Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Daily rc.update_bogons.sh results in zombie procs

    Scheduled Pinned Locked Moved General pfSense Questions
    13 Posts 3 Posters 1.1k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • JeGrJ Offline
      JeGr LAYER 8 Moderator
      last edited by

      Today it accumulated on most systems again. Just to check it:

      [2.4.4-RELEASE][root@fwl01.<***>.de]/root: ps laxwww | grep 91981 | grep -v grep
          0 72702 91981   0  20  0       0      0 -        Z     -       0:00.00 <defunct>
          0 91981 92573   0  20  0    8416   2316 piperd   I     -       0:00.00 cron: running job (cron)
          0 92534 91981   0  40 20    6968   2828 wait     INs   -       0:00.00 /bin/sh /etc/rc.update_bogons.sh
      

      It's the rc.update_bogons.sh again. Any chance how we could debug that and why it happens at all?

      Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

      If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

      1 Reply Last reply Reply Quote 0
      • jimpJ Offline
        jimp Rebel Alliance Developer Netgate
        last edited by

        Look at ps uxawwd and see where that falls in the process tree.

        I'm not sure what might result in that. Does it happen if you run it manually? If so, try running it with sh -x /etc/rc.update_bogons.sh and see if anything sticks out.

        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • JeGrJ Offline
          JeGr LAYER 8 Moderator
          last edited by JeGr

          Jep it's the fetch that is kinda "hanging":

          root   92573   0.0  0.0    6368   2296  -  Is   30Apr20      0:03.89 |-- /usr/sbin/cron -s
          root   91981   0.0  0.0    8416   2316  -  I    03:01        0:00.00 | `-- cron: running job (cron)
          root   72702   0.0  0.0       0      0  -  Z    11:52        0:00.00 |   |-- <defunct>
          root   92534   0.0  0.0    6968   2828  -  INs  03:01        0:00.00 |   `-- /bin/sh /etc/rc.update_bogons.sh
          root   87274   0.0  0.0    9264   6536  -  IN   17:13        0:00.01 |     `-- /usr/bin/fetch -a -w 600 -T 30 -q -o /tmp/bogonsv6 https://files.pfsense.org/lists/fullbogons-ipv6.txt
          

          Problems with the "files" server perhaps? I'll try running it manually...

          Edit: before running the RC manually, I tried the URL per hand - browser takes ages to load, a wget from another pfSense instance is taking ages in "connecting to files.pfsense.org..." and times out after multiple minutes

          [2.5.0-DEVELOPMENT][root@mirage.....to]/root: wget https://files.pfsense.org/lists/fullbogons-ipv6.txt
          --2020-06-05 17:17:54--  https://files.pfsense.org/lists/fullbogons-ipv6.txt
          Resolving files.pfsense.org (files.pfsense.org)... 162.208.119.41, 162.208.119.40, 2607:ee80:10::119:40, ...
          Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.41|:443... failed: Operation timed out.
          Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443...
          

          Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

          If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

          1 Reply Last reply Reply Quote 0
          • jimpJ Offline
            jimp Rebel Alliance Developer Netgate
            last edited by

            The zombie and the bogons update are at the same level, though. But if you kill the fetch do the others go away?

            We have had some issues with the files server which we're working to resolve, but I'm not aware of it making anything hang like that repeatedly.

            Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • JeGrJ Offline
              JeGr LAYER 8 Moderator
              last edited by JeGr

              See my edit above: seems the fetch/curl/wget takes ages, falls to the next IP, etc.

              [2.5.0-DEVELOPMENT][root@mirage.....to]/root: wget https://files.pfsense.org/lists/fullbogons-ipv6.txt
              --2020-06-05 17:17:54--  https://files.pfsense.org/lists/fullbogons-ipv6.txt
              Resolving files.pfsense.org (files.pfsense.org)... 162.208.119.41, 162.208.119.40, 2607:ee80:10::119:40, ...
              Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.41|:443... failed: Operation timed out.
              Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443... connected.
              HTTP request sent, awaiting response... 200 OK
              Length: 1841962 (1.8M) [text/plain]
              Saving to: 'fullbogons-ipv6.txt'
              
              fullbogons-ipv6.txt             1%[                                                  ]  23.66K  5.61KB/s    eta 5m 17s
              
              

              That screen took around 6min until it started downloading at all - definetly not normal as normal package updates etc. are way faster and have no problems with failing to another IP?

              I guess the whole process takes so long, the PHP process that started it times out or goes zombie. As this only reoccured recently that would fall in line with you having problems on the "files" server?

              Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

              If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

              1 Reply Last reply Reply Quote 0
              • jimpJ Offline
                jimp Rebel Alliance Developer Netgate
                last edited by

                Maybe so. Though there is a problem right this moment, there wasn't one overnight. So the behavior may be different at the moment. It's already being investigated here, so hopefully resolved shortly.

                Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • JeGrJ Offline
                  JeGr LAYER 8 Moderator
                  last edited by JeGr

                  Interesting. Download closes half way and breaks, retries and fails to reach the IP4 addresses then switch to v6, fails again and finally uses the v6 ::119:41 with success and instantly hops to ~2MB/s and loads without a hitch:

                  Resolving files.pfsense.org (files.pfsense.org)... 162.208.119.41, 162.208.119.40, 2607:ee80:10::119:40, ...
                  Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.41|:443... failed: Operation timed out.
                  Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443... connected.
                  HTTP request sent, awaiting response... 200 OK
                  Length: 1841962 (1.8M) [text/plain]
                  Saving to: 'fullbogons-ipv6.txt'
                  
                  fullbogons-ipv6.txt            84%[=========================================>        ]   1.48M  3.84KB/s    in 6m 12s
                  
                  2020-06-05 17:26:39 (4.09 KB/s) - Connection closed at byte 1556131. Retrying.
                  
                  --2020-06-05 17:26:40--  (try: 2)  https://files.pfsense.org/lists/fullbogons-ipv6.txt
                  Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443... failed: Connection refused.
                  Connecting to files.pfsense.org (files.pfsense.org)|2607:ee80:10::119:40|:443... failed: Connection refused.
                  Connecting to files.pfsense.org (files.pfsense.org)|2607:ee80:10::119:41|:443... connected.
                  HTTP request sent, awaiting response... 206 Partial Content
                  Length: 1841962 (1.8M), 285831 (279K) remaining [text/plain]
                  Saving to: 'fullbogons-ipv6.txt'
                  
                  fullbogons-ipv6.txt           100%[++++++++++++++++++++++++++++++++++++++++++=======>]   1.76M   496KB/s    in 0.6s
                  
                  2020-06-05 17:26:50 (496 KB/s) - 'fullbogons-ipv6.txt' saved [1841962/1841962]
                  

                  Another download now also reaches the IPv4 of .41 - seems the 40 is a bit faulty atm? and 41 had some issues but now responds well again. But if that happened while updating the bogons via cron, that could explain the hanging fetch process with all that timeouts, failings, retries etc.

                  Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                  If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                  1 Reply Last reply Reply Quote 0
                  • JeGrJ Offline
                    JeGr LAYER 8 Moderator
                    last edited by JeGr

                    Ah so I was running the update process with

                    sh -x /etc/rc.update_bogons.sh nosleep

                    (otherwise it goes to sleep for minutes to hours...) and it fails immediatly with an authentication error:

                    + /usr/bin/fetch -a -w 600 -T 30 -q -o /tmp/bogons https://files.pfsense.org/lists/fullbogons-ipv4.txt
                    Certificate verification failed for /C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
                    34374274104:error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed:/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/crypto/openssl/ssl/s3_clnt.c:1269:
                    fetch: https://files.pfsense.org/lists/fullbogons-ipv4.txt: Authentication error
                    

                    I'll check other systems where the download failed but I assume they could all have that problem.

                    Funny: the process/script doesn't go further. It won't exit and it won't skip or go away. Fetch just sits there doing nothing at all anymore.

                    Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                    If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                    1 Reply Last reply Reply Quote 0
                    • JeGrJ Offline
                      JeGr LAYER 8 Moderator
                      last edited by JeGr

                      Jep confirmed. Other systems (2.4.4-p3 or 2.4.5 equally) have the same problem:

                      [2.4.5-RELEASE][root@fwl01.....de]/root: sh -x /etc/rc.update_bogons.sh nosleep
                      + proc_error=''
                      + /usr/local/sbin/read_xml_tag.sh boolean system/do_not_send_uniqueid
                      + do_not_send_uniqueid=false
                      + [ false '!=' true ]
                      + /usr/sbin/gnid
                      + uniqueid=1c3a576e6ca2d88ad608
                      + export 'HTTP_USER_AGENT=/:1c3a576e6ca2d88ad608'
                      + echo 'rc.update_bogons.sh is starting up.'
                      + logger
                      + [ nosleep '=' '' ]
                      + echo 'rc.update_bogons.sh is beginning the update cycle.'
                      + logger
                      + [ -f /var/etc/bogon_custom ]
                      + v4url=https://files.pfsense.org/lists/fullbogons-ipv4.txt
                      + v6url=https://files.pfsense.org/lists/fullbogons-ipv6.txt
                      + v4urlcksum=https://files.pfsense.org/lists/fullbogons-ipv4.txt.md5
                      + v6urlcksum=https://files.pfsense.org/lists/fullbogons-ipv6.txt.md5
                      + process_url /tmp/bogons https://files.pfsense.org/lists/fullbogons-ipv4.txt
                      + local 'file=/tmp/bogons'
                      + local 'url=https://files.pfsense.org/lists/fullbogons-ipv4.txt'
                      + local 'filename=fullbogons-ipv4.txt'
                      + local 'ext=txt'
                      + /usr/bin/fetch -a -w 600 -T 30 -q -o /tmp/bogons https://files.pfsense.org/lists/fullbogons-ipv4.txt
                      Certificate verification failed for /C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
                      34374270280:error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed:/build/ce-crossbuild-245/sources/FreeBSD-src/crypto/openssl/ssl/s3_clnt.c:1269:
                      fetch: https://files.pfsense.org/lists/fullbogons-ipv4.txt: Authentication error
                      

                      fetch isn't coming back from the auth error and doesn't seem to quit/exit, the cron goes stale and the shell rc.x script goes Zombie after enough waiting.

                      So it seems the problem is two-fold:

                      1. ssl auth error on files.pfsense.org - things can happen
                      2. fetch not exiting after a failure and thus blocking/zombificating the parent processes
                        correct: fetch is configured to retry with "-a" and has "-w 600" 10min to retry again. But it never stops retrying.

                      Anything to help there?

                      Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                      If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                      1 Reply Last reply Reply Quote 0
                      • jimpJ Offline
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        Well, 1 should be fixed shortly. Not sure about 2.

                        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • JeGrJ Offline
                          JeGr LAYER 8 Moderator
                          last edited by JeGr

                          I was a bit off for 2). It seems it's the way fetch works with "-a" and "-w" with "-a" telling it to retry (seemingly infinite!) and "-w 600" makes it wait 10min for the next try. So it throws the auth failure, waits 10min to fail again, and again, and again and somewhere loosing its parent to a Zombie ๐Ÿ’€
                          Only seems that by becoming a zombie the mechanics to detect a running "bogon_update" in the script itself fail to see it still running and thus starting a new one (to become zombie, too).

                          Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                          If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                          I 1 Reply Last reply Reply Quote 0
                          • I Offline
                            itpp21 @JeGr
                            last edited by

                            My own fix/solution, locate section and replace if commented sections match.
                            /etc/rc.update_bogons.sh

                            # Set default values if not overriden
                            v4url=${v4url:-"https://files.pfsense.org/lists/fullbogons-ipv4.txt"}
                            v6url=${v6url:-"https://files.pfsense.org/lists/fullbogons-ipv6.txt"}
                            v4urlcksum=${v4urlcksum:-"${v4url}.md5"}
                            v6urlcksum=${v6urlcksum:-"${v6url}.md5"}
                            
                            # process_url /tmp/bogons "${v4url}"
                            # process_url /tmp/bogonsv6 "${v6url}"
                            
                            rm /tmp/bogons
                            rm /tmp/fullbogons-ipv4.txt.md5
                            rm /tmp/bogonsv6
                            rm /tmp/fullbogons-ipv6.txt.md5
                            curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv4.txt     -o /tmp/bogons
                            curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv4.txt.md5 -o /tmp/fullbogons-ipv4.txt.md5
                            curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv6.txt     -o /tmp/bogonsv6
                            curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv6.txt.md5 -o /tmp/fullbogons-ipv6.txt.md5
                            
                            if [ "$proc_error" != "" ]; then
                            	# Relaunch and sleep
                            	sh /etc/rc.update_bogons.sh &
                            	exit
                            fi
                            
                            # BOGON_V4_CKSUM=`/usr/bin/fetch -T 30 -q -o - "${v4urlcksum}" | awk '{ print $4 }'`
                            # ON_DISK_V4_CKSUM=`md5 /tmp/bogons | awk '{ print $4 }'`
                            # BOGON_V6_CKSUM=`/usr/bin/fetch -T 30 -q -o - "${v6urlcksum}" | awk '{ print $4 }'`
                            # ON_DISK_V6_CKSUM=`md5 /tmp/bogonsv6 | awk '{ print $4 }'`
                            
                            BOGON_V4_CKSUM=`cat /tmp/fullbogons-ipv4.txt.md5 | awk '{ print $4 }'`
                            ON_DISK_V4_CKSUM=`md5 /tmp/bogons | awk '{ print $4 }'`
                            BOGON_V6_CKSUM=`cat /tmp/fullbogons-ipv6.txt.md5 | awk '{ print $4 }'`
                            ON_DISK_V6_CKSUM=`md5 /tmp/bogonsv6 | awk '{ print $4 }'`
                            
                            if [ "$BOGON_V4_CKSUM" = "$ON_DISK_V4_CKSUM" ] || [ "$BOGON_V6_CKSUM" = "$ON_DISK_V6_CKSUM" ]; then
                            
                            
                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.