Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Daily rc.update_bogons.sh results in zombie procs

    Scheduled Pinned Locked Moved General pfSense Questions
    13 Posts 3 Posters 1.1k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ Offline
      jimp Rebel Alliance Developer Netgate
      last edited by

      Look at ps uxawwd and see where that falls in the process tree.

      I'm not sure what might result in that. Does it happen if you run it manually? If so, try running it with sh -x /etc/rc.update_bogons.sh and see if anything sticks out.

      Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • JeGrJ Offline
        JeGr LAYER 8 Moderator
        last edited by JeGr

        Jep it's the fetch that is kinda "hanging":

        root   92573   0.0  0.0    6368   2296  -  Is   30Apr20      0:03.89 |-- /usr/sbin/cron -s
        root   91981   0.0  0.0    8416   2316  -  I    03:01        0:00.00 | `-- cron: running job (cron)
        root   72702   0.0  0.0       0      0  -  Z    11:52        0:00.00 |   |-- <defunct>
        root   92534   0.0  0.0    6968   2828  -  INs  03:01        0:00.00 |   `-- /bin/sh /etc/rc.update_bogons.sh
        root   87274   0.0  0.0    9264   6536  -  IN   17:13        0:00.01 |     `-- /usr/bin/fetch -a -w 600 -T 30 -q -o /tmp/bogonsv6 https://files.pfsense.org/lists/fullbogons-ipv6.txt
        

        Problems with the "files" server perhaps? I'll try running it manually...

        Edit: before running the RC manually, I tried the URL per hand - browser takes ages to load, a wget from another pfSense instance is taking ages in "connecting to files.pfsense.org..." and times out after multiple minutes

        [2.5.0-DEVELOPMENT][root@mirage.....to]/root: wget https://files.pfsense.org/lists/fullbogons-ipv6.txt
        --2020-06-05 17:17:54--  https://files.pfsense.org/lists/fullbogons-ipv6.txt
        Resolving files.pfsense.org (files.pfsense.org)... 162.208.119.41, 162.208.119.40, 2607:ee80:10::119:40, ...
        Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.41|:443... failed: Operation timed out.
        Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443...
        

        Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

        If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

        1 Reply Last reply Reply Quote 0
        • jimpJ Offline
          jimp Rebel Alliance Developer Netgate
          last edited by

          The zombie and the bogons update are at the same level, though. But if you kill the fetch do the others go away?

          We have had some issues with the files server which we're working to resolve, but I'm not aware of it making anything hang like that repeatedly.

          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • JeGrJ Offline
            JeGr LAYER 8 Moderator
            last edited by JeGr

            See my edit above: seems the fetch/curl/wget takes ages, falls to the next IP, etc.

            [2.5.0-DEVELOPMENT][root@mirage.....to]/root: wget https://files.pfsense.org/lists/fullbogons-ipv6.txt
            --2020-06-05 17:17:54--  https://files.pfsense.org/lists/fullbogons-ipv6.txt
            Resolving files.pfsense.org (files.pfsense.org)... 162.208.119.41, 162.208.119.40, 2607:ee80:10::119:40, ...
            Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.41|:443... failed: Operation timed out.
            Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443... connected.
            HTTP request sent, awaiting response... 200 OK
            Length: 1841962 (1.8M) [text/plain]
            Saving to: 'fullbogons-ipv6.txt'
            
            fullbogons-ipv6.txt             1%[                                                  ]  23.66K  5.61KB/s    eta 5m 17s
            
            

            That screen took around 6min until it started downloading at all - definetly not normal as normal package updates etc. are way faster and have no problems with failing to another IP?

            I guess the whole process takes so long, the PHP process that started it times out or goes zombie. As this only reoccured recently that would fall in line with you having problems on the "files" server?

            Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

            If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

            1 Reply Last reply Reply Quote 0
            • jimpJ Offline
              jimp Rebel Alliance Developer Netgate
              last edited by

              Maybe so. Though there is a problem right this moment, there wasn't one overnight. So the behavior may be different at the moment. It's already being investigated here, so hopefully resolved shortly.

              Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • JeGrJ Offline
                JeGr LAYER 8 Moderator
                last edited by JeGr

                Interesting. Download closes half way and breaks, retries and fails to reach the IP4 addresses then switch to v6, fails again and finally uses the v6 ::119:41 with success and instantly hops to ~2MB/s and loads without a hitch:

                Resolving files.pfsense.org (files.pfsense.org)... 162.208.119.41, 162.208.119.40, 2607:ee80:10::119:40, ...
                Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.41|:443... failed: Operation timed out.
                Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443... connected.
                HTTP request sent, awaiting response... 200 OK
                Length: 1841962 (1.8M) [text/plain]
                Saving to: 'fullbogons-ipv6.txt'
                
                fullbogons-ipv6.txt            84%[=========================================>        ]   1.48M  3.84KB/s    in 6m 12s
                
                2020-06-05 17:26:39 (4.09 KB/s) - Connection closed at byte 1556131. Retrying.
                
                --2020-06-05 17:26:40--  (try: 2)  https://files.pfsense.org/lists/fullbogons-ipv6.txt
                Connecting to files.pfsense.org (files.pfsense.org)|162.208.119.40|:443... failed: Connection refused.
                Connecting to files.pfsense.org (files.pfsense.org)|2607:ee80:10::119:40|:443... failed: Connection refused.
                Connecting to files.pfsense.org (files.pfsense.org)|2607:ee80:10::119:41|:443... connected.
                HTTP request sent, awaiting response... 206 Partial Content
                Length: 1841962 (1.8M), 285831 (279K) remaining [text/plain]
                Saving to: 'fullbogons-ipv6.txt'
                
                fullbogons-ipv6.txt           100%[++++++++++++++++++++++++++++++++++++++++++=======>]   1.76M   496KB/s    in 0.6s
                
                2020-06-05 17:26:50 (496 KB/s) - 'fullbogons-ipv6.txt' saved [1841962/1841962]
                

                Another download now also reaches the IPv4 of .41 - seems the 40 is a bit faulty atm? and 41 had some issues but now responds well again. But if that happened while updating the bogons via cron, that could explain the hanging fetch process with all that timeouts, failings, retries etc.

                Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                1 Reply Last reply Reply Quote 0
                • JeGrJ Offline
                  JeGr LAYER 8 Moderator
                  last edited by JeGr

                  Ah so I was running the update process with

                  sh -x /etc/rc.update_bogons.sh nosleep

                  (otherwise it goes to sleep for minutes to hours...) and it fails immediatly with an authentication error:

                  + /usr/bin/fetch -a -w 600 -T 30 -q -o /tmp/bogons https://files.pfsense.org/lists/fullbogons-ipv4.txt
                  Certificate verification failed for /C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
                  34374274104:error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed:/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/crypto/openssl/ssl/s3_clnt.c:1269:
                  fetch: https://files.pfsense.org/lists/fullbogons-ipv4.txt: Authentication error
                  

                  I'll check other systems where the download failed but I assume they could all have that problem.

                  Funny: the process/script doesn't go further. It won't exit and it won't skip or go away. Fetch just sits there doing nothing at all anymore.

                  Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                  If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                  1 Reply Last reply Reply Quote 0
                  • JeGrJ Offline
                    JeGr LAYER 8 Moderator
                    last edited by JeGr

                    Jep confirmed. Other systems (2.4.4-p3 or 2.4.5 equally) have the same problem:

                    [2.4.5-RELEASE][root@fwl01.....de]/root: sh -x /etc/rc.update_bogons.sh nosleep
                    + proc_error=''
                    + /usr/local/sbin/read_xml_tag.sh boolean system/do_not_send_uniqueid
                    + do_not_send_uniqueid=false
                    + [ false '!=' true ]
                    + /usr/sbin/gnid
                    + uniqueid=1c3a576e6ca2d88ad608
                    + export 'HTTP_USER_AGENT=/:1c3a576e6ca2d88ad608'
                    + echo 'rc.update_bogons.sh is starting up.'
                    + logger
                    + [ nosleep '=' '' ]
                    + echo 'rc.update_bogons.sh is beginning the update cycle.'
                    + logger
                    + [ -f /var/etc/bogon_custom ]
                    + v4url=https://files.pfsense.org/lists/fullbogons-ipv4.txt
                    + v6url=https://files.pfsense.org/lists/fullbogons-ipv6.txt
                    + v4urlcksum=https://files.pfsense.org/lists/fullbogons-ipv4.txt.md5
                    + v6urlcksum=https://files.pfsense.org/lists/fullbogons-ipv6.txt.md5
                    + process_url /tmp/bogons https://files.pfsense.org/lists/fullbogons-ipv4.txt
                    + local 'file=/tmp/bogons'
                    + local 'url=https://files.pfsense.org/lists/fullbogons-ipv4.txt'
                    + local 'filename=fullbogons-ipv4.txt'
                    + local 'ext=txt'
                    + /usr/bin/fetch -a -w 600 -T 30 -q -o /tmp/bogons https://files.pfsense.org/lists/fullbogons-ipv4.txt
                    Certificate verification failed for /C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
                    34374270280:error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed:/build/ce-crossbuild-245/sources/FreeBSD-src/crypto/openssl/ssl/s3_clnt.c:1269:
                    fetch: https://files.pfsense.org/lists/fullbogons-ipv4.txt: Authentication error
                    

                    fetch isn't coming back from the auth error and doesn't seem to quit/exit, the cron goes stale and the shell rc.x script goes Zombie after enough waiting.

                    So it seems the problem is two-fold:

                    1. ssl auth error on files.pfsense.org - things can happen
                    2. fetch not exiting after a failure and thus blocking/zombificating the parent processes
                      correct: fetch is configured to retry with "-a" and has "-w 600" 10min to retry again. But it never stops retrying.

                    Anything to help there?

                    Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                    If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                    1 Reply Last reply Reply Quote 0
                    • jimpJ Offline
                      jimp Rebel Alliance Developer Netgate
                      last edited by

                      Well, 1 should be fixed shortly. Not sure about 2.

                      Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                      Need help fast? Netgate Global Support!

                      Do not Chat/PM for help!

                      1 Reply Last reply Reply Quote 0
                      • JeGrJ Offline
                        JeGr LAYER 8 Moderator
                        last edited by JeGr

                        I was a bit off for 2). It seems it's the way fetch works with "-a" and "-w" with "-a" telling it to retry (seemingly infinite!) and "-w 600" makes it wait 10min for the next try. So it throws the auth failure, waits 10min to fail again, and again, and again and somewhere loosing its parent to a Zombie ๐Ÿ’€
                        Only seems that by becoming a zombie the mechanics to detect a running "bogon_update" in the script itself fail to see it still running and thus starting a new one (to become zombie, too).

                        Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                        If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                        I 1 Reply Last reply Reply Quote 0
                        • I Offline
                          itpp21 @JeGr
                          last edited by

                          My own fix/solution, locate section and replace if commented sections match.
                          /etc/rc.update_bogons.sh

                          # Set default values if not overriden
                          v4url=${v4url:-"https://files.pfsense.org/lists/fullbogons-ipv4.txt"}
                          v6url=${v6url:-"https://files.pfsense.org/lists/fullbogons-ipv6.txt"}
                          v4urlcksum=${v4urlcksum:-"${v4url}.md5"}
                          v6urlcksum=${v6urlcksum:-"${v6url}.md5"}
                          
                          # process_url /tmp/bogons "${v4url}"
                          # process_url /tmp/bogonsv6 "${v6url}"
                          
                          rm /tmp/bogons
                          rm /tmp/fullbogons-ipv4.txt.md5
                          rm /tmp/bogonsv6
                          rm /tmp/fullbogons-ipv6.txt.md5
                          curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv4.txt     -o /tmp/bogons
                          curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv4.txt.md5 -o /tmp/fullbogons-ipv4.txt.md5
                          curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv6.txt     -o /tmp/bogonsv6
                          curl --max-time 120 -k https://files.pfsense.org/lists/fullbogons-ipv6.txt.md5 -o /tmp/fullbogons-ipv6.txt.md5
                          
                          if [ "$proc_error" != "" ]; then
                          	# Relaunch and sleep
                          	sh /etc/rc.update_bogons.sh &
                          	exit
                          fi
                          
                          # BOGON_V4_CKSUM=`/usr/bin/fetch -T 30 -q -o - "${v4urlcksum}" | awk '{ print $4 }'`
                          # ON_DISK_V4_CKSUM=`md5 /tmp/bogons | awk '{ print $4 }'`
                          # BOGON_V6_CKSUM=`/usr/bin/fetch -T 30 -q -o - "${v6urlcksum}" | awk '{ print $4 }'`
                          # ON_DISK_V6_CKSUM=`md5 /tmp/bogonsv6 | awk '{ print $4 }'`
                          
                          BOGON_V4_CKSUM=`cat /tmp/fullbogons-ipv4.txt.md5 | awk '{ print $4 }'`
                          ON_DISK_V4_CKSUM=`md5 /tmp/bogons | awk '{ print $4 }'`
                          BOGON_V6_CKSUM=`cat /tmp/fullbogons-ipv6.txt.md5 | awk '{ print $4 }'`
                          ON_DISK_V6_CKSUM=`md5 /tmp/bogonsv6 | awk '{ print $4 }'`
                          
                          if [ "$BOGON_V4_CKSUM" = "$ON_DISK_V4_CKSUM" ] || [ "$BOGON_V6_CKSUM" = "$ON_DISK_V6_CKSUM" ]; then
                          
                          
                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.