Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    [solved] 2.0.3 to 2.1 release - squid dies

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    15 Posts 2 Posters 13.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      Ecnerwal
      last edited by

      6 days later, and squid was down again (actually went down on the 5th day, I guess). Log reports something on the order of "out of swap space" (settings exactly the same as on 2.0.3, which never had a problem (or never a noticeable problem) and ran for ~3 months without a hitch - 16GB RAM, 32GB dedicated swap partition, as far as I recall.) I'll copy in what the log actually reported when I'm on that network again (this machine lives on the network I've left at 2.0.3 for now, and will until the network I upgraded is actually solid, unless I give up and back-grade that one.)

      Restarted squid (dropping the memory cache from 8 to 7 gig) and everything was sluggish as all get out. Barked up a few wrong trees looking for a non-pfSense network problem, then rebooted pfSense and everything hunky again… for now.
      Filter on squid: (package restart 9:44, reboot 10:50)

      Sep 26 16:19:06	kernel: pid 61679 (squid), uid 62, was killed: out of swap space
      Sep 26 16:19:07	squid[61285]: Squid Parent: child process 61679 exited due to signal 9
      Sep 27 09:44:47	php: /status_services.php: The command '/usr/local/etc/rc.d/squid.sh stop' returned exit code '1', the output was '2013/09/27 09:44:42| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: Could not send signal 15 to process 61679: (3) No such process'
      Sep 27 09:44:49	squid[10615]: Squid Parent: child process 11020 started
      Sep 27 10:50:50	squid[10615]: Squid Parent: child process 11020 exited with status 0
      Sep 27 10:50:50	php: /reboot.php: The command '/usr/local/etc/rc.d/squid.sh stop' returned exit code '1', the output was '2013/09/27 10:50:45| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl"'
      Sep 27 10:52:36	squid[85746]: Squid Parent: child process 86095 started
      Sep 27 10:52:36	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:36	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:36	php: rc.start_packages: Not calling package sync code for dependency squid of squid because some include files are missing.
      Sep 27 10:52:36	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:36	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:36	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:36	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:38	php: rc.start_packages: Not calling package sync code for dependency squid of squid because some include files are missing.
      Sep 27 10:52:41	php: rc.start_packages: The command '/usr/local/etc/rc.d/squid.sh stop' returned exit code '1', the output was '2013/09/27 10:52:36| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: No running copy'
      Sep 27 10:52:43	squid[32503]: Squid Parent: child process 32737 started
      Sep 27 10:52:43	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:43| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: Could not send signal 1 to process 86095: (3) No such process'
      Sep 27 10:52:44	php: rc.start_packages: The command '/usr/local/etc/rc.d/squid.sh stop' returned exit code '1', the output was '2013/09/27 10:52:39| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl"'
      Sep 27 10:52:46	squid[42886]: Squid Parent: child process 43273 started
      Sep 27 10:52:46	php: rc.start_packages: The command '/usr/pbi/squid-amd64/sbin/squid -k reconfigure' returned exit code '1', the output was '2013/09/27 10:52:46| aclParseAclLine: WARNING: empty ACL: acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl" squid: ERROR: Could not send signal 1 to process 32737: (3) No such process'
      

      Filter on swap, which indicates that this did happen on 9/16 (running 2.0.3), but if so, squid restarted cleanly and there was no "great sluggishness" nor a need to manually restart squid. When I finally upped the GUI log entries enough to see that far back looking for squid, there is, in fact, this entry 7 seconds after it was killed:
      Sep 16 20:28:30 squid[43487]: Squid Parent: child process 44026 started

      Sep 16 20:28:23	kernel: swap zone exhausted, increase kern.maxswzone
      Sep 16 20:28:23	kernel: pid 13449 (squid), uid 62, was killed: out of swap space
      Sep 16 20:28:23	kernel: swap zone exhausted, increase kern.maxswzone
      Sep 16 20:28:23	kernel: pid 30069 (php), uid 0, was killed: out of swap space
      Sep 21 11:15:05	squid[55785]: Failed to verify one of the swap directories, Check cache.log for details. Run 'squid -z' to create swap directories if needed, or if running Squid for the first time.
      Sep 21 11:39:16	squid[5931]: Failed to verify one of the swap directories, Check cache.log for details. Run 'squid -z' to create swap directories if needed, or if running Squid for the first time.
      Sep 21 12:06:29	squid[92872]: Failed to verify one of the swap directories, Check cache.log for details. Run 'squid -z' to create swap directories if needed, or if running Squid for the first time.
      Sep 26 16:19:06	kernel: swap zone exhausted, increase kern.maxswzone
      Sep 26 16:19:06	kernel: pid 61679 (squid), uid 62, was killed: out of swap space
      Sep 26 16:19:06	kernel: swap zone exhausted, increase kern.maxswzone
      Sep 26 16:19:06	kernel: pid 31134 (php), uid 0, was killed: out of swap space
      

      pfSense on i5 3470/DQ77MK/16GB/500GB

      1 Reply Last reply Reply Quote 0
      • K
        kejianshi
        last edited by

        What does your squid cache management setup page look like?

        1 Reply Last reply Reply Quote 0
        • E
          Ecnerwal
          last edited by

          Hmmm - what are folks doing to post pretty pictures here? Hosting them elsewhere, I suppose. Also, goodnight ;-)

          The hard way: Hard disk cache size 320000 MB
          hard disk cache system ufs
          hard disk cache location /squid/cache
          memory cache size  7168 (was 8192)
          min object size 0
          max object size 4000000
          max object size in RAM 3200
          level 1 subdirectories 128
          memory replacement Heap LFUDA
          cache replacement Heap LFUDA
          low-water % 96
          high water % 98

          pfSense on i5 3470/DQ77MK/16GB/500GB

          1 Reply Last reply Reply Quote 0
          • K
            kejianshi
            last edited by

            HDD cache size is too big (assuming its being filled to that level)

            with your RAM, you should reduce it to about 120000 (very safe).

            Also be sure not to dedicate more than half of your RAM to RAM cache.

            The reason you can't just make the HDD cache however big you have space for is that it take RAM to index HDD cache.

            If you actually fill all that cache, at some point between 120000MB and say 200000MB, you will probably exhaust your RAM and soon after, your swap.

            Check to see how much space is has actually been used so far.

            df -h

            1 Reply Last reply Reply Quote 0
            • E
              Ecnerwal
              last edited by

              8GB is only half the RAM. But I've started de-tuning, dropping to 7. I suppose I'll probably start reducing disk cache as well, but the math isn't there, or the squid docs are wrong.

              df -h output is in first post. 167GB used on that date

              Dashboard claim of swap usage (only place I see that reported) is typically 0-12%, and memory cache size (while in 2.0.3) was tuned while keeping an eye on it (aiming for 0).

              per squid docs, 320GB of cache should require 4.5 GB of index on 64 bit system at 14MB/GB. An additional 112MB is needed to index 8GB in RAM cache. 13GB or so total, leaving 3GB of 16 for everything else, if it were even full (it's not)

              at 167GB, memory index should be less than 2.5 GB, which comes out less than 11GB for memory index and 8GB RAM cache and 112MB memory cache index.

              Yes, mention is made of "hot objects" exceeding cache allocation. But I've got 3GB (5, supposedly, at current cache fill) of RAM and the swap disk if things are really going to get exciting, and I find it hard to believe that the full 32GB of swap has been used up, given that swap use is normally 0, or very close to it, and I've never seen it above 12%

              Mostly, what I notice is that 2.0.3 handled the exception "gracefully" while 2.1 kills squid, does not restart it, and runs like cold molasses (ie, very slowly) if I restart squid without rebooting the whole system…

              pfSense on i5 3470/DQ77MK/16GB/500GB

              1 Reply Last reply Reply Quote 0
              • K
                kejianshi
                last edited by

                OK - But my install was blowing up even on 2.03 with excessively large HDD cache.  When I started allocating no more than 20x my available RAM to HDD cache my system became very reliable.  And that was the only change I made, so that was for sure the issue.  I calculate available RAM for me to be no more than 1/2 total RAM since I dedicate 1/2 the total RAM to RAM cache already.

                For me anyway.  Thats also why I switched to a 64GB SSD.  I realized I had no practical use for more HDD unless I also seriously upgraded my RAM.

                I'm on 2.1 now and its rock solid with these same settings.

                Uptime 12 Days 20 Hours 37 Minutes 19 Seconds  (not that thats amazing long, but its long enough to suspect there are no problems)

                1 Reply Last reply Reply Quote 0
                • K
                  kejianshi
                  last edited by

                  P.S.  "167GB used on that date"  -  Yeah.  That probably means you are caching lots of really small objects also.  In that case, even a 120GB HDD cache might put you close to the ragged edge.

                  I'd have figured somewhere in that range it would crash or at least start swapping really annoyingly.

                  1 easy fix is double your RAM if you a married to a really big HDD cache.

                  1 Reply Last reply Reply Quote 0
                  • E
                    Ecnerwal
                    last edited by

                    kejianshi - What is your RAM amount?

                    Back on the network in question, updated numbers:

                    182 GB this morning.

                    78% memory use, 0% swap per dashboard (this is at 7GB RAM cache still, down one from the previous setup, but only up for 25 hours or so since reboot)

                    Dropped disk cache size to 220GB, moved min object size on disk up to 4, dropped the watermarks 1% each.

                    We'll see how that plays out, and reduce further if need be. Or buy more RAM, but to some extent my focus is on getting the maximum use from what I have (quite a bit) rather than simply throwing more at it.

                    I've found very little on tuning things like the number of level 1 subdirectories - the built-in commentary is rather limited on the subject, and most other things I can find about squid are similar. Likewise choice of hard disk cache system…or min object size on disk, for that matter.

                    pfSense on i5 3470/DQ77MK/16GB/500GB

                    1 Reply Last reply Reply Quote 0
                    • K
                      kejianshi
                      last edited by

                      Oddly - I have found that if anything 2.1 is easier on my memory.  Doesn't go up in the 80% very often now and the disk cache seems to purge alot more than it used to.

                      I'm only running 4GB of ram on this one, but the math on cache is simple and fairly constant.

                      For you, I'd keep an eye on exactly when you start having problems.  I wouldn't set the HDD cache size to prevent the system from running out of swap.  I'd set it up to avoid swapping at all. My swap size is 8GB but if I were using even 1KB of that, I'd assume something wasn't optimal and adjust down.

                      1 Reply Last reply Reply Quote 0
                      • E
                        Ecnerwal
                        last edited by

                        I have cut down both disk and RAM cache, and it's at least not crapping out entirely anymore or so far. But it has made me look in the logs a bit more.

                        One thing I noticed in the logs was a complaint about number of url_rewrite_children, so I bumped that from 5 to 50 (custom options in the GUI - I was going to mess about with the conf file but noticed that it was already in there set to 5, so I set it to 50) but it does not appear to be working - in that I'm still getting complaints that would make it seem it's set to 5. 50 is what it says in /usr/local/etc/squid/squid.conf, however, so the option is being passed from the GUI.

                        The main system log has this:
                        Oct 9 09:47:43 squid[33282]: The url_rewriter helpers are crashing too rapidly, need help!

                        and at the time of those, this in squid cache log

                        2013/10/09 09:47:43| WARNING: url_rewriter #5 (FD 22) exited
                        2013/10/09 09:47:43| WARNING: url_rewriter #3 (FD 18) exited
                        2013/10/09 09:47:43| WARNING: url_rewriter #4 (FD 19) exited
                        2013/10/09 09:47:43| Too few url_rewriter processes are running
                        FATAL: The url_rewriter helpers are crashing too rapidly, need help!
                        
                        

                        While the squid cache log has lots of this (and```
                        url_rewrite_children 50

                        2013/10/11 08:34:47| WARNING: All url_rewriter processes are busy.
                        2013/10/11 08:50:59| WARNING: All url_rewriter processes are busy.
                        2013/10/11 08:53:10| WARNING: All url_rewriter processes are busy.
                        2013/10/11 08:53:10| Consider increasing the number of url_rewriter processes to at least 16 in your config file.
                        2013/10/11 08:54:04| WARNING: All url_rewriter processes are busy.
                        2013/10/11 08:54:04| Consider increasing the number of url_rewriter processes to at least 15 in your config file.

                        pfSense on i5 3470/DQ77MK/16GB/500GB

                        1 Reply Last reply Reply Quote 0
                        • K
                          kejianshi
                          last edited by

                          Today my RAM used is at 90%.  Still haven't hit the swap ever and no lockups or crashes.  For me, pfsense works like a champ so long as I don't get silly with the cache settings.  Its been 25 days since a restart, shutdown or reboot.

                          I'm glad yours is working well now.

                          1 Reply Last reply Reply Quote 0
                          • K
                            kejianshi
                            last edited by

                            BTW, what is your RAM and what are your current cache settings?  (Incase someone later wants to know what is working)

                            1 Reply Last reply Reply Quote 0
                            • E
                              Ecnerwal
                              last edited by

                              RAM is 16 GB, as stated in my signature (and a few times higher in the thread), as it has been since I actually deployed in February 2013 (two identical systems on different networks, second one deployed in July 2013 - some i386 fiddling before building the 64bit systems for serious use.) One is still running 2.0.3

                              Cache detuned to 7GB RAM and 160 GB disk with the post-2.1 troubles.
                              Typical daily throughput in the 30GB range, 150-175 users (per lightsquid - about 100 people, but people with multiple devices get counted twice or three times)

                              I am a bit disturbed by the url_rewrite_chrildren being set to 50, but asking to be set to 15-16 as though they are still at 5. 50 was picked since the largest number I found was 52, while most were suggesting the teens, so it seemed like it would cover most of the issues with that - but it's not clear that it's "taking."

                              I also added the vm.pmap.shpgperproc tunable in system tunables due to complaints in the main system log about "Approaching the limit on PV entries" and raised it from 200 to 500.

                              $ cat /usr/local/etc/squid/squid.conf
                              # Do not edit manually !
                              http_port 172.XX.XX.1:3128
                              http_port 127.0.0.1:3128 transparent
                              icp_port 0
                              
                              pid_filename /var/run/squid.pid
                              cache_effective_user proxy
                              cache_effective_group proxy
                              error_directory /usr/pbi/squid-amd64/etc/squid/errors/en
                              icon_directory /usr/pbi/squid-amd64/etc/squid/icons
                              visible_hostname localhost
                              cache_mgr admin@localhost
                              access_log /var/squid/log/access.log
                              cache_log /var/squid/log/cache.log
                              cache_store_log none
                              logfile_rotate 14
                              shutdown_lifetime 3 seconds
                              # Allow local network(s) on interface(s)
                              acl localnet src  172.XX.XX.0/255.255.192.0
                              uri_whitespace strip
                              
                              cache_mem 7200 MB
                              maximum_object_size_in_memory 3200 KB
                              memory_replacement_policy heap LFUDA
                              cache_replacement_policy heap LFUDA
                              cache_dir aufs /squid/cache 160000 128 256
                              minimum_object_size 4 KB
                              maximum_object_size 4000000 KB
                              offline_mode off
                              cache_swap_low 50
                              cache_swap_high 80
                              
                              # No redirector configured
                              
                              # Setup some default acls
                              acl all src 0.0.0.0/0.0.0.0
                              acl localhost src 127.0.0.1/255.255.255.255
                              acl safeports port 21 70 80 210 280 443 488 563 591 631 777 901  3128 1025-65535 
                              acl sslports port 443 563  
                              acl manager proto cache_object
                              acl purge method PURGE
                              acl connect method CONNECT
                              acl dynamic urlpath_regex cgi-bin \?
                              cache deny dynamic
                              http_access allow manager localhost
                              
                              # Allow external cache managers
                              acl ext_manager_1 src 127.0.0.1 
                              http_access allow manager ext_manager_1
                              acl ext_manager_2 src 172.XX.XX.1 
                              http_access allow manager ext_manager_2
                              
                              http_access deny manager
                              http_access allow purge localhost
                              http_access deny purge
                              http_access deny !safeports
                              http_access deny CONNECT !sslports
                              
                              # Always allow localhost connections
                              http_access allow localhost
                              
                              quick_abort_min 4000 KB
                              quick_abort_max 0 KB
                              quick_abort_pct 40
                              request_body_max_size 0 KB
                              reply_body_max_size 0 deny all
                              delay_pools 1
                              delay_class 1 2
                              delay_parameters 1 -1/-1 -1/-1
                              delay_initial_bucket_level 100
                              # Throttle extensions matched in the url
                              acl throttle_exts urlpath_regex -i "/var/squid/acl/throttle_exts.acl"
                              delay_access 1 allow throttle_exts
                              delay_access 1 deny all
                              
                              # Custom options
                              redirect_program /usr/pbi/squidguard-amd64/bin/squidGuard -c /usr/pbi/squidguard-amd64/etc/squidGuard/squidGuard.conf
                              redirector_bypass off
                              url_rewrite_children 50
                              # Setup allowed acls
                              # Allow local network(s) on interface(s)
                              http_access allow localnet
                              # Default block all to be sure
                              http_access deny all
                              

                              pfSense on i5 3470/DQ77MK/16GB/500GB

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.