Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Squid 100% CPU every hour since it's been started

    Cache/Proxy
    8
    19
    10.2k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mpreynolds
      last edited by

      Hi all,

      I'm using pfSense 2.2.2 and I've recently installed Squid 3.4.10_2 and set it to transparent mode. The problem I'm facing is every hour since the process has started, Squid starts using 100% CPU for about 2.5 minutes. While this is happening Squid stops responding and no one can browse the internet, then as the process starts dropping down in CPU everything starts working again… until a hour later like clockwork.

      While this happens, if I try using "squidclient mgr:info" I get the message saying "Sending HTTP request ... done." and it hangs until the process becomes responsive again and then I receive a message saying "Alarm Clock".

      So obviously this alarm clock is going off every hour to tell the process to do something, the problem is I have no idea what. Can anyone shed some light on this, or know what I can try to figure out what Squid is doing?

      Also in the cache.log, at the time Squid starts working again, I get a message saying "Select loop Error. Retry 1".

      I do have a fairly large cache, at 50GB. The reason being is because I'm caching both Windows and Apple updates. Could Squid be doing some kind of hourly check on this and struggling with a cache that large, or the drive speed (it's a Samsung SSD mSATA drive in an APU box)? I switched the cache from UFS to AUFS in the hope this was the case, but it hasn't appeared to make any difference.

      I guess at this point I'm just trying to figure out what exactly Squid is doing every hour that's causing this.

      Thanks!
      Matt

      1 Reply Last reply Reply Quote 0
      • D
        duanes
        last edited by

        This question has been repeatedly asked, but there has never been a single response.  It is consistent and repeatable across installations.

        It has been a problem on every version of pfSense since I started using pfSense 1.9 in 2010, including current (Nov.2015) releases.

        Squid by iteself does not have the issue (or is not as noticable).  If Squidguard is enabled, then every hour (exact time depends on when squidguard was started), the Squid process goes to 100% cpu for 30-40 seconds.  During this time, no internet traffic is passed.

        Also, if ShallaBlack list is enabled, the hang is 2-3 times longer.

        For the same reasons as above, I originally had the hd cache set for 100GB.  But, I have tried reducing this all the way down to 500MB.

        PS, this is running on a very capable Dell R210 with dual core Xeon, 16GB RAM and a 500GB Black drive.

        1 Reply Last reply Reply Quote 0
        • K
          kivimart
          last edited by

          i have set the "Memory Cache Size" to 16 MB.

          This fixes the problem with the internet hang every hour.

          My cache is now 50GB of total of 120GB.

          1 Reply Last reply Reply Quote 0
          • D
            duanes
            last edited by

            This is the first reasonable answer I have heard.  THANKS!

            I tried greatly reducing the HD cache size to 1gb and RAM to 500mb.  That helped, but still 20-25s every hour.

            I will try a very small RAM cache….. Although, that is partly why I bought 16gb of RAM.

            1 Reply Last reply Reply Quote 0
            • D
              duanes
              last edited by

              It took a couple days for the problem to come back, but it DOES still happen.  However…. The hang time is greatly reduced to about 10 seconds, but it is still present.

              Squid settings - Local HD Cache 1000Mb (UFS), Max Object size=4mb.  Local RAM Cache 1000Mb, Max Obj size=256kb.

              I have just reduced the Ram to 10Mb and max obj size to 100Kb.

              I will post any updates, but it will probably take a few days to get a consistent result.

              1 Reply Last reply Reply Quote 0
              • C
                cracker1985
                last edited by

                Hello,

                Do you use any auth helper? NTLM auth- something like this ?

                1 Reply Last reply Reply Quote 0
                • DerelictD
                  Derelict LAYER 8 Netgate
                  last edited by

                  @duanes:

                  It took a couple days for the problem to come back, but it DOES still happen.  However…. The hang time is greatly reduced to about 10 seconds, but it is still present.

                  Squid settings - Local HD Cache 1000Mb (UFS), Max Object size=4mb.  Local RAM Cache 1000Mb, Max Obj size=256kb.

                  I have just reduced the Ram to 10Mb and max obj size to 100Kb.

                  I will post any updates, but it will probably take a few days to get a consistent result.

                  For the record: b = bits B = bytes which is it?

                  Chattanooga, Tennessee, USA
                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                  1 Reply Last reply Reply Quote 0
                  • D
                    duanes
                    last edited by

                    Sorry B for Bytes.

                    1 Reply Last reply Reply Quote 0
                    • D
                      duanes
                      last edited by

                      No authentication is used for the proxy.  All users have the same access level.

                      So…..... After a few days of running, I am not seeing any of the hangs like we were.  Key points are very small HD and RAM cache and the high/low water marks are set very close together (95/94%).

                      I am not going to boost the HD cache to 80GB, max HD cache object to 1000MB.  I give it some time and see what happens.

                      1 Reply Last reply Reply Quote 0
                      • K
                        kemecs
                        last edited by

                        Hello!

                        -improved?
                        I've the same problem a long time ago ..
                        Yet the latest version of everything I use

                        "2016/01/04 07:43:14 kid1| Select loop Error. Retry 1
                        2016/01/04 08:43:14 kid1| Select loop Error. Retry 1
                        2016/01/04 09:43:14 kid1| Select loop Error. Retry 1
                        2016/01/04 10:43:13 kid1| Select loop Error. Retry 1
                        "

                        1 Reply Last reply Reply Quote 0
                        • D
                          duanes
                          last edited by

                          Yes - The HD cache seems to work fine, however if I try to use the RAM cache in any meaningful capacity, then I get the same problems.

                          Also, the problem DOES seem to appear again if the HD cache is too large.  My system is a Dell i210 (not a heavy duty machine, but has decent IO).  I have limited the HD cache to about 50GB.  The trick is to ensure that the highwater and lowwater amounts are only 1 digit apart.  Mine are set at 95% and 94%.  The hang seems to be in clearing the cache and it is VERY intrusive.

                          Finally, I have the ram cache set to 1MB of RAM and max cache size of 1kb (minimum settings).  I have 16GB of RAM available, but any time I start increasing RAM cache and max item size to 1MB, the hang starts showing up in a matter of hours (or a few days for a 4GB ram cache setting).  I believe it to be the same process in which old items are flushed hourly, however, it is EXCRUCIATINGLY slow, even though it is a RAM based operation.

                          So, I've backed my physical RAM down to 4GB, set the RAM cache to 1MB and max size to 1KB.  The HD is 100GB, but I have HD cache set for 50GB with 1GB as the max item size (This will cache a virtually all SW updates).  I have the cache policy set to keep the largest items longer and use diskd as the drive access method.

                          So far, the hangs have not returned after 3+ weeks of operation.

                          1 Reply Last reply Reply Quote 0
                          • K
                            kemecs
                            last edited by

                            It now looks like the options

                            My ""Proxy Server: Cache Management"" settings:

                            Squid Cache General Settings:
                            Low-Water Mark in % = 93
                            High-Water Mark in % = 95

                            Squid Hard Disk Cache Settings:
                            Hard Disk Cache Size = 60000
                            Hard Disk Cache System = ufs
                            Level 1 Directories = 8
                            Minimum Object Size = 32
                            Maximum Object Size = 256

                            Squid Memory Cache Settings:
                            Memory Cache Size = 5120
                            Maximum Object Size in RAM = 512
                            Memory Replacement Policy = Heap GDSF

                            Exactly what you have set?

                            thx

                            1 Reply Last reply Reply Quote 0
                            • D
                              duanes
                              last edited by

                              I've been working on this for quite some time -

                              I started getting the hangs again when I upped the RAM Cache, so I keep the RAM size to an absolute minimum. (which is sad, because I specifically bought a ton of RAM thinking it would be better then HD cache).  Also, Low/High water marks need to be as close together as possible.  Apparently, the hourly trash collection process is run at a high priority and prevents all other activity or maybe locks something.  Either way, it is very intrusive.

                              These settings have been running without the hang for 33 days.

                              Squid Cache General Settings:
                              Low-Water Mark in % = 94
                              High-Water Mark in % = 95

                              Squid Hard Disk Cache Settings:
                              Cache Replacement Policy: Heap LFUDA
                              Hard Disk Cache Size = 80000
                              Hard Disk Cache System = diskd
                              Level 1 Directories = 16
                              Minimum Object Size = 0
                              Maximum Object Size = 2000

                              Squid Memory Cache Settings:
                              Memory Cache Size = 300
                              Maximum Object Size in RAM = 4
                              Memory Replacement Policy = Heap GDSF

                              1 Reply Last reply Reply Quote 0
                              • S
                                sporkrom
                                last edited by

                                Hi,
                                I have the exactly same problem with pfSense 2.2.6/squid3 in transparent mode/squidguard and squid virus check.
                                I bought the pfSense XG-1540 with 2x128GB SSD HDD, 32GB RAM, 2x10GE, 6x1GE and one 1TB USB 3.0 external SSD drive (only used for squid cache).
                                After changeing the Squid memory cache settings to

                                Memory Cache Size = 300  (before 8092)
                                Maximum Object Size in RAM = 4 (before 1024)

                                the problem was fixed.

                                But now the appliance uses only 4GB RAM….
                                The performance is great, because of the SSD cache HDD.

                                Sincerely
                                Roman

                                1 Reply Last reply Reply Quote 0
                                • K
                                  kemecs
                                  last edited by

                                  Unfortunately, the problem has not disappeared.
                                  I tried again yesterday.
                                  The new pfSense (FreeBSD 10.3-RELEASE) does not help, this problem.

                                  How interesting that occurs every 60 minutes.

                                  Sincerely
                                  kemecs

                                  1 Reply Last reply Reply Quote 0
                                  • D
                                    duanes
                                    last edited by

                                    I am pretty certain that this is an hourly garbage collection issue in Squid and there is no way to over come it.  For some reason the garbage collection is a blocking thread and stops all network traffic.  Additionally, all existing connections are dropped.  Finally, the collection is based on every 60 minutes from the boot time.

                                    I do see that having a memory cache of ANY size greatly increases the hang time.  There are a number of complaints about this around on the internet, but none of the responders seem to really grasp the problem.

                                    I have also found that I had to limit my squid HD cache size to about 40GB.  I wanted a larger cache to hold all of the MS updates, AV updates and other various files that tend to be large and repetitive.  Alas, I believe that I am stuck with the problem for now.

                                    1 Reply Last reply Reply Quote 0
                                    • A
                                      aGeekhere
                                      last edited by

                                      same problem, any chance this garbage collecting process can be set to low priority?
                                      100.00% (squid-1) -f /usr/local/etc/squid/squid.co

                                      Looks to be a bug, can someone post a new bug for this issue here https://redmine.pfsense.org/projects/pfsense-packages

                                      I have posted the bug here https://redmine.pfsense.org/issues/6485

                                      Never Fear, A Geek is Here!

                                      1 Reply Last reply Reply Quote 0
                                      • A
                                        aGeekhere
                                        last edited by

                                        For hard disk cache once it reaches about 30GB of 200GB squid starts pulling a high load

                                        
                                        last pid: 34059;  load averages:  0.65,  0.87,  0.87  up 10+04:05:45    22:20:31
                                        327 processes: 3 running, 307 sleeping, 17 waiting
                                        Mem: 127M Active, 2984M Inact, 450M Wired, 3688K Cache, 336M Buf, 349M Free
                                        
                                          PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
                                           11 root     155 ki31     0K    32K CPU0    0 236.6H  75.00% [idle{idle: cpu0}]
                                           11 root     155 ki31     0K    32K RUN     1 236.2H  68.16% [idle{idle: cpu1}]
                                         4349 squid     37    0   191M 89032K kqread  1  59:26  12.35% (squid-1) -f /usr/local/etc/squid/squid.co
                                         6952 root      52    0   262M 28748K piperd  1   0:01   5.57% php-fpm: pool nginx (php-fpm)
                                        10213 root      52    0   262M 29140K accept  0   0:00   2.29% php-fpm: pool nginx (php-fpm)
                                        79057 squid     20    0 37660K 13416K sbwait  0   0:03   0.29% (squidGuard) -c /usr/local/etc/squidGuard/
                                           12 root     -92    -     0K   272K WAIT    0  34:56   0.00% [intr{irq260: re1}]
                                           12 root     -92    -     0K   272K WAIT    1  25:14   0.00% [intr{irq261: re2}]
                                           12 root     -60    -     0K   272K WAIT    0  20:42   0.00% [intr{swi4: clock}]
                                           19 root      16    -     0K    16K syncer  0  10:05   0.00% [syncer]
                                            5 root     -16    -     0K    16K pftm    0   8:07   0.00% [pf purge]
                                           15 root     -16    -     0K    16K -       0   2:44   0.00% [rand_harvestq]
                                         4223 unbound   20    0 55640K 26308K kqread  0   2:40   0.00% /usr/local/sbin/unbound -c /var/unbound/un
                                        26898 root      20    0 30140K 17968K select  1   2:19   0.00% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.c
                                        23145 root      20    0 28608K  6416K kqread  0   2:04   0.00% nginx: worker process (nginx)
                                         6187 squid     20    0 37752K  3544K select  0   2:00   0.00% (pinger) (pinger)
                                        60390 squid     20    0 37752K  3544K select  0   1:48   0.00% (pinger) (pinger)
                                        40522 squid     20    0 37752K  3544K select  0   1:48   0.00% (pinger) (pinger)
                                        
                                        

                                        going to try and set a 20GB cache

                                        update
                                        High load stops with 20GB cache, raising it to 30GB, will see at what point the issue starts

                                        Load averages: 0.24, 0.19, 0.08

                                        Never Fear, A Geek is Here!

                                        1 Reply Last reply Reply Quote 0
                                        • A
                                          aGeekhere
                                          last edited by

                                          looks like this issue has been posted in squid
                                          http://bugs.squid-cache.org/show_bug.cgi?id=4477

                                          Never Fear, A Geek is Here!

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.