Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Varnish stops working after few days

    pfSense Packages
    3
    23
    7.5k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • X
      xudus
      last edited by

      Hi guys,

      Got a fresh install of 2.0.1 with pfBlocker, pfflowd, squid, squidGuard and varnish. On average, CPU usage no greater than 30%, RAM at 16%, SWAP 0% and HDD 4%. After reboot everything is working as it should. However, anywhere from few hours to few days after reboot, Varnish stops passing headers and routing to different servers. Restarting service (from gui) doesn't rectify the situation. Only after pfs restart everything starts working… and again after few hours or days it does the same thing.

      BTW this is a second box (same pfs version + packages) that has this behavior.

      Any ideas?

      1 Reply Last reply Reply Quote 0
      • marcellocM
        marcelloc
        last edited by

        I'm using varnish2 in production for months without any restart.

        Do you have any alerts on logs or could you check if varnish daemon is still up after these hours/few days?

        Did you checked if there is any other service running on same port like pfsense gui or redirect rule?

        Treinamentos de Elite: http://sys-squad.com

        Help a community developer! ;D

        1 Reply Last reply Reply Quote 0
        • X
          xudus
          last edited by

          marcelloc, thank you for quick reply. Looking at the output (see below) varnish is the only service on 80 and it is running. No alerts that I can see.

          There are 4 domains and 3 internal HTTP servers. No matter which domain I try to access from the outside, it always takes me to the first server that is configured under backends. Even after I removed the first backed, it still takes me to it no matter which domain I'm trying to access!

          Any thoughts?

          
          [2.0.1-RELEASE][root@xld.noc]/root(10): sockstat -4 -l
          USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
          root     sshd       28374 5  tcp4   *:22                  *:*
          root     lighttpd   36318 10 tcp4   *:8088                *:*
          dhcpd    dhcpd      9264  16 udp4   *:67                  *:*
          dhcpd    dhcpd      9264  20 udp4   *:20148               *:*
          nobody   dnsmasq    1966  3  udp4   *:53                  *:*
          nobody   dnsmasq    1966  4  tcp4   *:53                  *:*
          nobody   dnsmasq    1966  10 udp4   *:35409               *:*
          root     php        5813  10 udp4   *:*                   *:*
          root     php        53032 10 udp4   *:*                   *:*
          root     php        10759 10 udp4   *:*                   *:*
          root     php        24700 10 udp4   *:*                   *:*
          proxy    squid      26703 13 udp4   *:60333               *:*
          proxy    squid      26703 19 tcp4   192.168.192.2:3128    *:*
          proxy    squid      26703 20 tcp4   192.168.77.2:3128     *:*
          proxy    squid      26703 21 tcp4   192.168.71.1:3128     *:*
          proxy    squid      26703 22 tcp4   127.0.0.1:3128        *:*
          proxy    squid      26703 23 udp4   *:4827                *:*
          proxy    squid      26703 26 udp4   *:3401                *:*
          nobody   varnishd   58060 6  tcp4   *:80                  *:*
          root     varnishd   57880 4  tcp4   127.0.0.1:81          *:*
          root     syslogd    10165 14 udp4   *:514                 *:*
          root     bsnmpd     59549 9  udp4   *:*                   *:*
          root     bsnmpd     59549 10 udp4   *:161                 *:*
          _ntp     ntpd       6099  10 udp4   192.168.171.2:123     *:*
          _ntp     ntpd       6099  11 udp4   192.168.192.2:123     *:*
          _ntp     ntpd       6099  13 udp4   192.168.71.1:123      *:*
          _ntp     ntpd       6099  14 udp4   192.168.177.2:123     *:*
          root     miniupnpd  50392 10 tcp4   *:2189                *:*
          root     miniupnpd  50392 11 udp4   *:1900                *:*
          root     miniupnpd  50392 12 udp4   192.168.192.2:55451   *:*
          root     php        63861 10 udp4   *:*                   *:*
          root     php        62960 10 udp4   *:*                   *:*
          root     inetd      50089 10 udp4   127.0.0.1:6969        *:*
          [2.0.1-RELEASE][root@xld.noc]/root(11): ps aux | grep varnish
          root   57880  0.0  4.1 85968 84512  ??  Ss   12:03PM   0:01.84 varnishd: Varnish-Mgr xld.noc (varnishd)
          nobody 58060  0.0  4.1 92664 85452  ??  I    12:03PM   0:24.59 varnishd: Varnish-Chld xld.noc (varnishd)
          root   46966  0.0  0.1  3524  1260   0  S+    4:06PM   0:00.01 grep varnish
          
          
          1 Reply Last reply Reply Quote 0
          • X
            xudus
            last edited by

            Something is really screwed up. I disabled varnish in gui and I'm still able to access the server behind pfs! Mind you, I still have original issue with being presented with the first backend regardless of the domain being access.

            1 Reply Last reply Reply Quote 0
            • marcellocM
              marcelloc
              last edited by

              Check if there is a remain nat that forward requests to this first server.

              With varnish stopped, you can't be able to reach backends.

              Also check on system -> advanced if webgui redirect rule is disabled.

              Treinamentos de Elite: http://sys-squad.com

              Help a community developer! ;D

              1 Reply Last reply Reply Quote 0
              • X
                xudus
                last edited by

                No NAT and webgui is disabled.

                As I stated in the first post. Everything is working properly after the reboot. This system was running for almost two weeks w/o issues.

                Not sure if it matters, but varnish was installed first then two weeks after squid & squidguard was installed. Could the order of installation make difference?

                1 Reply Last reply Reply Quote 0
                • marcellocM
                  marcelloc
                  last edited by

                  @xudus:

                  Not sure if it matters, but varnish was installed first then two weeks after squid & squidguard was installed. Could the order of installation make difference?

                  Probably not.
                  There is something really weird on this setup. With varnish stopped, you can get access to port 80 so how could it forward to internal host????

                  Treinamentos de Elite: http://sys-squad.com

                  Help a community developer! ;D

                  1 Reply Last reply Reply Quote 0
                  • X
                    xudus
                    last edited by

                    Had the same behavior on previous box, so I rebuild the whole solution on the new box. The issue showed up on the new build. Two different boxes with this bug.

                    Just rebooted pfs. Everything is working. The question is for how long. I have a feeling that it has to do with squid/squidguard. I guest next time it happens, I'll remove squid/squidguard and see if it'll make a difference.

                    1 Reply Last reply Reply Quote 0
                    • X
                      xudus
                      last edited by

                      It happen again! 10 days after the reboot varnish is malfunctioning.

                      1 Reply Last reply Reply Quote 0
                      • C
                        canefield
                        last edited by

                        Dear all,

                        I encounter the same problems…after a couple of days Varnish is stopped and won't come online until a reboot of pfSense. Then again a couple days later the system doesn't respond to my external requests. How come?

                        Thanks,
                        Canefield

                        1 Reply Last reply Reply Quote 0
                        • marcellocM
                          marcelloc
                          last edited by

                          Is there any log or alert or message during manual service restart to help on identifying this problem?

                          I`m using it on amd64 for a long time without crashes.

                          att,
                          Marcello Coutinho

                          Treinamentos de Elite: http://sys-squad.com

                          Help a community developer! ;D

                          1 Reply Last reply Reply Quote 0
                          • C
                            canefield
                            last edited by

                            Yes, after a couple of tests the following error emerge:

                            "php: : The command '/usr/local/etc/rc.d/varnish.sh' returned exit code '2', the output was 'kern.ipc.nmbclusters: 65536 sysctl: kern.ipc.nmbclusters: Invalid argument kern.ipc.somaxconn: 16384 -> 16384 kern.maxfiles: 131072 -> 131072 kern.maxfilesperproc: 104856 -> 104856 kern.threads.max_threads_per_proc: 4096 -> 4096 NB: Storage size limited to 2GB on 32 bit architecture, NB: otherwise we could run out of address space. Message from VCC-compiler: Reference to unknown backend 'CANLB' at ('input' Line 55 Pos 28) .backend = CANLB; –-------------------------###########- In director specification starting at: ('input' Line 53 Pos 1) director CA client { ########------------ Running VCC-compiler failed, exit 1 VCL compilation failed'"

                            Thanks a lot,
                            Canefield

                            1 Reply Last reply Reply Quote 0
                            • marcellocM
                              marcelloc
                              last edited by

                              canefield,

                              something is messing up config:

                              Message from VCC-compiler: Reference to unknown backend 'CANLB' at ('input' Line 55 Pos 28) .backend = CANLB; –-------------------------###########-
                              In director specification starting at: ('input' Line 53 Pos 1) director CA client { ########------------ Running VCC-compiler failed, exit 1 VCL compilation failed'"

                              Treinamentos de Elite: http://sys-squad.com

                              Help a community developer! ;D

                              1 Reply Last reply Reply Quote 0
                              • X
                                xudus
                                last edited by

                                marcelloc, is there any particular command that I could run that would help with finding the root cause?

                                TIA,
                                Dave

                                1 Reply Last reply Reply Quote 0
                                • marcellocM
                                  marcelloc
                                  last edited by

                                  xudus,

                                  try to run the startup command on console/ssh

                                  /usr/local/etc/rc.d/varnish.sh restart

                                  Treinamentos de Elite: http://sys-squad.com

                                  Help a community developer! ;D

                                  1 Reply Last reply Reply Quote 0
                                  • X
                                    xudus
                                    last edited by

                                    This is what I'm getting:

                                    
                                    kern.ipc.nmbclusters: 65536
                                    sysctl: kern.ipc.nmbclusters: Invalid argument
                                    kern.ipc.somaxconn: 16384 -> 16384
                                    kern.maxfiles: 131072 -> 131072
                                    kern.maxfilesperproc: 104856 -> 104856
                                    kern.threads.max_threads_per_proc: 4096 -> 4096
                                    storage_malloc: max size 128 MB.
                                    Using old SHMFILE
                                    
                                    
                                    1 Reply Last reply Reply Quote 0
                                    • marcellocM
                                      marcelloc
                                      last edited by

                                      @xudus:

                                      This is what I'm getting:

                                      There is no varnish fatal errors on this log, so it should be runinng.

                                      Treinamentos de Elite: http://sys-squad.com

                                      Help a community developer! ;D

                                      1 Reply Last reply Reply Quote 0
                                      • X
                                        xudus
                                        last edited by

                                        Sorry, it is running as I just restarted pfs. I'll post the output when it'll malfunction next time.

                                        1 Reply Last reply Reply Quote 0
                                        • X
                                          xudus
                                          last edited by

                                          marcelloc, it did it again.

                                          The output from /usr/local/etc/rc.d/varnish.sh restart is same as before (no errors). Is there any other place that I could poke to see what is braking varnish?

                                          1 Reply Last reply Reply Quote 0
                                          • marcellocM
                                            marcelloc
                                            last edited by

                                            xudus,

                                            check with netstat -an if varnish port are still up
                                            check with ps ax | grep -i varnish if varnish is still running.

                                            You can also create a cron job to restart varnish after two days for example to prevent this random error.

                                            att,
                                            Marcello Coutinho

                                            Treinamentos de Elite: http://sys-squad.com

                                            Help a community developer! ;D

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.