Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.3 stops routing traffic every 1 og 2 days.

    Scheduled Pinned Locked Moved General pfSense Questions
    27 Posts 10 Posters 6.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      denmly
      last edited by

      Hi.

      I'm having a problem like a few others, that my PFsense FW stops routing traffic.

      I'm able to access to web interface from the internet, and reboot the server and then it comes up again with no problems.

      My initial setup:

      One VMWare Guest running PFSense 2.3 i386 version with 2 Xeon X5650, 1 GB and 20 Disk. (Primary FW)
      One HW box running 2.2.6 with Geode(TM) Integrated Processor by AMD PCS CPU, 256 mb Ram and 1 GB CF Card (Backup fw with Carp enabled.

      Then I reinstalled the Primary FW with x64 version 2.3, and restored configuration, and installed a Backup FW On VMware with version 2.3, and restored Backup FW configuration on this som my setup now was:

      Two PFsense FW's running 2.3, with CARP
      HW: Vmware Guest with 2 Xeon X5650 CPU's, 1 GB RAM and 20 GB Disk.

      Internet connection is a 1 Gbit connection.

      But the problem still happends every 1 or 2 days.

      This weekend I stopped a SNMP test from my Zabbix server, and then i ran one more day before stopping traffic.

      This morning the FW stopped again, and then accessed the /status.php site and took a backup of the status_output.tgz file.

      I'm unsure of what to try next…

      Regards Michael

      1 Reply Last reply Reply Quote 0
      • ?
        Guest
        last edited by

        I would try out to insert 2 x 4 GB of RAM inside of the server! Then you could be trying to high
        up the mbuf size and then this would be perhaps preventing you form that problem if you run
        out of space. And if the lower amount of RAM it selfs will be the problem its solved too.

        All kind of data passing through the CPU of the firewall will hitting too the memory system, and this might
        be not really great enough for all actions. It could be that this was in the past not really a problem but in the
        future I really think some more GBs amount of RAM should be better to invest.

        1 Reply Last reply Reply Quote 0
        • R
          rlrobs
          last edited by

          I have the same problem with pfSense 2.3 on dell poweredge 2900.

          Dell Power edge 2900
          32GB RAM
          QuadCore
          HD SAS: 512GB
          4 interfaces intel.

          packages:
          Suricata
          PFBlocker
          Zabbix-aget-LTS
          OpenVPN Client Export.

          1 Reply Last reply Reply Quote 0
          • A
            adam65535
            last edited by

            The Dell R320 system I updated from 2.1.5 to 2.3 also had a similar symptom using igb driver.  It is the secondary of a pair of Dell R320's in a HA (primary/secondary failover) setup.  The primary is still on 2.1.5.  The secondary ran fine with it being master on 2.3 for about 10 hours or so and then half of the connections to some IPs stopped working.  It was wierd because I couldn't ping some systems on the network but I could ping others.  Same thing with remote systems.  Some I could ping and others I couldn't.  I could not ping the ISPs router (pfsense's default route) but I could pass traffic through it.  Network traces on the other systems and routers showed that packets went out and were sent back but pfsense didn't see them (or dropped them?).  Interrupts went to about 30% when the problem started around 5am  several days ago.  Even when I switched back to the primary the interrupts were still pegged at 30% even though no traffic was going through the secondary (that I could tell) after moving traffic back to the primary (carp).  I didn't think to do a netstat -i to see which IRQs were maxed or look at dropped packets as it was very early in the morning for me unfortunately.  Keep in mind that this is a backup site that is not active so not much traffic goes through it except transaction logs, etc.

            I noticed that I still had a hw.igb.num_queues set to 2 trying to optimize the drivers on pfsense 2.1.5 to limit nmbclusters from what I remember (my memory is not that good though :)).  It seemed like a big coincidence that is half the CPUs that are on the system and also maybe around half or 1/4 of the connections were failing (hyperthreading disabled in bios).  The driver was creating 4 queues according to netstat -i.

            I removed that setting and hoping it was related.  I will be doing tests during business hours the next week or so to try and determine if the problems is resolved or not.

            Do you have num_queues set also by chance?  It is not needed any more from everything I read.

            (I will update this post with the network card model numbers when I get to work in the morning).
            I have kern.ipc.nmbclusters="131072" set and using about 43000 of them in my setup with 4 cores and 8 interfaces (two 4 port intel cards).
            Running 3 site Ipsec tunnels, openvpn but that was not in use anytime, port forwards, carp(of course), and built in load balancer.

            1 Reply Last reply Reply Quote 0
            • D
              denmly
              last edited by

              Yesterday i tried to upgrade the fw with 4 GB ram…

              But it died last night at about 21.00...

              I'm almost ready to downgrade to 2.2.6, because this is driving me crazy...

              Is there a place to get the old iso files online??

              1 Reply Last reply Reply Quote 0
              • C
                cmb
                last edited by

                @adam65535:

                I noticed that I still had a hw.igb.num_queues set to 2 trying to optimize the drivers on pfsense 2.1.5 to limit nmbclusters from what I remember (my memory is not that good though :)).

                There were problems in igb multi-queue in the old drivers, that's why people ended up setting num_queues to 1 or a small number. In all FreeBSD 10.x and newer base versions (2.2.0-2.3.1+), you shouldn't specify hw.igb.num_queues at all. Remove that from loader.conf and/or loader.conf.local to let it use the default (1 queue per CPU core).

                It's possible setting num_queues to some non-default number causes problems, especially if a low number, as I doubt much testing happens in those circumstances.

                1 Reply Last reply Reply Quote 0
                • C
                  cmb
                  last edited by

                  denmly: I PMed you a link to a kernel to try with instructions.

                  1 Reply Last reply Reply Quote 0
                  • D
                    denmly
                    last edited by

                    @cmb:

                    denmly: I PMed you a link to a kernel to try with instructions.

                    I'll try this kernel right away

                    1 Reply Last reply Reply Quote 0
                    • D
                      denmly
                      last edited by

                      New kernel is installed, and now its just wait and see… :-)

                      1 Reply Last reply Reply Quote 0
                      • U
                        ulicky
                        last edited by

                        I have same problem on 2 same machines, before 2.3 it was ok.

                        Supermicro board + 4x igb interfaces
                        Intel(R) Xeon(R) CPU X3430 @ 2.40GHz - 4 CPUs: 1 package(s) x 4 core(s)
                        Memory usage 2% of 8148 MiB

                        Any solution for that?

                        1 Reply Last reply Reply Quote 0
                        • B
                          byusinger84
                          last edited by

                          Having the same issue on this post: https://forum.pfsense.org/index.php?topic=110710

                          1 Reply Last reply Reply Quote 0
                          • M
                            mer
                            last edited by

                            ulicky and byusinger84, can you console in or ssh to the box?  Assuming the interfaces are igb or em, see if there are any messages related to "watchdog timeout".
                            I don't have any fixes, but if you have those interfaces, it may be related to something a few other folks are seeing.

                            1 Reply Last reply Reply Quote 0
                            • B
                              byusinger84
                              last edited by

                              @mer:

                              ulicky and byusinger84, can you console in or ssh to the box?  Assuming the interfaces are igb or em, see if there are any messages related to "watchdog timeout".
                              I don't have any fixes, but if you have those interfaces, it may be related to something a few other folks are seeing.

                              I'll check this out the next time the LAN interface freezes again.

                              1 Reply Last reply Reply Quote 0
                              • T
                                thx2000
                                last edited by

                                I'm pretty sure I'm experiencing the same problem, as mentioned in this post: https://forum.pfsense.org/index.php?topic=110320.0

                                I've noticed that if I leave the system on long enough the LAN interface will eventually drop offline after 2-3 days even without any SIP traffic through the VPN.  I'll try to check for watchdog timeout messages the next time it occurs.

                                1 Reply Last reply Reply Quote 0
                                • D
                                  denmly
                                  last edited by

                                  Just experienced another of these fw breakdowns even with the new kernel from CMB :-(

                                  it came at the same time that a big transfer of data started through a site to site vpn tunnel…

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    cmb
                                    last edited by

                                    @denmly:

                                    Just experienced another of these fw breakdowns even with the new kernel from CMB :-(

                                    it came at the same time that a big transfer of data started through a site to site vpn tunnel…

                                    That's not good, maybe something different in your case. Others have had promising results with the no-netmap kernel, though it hasn't been long enough yet to have a lot of confidence. What type of VPN?

                                    1 Reply Last reply Reply Quote 0
                                    • D
                                      denmly
                                      last edited by

                                      Ipsec VPN to another Pfsense 2.3…

                                      1 Reply Last reply Reply Quote 0
                                      • C
                                        cmb
                                        last edited by

                                        Could you get me a status tgz from your system? Browse to status.php and click the link to download the tgz. Email the file or a link to it to cmb at pfsense dot org.

                                        1 Reply Last reply Reply Quote 0
                                        • D
                                          denmly
                                          last edited by

                                          Email is now sent to you.

                                          1 Reply Last reply Reply Quote 0
                                          • B
                                            byusinger84
                                            last edited by

                                            @denmly:

                                            Just experienced another of these fw breakdowns even with the new kernel from CMB :-(

                                            it came at the same time that a big transfer of data started through a site to site vpn tunnel…

                                            I also experienced the same issue even using the new kernel. Also I don't think this is related to SIP traffic because one of the sites that's had the issue doesn't use SIP.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.