Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense Crash, cannot find root cause. Help!!

    Scheduled Pinned Locked Moved General pfSense Questions
    11 Posts 3 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • KOMK
      KOM
      last edited by

      Kernel panics are usually caused by misbehaving hardware. I'm not a FreeBSD tech but nobody else has replied yet. I may be totally off-base here.

      When your crash happens, it seems to be servicing the NIC:

      curthread    = 0xfffff8000b9ec620: pid 12 "irq296: igb4:que 0"
      current process		= 12 (irq296: igb4:que 0)
      

      Also:

      <7>sonewconn: pcb 0xfffff804073d21d0: Listen queue overflow: 193 already in queue awaiting acceptance (27 occurrences)
      

      which might be fixed by adding kern.ipc.somaxconn=4096 in System - Advanced - System Tunables.

      Read this and pay attention to the section on igb(4) cards. Try what is recommended re: setting kern.ipc.nmbclusters.

      https://docs.netgate.com/pfsense/en/latest/hardware/tuning-and-troubleshooting-network-cards.html

      S 1 Reply Last reply Reply Quote 1
      • S
        scottys
        last edited by

        Thank you, I read that but the system has been running smoothly for over a year that I thought it couldnt it so I stopped reading before getting to the cards. All my WANs are located on "bce" card (4 port, 4 WANs) and my LAN is on the "igb" card (4 port, 1 used for LAN)

        So basically it looks like there was a mbufs overflow on the NIC(s) (from what you can tell, I mean obviously there was something happening as this is repeated 50 times in the dump

        sonewconn: pcb 0xfffff804073d21d0: Listen queue overflow: 193 already in queue awaiting acceptance (27 occurrences)
        

        So basically I just need to increase the memory allocation size for my NICs? The reason I find it hard to believe is looking at the backup pfsense currently running, right now is about the peak traffic so it is under the most load right now and looking at MBUF Usage: 3% (29136/1000000)
        And it never really moves from that 3% (I have yet to see it above 3%)

        1 Reply Last reply Reply Quote 0
        • KOMK
          KOM
          last edited by KOM

          The crash happened while the system was talking to the igb NIC driver. What it was doing I can't tell you. Those sonewconn errors might have nothing to do with it, or everything. I don't know that either. I'm just trying to give you suggestions and options. What you do is up to you.

          I also noticed snort in your process list. While debugging this, you might want to temporarily disable any heavy packages like snort, suricata, or pfblocker just to rule them out. For example, there was an issue several months ago where a pfB list exceeded some threshold which started causing problems for people until they bumped a system tunable.

          S 1 Reply Last reply Reply Quote 0
          • S
            scottys @KOM
            last edited by

            @KOM said in PfSense Crash, cannot find root cause. Help!!:

            I don't know that either. I'm just trying to give you suggestions and options. What you do is up to you.

            I understand completely, just trying to understand

            For example, there was an issue several months ago where a pfB list exceeded some threshold which started causing problems for people into they bumped a system tunable.

            Do you happen to know what this is? (the tuneable).

            I was running Snort, pfBlockerNG, SquidProxy and SquidGuard at the time of the crash. Since the crash all services have been disabled. The only thing I can think of that would cause this is the OpenVAS Vulnerability Scan going running on our networks, but we have been hit with them from the outside and this isn't the first time I have ran the scan - the scan is ran about once every 3 months or so. So this pfsense has gone through at least 4 internal scans, and I know our servers have been hit with the same scanners as I see them on snort.

            1 Reply Last reply Reply Quote 0
            • KOMK
              KOM
              last edited by

              @scottys said in PfSense Crash, cannot find root cause. Help!!:

              Do you happen to know what this is? (the tuneable).

              It was actually the firewall state table size, which is controlled via System - Advanced - Firewall & NAT - Firewall Maximum States. Default is 200000 and they recommend bumping it to 400000.

              S 1 Reply Last reply Reply Quote 0
              • S
                scottys @KOM
                last edited by

                @KOM Looking at the description, I think this could be the culprit
                "Maximum number of table entries for systems such as aliases, sshguard, snort, etc, combined"

                Since I did see some stuff with sshguard (OpenVAS scanning) and tens of thousands of sorts alerts, add pfBlockerNG country blocking and SquidGuard's list blocking, i think it could easily hit 400,000 entries.

                Besides bumping it up, do you know of some kind of maintenance I can do to ensure that table stays under 400k? (if that was the culprit of the crash)

                1 Reply Last reply Reply Quote 0
                • KOMK
                  KOM
                  last edited by

                  No, not really. There are several Zabbix packages, but I don't know if that metric is being tracked or not with the FreeBSD OS template.

                  1 Reply Last reply Reply Quote 1
                  • S
                    scottys
                    last edited by

                    bump just in case that isn't the issue and it is something else

                    @KOM Thank you for your help. I am in no way disreguarding what you have told me. Currently in testing with our backup to ensure stability with the new tunables. You did say

                    I'm not a FreeBSD tech but nobody else has replied yet. I may be totally off-base here

                    I just need to ensure that you are right on target

                    Thank you for all your help

                    1 Reply Last reply Reply Quote 0
                    • S
                      Stewart @KOM
                      last edited by

                      @KOM said in PfSense Crash, cannot find root cause. Help!!:

                      Kernel panics are usually caused by misbehaving hardware. I'm not a FreeBSD tech but nobody else has replied yet. I may be totally off-base here.

                      When your crash happens, it seems to be servicing the NIC:

                      curthread    = 0xfffff8000b9ec620: pid 12 "irq296: igb4:que 0"
                      current process		= 12 (irq296: igb4:que 0)
                      

                      Also:

                      <7>sonewconn: pcb 0xfffff804073d21d0: Listen queue overflow: 193 already in queue awaiting acceptance (27 occurrences)
                      

                      which might be fixed by adding kern.ipc.somaxconn=4096 in System - Advanced - System Tunables.

                      Read this and pay attention to the section on igb(4) cards. Try what is recommended re: setting kern.ipc.nmbclusters.

                      https://docs.netgate.com/pfsense/en/latest/hardware/tuning-and-troubleshooting-network-cards.html

                      Nothing really to add but I find it ironic that you say "I'm not a FreeBSD tech..." and then go on to troubleshoot the crash dump, suggest what appears to be a kernel change in the System Tunables, and give references. Then start talking about adjusting the Firewall State sizes. I kinda think that makes you "...a FreeBSD tech...", at least more than you think you are. :)

                      1 Reply Last reply Reply Quote 2
                      • KOMK
                        KOM
                        last edited by

                        I try to help out where I can. Even though I've been here five years or so, I still remember the feeling of being new and posing a question into the void and getting no response. If I think I can even point them in the right direction, I'll reply. You might notice that this forum has very few unanswered posts. Not all issues can be resolved via the community forums, but I think we have a pretty high success rate and that helps the project's reputation & success.

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.