Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Help Understanding a Crash [kernel panic]

    Scheduled Pinned Locked Moved General pfSense Questions
    crashkernel panicpfsensehelplog
    31 Posts 4 Posters 4.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • bmeeksB
      bmeeks
      last edited by

      Performance would probably be about the same with Snort. That's because even though Suricata itself is multithreaded, the actual interface between the host OS stack and the NIC mediated by netmap is currently constrained by the fact only a single pair of host stack TX/RX rings is available. That in turn constrains Suricata to single-threaded operation for packet acquisition (same as Snort).

      But this is something I am currently working on improving along with the Suricata development team. Perhaps this improvement is ready by the release of 6.0.4 Suricata. We will see.

      In the meantime, why not just switch over to Legacy Blocking Mode for a while? Only if that really is a problem for you, would I switch over to Snort. Switching would mean reconfiguring a lot of stuff.

      N 1 Reply Last reply Reply Quote 0
      • N
        None 0 @bmeeks
        last edited by

        @bmeeks Right. Thanks.

        In legacy, do I need to enable and config SID Mgmt to ET rules to work, or just set IPS Policy + Ruleset?

        bmeeksB 1 Reply Last reply Reply Quote 0
        • bmeeksB
          bmeeks @None 0
          last edited by

          @none-0 said in Help Understanding a Crash [kernel panic]:

          @bmeeks Right. Thanks.

          In legacy, do I need to enable and config SID Mgmt to ET rules to work, or just set IPS Policy + Ruleset?

          IPS Policies are only available for Snort Subscriber Rules. That's because that feature depends on a special metadata tag that the Snort rules authors include with their rules package. The ET rules do not have that metadata tag. Each Snort rule is tagged by the authors with one or more IPS Policy tags, and the IPS Policy feature in the Snort and Suricata packages keys off that tag to select rules. Since the ET rules don't have the tag, the feature can't work for them.

          So when using ET rules, you will need to manually enable the ET categories and/or the individual rules you want. You can do that either via SID MGMT (the best way, in my opinion), or on the CATEGORIES and RULES tabs by checking boxes and clicking icons.

          You can use both techniques at the same time. So use IPS Policy to let it auto-select the Snort Subscriber rules, and then use either SID MGMT or the manual process to choose your ET rules.

          1 Reply Last reply Reply Quote 1
          • N
            None 0
            last edited by

            Hello guys,

            I had another crash, and apparently for the same reason: IPs bouncing between two MACs, and then calling the netmap_ring_reinit. Strangely I was using the Legacy mode this time (so it may be another type of problem), and it's the first one since 13 days ago.

            ddb.txt
            msgbuf.txt

            bmeeksB 1 Reply Last reply Reply Quote 0
            • bmeeksB
              bmeeks @None 0
              last edited by

              @none-0 said in Help Understanding a Crash [kernel panic]:

              Hello guys,

              I had another crash, and apparently for the same reason: IPs bouncing between two MACs, and then calling the netmap_ring_reinit. Strangely I was using the Legacy mode this time (so it may be another type of problem), and it's the first one since 13 days ago.

              ddb.txt
              msgbuf.txt

              Using Legacy Mode would take netmap 100% out of the equation, and thus the netmap_ring_init() function could not be called. That is a netmap-specific function, and it will only be called by netmap itself. So you had something running using netmap operation, and Snort and Suricata are the only two packages I am aware of on pfSense that use netmap. Perhaps you had a duplicate zombie process running on a Suricata interface (two instances of Suricata running on the same interface). One old instance might have been running netmap mode and a newer one Legacy Mode, but both on the same interface.

              You are running 2.5.2 RELEASE, so you don't have access to the latest Suricata updates with the new 6.0.3 binary. It will be interesting to see how that update performs for you when it becomes available.

              N 1 Reply Last reply Reply Quote 1
              • N
                None 0 @bmeeks
                last edited by

                oh man 😦
                I knew if this was the case, the other process shouldn't still be running after the restart, but anyway:
                /usr/local/bin/suricata -i igb1 -D -c /usr/local/etc/suricata/suricata_25400_igb1/suricata.yaml --pidfile /var/run/suricata_igb125400.pid
                Only this one now.

                You are running 2.5.2 RELEASE, so you don't have access to the latest Suricata updates with the new 6.0.3 binary. It will be interesting to see how that update performs for you when it becomes available.

                I'm looking forward to the release! Unfortunatly I can't move to devel and try it right now, but I will update this topic (if possible) when I do.

                Thanks again, @bmeeks!

                1 Reply Last reply Reply Quote 0
                • N
                  None 0
                  last edited by

                  Had another crash today :/
                  Same reason. Should I uninstall n install Suricata again?

                  bmeeksB 1 Reply Last reply Reply Quote 0
                  • bmeeksB
                    bmeeks @None 0
                    last edited by bmeeks

                    @none-0 said in Help Understanding a Crash [kernel panic]:

                    Had another crash today :/
                    Same reason. Should I uninstall n install Suricata again?

                    No, uninstalling and then reinstalling the exact same binary will not likely have any effect on that error.

                    It is likely a problem within the compiled binary code, and the exact same code will get installed again if you remove and reinstall the package. The only way to have a meaningful test would be to move to the 2.6.0 snapshot branch and install the newer Suricata from there. That package has a completely different binary in it.

                    In my experience with that error, I have not seen it cause a kernel panic, but that does not mean it couldn't. Maybe it is tickling something in your system just right such that it triggers the crash. My suspicion is multithreaded access to the single pair of netmap RX/TX rings exposed by the host stack in the older netmap API used by Suricata in pfSense CE and pfSense+ RELEASE versions today. There is new code to address that in the 6.0.3 Suricata package currently available in the 2.6.0 Snapshots branch.

                    N 1 Reply Last reply Reply Quote 0
                    • N
                      None 0 @bmeeks
                      last edited by

                      @bmeeks Just thought it could be a problem with my installation, but it doesn't seem to make sense...
                      Alright, disabled blocking, and gonna try the DEV branch when I have time.

                      Thanks!

                      bmeeksB 1 Reply Last reply Reply Quote 0
                      • bmeeksB
                        bmeeks @None 0
                        last edited by bmeeks

                        @none-0 said in Help Understanding a Crash [kernel panic]:

                        @bmeeks Just thought it could be a problem with my installation, but it doesn't seem to make sense...
                        Alright, disabled blocking, and gonna try the DEV branch when I have time.

                        Thanks!

                        If you are using Legacy Blocking Mode, then netmap is 100% completely and totally out of the picture. You should never see a netmap_ring_reinit() error in Legacy Mode. The only way to see that error is if something is still running with the netmap kernel device. If you are still getting kernel crashes and not using Inline IPS Mode on any interface, then Suricata is not the root cause of the problem.

                        Everything I mentioned above about the new Suricata in the Snapshots branch only applies when using Inline IPS Mode.

                        N 1 Reply Last reply Reply Quote 1
                        • N
                          None 0 @bmeeks
                          last edited by

                          @bmeeks
                          My mistake, there's no netmap_ring_reinit this time:

                          <6>arp: 192.168.0.30 moved from 00:1d:60:7d:8c:61 to 00:05:4b:04:5e:7c on igb1
                          <6>arp: 192.168.0.39 moved from 00:22:15:6c:eb:96 to 00:e0:53:0b:40:f8 on igb1
                          <6>pid 18785 (grep), jid 0, uid 0: exited on signal 10 (core dumped)
                          <6>igb1: link state changed to DOWN
                          <6>igb1: link state changed to UP
                          <6>igb1: link state changed to DOWN
                          <6>igb1: link state changed to UP
                          <6>igb1: link state changed to DOWN
                          <6>igb1: link state changed to UP
                          <6>igb1: link state changed to DOWN
                          <6>igb1: link state changed to UP
                          <6>arp: 192.168.0.30 moved from f4:6d:04:e4:84:54 to 00:05:4b:04:5e:7c on igb1
                          <6>igb1: promiscuous mode disabled
                          <6>igb1: promiscuous mode enabled
                          <6>igb1: link state changed to DOWN
                          <6>igb1: link state changed to UP
                          <6>igb1: link state changed to DOWN
                          <6>igb1: link state changed to UP
                          
                          
                          Fatal trap 12: page fault while in kernel mode
                          cpuid = 2; apic id = 02
                          fault virtual address	= 0xffff
                          fault code		= supervisor read data, page not present
                          instruction pointer	= 0x20:0xffffffff83cf7396
                          stack pointer	        = 0x28:0xfffffe0089906ac0
                          frame pointer	        = 0x28:0xfffffe0089906ac0
                          code segment		= base 0x0, limit 0xfffff, type 0x1b
                          			= DPL 0, pres 1, long 1, def32 0, gran 1
                          processor eflags	= interrupt enabled, resume, IOPL = 0
                          current process		= 19 (dp_zil_clean_taskq_)
                          trap number		= 12
                          panic: page fault
                          cpuid = 2
                          time = 1630329038
                          KDB: enter: panic
                          
                          

                          So as you said, this is not related at all with Suricata. netmap is out.

                          Ok, so in this log there'r "promiscuous mode" changes, and even if that is normal, I uninstalled darkstat anyway.

                          Now the only other pkg installed is pfBlockerNG-devel. Let's see if my system will crash again in the next days....

                          Thanks.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            You have the backtrace from the crash report?

                            N 1 Reply Last reply Reply Quote 0
                            • N
                              None 0 @stephenw10
                              last edited by

                              @stephenw10 Hi! Yes, here: ddb.txt

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                Mmm, well very similar but not identical:

                                db:0:kdb.enter.default>  bt
                                Tracing pid 40766 tid 100593 td 0xfffff8013f829740
                                kdb_enter() at kdb_enter+0x37/frame 0xfffffe00910f4530
                                vpanic() at vpanic+0x197/frame 0xfffffe00910f4580
                                panic() at panic+0x43/frame 0xfffffe00910f45e0
                                trap_fatal() at trap_fatal+0x391/frame 0xfffffe00910f4640
                                trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00910f4690
                                trap() at trap+0x286/frame 0xfffffe00910f47a0
                                calltrap() at calltrap+0x8/frame 0xfffffe00910f47a0
                                --- trap 0xc, rip = 0xffffffff8120874b, rsp = 0xfffffe00910f4870, rbp = 0xfffffe00910f4880 ---
                                vm_radix_remove() at vm_radix_remove+0x1b/frame 0xfffffe00910f4880
                                vm_page_free_prep() at vm_page_free_prep+0x55/frame 0xfffffe00910f48a0
                                vm_page_free_toq() at vm_page_free_toq+0x12/frame 0xfffffe00910f48d0
                                vm_object_page_remove() at vm_object_page_remove+0x61/frame 0xfffffe00910f4930
                                vm_map_entry_delete() at vm_map_entry_delete+0x104/frame 0xfffffe00910f4980
                                vm_map_delete() at vm_map_delete+0x184/frame 0xfffffe00910f49e0
                                vm_map_remove() at vm_map_remove+0xab/frame 0xfffffe00910f4a10
                                vmspace_exit() at vmspace_exit+0xcb/frame 0xfffffe00910f4a50
                                exit1() at exit1+0x55b/frame 0xfffffe00910f4ab0
                                sys_sys_exit() at sys_sys_exit+0xd/frame 0xfffffe00910f4ac0
                                amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe00910f4bf0
                                fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00910f4bf0
                                --- syscall (1, FreeBSD ELF64, sys_sys_exit), rip = 0x800c2a00a, rsp = 0x7fffffffec38, rbp = 0x7fffffffec50 ---
                                

                                What is that running on?

                                N 1 Reply Last reply Reply Quote 0
                                • N
                                  None 0 @stephenw10
                                  last edited by

                                  @stephenw10

                                  What is that running on?

                                  Sorry, what you mean?

                                  bmeeksB 1 Reply Last reply Reply Quote 0
                                  • bmeeksB
                                    bmeeks @None 0
                                    last edited by

                                    @none-0 said in Help Understanding a Crash [kernel panic]:

                                    @stephenw10

                                    What is that running on?

                                    Sorry, what you mean?

                                    I suspect he means what type of hardware -- Netgate appliance (and if so, which model, as different models have different CPU families) or generic Intel/AMD hardware.

                                    N 1 Reply Last reply Reply Quote 1
                                    • N
                                      None 0 @bmeeks
                                      last edited by

                                      @bmeeks @stephenw10

                                      It's a generic Intel box:
                                      CPU: i3-4170
                                      RAM: Kingston 2x8GB
                                      Mobo: Asus (don't remember the model, but I can check)
                                      Network adp: Intel I350-T4V2
                                      9a875d05-7f0a-4a25-a170-4956d0ccea7f-image.png

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by stephenw10

                                        Mmm, well I would be running a RAM test there when you can to be sure it's not hardware issue.
                                        Though it seems far too similar to be a RAM error which is usually pretty random.

                                        N 1 Reply Last reply Reply Quote 1
                                        • N
                                          None 0 @stephenw10
                                          last edited by

                                          @stephenw10 I think I ll buy a new stick... Memory tests work sometimes, but for intermittent problems I would possible need to run them for days...
                                          I can do the tests with more time, and use them elsewhere if happens to be no problem with.

                                          A single module would do the trick, or dual channel benefits pfsense? I mean, it won't use the bandwidth, but latency is better in dual too... What do you guys think?

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Unlikely to make much difference IMO. For a test it doesn't really matter anyway.

                                            Steve

                                            1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.