Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.45_p1 Upgrade - Kernel panic on boot when 2nd WAN plugged in

    Scheduled Pinned Locked Moved General pfSense Questions
    11 Posts 2 Posters 857 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T Offline
      trufflebutter
      last edited by

      Hi,

      Long time lurker, first time poster. I recently upgraded to 2.4.5_p1, and imported my config.xml from 2.4.5 installation.

      On boot, I get a kernel panic but I can mitigate this by unplugging the 2nd WAN interface and plugging it in after the machine has booted. The closing lines of the crash dump output read as below.

      Any ideas where to start diagnosing this one?

      <118>Configuring WAN interface...
      <5>igb3: link state changed to UP
      <5>igb3.666: link state changed to UP
      <5>igb0: link state changed to UP
      <118>done.
      <118>Configuring LAN interface...done.
      <118>Configuring DMZ interface...done.
      <118>Configuring IOT interface...done.
      <118>Configuring WAN_PN interface...
      <6>ng0: changing name to 'pppoe0'
      <5>igb1: link state changed to UP
      <6>pflog0: promiscuous mode enabled
      <118>done.
      <118>Configuring CARP settings...done.
      <118>Syncing OpenVPN settings...done.
      <118>route: writing to routing socket: Invalid argument
      <118>route: writing to routing socket: Invalid argument
      <118>Configuring firewall.....
      <5>igb2: link state changed to UP
      <118>.done.
      <118>Starting PFLOG...done.
      <118>Setting up gateway monitors...done.
      <118>Setting up static routes...
      
      
      Fatal trap 12: page fault while in kernel mode
      cpuid = 2; apic id = 04
      fault virtual address	= 0x0
      fault code		= supervisor read data, page not present
      instruction pointer	= 0x20:0xffffffff80f9ea8e
      stack pointer	        = 0x28:0xfffffe0111d857d0
      frame pointer	        = 0x28:0xfffffe0111d857e0
      code segment		= base 0x0, limit 0xfffff, type 0x1b
      			= DPL 0, pres 1, long 1, def32 0, gran 1
      processor eflags	= interrupt enabled, resume, IOPL = 0
      current process		= 12 (swi1: pfsync)
      trap number		= 12
      panic: page fault
      cpuid = 2
      KDB: enter: panic
      panic.txt0600001213726374403  7142 ustarrootwheelpage faultversion.txt06000033013726374403  7620 ustarrootwheelFreeBSD 11.3-STABLE #243 abf8cba50ce(RELENG_2_4_5): Tue Jun  2 17:53:37 EDT 2020
          root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-245/obj/amd64/YNx4Qq3j/build/ce-crossbuild-245/sources/FreeBSD-src/sys/pfSense
      			```
      1 Reply Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        Can we see the backtrace from the crash report? Assuming you see one.
        Everything between > bt and > ps.

        Is the backtrace the same everytime it crashes?

        Which interface is your second WAN? Is it a different NIC type?

        Steve

        1 Reply Last reply Reply Quote 0
        • T Offline
          trufflebutter
          last edited by

          Hey Steve,

          Hopefully this is what you're after. I've got the textdump.tar.0 as well if needed.

          >  bt
          Tracing pid 12 tid 100139 td 0xfffff80006093620
          kdb_enter() at kdb_enter+0x3b/frame 0xfffffe0111d85480
          vpanic() at vpanic+0x19b/frame 0xfffffe0111d854e0
          panic() at panic+0x43/frame 0xfffffe0111d85540
          trap_pfault() at trap_pfault/frame 0xfffffe0111d85590
          trap_pfault() at trap_pfault+0x49/frame 0xfffffe0111d855f0
          trap() at trap+0x29d/frame 0xfffffe0111d85700
          calltrap() at calltrap+0x8/frame 0xfffffe0111d85700
          --- trap 0xc, rip = 0xffffffff80f9ea8e, rsp = 0xfffffe0111d857d0, rbp = 0xfffffe0111d857e0 ---
          pfsync_state_export() at pfsync_state_export+0x1e/frame 0xfffffe0111d857e0
          pfsync_sendout() at pfsync_sendout+0x1cf/frame 0xfffffe0111d85890
          pfsyncintr() at pfsyncintr+0xc6/frame 0xfffffe0111d858e0
          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0111d85920
          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0111d85970
          fork_exit() at fork_exit+0x83/frame 0xfffffe0111d859b0
          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0111d859b0
          --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
          
          1 Reply Last reply Reply Quote 0
          • T Offline
            trufflebutter
            last edited by

            Yes, same each time it crashes. That said I think once it recovered all by itself, but every reboot since has resulted in the same.

            It's perfectly stable once it's up, so I'm 99.9% sure it's something in the config or from the import rather than a hardware or NIC issue.

            It's a 4 port Intel NIC, same once that's been in there for 2.4.5 running for the last 120 days rock solid stable.

            1 Reply Last reply Reply Quote 0
            • stephenw10S Offline
              stephenw10 Netgate Administrator
              last edited by

              Hmm, not a crash I'm familiar with.

              It's in pfsync, I assume this is an HA pair? Is the second WAN connected to both?

              Steve

              1 Reply Last reply Reply Quote 0
              • T Offline
                trufflebutter
                last edited by

                Hey Steve,

                No nothing fancy. Single host, dual WAN with 2 LANs. 2nd LAN has two VLANs.

                I was going to tear down the 2nd WAN connection/interface assignment tonight and see how it behaves, then rebuild it and see if that helps. I suspect it will still crater, but something to try.

                1 Reply Last reply Reply Quote 0
                • stephenw10S Offline
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmm, do you have any HA settings enabled there? State sync?

                  You will have a pfsync interface but I would not expect it to be doing anything on a single firewall.

                  Steve

                  1 Reply Last reply Reply Quote 0
                  • T Offline
                    trufflebutter
                    last edited by

                    Hi @stephenw10

                    Not that I've configured, nothing under CARP - anywhere else I can check?

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S Offline
                      stephenw10 Netgate Administrator
                      last edited by

                      Does your sync interface have any config on it?

                      [2.4.5-RELEASE][admin@t70.stevew.lan]/root: ifconfig pfsync0
                      pfsync0: flags=0<> metric 0 mtu 1500
                      	groups: pfsync
                      

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • T Offline
                        trufflebutter
                        last edited by

                        [2.4.5-RELEASE][admin@incognito.local]/root: ifconfig pfsync0
                        pfsync0: flags=0<> metric 0 mtu 1500
                                groups: pfsync
                        
                        

                        That's what I have sir.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S Offline
                          stephenw10 Netgate Administrator
                          last edited by

                          Hmm, odd. And it does that with the same crash when you boot with WAN2 connected?

                          What if you boot with the NIC connected but not actually connected to the WAN2 modem? That might determine if it's a hardware/driver issue or a network stack problem. I could see pfsync being either.

                          Steve

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.