• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Crash report - Fatal trap 12: page fault while in kernel mode (on VMWARE)

Scheduled Pinned Locked Moved General pfSense Questions
13 Posts 2 Posters 1.6k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    stephenw10 Netgate Administrator
    last edited by Jan 29, 2021, 1:05 AM

    Yes Avahi will have been updated.
    You are seeing some arp movement logs like:

    arp: 10.1.30.150 moved from 54:60:09:c0:d1:4e to 00:e0:4c:36:86:d2 on vmx0.30
    

    Since that's not Apple it could be an actual IP conflict.

    Check what those MACs (and the others shown) belong to.

    Steve

    F 1 Reply Last reply Jan 29, 2021, 1:17 AM Reply Quote 0
    • F
      fresnoboy @stephenw10
      last edited by Jan 29, 2021, 1:17 AM

      @stephenw10

      Stephen, they are all Chromecast Audio devices. The 00:e0 MAC addresses are for the USB connected ethernet adapter. The 54:60 addresses are the builtin wifi adapters. All of them end up on VLAN30, which is the vmx0.30 VLAN addreess.

      If the switch they are plugged into goes reboots, they can flip back to wifi, and then back to ethernet when the switch comes back online. The MAC addresses are different, but the chromecast will want the same IP address via DHCP since it's the same device and network.

      But I update switches all the time (they are unifi US-48's), and they don't crash pfense when I do that. The last time it happened was at 1 AM local time, and no switch upgrade happened then.

      I guess if the chromecast did a software update and rebooted, that could cause such a transition as well. Not sure why that should cause avahi trouble, and even if avahi crashed, why would it cause a kernel panic?

      1 Reply Last reply Reply Quote 0
      • S
        stephenw10 Netgate Administrator
        last edited by Jan 29, 2021, 7:23 PM

        It shouldn't, I agree. And that looks like legitimate use of the same IP.

        You may want to just stop logging those:
        https://docs.netgate.com/pfsense/en/latest/troubleshooting/logs-arp-moved.html

        The two crashes shown have different backtraces:

        db:0:kdb.enter.default>  bt
        Tracing pid 0 tid 100079 td 0xfffff80006654620
        kdb_enter() at kdb_enter+0x3b/frame 0xfffffe02387ef640
        vpanic() at vpanic+0x19b/frame 0xfffffe02387ef6a0
        panic() at panic+0x43/frame 0xfffffe02387ef700
        bpf_buffer_append_mbuf() at bpf_buffer_append_mbuf+0x64/frame 0xfffffe02387ef730
        catchpacket() at catchpacket+0x4b9/frame 0xfffffe02387ef7e0
        bpf_mtap() at bpf_mtap+0x200/frame 0xfffffe02387ef850
        ether_nh_input() at ether_nh_input+0xe9/frame 0xfffffe02387ef8b0
        netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe02387ef900
        ether_input() at ether_input+0x26/frame 0xfffffe02387ef920
        if_input() at if_input+0xa/frame 0xfffffe02387ef930
        em_rxeof() at em_rxeof+0x2e1/frame 0xfffffe02387ef9a0
        em_handle_que() at em_handle_que+0x40/frame 0xfffffe02387ef9e0
        taskqueue_run_locked() at taskqueue_run_locked+0x185/frame 0xfffffe02387efa40
        taskqueue_thread_loop() at taskqueue_thread_loop+0xb8/frame 0xfffffe02387efa70
        fork_exit() at fork_exit+0x83/frame 0xfffffe02387efab0
        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe02387efab0
        --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
        
        db:0:kdb.enter.default>  bt
        Tracing pid 80764 tid 100760 td 0xfffff801e3d52620
        kdb_enter() at kdb_enter+0x3b/frame 0xfffffe0238ba3200
        vpanic() at vpanic+0x19b/frame 0xfffffe0238ba3260
        panic() at panic+0x43/frame 0xfffffe0238ba32c0
        trap_pfault() at trap_pfault/frame 0xfffffe0238ba3310
        trap_pfault() at trap_pfault+0x49/frame 0xfffffe0238ba3370
        trap() at trap+0x29d/frame 0xfffffe0238ba3480
        calltrap() at calltrap+0x8/frame 0xfffffe0238ba3480
        --- trap 0xc, rip = 0xffffffff80d579c3, rsp = 0xfffffe0238ba3550, rbp = 0xfffffe0238ba3560 ---
        m_tag_delete_chain() at m_tag_delete_chain+0x83/frame 0xfffffe0238ba3560
        mb_dtor_pack() at mb_dtor_pack+0x11/frame 0xfffffe0238ba3570
        uma_zfree_arg() at uma_zfree_arg+0x41/frame 0xfffffe0238ba35d0
        mb_free_ext() at mb_free_ext+0x101/frame 0xfffffe0238ba3600
        m_freem() at m_freem+0x48/frame 0xfffffe0238ba3620
        vmxnet3_stop() at vmxnet3_stop+0x283/frame 0xfffffe0238ba3670
        vmxnet3_init_locked() at vmxnet3_init_locked+0x27/frame 0xfffffe0238ba3700
        vmxnet3_ioctl() at vmxnet3_ioctl+0x39c/frame 0xfffffe0238ba3740
        ifhwioctl() at ifhwioctl+0x5f3/frame 0xfffffe0238ba37a0
        ifioctl() at ifioctl+0x475/frame 0xfffffe0238ba3840
        kern_ioctl() at kern_ioctl+0x267/frame 0xfffffe0238ba38b0
        sys_ioctl() at sys_ioctl+0x15b/frame 0xfffffe0238ba3980
        amd64_syscall() at amd64_syscall+0xa86/frame 0xfffffe0238ba3ab0
        fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0238ba3ab0
        --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x804e69fca, rsp = 0x7fffdfffbd58, rbp = 0x7fffdfffc5b0 ---
        

        Seems to be in mbufs for the first one. There are no mbuf exhaustion messages but make sure you have that set to 1M and shown as such on the dashboard.
        Looks almost exactly like this: https://forum.netgate.com/topic/147078/pfsense-reboot-kernel-panic-bpf_mcopy-v2-4-4-p3 No specific cause there though.

        Steve

        F 1 Reply Last reply Jan 29, 2021, 8:33 PM Reply Quote 0
        • F
          fresnoboy @stephenw10
          last edited by Jan 29, 2021, 8:33 PM

          @stephenw10 Thanks for looking into this. I do have 1M MBUFs set as reflected in the dashboard.

          As per the thread, I have increased the frags limit to 10000 (it was set at 5000 which is the default), and will see if that helps anything.

          I do have a gigabit fiber connection, but it's not clear that should cause changes to the defaults, but if there are things I need to change, I'm happy to try it.

          The system has been running fine for 2 days now. I'll keep an eye on it and see if it stays stable.

          I can't ever remember a crash in this configuration under 2.4.4. Were there any changes in 2.4.5 that could have caused a problem?

          Also, I did install the latest set of critical Vmware patches to 6.7U3 about a week before the first crash. Any change that could have affected something? The system is using ECC memory, and I am not seeing errors, so I think the hardware seems to not be a cause.

          1 Reply Last reply Reply Quote 0
          • S
            stephenw10 Netgate Administrator
            last edited by Jan 30, 2021, 2:46 PM

            Mmm, there are not any frags limit log entries in the message buffer so you probably don't need to increase that. It won't hurt though.

            There are no specific issues I'm aware of with VMWare and 2.4.5/p1. Nor with VMWare updates.

            Steve

            F 1 Reply Last reply Jan 31, 2021, 11:20 PM Reply Quote 0
            • F
              fresnoboy @stephenw10
              last edited by Jan 31, 2021, 11:20 PM

              textdump.tar.2.zip @stephenw10

              Well, I just had another outage. Same mbuf panic, and this with double the frags I had allocated before.

              Txtdump attached. Would love some ideas, or maybe I should revert back to 2.4.4? The config is backward compatible to 2.4.4 right?

              thx!

              1 Reply Last reply Reply Quote 0
              • S
                stephenw10 Netgate Administrator
                last edited by Feb 1, 2021, 2:02 AM

                No, current pfSense versions can import and update older config file versions but not the other way around. It might work OK.
                But other thread showing this was running 2.4.4p3 anyway so I would suggest going to a 2.5 snapshot if you're going to do anything.

                Steve

                F 1 Reply Last reply Feb 1, 2021, 5:13 AM Reply Quote 0
                • F
                  fresnoboy @stephenw10
                  last edited by Feb 1, 2021, 5:13 AM

                  @stephenw10

                  Ok. It's easy enough to take a snapshot that I can revert to since it's a vmware guest. I will go try a 2.5 version and see if it is better. Do you think there are relevant changes in the 3.5 train that could address this, or is it just trying something newer?

                  Does this crash have any more helpful data than the other two?

                  1 Reply Last reply Reply Quote 0
                  • S
                    stephenw10 Netgate Administrator
                    last edited by Feb 1, 2021, 2:22 PM

                    No that crash looks pretty much identical.

                    There are a lot of changes in pfSense 2.5 due to the FreeBSD 12 base. There are a whole raft of NIC changes that could affect this.

                    Steve

                    F 1 Reply Last reply Feb 1, 2021, 5:16 PM Reply Quote 0
                    • F
                      fresnoboy @stephenw10
                      last edited by Feb 1, 2021, 5:16 PM

                      @stephenw10

                      That makes a ton of sense. Will try it out today.

                      1 Reply Last reply Reply Quote 0
                      13 out of 13
                      • First post
                        13/13
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                        This community forum collects and processes your personal information.
                        consent.not_received