Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfSense v.2.6 crashes and reboot

    General pfSense Questions
    3
    10
    870
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mauro.tridici
      last edited by

      Dear pfSense experts/developers,

      during the last month, pfSense suddenly crashed and rebooted two times.
      Nothing changed during this period.

      I read that, in this case, I should share the crash report here.
      Could you please help me to understand the cause of this issue?
      You can find the dump files in attachment.

      Many thanks in advance,
      Mauro

      info.0 textdump.tar.0

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Backtrace:

        db:0:kdb.enter.default>  bt
        Tracing pid 84479 tid 100662 td 0xfffff80309ee7000
        kdb_enter() at kdb_enter+0x37/frame 0xfffffe0094f3b860
        vpanic() at vpanic+0x197/frame 0xfffffe0094f3b8b0
        panic() at panic+0x43/frame 0xfffffe0094f3b910
        pmap_remove_pages() at pmap_remove_pages+0xa1d/frame 0xfffffe0094f3ba10
        vmspace_exit() at vmspace_exit+0x9e/frame 0xfffffe0094f3ba50
        exit1() at exit1+0x55b/frame 0xfffffe0094f3bab0
        sys_sys_exit() at sys_sys_exit+0xd/frame 0xfffffe0094f3bac0
        amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe0094f3bbf0
        fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0094f3bbf0
        --- syscall (1, FreeBSD ELF64, sys_sys_exit), rip = 0x8004095fa, rsp = 0x7fffffffebe8, rbp = 0x7fffffffec00 ---
        

        Panics:

        panic: bad pte va 80c200000 pte 80000002be20ac28
        cpuid = 2
        time = 1690324290
        KDB: enter: panic
        
        panic: bad pte va 800f5c000 pte 0
        cpuid = 4
        time = 1692058926
        KDB: enter: panic
        

        Hmm, I would say that's likely a hardware error...excpet it's in VMWare. Has the hypervisor been updated in that time? What version of ESXi is it?

        The only error shown other than the panic is this:

        (da0:mpt0:0:0:0): UNMAP failed, disabling BIO_DELETE
        (da0:mpt0:0:0:0): UNMAP. CDB: 42 00 00 00 00 00 00 00 08 00 
        (da0:mpt0:0:0:0): CAM status: SCSI Status Error
        (da0:mpt0:0:0:0): SCSI status: Check Condition
        (da0:mpt0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
        (da0:mpt0:0:0:0): Command byte 7 is invalid
        (da0:mpt0:0:0:0): Error 22, Unretryable error
        

        Which looks like a drive error. Except again it's in VMWare...

        Steve

        S M 2 Replies Last reply Reply Quote 0
        • S
          Stewart @stephenw10
          last edited by

          @stephenw10 A bad bit in hardware, if it is in the right place, could also affect the vmdk file. I would suspect that bit would be unreadable in the vmfs and get passed on. Could possibly still be a drive or controller error just getting passed up the stack.

          M 1 Reply Last reply Reply Quote 0
          • M
            mauro.tridici @stephenw10
            last edited by

            Hi Stephen,

            thank you for your support.

            @stephenw10 said in pfSense v.2.6 crashes and reboot:

            Hmm, I would say that's likely a hardware error...excpet it's in VMWare. Has the hypervisor been updated in that time? What version of ESXi is it?

            No, the hypervisor hasn't been updated during that period.
            The version of ESXi is 6.7 u3

            I'll check the status of drives and controller and I will let you know.
            Thanks,
            Mauro

            1 Reply Last reply Reply Quote 0
            • M
              mauro.tridici @Stewart
              last edited by

              @Stewart thank you for the additional info.

              I just checked the status of drives and controller from the server management GUI, but it seems everything is ok.
              No lines has been recently added to the logs page of the server.

              It is very strange, I don't know how to manage it.

              Mauro

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Unfortunately none of that crash data is very revealing. Are those the only crashes it's seen?

                M 1 Reply Last reply Reply Quote 0
                • M
                  mauro.tridici @stephenw10
                  last edited by

                  @stephenw10 it happened again some minutes ago.

                  No CPU overload, no hard issues on controller and drives...
                  I'm still not able to understand where is the cause...

                  info.0 textdump.tar.0

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, different crash but still nothing specific.

                    Fatal trap 9: general protection fault while in kernel mode
                    cpuid = 6; apic id = 0c
                    instruction pointer	= 0x20:0xffffffff80d6f3f7
                    stack pointer	        = 0x28:0xfffffe000455f680
                    frame pointer	        = 0x28:0xfffffe000455f700
                    code segment		= base 0x0, limit 0xfffff, type 0x1b
                    			= DPL 0, pres 1, long 1, def32 0, gran 1
                    processor eflags	= interrupt enabled, resume, IOPL = 0
                    current process		= 28 (dom0)
                    trap number		= 9
                    panic: general protection fault
                    cpuid = 6
                    time = 1692624634
                    KDB: enter: panic
                    
                    db:0:kdb.enter.default>  bt
                    Tracing pid 28 tid 100208 td 0xfffff800090ed000
                    kdb_enter() at kdb_enter+0x37/frame 0xfffffe000455f390
                    vpanic() at vpanic+0x197/frame 0xfffffe000455f3e0
                    panic() at panic+0x43/frame 0xfffffe000455f440
                    trap_fatal() at trap_fatal+0x391/frame 0xfffffe000455f4a0
                    trap() at trap+0x67/frame 0xfffffe000455f5b0
                    calltrap() at calltrap+0x8/frame 0xfffffe000455f5b0
                    --- trap 0x9, rip = 0xffffffff80d6f3f7, rsp = 0xfffffe000455f680, rbp = 0xfffffe000455f700 ---
                    __mtx_lock_sleep() at __mtx_lock_sleep+0xd7/frame 0xfffffe000455f700
                    pmap_ts_referenced() at pmap_ts_referenced+0xc63/frame 0xfffffe000455f7b0
                    vm_pageout_worker() at vm_pageout_worker+0xf88/frame 0xfffffe000455fb70
                    vm_pageout() at vm_pageout+0x193/frame 0xfffffe000455fbb0
                    fork_exit() at fork_exit+0x7e/frame 0xfffffe000455fbf0
                    fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000455fbf0
                    --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
                    

                    Is there some reason you're not on 2.7?

                    You should probably stop logging those ARP movements if those MACs are known.

                    M 1 Reply Last reply Reply Quote 0
                    • M
                      mauro.tridici @stephenw10
                      last edited by

                      @stephenw10 thank you for the analysis.

                      I'm still at 2.7 because pfsense is in production and we need to be sure that the update will not cause any issue...
                      I'm at 2.6...do you think that I can update to 2.7 without impacting the existing services (syslog-ng, snort, pfblocker-ng, iperf, and so on)?

                      In addition, I noticed that some installed package names are in yellow.

                      Screenshot 2023-08-21 at 17.51.00.png

                      Sorry, but I didn't understand your last sentence:
                      "You should probably stop logging those ARP movements if those MACs are known."

                      What does I need to do in this case?

                      Thank you in advance,
                      Mauro

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        https://docs.netgate.com/pfsense/en/latest/troubleshooting/logs-arp-moved.html

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.