Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Firewall rebooted unexpectedly

    Scheduled Pinned Locked Moved General pfSense Questions
    15 Posts 2 Posters 968 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      michmoor LAYER 8 Rebel Alliance
      last edited by

      Netgate 6100 rebooted unexpectedly.
      I have some crash dump files that i can upload.

      Crash report begins.  Anonymous machine information:
      
      amd64
      15.0-CURRENT
      FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024     root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBS
      
      Crash report details:
      
      No PHP errors found.
      
      Filename: /var/crash/info.0
      Dump header from device: /dev/nda0p3
        Architecture: amd64
        Architecture Version: 4
        Dump Length: 371712
        Blocksize: 512
        Compression: none
        Dumptime: 2024-09-05 15:16:17 -0400
        Hostname: GAFW
        Magic: FreeBSD Text Dump
        Version String: FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024
          root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/j
        Panic String: page fault
        Dump Parity: 2857159027
        Bounds: 0
        Dump Status: good
      

      Firewall: NetGate,Palo Alto-VM,Juniper SRX
      Routing: Juniper, Arista, Cisco
      Switching: Juniper, Arista, Cisco
      Wireless: Unifi, Aruba IAP
      JNCIP,CCNP Enterprise

      M 1 Reply Last reply Reply Quote 0
      • M
        michmoor LAYER 8 Rebel Alliance @michmoor
        last edited by michmoor

        rebooted again...somethings failing i think.

        SSD is still in a good state

        === START OF SMART DATA SECTION ===
        SMART overall-health self-assessment test result: PASSED
        

        Firewall: NetGate,Palo Alto-VM,Juniper SRX
        Routing: Juniper, Arista, Cisco
        Switching: Juniper, Arista, Cisco
        Wireless: Unifi, Aruba IAP
        JNCIP,CCNP Enterprise

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Upload the crash report here: https://nc.netgate.com/nextcloud/s/mWWHieq9ZHL6seF

          M 1 Reply Last reply Reply Quote 0
          • M
            michmoor LAYER 8 Rebel Alliance @stephenw10
            last edited by

            @stephenw10
            files uploaded. I also have a TAC opened. Im not seeing any signs of hardware failure as suggested but could be wrong.

            Firewall: NetGate,Palo Alto-VM,Juniper SRX
            Routing: Juniper, Arista, Cisco
            Switching: Juniper, Arista, Cisco
            Wireless: Unifi, Aruba IAP
            JNCIP,CCNP Enterprise

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Doesn't look like hardware, all those crashes are almost identical.

              Backtrace:

              db:1:pfs> bt
              Tracing pid 12 tid 100043 td 0xfffff80001688740
              kdb_enter() at kdb_enter+0x33/frame 0xfffffe00850ca270
              panic() at panic+0x43/frame 0xfffffe00850ca2d0
              trap_fatal() at trap_fatal+0x40f/frame 0xfffffe00850ca330
              trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ca390
              calltrap() at calltrap+0x8/frame 0xfffffe00850ca390
              --- trap 0xc, rip = 0xffffffff846626a7, rsp = 0xfffffe00850ca460, rbp = 0xfffffe00850ca490 ---
              export_pflow() at export_pflow+0x77/frame 0xfffffe00850ca490
              pf_detach_state() at pf_detach_state+0x45b/frame 0xfffffe00850ca4d0
              pf_state_insert() at pf_state_insert+0x854/frame 0xfffffe00850ca570
              pf_test_rule() at pf_test_rule+0x28f8/frame 0xfffffe00850ca9c0
              pf_test() at pf_test+0x1382/frame 0xfffffe00850cab90
              pf_check_out() at pf_check_out+0x22/frame 0xfffffe00850cabb0
              pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe00850cabe0
              ip_output() at ip_output+0xb60/frame 0xfffffe00850cace0
              ip_forward() at ip_forward+0x3c2/frame 0xfffffe00850cad90
              ip_input() at ip_input+0x705/frame 0xfffffe00850cadf0
              swi_net() at swi_net+0x138/frame 0xfffffe00850cae60
              ithread_loop() at ithread_loop+0x257/frame 0xfffffe00850caef0
              fork_exit() at fork_exit+0x7f/frame 0xfffffe00850caf30
              fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850caf30
              --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
              

              Looks like an issue in pflow, do you have that enabled?

              The only other thing I see is:
              <6>pid 67263 (pftop), jid 0, uid 0: exited on signal 6 (core dumped)
              That could just be a symptom of the panic though.

              M 1 Reply Last reply Reply Quote 0
              • M
                michmoor LAYER 8 Rebel Alliance @stephenw10
                last edited by

                @stephenw10
                I do have pflow enabled
                Its been working great since the 24. update. Why is it acting up now?

                58b6c7a4-9111-4179-b7bf-649bfd5b011a-image.png

                Firewall: NetGate,Palo Alto-VM,Juniper SRX
                Routing: Juniper, Arista, Cisco
                Switching: Juniper, Arista, Cisco
                Wireless: Unifi, Aruba IAP
                JNCIP,CCNP Enterprise

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Good question. And it's set to Netflowv5 so not this: https://redmine.pfsense.org/issues/15446

                  What else has changed?

                  M 1 Reply Last reply Reply Quote 0
                  • M
                    michmoor LAYER 8 Rebel Alliance @stephenw10
                    last edited by

                    @stephenw10
                    I cant see the config history as now its flooded with (system): related messages.

                    b73a5a6f-f130-436c-bd7c-f94d941ffda0-image.png

                    The Auto Configuration Backup / Restore has no backups for the device. Is this normal?

                    63509ec2-81ce-47ae-b168-ac5705765ac0-image.png

                    This started yesterday during the work day so for sure no changes. Later that night i updated a pfblocker DNSBL feed but its not related to pfblocker.

                    Anything else i can check? Any other clues in the crash dumps?

                    Firewall: NetGate,Palo Alto-VM,Juniper SRX
                    Routing: Juniper, Arista, Cisco
                    Switching: Juniper, Arista, Cisco
                    Wireless: Unifi, Aruba IAP
                    JNCIP,CCNP Enterprise

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, ACB not seeing backups is probably unrelated. But check general connectivity from the firewall itself. Check if using the key in a different box can see the backups.

                      This looks like a bug in flow to me, we are looking into it.

                      How often is it panicking? Can you test disabling pflow?

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        michmoor LAYER 8 Rebel Alliance @stephenw10
                        last edited by

                        @stephenw10
                        I can disable flow for now.

                        The restart events are below
                        9/5 - 3:20pm EDT
                        9/5 - 3:40pm EDT
                        9/5 - 11:50pm EDT
                        9/6 - 03:30am EDT
                        9/6 - 05:40am EDT
                        9/6 - 07:00am EDT

                        Firewall: NetGate,Palo Alto-VM,Juniper SRX
                        Routing: Juniper, Arista, Cisco
                        Switching: Juniper, Arista, Cisco
                        Wireless: Unifi, Aruba IAP
                        JNCIP,CCNP Enterprise

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Hmm, OK it appears it probably is that bug. Or at least the same fix applies.

                          Something must have changed though for it to suddenly start hitting it.

                          M 1 Reply Last reply Reply Quote 0
                          • M
                            michmoor LAYER 8 Rebel Alliance @stephenw10
                            last edited by

                            @stephenw10 Even though the redmine points to it being related to IPFIX?

                            The only thing that "recently" changed was a NAT Port Forward rule and DHCP settings on 9/5 @ 09:32am EDT

                            I see there is a patch created.

                            Firewall: NetGate,Palo Alto-VM,Juniper SRX
                            Routing: Juniper, Arista, Cisco
                            Switching: Juniper, Arista, Cisco
                            Wireless: Unifi, Aruba IAP
                            JNCIP,CCNP Enterprise

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              There is a patch but it's a compile time patch. It's fixed in 24.08 but would need a rebuild for 24.03.

                              Yes, in the original bug report it only affected IPFIX which is why I initially thought it could not be that. But Kristof believes the root cause is the same here, the fix is the same.

                              It is odd though that you were not hitting it before though. Something must have changed. Hard to imagine a port forward would have done it.

                              M 1 Reply Last reply Reply Quote 0
                              • M
                                michmoor LAYER 8 Rebel Alliance @stephenw10
                                last edited by

                                @stephenw10
                                I honestly dont know what couldve change within 24hrs specifically to pflow. I added an additional collector configuration a while back ago

                                I reviewed my changes from yesterday and confirmed only those changes i stated were done. Considering the bulk of the reboots happened while i was asleep and as far as i know i don't sleep walk (maybe i do) it wasn't anything I've done overnight to cause those reboots.

                                As of now the fix is ready but will be released with 24.08?
                                The workaround is to disable pflow?

                                Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                Routing: Juniper, Arista, Cisco
                                Switching: Juniper, Arista, Cisco
                                Wireless: Unifi, Aruba IAP
                                JNCIP,CCNP Enterprise

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Well the first thing is to confirm it really is pflow by disabling it making sure it doesn't happen.

                                  1 Reply Last reply Reply Quote 1
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.