Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfsense crash 2.8.0

    Scheduled Pinned Locked Moved General pfSense Questions
    12 Posts 3 Posters 375 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      The backtrace is usually the important part of a crash report and isn't shown there. Did you get a full crash report?

      C 1 Reply Last reply Reply Quote 0
      • C
        cayossarian @stephenw10
        last edited by cayossarian

        @stephenw10 I was using memory based fle option for /var set atr 1Gb but had never see it get anywhere near that, usually 1% (right now at 55Mb) at most but then the crash dump ended up in there so a page fault occurred.

        My suspicion is that the crash happened but then the crash report caused a page fault when the dump filled up /var. But who knows maybe something unusual filled /var up.

        I guess /var isn't the best place for a crash dump or if that's the only option then I'll have to remove the memory based option. I did increase the size to 4Gb but who know how big the dump could be.

        Thanks,

        Bill

        GertjanG 1 Reply Last reply Reply Quote 0
        • GertjanG
          Gertjan @cayossarian
          last edited by

          @cayossarian said in pfsense crash 2.8.0:

          the crash report caused a page fault when the dump filled up /var

          Crash dump are not stored 'somewhere' in the /var/ - but, afaik, in the swap space (partition).

          Obtaining Panic Information for Developers

          Start by saying they are stored in /var/crash/

          and at the bottom you'll find : Install without Swap Space which tells me something different. And actually, as you said, more logic : what happens when there is a file system issue ? The system goes down with a trace.
          Also : the small Netgate appliances don't even have '4 Gbytes' for their /var/ ....

          Maybe - me even more guessing - the /car/crash/ contains some sort of symlink or just a filename or indication if a crash dump exists in the swap ?

          @cayossarian said in pfsense crash 2.8.0:

          But who knows maybe something unusual filled /var

          Your mission, as an admin : go have a look ? What folder contains 'Gbytes' size files ?

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            You should still see the backtrace at the console if it panics even without SWAP to store it.

            C 1 Reply Last reply Reply Quote 0
            • C
              cayossarian @stephenw10
              last edited by cayossarian

              This post is deleted!
              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Backtrace:

                db:1:pfs> bt
                Tracing pid 11 tid 100003 td 0xfffff8026f5fd740
                kdb_enter() at kdb_enter+0x33/frame 0xfffffe008e21eb20
                panic() at panic+0x43/frame 0xfffffe008e21eb80
                trap_fatal() at trap_fatal+0x40b/frame 0xfffffe008e21ebe0
                trap_pfault() at trap_pfault+0x46/frame 0xfffffe008e21ec30
                calltrap() at calltrap+0x8/frame 0xfffffe008e21ec30
                --- trap 0xc, rip = 0xffffffff80d15b8d, rsp = 0xfffffe008e21ed00, rbp = 0xfffffe008e21ed60 ---
                callout_process() at callout_process+0x1ad/frame 0xfffffe008e21ed60
                handleevents() at handleevents+0x186/frame 0xfffffe008e21eda0
                cpu_activeclock() at cpu_activeclock+0x6a/frame 0xfffffe008e21edd0
                cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008e21edf0
                sched_idletd() at sched_idletd+0x546/frame 0xfffffe008e21eef0
                fork_exit() at fork_exit+0x7b/frame 0xfffffe008e21ef30
                fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008e21ef30
                --- trap 0xf0db229f, rip = 0x2a49b49e2199f62b, rsp = 0x996070dacc2370c0, rbp = 0x468b9de920125c59 ---
                

                Unfortunately that's not very revealing. Doesn't really point to anything specific.

                The message buffer has some entries I would investigate though.

                <6>igc0: link state changed to DOWN
                <6>igc0: link state changed to UP
                <6>igc0: link state changed to DOWN
                <6>igc0: link state changed to UP
                <6>igc0: link state changed to DOWN
                <6>igc0: link state changed to UP
                

                What is igc0? Was the link intentionally being reconnected?

                <6>arp: 192.168.65.70 moved from 00:14:2d:e2:70:18 to 2c:3b:70:e9:08:61 on igc1.65
                <3>arp: 2c:3b:70:e9:08:61 attempts to modify permanent entry for 192.168.65.70 on igc1.65
                <6>arp: 192.168.65.70 moved from 00:14:2d:e2:70:18 to 2c:3b:70:e9:08:61 on igc1.65
                

                What are those devices and are they something that should sharing an IP address? Also that permanent entry implies either it's a local NIC or you're using static-arp which is almost always a bad idea.

                <7>sonewconn: pcb 0xfffff801c6f85000 (127.0.0.1:853 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (1 occurrences), euid 0, rgid 0, jail 0
                <7>sonewconn: pcb 0xfffff801c6f85000 (127.0.0.1:853 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (6547 occurrences), euid 0, rgid 0, jail 0
                <7>sonewconn: pcb 0xfffff801c6f85000 (127.0.0.1:853 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (1234 occurrences), euid 0, rgid 0, jail 0
                

                It looks like Unbound is unable to answer queries over TLS fast enough and it exhausting the queue for some reason.

                C 1 Reply Last reply Reply Quote 0
                • C
                  cayossarian @stephenw10
                  last edited by

                  This post is deleted!
                  C 1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, so 192.168.65.70 is the pfSense interface in that VLAN? And c:3b:70:e9:08:61 should not be using it?

                    None of that should ever cause a panic but you should address it at least to clean up the logs so other more important events aren't hidden.

                    1 Reply Last reply Reply Quote 0
                    • C
                      cayossarian @cayossarian
                      last edited by

                      This post is deleted!
                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Are both interfaces actually connected? Both on the same subnet? That's often asking for trouble. I would try to use only one interface there.

                        C 1 Reply Last reply Reply Quote 0
                        • C
                          cayossarian @stephenw10
                          last edited by

                          @stephenw10 I don’t have control of the panel but thanks for asking as I can open a. Support ticket with SPAN.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.