Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Wan periodic reset causes system reboot.

    Scheduled Pinned Locked Moved General pfSense Questions
    152 Posts 6 Posters 34.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • RobbieTTR
      RobbieTT @stephenw10
      last edited by

      @stephenw10
      Sorry Steve, this proved to be beyond me. I guess I will have to wait for the GUI button to be implemented or for a genuine idiot proof step-by-step guide to be written as this has eaten through way too many hours over too many days.

      I think I hit the assumed-knowledge barrier too often, with steps given, only to be belatedly added to with instructions like 'using console mode' or 'use kernel debug mode option 6' or 'did you edit some .conf file' or 'follow 'x' thread' or 'install 'x' package but only by method 'y'.

      So what did work:

      • got console working from macOS (mislabeled as GNU screen in pfSense docs)
      • got the swap partition size changed via console
      • fresh install
      • installed pfSense-kernel-debug-pfSense pkg from the GUI command line
      • ran kdb.enter.default=capture on; (etc) script from regular CLI
      • reboots (many)
      • kdb.enter.default=capture shown under /root
      • reboot into kernel debug mode via console (option 6 etc)
      • trigger panic via CLI using sysctl debug.kdb.panic=1
      • console scrolls through something that looks like a core dump...
      • crash report in /var/crash with info and text dump files
      • no core dump offered in the GUI
      • no core dump file found in /var/crash

      Clearly I am typing with a little frustration (sorry about that) but perhaps you can spot something useful in the above.

      โ˜•๏ธ

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by stephenw10

        I'm sorry. Yes it will be much better when there's a gui option.

        You shouldn't need to add the debug kernel just to get the coredump.

        The important steps are:

        1. Make sure you have enough SWAP space (you do.
        2. Edit /etc/pfSense-ddb.conf so it contains the different default line like:
        # $FreeBSD$
        #
        #  This file is read when going to multi-user and its contents piped thru
        #  ``ddb'' to define debugging scripts.
        #
        # see ``man 4 ddb'' and ``man 8 ddb'' for details.
        #
        
        script lockinfo=show locks; show alllocks; show lockedvnods
        script pfs=bt ; show registers ; show pcpu ; run lockinfo ; acttrace ; ps ; alltrace
        
        # kdb.enter.panic       panic(9) was called.
        #script kdb.enter.default=textdump set; capture on; run pfs ; capture off; textdump dump; reset
        script kdb.enter.default=capture on; bt; show registers; show pcpu; capture off; dump; reset
        
        # kdb.enter.witness	witness(4) detected a locking error.
        script kdb.enter.witness=run lockinfo
        
        1. Reboot.
        2. (Optionally) Run sysctl debug.kdb.panic=1 to test the setup. You should see it writing out the coredump to swap in the console after all the backtraces scroll past.

        Steve

        RobbieTTR 1 Reply Last reply Reply Quote 0
        • RobbieTTR
          RobbieTT @stephenw10
          last edited by

          @stephenw10 said in Wan periodic reset causes system reboot.:

          1. Edit /etc/pSense-ddb.conf so it contains the different default line like:

          Hmmm, no such file found on this device. No idea why!

          โ˜•๏ธ

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Oh sorry I typo'd that. ๐Ÿคฆ

            Should be /etc/pfSense-ddb.conf

            RobbieTTR 1 Reply Last reply Reply Quote 0
            • RobbieTTR
              RobbieTT @stephenw10
              last edited by

              @stephenw10

              Haha - should have spotted that.

              [23.09-BETA]/root: cat /etc/pfSense-ddb.conf
              # $FreeBSD$
              #
              #  This file is read when going to multi-user and its contents piped thru
              #  ``ddb'' to define debugging scripts.
              #
              # see ``man 4 ddb'' and ``man 8 ddb'' for details.
              #
              
              script lockinfo=show locks; show alllocks; show lockedvnods
              script pfs=bt ; show registers ; show pcpu ; run lockinfo ; acttrace ; ps ; alltrace
              
              # kdb.enter.panic       panic(9) was called.
              # script kdb.enter.default=textdump set; capture on; run pfs ; capture off; textdump dump; reset
              script kdb.enter.default=capture on; bt; show registers; show pcpu; capture off; dump; reset
              
              # kdb.enter.witness	witness(4) detected a locking error.
              script kdb.enter.witness=run lockinfo
              [23.09-BETA]/root: 
              

              Now, do I have a typo of my own?

              โ˜•๏ธ

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Looks fine to me. Reboot to apply it and then try a test panic.

                RobbieTTR 2 Replies Last reply Reply Quote 0
                • RobbieTTR
                  RobbieTT @stephenw10
                  last edited by

                  @stephenw10

                  Wife watching Bake Off on catch-up; I would die a painful death.

                  I'll be brave when she is elsewhere.

                  โ˜•๏ธ

                  1 Reply Last reply Reply Quote 1
                  • RobbieTTR
                    RobbieTT @stephenw10
                    last edited by RobbieTT

                    @stephenw10
                    I think I have it. Now, how do I get this massive vmcore and info file to you?

                    <118>Netgate pfSense Plus 23.09-BETA amd64 20231020-0600
                    <118>Bootup complete
                    <6>ng0: changing name to 'pppoe0'
                    pf_test6: kif == NULL, if_xname pppoe0
                    <6>ng0: changing name to 'pppoe0'
                    
                    
                    Fatal trap 12: page fault while in kernel mode
                    
                    cpuid = 3; apic id = 18
                    fault virtual address	= 0x10
                    fault code		= supervisor read data, page not present
                    instruction pointer	= 0x20:0xffffffff80f4e116
                    stack pointer	        = 0x0:0xfffffe00850b6b60
                    frame pointer	        = 0x0:0xfffffe00850b6b90
                    code segment		= base 0x0, limit 0xfffff, type 0x1b
                    			= DPL 0, pres 1, long 1, def32 0, gran 1
                    processor eflags	= interrupt enabled, resume, IOPL = 0
                    current process		= 2 (clock (3))
                    rdi: fffff80203712800 rsi: 000000000000001c rdx: fffff8013760d878
                    rcx: fffff8013760d878  r8: 00000000ffffffbd  r9: 0000000000000018
                    rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00850b6b90
                    r10: fffff802033dd8c0 r11: fffff8016f5e5000 r12: 0000000000010300
                    r13: fffff80203676b98 r14: fffffe00850b6b68 r15: 0000000000000018
                    trap number		= 12
                    panic: page fault
                    cpuid = 3
                    time = 1697905286
                    KDB: enter: panic
                    

                    โ˜•๏ธ

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      You can upload it here: https://nc.netgate.com/nextcloud/index.php/s/ywzFPM3F8GZnRdb

                      Or I can download it from somewhere if that's easier, just send me a link in chat.

                      RobbieTTR 1 Reply Last reply Reply Quote 0
                      • RobbieTTR
                        RobbieTT @stephenw10
                        last edited by

                        @stephenw10 said in Wan periodic reset causes system reboot.:

                        https://nc.netgate.com/nextcloud/index.php/s/ywzFPM3F8GZnRdb

                        Uploaded to your link. Usual privacy request, or I'll come looking for you. ๐Ÿ˜Ž

                        If you can acknowledge they arrived ok, that would be great.

                        โ˜•๏ธ

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Great I see them. ๐Ÿ‘

                          RobbieTTR 1 Reply Last reply Reply Quote 1
                          • RobbieTTR
                            RobbieTT @stephenw10
                            last edited by

                            @stephenw10
                            TVM.
                            I just used the GUI to command the WAN connection down and up again to trigger the panic. Give me a shout if you need a repeat or a different method.

                            โ˜•๏ธ

                            (but quietly hoping that this is the last of it...๐Ÿ™ƒ)

                            1 Reply Last reply Reply Quote 1
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              More can't hurt!

                              Are you able to get the backtrace from the console for that? Just to confirm it's the same crash. I'm pretty sure it is though.

                              RobbieTTR 1 Reply Last reply Reply Quote 1
                              • RobbieTTR
                                RobbieTT @stephenw10
                                last edited by

                                @stephenw10
                                I've since cleared the file and switched devices (to check that the qat_200xx revision was in place for the D-1736NT). It did look the same though.

                                โ˜•๏ธ

                                1 Reply Last reply Reply Quote 1
                                • A
                                  AlexanderK
                                  last edited by

                                  not possible to keep testing after the latest news. thanks everyone

                                  RobbieTTR 1 Reply Last reply Reply Quote 0
                                  • RobbieTTR
                                    RobbieTT @AlexanderK
                                    last edited by

                                    @AlexanderK said in Wan periodic reset causes system reboot.:

                                    not possible to keep testing after the latest news. thanks everyone

                                    That is a shame but understandable.

                                    Hopefully this stupid mess is just for the short-term as it is hard to see why anyone testing could be expected to pay either a large sum of money, every year, or buy additional Netgate devices for testing purposes - especially if the end results are excluded from home or lab users.

                                    I've got 11.5 months on my current pfSense+ licence so will continue to test, at least in the short term, so that Netgate can think this through. I accept though that many will dump any and all dev/beta testing out of principle, even if they don't run from pfSense.

                                    โ˜•๏ธ

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      Updates coming....

                                      RobbieTTR 2 Replies Last reply Reply Quote 1
                                      • RobbieTTR
                                        RobbieTT @stephenw10
                                        last edited by

                                        @stephenw10 said in Wan periodic reset causes system reboot.:

                                        Updates coming....

                                        ...and I remain optimistic. ๐Ÿ‘

                                        โ˜•๏ธ

                                        1 Reply Last reply Reply Quote 1
                                        • RobbieTTR
                                          RobbieTT @stephenw10
                                          last edited by

                                          @stephenw10 said in Wan periodic reset causes system reboot.:

                                          Updates coming....

                                          Optimism now dead.

                                          Current pfSense+ subscription that I use for dev/beta testing apparently dead too.

                                          Don't know what to say really.

                                          โ˜•๏ธ

                                          J 1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            It should not be. Existing subs should still function until they expire at least. Are you no longer able to see pkgs?

                                            Also further updates may still happen here.

                                            RobbieTTR 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.