Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    6100 Boot Loop w/ Traffic Shaper on PPPoE WAN

    Scheduled Pinned Locked Moved Official NetgateĀ® Hardware
    26 Posts 3 Posters 3.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Ok, great. And it's the same backtrace every time?

      The odd thing there is that it doesn't appear to be in the traffic shaper. šŸ¤”

      T 1 Reply Last reply Reply Quote 0
      • T
        TheGrimPickler @stephenw10
        last edited by

        @stephenw10 I can't say for certain, I figured a crash log would be a crash log, so I didn't really try to give it more than one go.

        Pfsense became unresponsive for quite some time after enabling a few shaper queues, so I'd have to pull the plug to reboot it, let it boot without WAN plugged in, plug in WAN, then receive that dump. So I suppose it could be related to me bringing the OS to an abrupt halt, but that leaves me still stuck on my traffic shaper woes. Afterwards, I reboot, factory reset then restore my backup that doesn't utilize traffic shaper. And just in case there are any known issues I might have not known about, I'm utilizing current versions of the following packages:

        • Netgate_Firmware_Upgrade
        • pfBlockerNG-devel
        • Service_Watchdog
        • WireGuard
        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Do you have any of the console output while it was looping? It would be good to see where it panics in the boot process and if it's the same panic as that in the crash report.

          T 1 Reply Last reply Reply Quote 0
          • T
            TheGrimPickler @stephenw10
            last edited by

            @stephenw10 DM'd more crash logs that were triggered by adding new queues. Unfortunately, it doesn't seem any of the changes are being committed to memory this time as things return to the most recent setting and boot normally after the crash.

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Do you have any further details of the queues you enabled and how they were configured?

              Simply enabling the shaper with a few PRIQ queues on a PPPoE WAN is not triggering it here.

              T 1 Reply Last reply Reply Quote 0
              • T
                TheGrimPickler @stephenw10
                last edited by

                @stephenw10 I enabled three queues each on the WAN and LAN interfaces (last forced crash happened specifically when adding the LAN ones, funny enough). Priorities 3, 7 and 13 on each side I think.

                All Codel Active Queue, with one default queue on each interface. The queue limit was 50 most of the time I believe, I also did try setting it to 1000 originally when things orginally crashed. Bandwidth set to 940 mbps either interface.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmm, do you have the actual config queues section that was generated?

                  T 1 Reply Last reply Reply Quote 0
                  • T
                    TheGrimPickler @stephenw10
                    last edited by

                    @stephenw10 not on hand, the last crash would revert the save so I wouldn't have the full thing. I can try to grab something again in a few days here.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      OK, great. I haven't been able to replicate it here yet.

                      T 1 Reply Last reply Reply Quote 0
                      • T
                        TheGrimPickler @stephenw10
                        last edited by

                        @stephenw10 Sorry for the delayed reply, I wish I had an easier way to test this without inconveniencing some people. Alright, I sent you a log file and a config file. I created the shaper config, saved a copy of it, then applied it. The router stalled for 15 minutes, at which point I disconnected the power and replugged it back in. The boot stalls at "boostrapping clock" for more than a handful of minutes, then I send an 'enter' keystroke to putty's console connection and the attached crash begins. After collecting the evidence I unplugged the sfp+ connection, rebooted the router again, let the console fully come up, remove the traffic shaper via php shell, plug in sfp+ connection, everythings back to normal.

                        I grabbed the full router config this time before forcing the crash, so please let me know if you need anything else.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Ok that looks like something we should be able to work with:

                          Bootstrapping clock...
                          codel_should_drop: could not found the packet mtag!
                          
                          
                          Fatal trap 12: page fault while in kernel mode
                          cpuid = 0; apic id = 04
                          fault virtual address	= 0x5010410
                          fault code		= supervisor read data, page not present
                          instruction pointer	= 0x20:0xffffffff80cd789d
                          stack pointer	        = 0x0:0xfffffe00c4c04ae0
                          frame pointer	        = 0x0:0xfffffe00c4c04b60
                          code segment		= base 0x0, limit 0xfffff, type 0x1b
                          			= DPL 0, pres 1, long 1, def32 0, gran 1
                          
                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Opened a bug to track it: https://redmine.pfsense.org/issues/14497

                            T 1 Reply Last reply Reply Quote 1
                            • T
                              TheGrimPickler @stephenw10
                              last edited by

                              @stephenw10 awesome, glad to hear it. And thanks for tending to this and walking me through it as well.

                              K 1 Reply Last reply Reply Quote 0
                              • K
                                kprovost @TheGrimPickler
                                last edited by kprovost

                                @TheGrimPickler I'm failing to reproduce this problem so far.

                                The backtrace suggests that the unmapped pages feature is in use. Can you confirm the value of sysctl kern.ipc.mb_use_ext_pgs ?

                                Also, I appear to have missed what version you're running here.

                                T 1 Reply Last reply Reply Quote 0
                                • T
                                  TheGrimPickler @kprovost
                                  last edited by TheGrimPickler

                                  @kprovost via Diagnostics>Command Prompt:
                                  kern.ipc.mb_use_ext_pgs: 0

                                  Edit- I'm on version 23.05 of pfsense plus

                                  K 1 Reply Last reply Reply Quote 0
                                  • K
                                    kprovost @TheGrimPickler
                                    last edited by

                                    @TheGrimPickler I still cannot reproduce this issue.

                                    The 'could not found the packet mtag!' warning ought to be impossible, so it's not particularly helpful in debugging this.

                                    Can you share your full configuration?

                                    T 1 Reply Last reply Reply Quote 0
                                    • T
                                      TheGrimPickler @kprovost
                                      last edited by

                                      @kprovost sent

                                      1 Reply Last reply Reply Quote 0
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.