Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login
    Introducing Netgate Nexus: Multi-Instance Management at Your Fingertips.

    Netgate 6100 unstable since upgrade to 26.03

    Scheduled Pinned Locked Moved General pfSense Questions
    21 Posts 4 Posters 545 Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C Offline
      ChrisJenk
      last edited by

      Since upgrading my 6100 from 25.11 to 26.03 I've had three occasions where the router has locked up - it stopped responding on all interfaces - and I had to power cycle it to recover. After the most recent instance the system log shows the following up until the moment of me power cycling it:

      Apr 17 14:18:23	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:18:23	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:18:23	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:18:23	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:18:23	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:18:23	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:18:23	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:18:23	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:18:23	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:18:23	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:17:27	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:17:27	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:17:27	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:17:27	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:17:27	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:17:27	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:17:27	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:17:27	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:17:26	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:17:26	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:16:30	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:16:30	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:16:30	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:16:30	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:16:30	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:16:30	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:16:30	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:16:30	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:16:30	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:16:30	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:15:34	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:15:34	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:15:34	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:15:34	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:15:34	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:15:34	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:15:34	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:15:34	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:15:34	kernel		pppoe: received PADO but could not find request for it
      Apr 17 14:15:34	kernel		pppoe0: host unique tag found, but it belongs to a connection in state 3
      Apr 17 14:15:34	kernel		pppoe0: link state changed to DOWN
      Apr 17 14:15:34	kernel		if_pppoe: pppoe0: LCP keepalive timeout
      

      Could this be related to the lock up? Are there any known issues that could cause this instability? System was rock solid on 25.11.

      This is rather concerning.

      GertjanG 1 Reply Last reply Reply Quote 0
      • GertjanG Offline
        Gertjan @ChrisJenk
        last edited by

        @ChrisJenk

        You're using 'pppoe'.
        I was told (see the xxxx post on the forum) that pppoe uses a new driver.
        Go here : System > Advanced > Networking at the bottom of the page, where you can pick 'the other' one.

        No "help me" PM's please. Use the forum, the community will thank you.

        C 1 Reply Last reply Reply Quote 0
        • C Offline
          ChrisJenk @Gertjan
          last edited by

          @Gertjan said in Netgate 6100 unstable since upgrade to 26.03:

          @ChrisJenk

          You're using 'pppoe'.
          I was told (see the xxxx post on the forum) that pppoe uses a new driver.
          Go here : System > Advanced > Networking at the bottom of the page, where you can pick 'the other' one.

          I am using the new driver (I guess the log messages do not differentiate).

          Screenshot 2026-04-17 at 15.04.10.png

          1 Reply Last reply Reply Quote 0
          • stephenw10S Offline
            stephenw10 Netgate Administrator
            last edited by

            They do log differently. Those logs are from the new if_pppoe driver. They usually show a connection where the remote side is not in the same connection state. So either the local client brought down the link but the server thinks it's still up or the other way around. Either way I'd expect it to be resolvable by reconnecting the WAN. A reboot should not be necessary.

            However if it stopped responding on any interface that's something else. Those PPPoE logs could just be a symptom.

            Are there any other errors logged? When it first hit the issue perhaps?

            C 1 Reply Last reply Reply Quote 0
            • C Offline
              ChrisJenk @stephenw10
              last edited by

              @stephenw10 Nothing that looked out of the ordinary in the system log. The messages immediately before the snippet I posted were some SSH connect/disconnect from one of my monitoring hosts (which were successful). The issue seemed to start pretty much at the time those PPPoE messages were logged. The next message after that snippet was the system boot message after I had power cycled the device. If it happens again, which I suspect it will, I will try just unplugging and replugging the WAN connection to see if it recovers, but I'm not convinced that is the issue as I was using the same WAN connection, and if_pppoe, under 25.11 and never had this issue.

              Is there anything I can do to get more diagnostics if/when it happens again?

              1 Reply Last reply Reply Quote 0
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                Was it still responding at the console?

                If so you could grab the ifconfig output and to ping out from pfSense itself to something on the LAN. Try to determine if anything is still passing any traffic.

                C 1 Reply Last reply Reply Quote 0
                • C Offline
                  ChrisJenk @stephenw10
                  last edited by

                  @stephenw10 No idea about the console. Where the unit is located makes it very hard to maintain a permanent console connection (though I will look into that for the future). The unit has 4 physical interfaces in use; 3x 2.5 Gbit igc and 1x 10 Gbit ix. One of the igc interfaces has two subnets on it via separate VLANs. One of the igc interfaces has a VLAN (911) underlying the PPPoE ISP connection.

                  When the issue started I tried pinging all of the internal (non WAN) interfaces and none of them were responding to pings. Sadly I wasn't able to try to ping the WAN externally as I had no connectivity and I was in a hurry to resurrect everything.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S Offline
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, well I would try to hook up the console to something if you can. That would show you just how unresponsive it is.

                    C 1 Reply Last reply Reply Quote 0
                    • C Offline
                      ChrisJenk @stephenw10
                      last edited by

                      @stephenw10 I've managed to rig up a permanent USB console connection. Let's see what happens next time...

                      C 1 Reply Last reply Reply Quote 1
                      • C Offline
                        ChrisJenk @ChrisJenk
                        last edited by

                        @stephenw10 Just to follow up on this. I may have identified the cause. I integrated my firewall with Home assistant using the pfSense integration. This uses the pfSense XMLRPC service to query status, statistics etc. for display in Home Assistant. As part of my troubleshooting I disabled this integration in Home Assistant and since then no hangs (so far at least - many days now). That suggests to me that there may be some kind of issue in the handling of the XMLRPC API in 26.03? I am leaving it disabled for now, though I believe it is fairly widely used.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S Offline
                          stephenw10 Netgate Administrator
                          last edited by

                          Ah interesting. That has caused some problems in the past but I'm not aware of anything in 26.03 specifically.

                          Without any errors to work with it's impossible to say really. I assume nothing was shown on the console?

                          C 1 Reply Last reply Reply Quote 0
                          • C Offline
                            ChrisJenk @stephenw10
                            last edited by

                            @stephenw10 Well, at the same time as I set up the permanent console I also disabled the HA pfSense integration. As there haven't been any issues since there is nothing to see on the console (or in the logs). I'm not sure I want to risk enabling the integration again as stability is paramount for me.

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S Offline
                              stephenw10 Netgate Administrator
                              last edited by

                              Hmm, I don't use Home Assistant here so I can't test it. Maybe you can replicate it on a test instance?

                              C 1 Reply Last reply Reply Quote 0
                              • C Offline
                                ChrisJenk @stephenw10
                                last edited by

                                @stephenw10 The thing is that the issue affects my Netgate device, of which I only have one, nor Home Assistant. So using a test HA instance won't really help (if that is what you meant). I don't really have any easy way to spin up a test pfSense+ instance. Personally I will live without the HA integration for now as, while nice, it is not critical for me. Maybe someone else will run into the issue and be able to capture diagnostics etc.

                                1 Reply Last reply Reply Quote 1
                                • dennypageD Offline
                                  dennypage
                                  last edited by

                                  I could probably test this. @ChrisJenk, can you give me an idea of where you got the integration (I don't see one in the standard distribution), and how it is configured?

                                  C 1 Reply Last reply Reply Quote 1
                                  • C Offline
                                    ChrisJenk @dennypage
                                    last edited by

                                    @dennypage It's from the Home Assistant Community Store (HACS). Info here:

                                    https://github.com/travisghansen/hass-pfsense

                                    I created a dedicated pfSense user for it to use and gave it the credentials for that user.

                                    There isn't much to configure; I just has the default set of enabled sensors and metrics. I wasn't using any of the control functions.

                                    My suspicion is that over time (takes many days, perhaps weeks) the frequent polling either causes some kind of resource leak (though nothing was obvious; memory and CPU were fine) or some corner case concurrency issue causes some kind of lock up. Each time my Netgate unit locked up it was not even responding to pings, so it was a pretty hard lock up.

                                    Since disabling the integration and blocking its access to pfSense (I locked the user) I haven't had the issue. Not conclusive proof but somewhat suspicious.

                                    dennypageD 1 Reply Last reply Reply Quote 0
                                    • dennypageD Offline
                                      dennypage @ChrisJenk
                                      last edited by

                                      @ChrisJenk said in Netgate 6100 unstable since upgrade to 26.03:

                                      I just has the default set of enabled sensors and metrics.

                                      Just to be sure I understand, you didn't change (enable/disable) any of the controls or sensors? Everything in Home Assistant is still at defaults?

                                      I have it up and running with defaults against one of my pfSense VMs.

                                      C 1 Reply Last reply Reply Quote 1
                                      • stephenw10S Offline
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Nice. I guess watch for memory leaks somewhere.

                                        1 Reply Last reply Reply Quote 0
                                        • C Offline
                                          ChrisJenk @dennypage
                                          last edited by

                                          @dennypage I can't recall if I enabled any additional sensors but here is a list pf all the pfSense integration entities that were enabled on my system when the problem was occurring.

                                          I'd be surprised if it were a memory leak since one of the things the integration monitors is memory usage and it was always very stable at 10-11%.

                                          pfSense_HA_Entities_Small.png

                                          dennypageD 1 Reply Last reply Reply Quote 0
                                          • dennypageD Offline
                                            dennypage @ChrisJenk
                                            last edited by

                                            @ChrisJenk said in Netgate 6100 unstable since upgrade to 26.03:

                                            I can't recall if I enabled any additional sensors but here is a list pf all the pfSense integration entities that were enabled on my system when the problem was occurring.

                                            Okay, thanks. Looks like you had all the excess filesystems turned off, but otherwise it looks like default.

                                            One thing, the last entity on the list (the one entitled "Update")... I assume that a HACS specific thing? I don't use HACS, so I just want to be sure.

                                            C 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2026 Rubicon Communications LLC (Netgate). All rights reserved.