Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    if_pppoe problems with php-fpm causing loops. (resolved)

    Scheduled Pinned Locked Moved General pfSense Questions
    69 Posts 4 Posters 10.7k Views 6 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C Offline
      chrcoluk
      last edited by chrcoluk

      A retry delay, might be an idea? With the timing issue on ipv6 as well, I wonder if things just move too fast.
      Because I dont understand how the new PPP works, from what I can tell it doesnt even have a config file, the parameters are just supplied directly, its hard for me to try and patch something myself to find a solution, I am hoping netgate are looking into it as its just me and ajtuk from one ISP.
      Just in case it might be relevant, the WAN connection does go through a WAN side VLAN, a requirement from CityFibre.

      This is also confirmed on the command output here.

      # pppcfg pppoe2
              dev: igc1.911 state: session
              sid: 0x44a8 PADI retries: 0 PADR retries: 0 time: 02:13:26
              sppp: phase network authproto auto authname "x@x" peerproto auto 
              dns: 217.x 217.x
      

      I will of course do as you said stephen, the incident is now closed, but if needed I will try to emulate it with a cable pull test, but I have someone relying on the uplink almost around the clock, so might be a little bit of time to get a chance to do it.

      I am also prepared to have a conversation with AAISP to see if they can assist from their side, maybe they can notice something wrong in their logs.

      pfSense CE 2.8.1

      1 Reply Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        Mmm, the VLAN should not have any effect. Many ISPs require that..

        Can you replicate this manually by just reconnecting the WAN? That at least would be something we could try to replicate locally.

        C 1 Reply Last reply Reply Quote 0
        • C Offline
          chrcoluk @stephenw10
          last edited by chrcoluk

          @stephenw10 I thought about how I would do the test, I think best way is cutting power to the ONT, s the box will detect the carrier link going down if I pull the ethernet, I will try to do this test as soon as I can, but to try and mimic the conditions, I will probably need to keep at down for at least 15 minutes.

          I will do this in about 5 hours or so, I think either killing the ONT power or removing the cable, will make the cable register as disconnected, I am not prepared to remove the fibre cable, as it leaves it exposed to dirt.

          AAISP offer a kill switch for the PPP, and the last outage was caused by them doing a "admin reset" it was logged as that on their logs. I guess to kick me of the LNS for rebalancing purposes, so I will get online via my phone and use the PPP kill switch for the test, this leaves the ONT powered up and ethernet cable in a connected state to mimic the failure conditions.

          pfSense CE 2.8.1

          1 Reply Last reply Reply Quote 1
          • C Offline
            chrcoluk @stephenw10
            last edited by chrcoluk

            @stephenw10 Ok here is an update, good news, remote kill of ppp triggers it so easy debug. Bad news cycling the interface removes the debug flag so successful connect lacks debug.

            I can confirm the following, interface is "not" going down, no counters reset, checking it often in both cli and interfaces UI shows outbound packets accruing with no other changes.
            The UI during this stuck state, shows it online with no ip's.
            The CLI during this stuck state shows it online and the ip's still bound.

            snapshot of debug log below, is from what I can see just 4 lines repeating, and looking at timestamps, there seems to be no pauses going on. Only thing I censored was what I think is remote mac from ISP, playing safe on that.

            Since I can do the test quickly, I am a bit more open to repeating it in future.

            I have also snapshotted another bit of logging with it debug enabled whilst I kill PPP, so info before the loop starts, but its a lot to post on this thread, is there somewhere else I can post more privately? If not, I might attach it somewhere.

            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x63 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x63 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x62 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x62 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x61 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x61 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x60 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x60 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x5f len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x5f len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x5e len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x5e len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x5d len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x5d len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x5c len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x5c len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x5b len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x5b len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x5a len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x5a len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x59 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x59 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x58 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x58 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x57 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x57 len=22
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp nak opts:
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp input(ack-sent): <conf-nak id=0x56 len=10
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2 (8864) state=3, session=0x580 output -> 9c:<blah>, len=30
            Jul 29 07:29:01 	kernel 	486498 	if_pppoe: pppoe2: ipcp output <conf-req id=0x56 len=22
            Jul 29 07:29:01 	kern
            ```el 	486498 	if_pppoe: pppoe2: ipcp nak opts:

            pfSense CE 2.8.1

            1 Reply Last reply Reply Quote 0
            • C Offline
              chrcoluk @ajtuk
              last edited by

              @ajtuk Hi, I reported this to AAISP now, it makes sense to try and get them to work with pfSense dev's, if you can, you can contact them as well. As I mentioned there was another customer affected by the lack of auto reconnection.

              pfSense CE 2.8.1

              A 1 Reply Last reply Reply Quote 0
              • A Offline
                ajtuk @chrcoluk
                last edited by

                @chrcoluk said in if_pppoe problems with php-fpm causing loops. (resolved):

                @ajtuk Hi, I reported this to AAISP now, it makes sense to try and get them to work with pfSense dev's, if you can, you can contact them as well. As I mentioned there was another customer affected by the lack of auto reconnection.

                Will do. I can also do some more testing next week and see if I get the same results. It's been "stable" the last few days, but no maintenance or issues on the AAISP end to cause it to drop.

                1 Reply Last reply Reply Quote 1
                • stephenw10S Offline
                  stephenw10 Netgate Administrator
                  last edited by

                  Yup you can upload here: https://nc.netgate.com/nextcloud/s/isZqc6dRLsXfqYg

                  How exactly are you killing PPP?

                  C A 2 Replies Last reply Reply Quote 0
                  • C Offline
                    chrcoluk @stephenw10
                    last edited by chrcoluk

                    @stephenw10 There should be 2 logs there, I am not sure it worked as it says uploading, but please let me know.

                    On your question, AAISP on their control panel has a button that you can click which will kill the PPP session from their side, so basically a server side kill, not client side kill. I was logged in with my mobile phone connection to ensure I didnt lose access to the control panel as it happened.

                    pfSense CE 2.8.1

                    1 Reply Last reply Reply Quote 1
                    • stephenw10S Offline
                      stephenw10 Netgate Administrator
                      last edited by

                      Yup I see the files there, thanks.

                      C 2 Replies Last reply Reply Quote 1
                      • A Offline
                        ajtuk @stephenw10
                        last edited by

                        @stephenw10 My connection dropped tonight. ISP logged it as a "Planned PPP restart". I uploaded a log to the link here. Maybe it's helpful?

                        It was only my CityFibre connection which did not reconnect. FTTC reconnected OK. Both use PPPoE and both are with A&A.

                        Rebooting the appliance brought it back up.

                        C 1 Reply Last reply Reply Quote 1
                        • C Offline
                          chrcoluk @stephenw10
                          last edited by

                          @stephenw10 Thank you for providing these commands, and confirmation more logging is coming as well. The ISP is still investigating, I did setup an auto recovery mechanism which involved rebooting pfSense after 3 failed responses from the gateway in a 3 minute period, but now with the down up commands this will be a quicker and cleaner process, and since cycling the ppp is far less of an interruption than rebooting, I can do it without waiting 3 minutes as well.

                          https://forum.netgate.com/post/1223518

                          pfSense CE 2.8.1

                          1 Reply Last reply Reply Quote 0
                          • C Offline
                            chrcoluk @stephenw10
                            last edited by chrcoluk

                            @stephenw10 I have another update on 2.8.1 and using the up down commands manually.

                            It turns out running 'ifconfig pppoe2 down' has the same issue, the 'ifconfig' after running the down command reports this for ppppoe2. Censoring IP's

                            pppoe2: flags=1008851<UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
                                    description: WAN
                                    options=0
                                    inet x.x.x.x --> x.x.x.x netmask 0xffffffff
                                    inet6 fe80::xxx:xxxx:xxxx:xxxx%pppoe2 prefixlen 64 scopeid 0x10
                                    groups: pppoec
                                    nd6 options=121<PERFORMNUD,AUTO_LINKLOCAL,NO_DAD>
                            

                            So the inet is the VIP, the main IP is removed but VIP remains, link local IPv6 remains. Note it still has 'UP' status as well, but with a dead connection.

                            On the GUI WAN is showing as green up arrow but with blank ip information, no ip info there at all.

                            However as before, going in the GUI, disabling WAN. Save, Enabling WAN save, then Apply successfully does a full WAN cycle and brings back online. (or rebooting)

                            I hope this new info helps that the down command fails to take the PPPOE session offline.

                            Sadly although my ISP did start an investigation, no updates were provided after.

                            I will run the down command again another time with debug enabled, and upload that log to the link provided, I dont know when this will be, as I have people using this connection who are almost around the clock streaming. I hope the new more verbose logging enhancements made it into 2.8.1.

                            I have a sneaky feeling VIP may be the problem acting as a blocker on if_pppoe termination, when I do the later test I will remove VIP, then run down command.

                            pfSense CE 2.8.1

                            C 1 Reply Last reply Reply Quote 0
                            • C Offline
                              chrcoluk @chrcoluk
                              last edited by chrcoluk

                              Looks like VIP is the culprit, after I removed it, pppoe2 down worked, and up then brought it back up. So I guess wasnt just a VIP bug causing loops, but also one affecting pppoe termination.

                              'if_pppoe: pppoe2: lcp close(initial)' is the only log entry with VIP removed.

                              pfSense CE 2.8.1

                              1 Reply Last reply Reply Quote 0
                              • C Offline
                                chrcoluk @ajtuk
                                last edited by

                                @ajtuk Forgot to ask, do you have any additional IP's added as virtual IP's on your install?

                                pfSense CE 2.8.1

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.