Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    New PPPoE backend, some feedback

    Scheduled Pinned Locked Moved Development
    217 Posts 18 Posters 31.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • RobbieTTR
      RobbieTT @stephenw10
      last edited by

      @stephenw10

      With the “Do not wait for a RA” box unchecked (my usual config is to have this box checked for reasons long since forgotten but sure to bite me at some point)

      Ok, un-forgotten quite quickly. Whilst leaving the RA box unchecked works for taking the PPPoE interface down and up again it screws-up a full reboot instead.

      Without the “Do not wait for a RA” box checked, on a full reboot the interface and the PPPoE appear to be up and running on the GUI but no actual internet traffic is passed for a further 4 or 5 minutes or more.

      Start Time:

      May 3 17:06:55	kernel		---<<BOOT>>---
      May 3 17:06:55	syslogd		kernel boot file is /boot/kernel/kernel
      May 3 17:05:09	syslogd		exiting on signal 15
      May 3 17:05:09	reboot	97088	rebooted by root
      

      To this point when pfSense thinks it is ready (and normally where it should be up and running) but cannot reach outside:

      May 3 17:07:50	kernel		done.
      May 3 17:07:48	php-cgi	68067	notify_monitor.php: Could not send the message to xxxxxxx@xxxxxxx.me -- Error: Failed to connect to mail.haveworx.co.uk:587 [SMTP: Failed to connect socket: php_network_getaddresses: getaddrinfo for mail.haveworx.co.uk failed: Name does not resolve (code: -1, response: )]
      

      To this point, where traffic does actually flow:

      May 3 17:11:44	php-fpm	44318	/rc.newwanipv6: Resyncing OpenVPN instances for interface WAN.
      May 3 17:11:44	check_reload_status	680	Reloading filter
      May 3 17:11:35	php_pfb	5699	[pfBlockerNG] filterlog daemon started
      May 3 17:11:35	php_pfb	4074	[pfBlockerNG] filterlog daemon started
      May 3 17:11:35	php-fpm	44318	/rc.newwanipv6: rc.newwanipv6: on (IP address: 2a02:xxx:feed:xxxx:xxxx:xxxx:xxxx:xx06) (interface: wan) (real interface: pppoe0).
      May 3 17:11:35	php-fpm	44318	/rc.newwanipv6: rc.newwanipv6: Info: starting on pppoe0 due to REQUEST.
      

      So I guess we still have a problem but we can move the problem somewhere else.

      ☕️

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Hmm, interesting.

        I expect to not have that checked because the dhcp is set to go over PPPoE. It should only try to pull a lease once the PPPoE is up and remote server sends an RA over it. But that does depend on the frequency the ISP sends at. One of the other issues we are seeing is with ISPs that send RAs at high frequency, like 10s intervals, and trigger events at each.

        But I suspect the difference here is that the old backend only marks the interface up once it's actually connected and if_pppoe is seen as UP as soon as it's created. If dhcp6c doesn't wait for an RA it will immediately try and fail and then.... get stuck in some fail-loop!

        We are changing that behaviour now so it may be fixed in the next build anyway.

        RobbieTTR 1 Reply Last reply Reply Quote 1
        • RobbieTTR
          RobbieTT @stephenw10
          last edited by

          @stephenw10 said in New PPPoE backend, some feedback:

          Hmm, interesting.

          I expect to not have that checked because the dhcp is set to go over PPPoE. It should only try to pull a lease once the PPPoE is up and remote server sends an RA over it.

          Looking forward to the changes. 👍

          My ISP RA's are sent reasonably infrequently so once the PPPoE session is up the client router (pfSense) should send an RS upstream and get the RA straight back. Occasionally an RA is captured first but typically the RA used will be triggered by the RS.

          The days of waiting obediently for an RA should be confined to history (well, whenever the replacement RFC came out, which is a number of years ago now). ISPs that deliberately machine-gun out unsolicited RAs should be sent a burning copy of the standards.

          ☕️

          1 Reply Last reply Reply Quote 1
          • RobbieTTR
            RobbieTT @stephenw10
            last edited by

            @stephenw10

            The 171.diff patch really improves things. New text file with logs, dmesg -a and my remaining comments sent direct.

            ☕️

            1 Reply Last reply Reply Quote 1
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Hmm, I can't replicate that FQDN access issue. How does it fail when you do that?

              RobbieTTR 1 Reply Last reply Reply Quote 0
              • RobbieTTR
                RobbieTT @stephenw10
                last edited by

                @stephenw10

                The GUI stalls and I get this:

                 2025-04-30 at 17.02.53 copy.png

                With the fqdn it's like as soon as the WAN is lost it forgets that local access is still available. Perhaps unbound is restarting and local look-ups are dropped but I'm not really sure of the cause. I'm not using Kea, if that is a factor.

                If I use the GUI via the IP address instead and take the PPPoE interface down / up then the GUI stays alive.

                ☕️

                1 Reply Last reply Reply Quote 1
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmm, yeah that does seem like it must be an Unbound issue. I guess I'm not seeing it here because I'm not using that box for DNS... 🤔

                  patient0P 1 Reply Last reply Reply Quote 0
                  • patient0P
                    patient0 @stephenw10
                    last edited by

                    Just a reference: There is another thread where a user reports an issue with connecting to the webGUI (503 error) after upgrading from 24.11 to 25.03-BETA and then switching on the if_pppoe module:

                    Problems after enabling if_pppoe

                    w0wW 1 Reply Last reply Reply Quote 0
                    • w0wW
                      w0w @patient0
                      last edited by

                      @patient0
                      This may be a manifestation of the same bug: https://forum.netgate.com/topic/197119/dns-resolver-exiting-when-loading-pfblocker-25-03-b-20250409-2208. Some ISPs send RA packets too aggressively, and due to a bug, pfSense starts endlessly restarting related services and daemons. On certain hardware, it's even possible that PHP hangs as a result.

                      RobbieTTR C 2 Replies Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by stephenw10

                        Yup I'd bet it's that ^. Should be fixed in the next beta build.

                        1 Reply Last reply Reply Quote 0
                        • RobbieTTR
                          RobbieTT @w0w
                          last edited by

                          @w0w said in New PPPoE backend, some feedback:

                          @patient0
                          This may be a manifestation of the same bug: Some ISPs send RA packets too aggressively...

                          Thankfully my ISP is very mild with the RAs (and complies with the standards), so it is very rare for the process to be triggered by an RA and is almost exclusively an RS from pfSense kicking it all off.

                          The dns-resolver loosing its mind when pfBlocker does its thing would probably explain why the fqdn gets tossed.

                          Whilst it doesn't solve everything the 171.diff experimental patch has really calmed things down on boot & interface status change. Looking forward to all this being collected in a new beta. This is all looking positive.

                          ☕️

                          w0wW 1 Reply Last reply Reply Quote 2
                          • w0wW
                            w0w @RobbieTT
                            last edited by

                            @RobbieTT said in New PPPoE backend, some feedback:

                            171.diff

                            Where can I get it?

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Here's the file to test. This is not the final fix that will be in the build though.
                              171.diff

                              w0wW RobbieTTR 2 Replies Last reply Reply Quote 1
                              • w0wW
                                w0w @stephenw10
                                last edited by

                                @stephenw10
                                Oh yes, I tested an earlier version too, but this one at least works with the latest snapshot.
                                It looks promising. 👍

                                1 Reply Last reply Reply Quote 2
                                • RobbieTTR
                                  RobbieTT @stephenw10
                                  last edited by

                                  @stephenw10

                                  I'm not sure if my buffer bloat on 23.05 / PPPoE is truly related to all this so I have opened a different thread here.

                                  ☕️

                                  1 Reply Last reply Reply Quote 1
                                  • C
                                    chrcoluk @w0w
                                    last edited by chrcoluk

                                    @w0w said in New PPPoE backend, some feedback:

                                    @patient0
                                    This may be a manifestation of the same bug: https://forum.netgate.com/topic/197119/dns-resolver-exiting-when-loading-pfblocker-25-03-b-20250409-2208. Some ISPs send RA packets too aggressively, and due to a bug, pfSense starts endlessly restarting related services and daemons. On certain hardware, it's even possible that PHP hangs as a result.

                                    I think the service restarting code needs a looking at, I will confess on my personal install of pfSense, I have disabled a lot of it, as I found its way too aggressive at mass restarting services.

                                    As an example is no need to restart the ups daemon if a VPN cycles.

                                    I would either restrict the services that restart, instead of a blanket restart all services, or make it optional tick box in advanced settings. Most people are probably only using services LAN side anyway so restarting them because of a change of WAN conditions seems excessive.

                                    pfSense CE 2.8.0

                                    w0wW 1 Reply Last reply Reply Quote 0
                                    • w0wW
                                      w0w @chrcoluk
                                      last edited by

                                      @chrcoluk
                                      You may be right. Everything eventually needs to be reevaluated. We're dealing with a complex software construct that tries to account for all the situations that may arise from thousands of different user configurations. And as far as I remember, the whole reconfiguration/restart behavior has been around for a long time, even though some services might no longer need it.
                                      As the saying goes... if you want something done — submit a feature request, or maybe… a bugfix.

                                      C 1 Reply Last reply Reply Quote 0
                                      • C
                                        chrcoluk @w0w
                                        last edited by chrcoluk

                                        @w0w Thats the plan, but I need to be sure I am not submitting something that only works for me and breaks for others, it needs care taken on this issue.

                                        pfSense CE 2.8.0

                                        1 Reply Last reply Reply Quote 0
                                        • L
                                          louis2
                                          last edited by

                                          I did test the new PPPoE this morning trying multiple settings on a simple fiber connection.

                                          It did work for IPV4 however NOT for IPV6. So after a short test trying multiple settings I switched back to the old PPOE.

                                          For info:

                                          • internet is arriving via vlan6
                                          • PPOE is listening to that vlan

                                          The settings for the old PPOE and also supposed settings for the new version

                                          • IPV4: PPPoE
                                          • IPV6: DHCP6
                                          • MTU: (default)
                                          • MSS: 1492
                                          • Use IPV4: YES
                                          • Request only IPv6 prefix: YES

                                          That did not work for IPV6 so I did try SLAAC and using the default MSS etc did not work.

                                          So I switched back after this short test.

                                          L 1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            If you can test again try enabling DHCP6 Debug in System > Advanced > Networking.

                                            I have that setup running here without issue. It can take a minute or so to connect depending on what your ISP does.

                                            Do you have Do not wait for a RA set in the DHCPv6 client setup?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.