Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    New PPPoE backend, some feedback

    Scheduled Pinned Locked Moved Development
    222 Posts 18 Posters 31.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • w0wW
      w0w @stephenw10
      last edited by w0w

      @stephenw10 said in New PPPoE backend, some feedback:

      the ppp linkup scripts are neither php or bash, they use the FreeBSD standard shell sh.
      

      Oh, well, missed that…

      @stephenw10 said in New PPPoE backend, some feedback:

      can't replicate the LAGG issue. It works fine for me in a test setup
      

      Hmm... Last time I tried it, the LAGG interface was just missing from the list.
      Now that you said you can't replicate it, I went to the GUI and tried to recreate the LAGG. I re-created it, pressed Save, the page loaded and then stopped — so I pressed Save again, just to get a "problem loading page" in FF, and found that the firewall had just crashed.

      amd64
      15.0-CURRENT
      FreeBSD 15.0-CURRENT #0 plus-RELENG_25_03-n256448-5d69d8519d49: Tue Feb  4 00:57:41 UTC 2025     root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/obj/amd64/DugkeSvO/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/sources/FreeB
      
      Crash report details:
      
      No PHP errors found.
      
      Filename: /var/crash/info.0
      Dump header from device: /dev/ada0p3
        Architecture: amd64
        Architecture Version: 4
        Dump Length: 456704
        Blocksize: 512
        Compression: none
        Dumptime: 2025-04-07 17:43:48 +0300
        Hostname: c_primary.ccccc
        Magic: FreeBSD Text Dump
        Version String: FreeBSD 15.0-CURRENT #0 plus-RELENG_25_03-n256483-08e0bace8aeb: Thu Mar  6 02:18:06 UTC 2025
          root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/obj/amd64/lpwib8GT/var/
        Panic String: page fault
        Dump Parity: 3854542699
        Bounds: 0
        Dump Status: good
      	
      db:0:kdb.enter.default>  run pfs
      db:1:pfs> bt
      Tracing pid 12 tid 100069 td 0xfffff80002c1f740
      kdb_enter() at kdb_enter+0x33/frame 0xfffffe00d71678e0
      panic() at panic+0x43/frame 0xfffffe00d7167940
      trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00d71679a0
      trap_pfault() at trap_pfault+0x46/frame 0xfffffe00d71679f0
      calltrap() at calltrap+0x8/frame 0xfffffe00d71679f0
      --- trap 0xc, rip = 0xffffffff80e50987, rsp = 0xfffffe00d7167ac8, rbp = 0xfffffe00d7167b50 ---
      lagg_port_output() at lagg_port_output+0x7/frame 0xfffffe00d7167b50
      pppoe_start() at pppoe_start+0xc2/frame 0xfffffe00d7167bc0
      sppp_output() at sppp_output+0x290/frame 0xfffffe00d7167c10
      ip6_forward() at ip6_forward+0x736/frame 0xfffffe00d7167d10
      ip6_input() at ip6_input+0xa5c/frame 0xfffffe00d7167df0
      swi_net() at swi_net+0x128/frame 0xfffffe00d7167e60
      ithread_loop() at ithread_loop+0x239/frame 0xfffffe00d7167ef0
      fork_exit() at fork_exit+0x7b/frame 0xfffffe00d7167f30
      fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00d7167f30
      --- trap 0x214cd131, rip = 0xc0c148f88948c701, rsp = 0xf18948c8314819c0, rbp = 0x4cf12148d90948c1 ---
      db:1:pfs>  show registers
      cs                        0x20
      ds                        0x3b
      es                        0x3b
      fs                        0x13
      gs                        0x1b
      ss                        0x28
      rax                       0x12
      rcx         0x98ba0a0988e4b5aa
      rdx         0xfffffe00d7167400
      rbx                      0x100
      rsp         0xfffffe00d71677b8
      rbp         0xfffffe00d71678e0
      rsi         0xfffffe00d7167670
      rdi         0xffffffff82741bf8  vt_conswindow+0x10
      r8                        0x30
      r9                        0x30
      r10                          0
      r11                          0
      r12                          0
      r13                          0
      r14         0xffffffff81468998
      r15         0xfffff80002c1f740
      rip         0xffffffff80d4e3d3  kdb_enter+0x33
      rflags                    0x86
      kdb_enter+0x33: movq    $0,0x1d70132(%rip)
      db:1:pfs>  show pcpu
      cpuid        = 4
      dynamic pcpu = 0xfffffe009b59c540
      curthread    = 0xfffff80002c1f740: pid 12 tid 100069 critnest 1 "swi1: netisr 3"
      curpcb       = 0xfffff80002c1fc60
      fpcurthread  = none
      idlethread   = 0xfffff800025ef740: tid 100007 "idle: cpu4"
      self         = 0xffffffff83a14000
      curpmap      = 0xffffffff82a62770
      tssp         = 0xffffffff83a14384
      rsp0         = 0xfffffe00d7168000
      kcr3         = 0x80000000c57ed002
      ucr3         = 0xffffffffffffffff
      scr3         = 0x2ed483ae3
      gs32p        = 0xffffffff83a14404
      ldt          = 0xffffffff83a14444
      tss          = 0xffffffff83a14434
      curvnet      = 0xfffff80001237040
      db:1:pfs>  run lockinfo
      db:2:lockinfo> show locks
      No such command; use "help" to list available commands
      db:2:lockinfo>  show alllocks
      No such command; use "help" to list available commands
      db:2:lockinfo>  show lockedvnods
      Locked vnodes
      db:1:pfs>  acttrace
      
      
      

      Looks like I’m lucky… again. This must be related to PPPoE being enabled on one of the interfaces I tried to assign to the LAGG. So I guess it's OK :-)

      Anyway... Now, after trial and error, I managed to create the LAGG again, went to Interfaces → PPPs to select my LAGG as parent interface for PPPoE, and — there are no LAGG interfaces at all. It only shows VIPs, VLANs, and no LAGGs.
      I really don't know what exactly I am “doing wrong” this time.

      1 Reply Last reply Reply Quote 1
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Hmm, so that crash was due to having pppoe already running on an interface that you added to a new lagg?

        Still it should not be possible.

        There does seem to be a 'quirk' here. As you say laggs are excluded from the pppoe parent interfaces list but if you create the lagg then assign it then change the IPv4 type to pppoe it will allow it.

        But that doesn't seem to be a recent regression.

        w0wW 1 Reply Last reply Reply Quote 1
        • w0wW
          w0w @stephenw10
          last edited by w0w

          @Phil2025 said in New PPPoE backend, some feedback:

          I hope they aren't releasing this soon as PPPoE has regressed, its slower to connect (old one and new), and the new If_PPPoE doesn't support everything as you and I have found. If someone has traffic shaping enabled and the new if_PPPoe becomes the default, then people are going to find themselves unable to connect back to their ISP after upgrading, until all traffic shaping rules are removed. No mention of this caveat in the BETA release notes. Also I want traffic shaping to avoid buffer bloat and to give VoIP priority.

          I’m sure they’ll fix most of the bugs before the release, or at least MPD will work the same way as before. It’s absolutely fine to have something broken at the beta stage, especially when such a major change is taking place.

          @stephenw10 said in New PPPoE backend, some feedback:

          Hmm, so that crash was due to having pppoe already running on an interface that you added to a new lagg?

          I just tested it again. Yes, it is possible to try to create a LAGG even if PPPoE is enabled on one of the interfaces that the LAGG consists of.

          @stephenw10 said in New PPPoE backend, some feedback:

          here does seem to be a 'quirk' here. As you say laggs are excluded from the pppoe parent interfaces list but if you create the lagg then assign it then change the IPv4 type to pppoe it will allow it.

          Ok just did it. I've just forgot about this 'quirk'. It works, yes.

          Now I remembered how the original configuration was set up — the one that works with the old backend using MPD, but doesn’t work with the new one.
          I should probably mention that this is that same “unsupported” CARP + PPPoE configuration that was once posted by someone on this forum. The idea is that it automatically brings up PPPoE on whichever firewall is currently the CARP master.
          I created a LAGG consisting of two ports from the same NIC — ixl0 and ixl1. Then I assigned it to an interface named WAN_ISP, gave it a static IP address of 10.0.110.2, and created a corresponding VIP 10.0.110.1. On the second firewall, the setup is roughly the same, except the WAN_ISP interface address is 10.0.110.3, accordingly.

          This setup “somehow works” with the MPD-based configuration and not working with new pppoe stack, just getting

          /interfaces_ppps_edit.php: Error configuring PPPoE interface pppoe0
          

          Maybe this setup should just be scrapped and forgotten altogether, as I’m not even sure it works properly or as intended.

          Still, the question remains open — why doesn’t it work with the virtual IP assigned to the LAGG, but does work when using the LAGG directly?

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by stephenw10

            Ah, yes PPP connections are not supported in HA setups indeed. But as you say if can be made to work (ish). What is the parent for the PPPoE there then? The CARP VIP? I don't think that's possible. 🤔

            w0wW 1 Reply Last reply Reply Quote 1
            • w0wW
              w0w @stephenw10
              last edited by

              @stephenw10 said in New PPPoE backend, some feedback:

              What is the parent for the PPPoE there then? The CARP VIP? I don't think that's possible.

              Yes, it's a CARP VIP. I think I'll just get rid of it.

              1 Reply Last reply Reply Quote 0
              • w0wW
                w0w
                last edited by w0w

                I can't remember or find that thread, but I think someone already asked about this...
                Where exactly does the new PPPoE backend write the connection log?

                Status/System Logs/PPP contains only old mpd records.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmm, well it looks like you can (or could) actually set a CARP VIP as a PPPoE parent. Which seems illogical but....

                  And I assume you can't with if_pppoe because that's not a physical interface....

                  There isn't anything like the same logging that mpd gives. Yet. I would run a pcap on the parent NIC and see whats actually happening. I would think it has to send from the CARP MAC since it clearly doesn't us the actual VIP IP.

                  w0wW 1 Reply Last reply Reply Quote 1
                  • w0wW
                    w0w @stephenw10
                    last edited by w0w

                    @stephenw10 said in New PPPoE backend, some feedback:

                    There isn't anything like the same logging that mpd gives. Yet. I would run a pcap on the parent NIC and see whats actually happening. I would think it has to send from the CARP MAC since it clearly doesn't us the actual VIP IP.

                    Are you talking about the mpd backend or the new one? On the new one, when selecting the CARP VIP, the pcap on the parent interface naturally shows nothing — the new backend simply can't configure itself properly and doesn't start at all.

                    Interesting. I switched back to mpd, leaving the settings with the VIP that were configured for the new backend — and now PPPoE doesn't want to work even with mpd. Something is definitely wrong with the configuration conversion between the two backends.

                    In the log, it also looks like it's connecting through the wrong interface:

                    2025-04-08 18:55:45.407053+03:00 	ppp 	56619 	[wan] Bundle: Interface ng0 created
                    2025-04-08 18:55:45.406382+03:00 	ppp 	56619 	web: web is not running
                    2025-04-08 18:55:44.495307+03:00 	ppp 	36089 	process 36089 terminated
                    2025-04-08 18:55:44.446476+03:00 	ppp 	36089 	[wan] Bundle: Shutdown
                    2025-04-08 18:55:44.403502+03:00 	ppp 	56619 	waiting for process 36089 to die...
                    2025-04-08 18:55:43.401537+03:00 	ppp 	56619 	waiting for process 36089 to die...
                    2025-04-08 18:55:42.400289+03:00 	ppp 	36089 	[wan] IPV6CP: Close event
                    2025-04-08 18:55:42.400259+03:00 	ppp 	36089 	[wan] IPCP: Close event
                    2025-04-08 18:55:42.400219+03:00 	ppp 	36089 	[wan] IFACE: Close event
                    2025-04-08 18:55:42.400117+03:00 	ppp 	36089 	caught fatal signal TERM
                    2025-04-08 18:55:42.399979+03:00 	ppp 	56619 	waiting for process 36089 to die...
                    2025-04-08 18:55:42.399687+03:00 	ppp 	56619 	process 56619 started, version 5.9
                    2025-04-08 18:55:42.399135+03:00 	ppp 	56619 	Multi-link PPP daemon for FreeBSD
                    2025-04-08 18:54:43.826132+03:00 	ppp 	36089 	[wan] Bundle: Interface ng0 created
                    

                    but nothing on LAGG
                    Ok next step...
                    I booted into the previous snapshot from February, launched PPPoE and pcap there —
                    Here’s an example of one of the packets:
                    fe185dda-f2c3-4c3d-80d8-901a39268f20-{210A497D-BC63-4013-96FD-9CE837955757}.png

                    b4:96:91:c9:77:84 is just active ethernet card ixl0 MAC form LAGG (FAILOVER) I have used for CARP VIP.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, well I can certainly see why that might fail. Setting it on a VIP really makes no sense for a L2 protocol. It seems like it worked 'by accident'. I'm not sure that will ever work with if_pppoe. I'll see if Kristof has any other opinion...

                      K 1 Reply Last reply Reply Quote 0
                      • K
                        kprovost @stephenw10
                        last edited by

                        @stephenw10 I don't see how setting a carp IP on a PPPoE interface would make sense, no.

                        It doesn't make sense on the underlying Ethernet device (because it's not expected to have an address assigned at all), and also doesn't make sense on the PPPoE device itself, because there's no way to do the ARP dance that makes carp work.

                        w0wW 1 Reply Last reply Reply Quote 2
                        • w0wW
                          w0w @kprovost
                          last edited by

                          @kprovost said in New PPPoE backend, some feedback:

                          It doesn't make sense on the underlying Ethernet device (because it's not expected to have an address assigned at all), and also doesn't make sense on the PPPoE device itself, because there's no way to do the ARP dance that makes carp work.

                          The whole point is to use the status of the parent interface to bring up the PPPoE interface. To determine the status of the parent (underlying) interface, the CARP VIP on the parent interface is exactly what's needed — to identify which node is the master and where to bring up PPPoE. Honestly, I have no idea why it even worked before. But if it's not supposed to work and never will, then of course I won't insist on this approach :)

                          Ideally, there would simply be a feature to bring up the PPPoE WAN session only if the firewall is the MASTER.
                          I doubt I'm the only one whose ISP doesn't appreciate users trying to initiate more than one PPPoE session.

                          1 Reply Last reply Reply Quote 0
                          • D
                            dsl.ottawa
                            last edited by

                            I've recently upgraded to the latest beta 2.8, and switched to the new PPPOE backend.

                            I really didn't have any issues with the previous one other than performance. I recently upgraded to 3gig fiber and have been struggling to get full speed when using pppoe on the pfsense box.

                            I have found no difference from the old to the new backend. Performance still seems to be the same. The odd thing is that I get full speed on the upload, but only about half to 2/3rds on the down. I.e. I get 3000-3200Mbps upload, but download is usually around 1700-1900Mbps.

                            I've tried it with an intel X520 card, and an X710 card. No difference.

                            What I have noticed, and I'm not sure if this is the reason for the performance hit, is that on the upload or the tx side it seems to use all the queue's available to it. but on the rx side it only uses the first queue. I tried tweaking the queue's on the x710 and didn't make any difference.

                            Here's an example

                            [2.8.0-BETA][root@router]/root: sysctl -a | grep '.ixl..*xq0' | grep packets
                            dev.ixl.0.pf.txq07.packets: 2550054
                            dev.ixl.0.pf.txq06.packets: 2444906
                            dev.ixl.0.pf.txq05.packets: 542271
                            dev.ixl.0.pf.txq04.packets: 781264
                            dev.ixl.0.pf.txq03.packets: 2216896
                            dev.ixl.0.pf.txq02.packets: 2738515
                            dev.ixl.0.pf.txq01.packets: 5394
                            dev.ixl.0.pf.txq00.packets: 8645
                            dev.ixl.0.pf.rxq07.packets: 0
                            dev.ixl.0.pf.rxq06.packets: 0
                            dev.ixl.0.pf.rxq05.packets: 0
                            dev.ixl.0.pf.rxq04.packets: 0
                            dev.ixl.0.pf.rxq03.packets: 0
                            dev.ixl.0.pf.rxq02.packets: 0
                            dev.ixl.0.pf.rxq01.packets: 0
                            dev.ixl.0.pf.rxq00.packets: 6262688

                            at the moment I have this in my loader.conf.local file

                            net.tcp.tso="0"
                            net.inet.tcp.lro="0"
                            hw.ixl.flow_control="0"
                            hw.ix1.num_queues="8"
                            dev.ixl.0.iflib.override_qs_enable=1
                            dev.ixl.0.iflib.override_nrxqs=8
                            dev.ixl.0.iflib.override_ntxqs=8
                            dev.ixl.1.iflib.override_qs_enable=1
                            dev.ixl.1.iflib.override_nrxqs=8
                            dev.ixl.1.iflib.override_ntxqs=8
                            dev.ixl.0.iflib.override_nrxds=4096
                            dev.ixl.0.iflib.override_ntxds=4096
                            dev.ixl.1.iflib.override_nrxds=4096
                            dev.ixl.1.iflib.override_ntxds=4096

                            If I could fix this issue the rest seems to be rock solid.

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              How are you testing? What hardware are you running?

                              The upload speed is also unchanged from mpd5?

                              D 1 Reply Last reply Reply Quote 0
                              • D
                                dsl.ottawa @stephenw10
                                last edited by

                                @stephenw10

                                I'm running a Supermicro E300-8D with a Xeon D-1518 CPU and 16gig of Ram.
                                The onboard 10gig nic's are Lagg'd in to my lan and the addon slot is filled with currently an X710 but initially I had tried a X520.

                                The switch from mpd5 made very little if any difference. Maybe 100mbps if that.

                                I'm testing from my pc which has 10gig fiber into my core switch and then the pfsense box is fed by the mentioned 10gig LAGG group.

                                When I run the speed test right from the modem itself using he provider interface it comes back with 3.2gig up and down every time. When running it from my pc I'm using the providers speed test site. Upload is always 3-3.2gig, but the download always falls short.

                                The only difference I saw from going from mpd5 to the new one is the cpu usage dropped.. Previously on a test I'd see upwards of 60% cpu , now it's 38-43%. 39% on the download and upwards of 43% on the upload tests.

                                The only thing I haven't tried that I can think of is to remove the pppoe from pfsense and just set it up as a dhcp to the provider modem and see if I get the full speed all the way through.

                                The part I find odd is that if it was the pppoe , I wouldn't think I'd get full speed on the upload ? That's why I started looking at other things, like the queue's to see if I could find something off.

                                w0wW 1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by stephenw10

                                  Does that speedtest use multiple connections?

                                  The reason you see a difference between up and down is that when you're downloading Receive Side Scaling applies to the PPPoE directly and that is what limits it.

                                  However if_pppoe is RSS enabled so should be able to spread the load across the queues/cores much better. But only if there are multiple streams to spread.

                                  And just to be clear the WAN here was either the X520 or X710 NIC?

                                  D 1 Reply Last reply Reply Quote 0
                                  • w0wW
                                    w0w @dsl.ottawa
                                    last edited by

                                    @dsl-ottawa
                                    Did you remove the net.isr.dispatch=deferred? See PPPoE with multi-queue NICs

                                    M D 2 Replies Last reply Reply Quote 0
                                    • M
                                      mr_nets @w0w
                                      last edited by mr_nets

                                      @w0w said in New PPPoE backend, some feedback:

                                      @dsl-ottawa
                                      Did you remove the net.isr.dispatch=deferred? See PPPoE with multi-queue NICs

                                      @w0w I did remove this value, shouldn't I ? I didn't see any change without.

                                      w0wW 1 Reply Last reply Reply Quote 1
                                      • w0wW
                                        w0w @mr_nets
                                        last edited by

                                        @mr_nets said in New PPPoE backend, some feedback:

                                        @w0w I did remove this value, shouldn't I ? I didn't see any change without.

                                        AFAIK it should be removed on the new backend.
                                        Do you have any other tunes enabled, flow control, tcp segmentation offload, LRO, no?
                                        Just guessing... I have never seen anything over 1Gig running PPPoE...

                                        M 1 Reply Last reply Reply Quote 0
                                        • M
                                          mr_nets @w0w
                                          last edited by

                                          @w0w said in New PPPoE backend, some feedback:

                                          @mr_nets said in New PPPoE backend, some feedback:

                                          @w0w I did remove this value, shouldn't I ? I didn't see any change without.

                                          AFAIK it should be removed on the new backend.
                                          Do you have any other tunes enabled, flow control, tcp segmentation offload, LRO, no?
                                          Just guessing... I have never seen anything over 1Gig running PPPoE...

                                          Fine, every offload setting are disabled as well. The only thing I didn't remove is Jumbo Frame on PPPoE (MTU 1500) since my ISP support that.

                                          1 Reply Last reply Reply Quote 1
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            What CPU usage are you seeing when you test? What about per core usage? I one core still pegged at 100%

                                            D M 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.