Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    new if_pppoe Backend - getting HA/CARP to work like in MPD

    Scheduled Pinned Locked Moved Development
    42 Posts 3 Posters 3.8k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      perrin @zjamali
      last edited by

      @zjamali great, thanks for figuring that out. I need to check why it's not working with vipref=0.

      currently i have no idea why this happens. vipref 0 is just the first item in the vips. config_get_path is a standard pfsense function.

      Maybe it is something with the way you configured that vip? is your first VIP configured differently ffrom the one you were using later?

      zjamaliZ 1 Reply Last reply Reply Quote 0
      • zjamaliZ Offline
        zjamali @perrin
        last edited by

        @perrin Its the first vip configured in both pfsense node. So it gets id = 0.

        P 1 Reply Last reply Reply Quote 0
        • P Offline
          perrin @zjamali
          last edited by

          @zjamali this is the way the dropdown in the PPPoE-HA GUI looks in my production firewall:

          2115036b-f0fa-4528-a8e0-b08059022544-grafik.png
          in my case VHID 18 is the one that is configured for failover.

          Can you check if yours maybe starts with VHID 0?

          zjamaliZ 1 Reply Last reply Reply Quote 0
          • zjamaliZ Offline
            zjamali @perrin
            last edited by zjamali

            @perrin Mine also start with VHID 1 but if you go to Firewall -> Virtual IP, there is listing of all VIP, you hover on the edit button for first VHID, the url link will say its id=0

            id = 0 is for internal system id for record creation whilst VHID 1 is for user POV, if i am not mistaken.

            P 2 Replies Last reply Reply Quote 0
            • P Offline
              perrin @zjamali
              last edited by

              @zjamali yep, same here. I'll debug why it is not working on the first VIP later.
              Can you temporarely work with a different VIP?

              zjamaliZ 1 Reply Last reply Reply Quote 0
              • zjamaliZ Offline
                zjamali @perrin
                last edited by

                @perrin said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                @zjamali yep, same here. I'll debug why it is not working on the first VIP later.
                Can you temporarely work with a different VIP?

                Should be OK. no issue

                1 Reply Last reply Reply Quote 0
                • P Offline
                  perrin @zjamali
                  last edited by perrin

                  @zjamali said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                  @perrin Mine also start with VHID 1 but if you go to Firewall -> Virtual IP, there is listing of all VIP, you hover on the edit button for first VHID, the url link will say its id=0

                  id = 0 is for internal system id for record creation whilst VHID 1 is for user POV, if i am not mistaken.

                  I tried reproducing the behaviour by removing all carp VIPs from my test firewall and adding just a single new one so it gets "id=0". In my test VM the script behaved as expected:

                  Sep 17 07:46:31 	pppoe-ha 	1464 	VHID 1 BACKUP -> DOWN wan (pppoe0)
                  Sep 17 07:46:31 	pppoe-ha 	1464 	Handle CARP command for 1@vtnet0.510 - BACKUP 
                  

                  and

                  Sep 17 07:46:35 	pppoe-ha 	52762 	VHID 1 MASTER -> UP wan (pppoe0)
                  Sep 17 07:46:35 	pppoe-ha 	52762 	Handle CARP command for 1@vtnet0.510 - MASTER 
                  

                  so, there must be some differences in the config between your and my firewall.

                  this is the Virtual IP definition in the my config XML:

                  <vip>
                  			<mode>carp</mode>
                  			<interface>opt12</interface>
                  			<vhid>1</vhid>
                  			[remaining data ommited]
                  </vip>
                  

                  so, in my case it starts with vhid = 1 instead of zero. can you confirm this in your installation?

                  zjamaliZ 1 Reply Last reply Reply Quote 1
                  • zjamaliZ Offline
                    zjamali @perrin
                    last edited by

                    @perrin

                    The record that i having the issue is this

                    b0fb1eb6-070c-481c-bff5-e607d1f6631a-image.png

                    Check if your opt12 with vhid 1, when you hover the edit button, id = 0. that causing the trouble when selected on the mapping. If i using other carp vip, it can reconcile

                    P 1 Reply Last reply Reply Quote 0
                    • P Offline
                      perrin @zjamali
                      last edited by

                      @zjamali Yep, in my case my VHID18 is ID=0

                      but in my case this VIP also works:

                      Sep 17 08:08:42 	pppoe-ha 	41559 	VHID 18 MASTER -> UP wan (pppoe0)
                      Sep 17 08:08:42 	pppoe-ha 	41559 	Handle CARP command for 18@vtnet0.510 - MASTER
                      Sep 17 08:08:39 	pppoe-ha 	28341 	VHID 18 BACKUP -> DOWN wan (pppoe0)
                      Sep 17 08:08:39 	pppoe-ha 	28341 	Handle CARP command for 18@vtnet0.510 - BACKUP 
                      

                      Can you please confirm that your VIP correctly switches from MASTER to BACKUP?
                      Also your vhid 2 network overlaps with vhid 101 and vhid 3 with 102

                      1 Reply Last reply Reply Quote 0
                      • w0wW Offline
                        w0w
                        last edited by

                        Hmm...

                        empty($row['enabled']) || empty($row['vipref']) || empty($row['iface'])) continue;
                        $vip =
                        

                        empty($row['vipref']) will then exlude viperf=0 is not it?

                        I think this must be changed to

                        if (empty($row['enabled']) || !isset($row['vipref']) || $row['vipref'] === '' || empty($row['iface'])) continue;
                        $vip = $vips[$row['vipref']] ?? null;
                        

                        If we want to use 0 from viperf array (or any other not empty value)

                        P 1 Reply Last reply Reply Quote 0
                        • P Offline
                          perrin @w0w
                          last edited by

                          @w0w said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                          empty($row['enabled']) || empty($row['vipref'])

                          you are right. there is a bug in the reconcile_all function.

                          i will fix that.

                          1 Reply Last reply Reply Quote 1
                          • P Offline
                            perrin
                            last edited by

                            ok, i did a complete refactor of the logic within handle_carp_change and reconcile_all so that in the end both functions use the same logic. tested it on my machine and it seems to work for me.
                            I uploaded a new pkg version 0.1.1 to github.

                            Happy to get feedback from your tests.

                            zjamaliZ w0wW 2 Replies Last reply Reply Quote 0
                            • zjamaliZ Offline
                              zjamali @perrin
                              last edited by

                              @perrin

                              Installed new package, now its work for reconcile on vhid 1

                              12dce092-e863-455c-ac7a-683781dea3bb-image.png

                              1 Reply Last reply Reply Quote 1
                              • P perrin referenced this topic
                              • w0wW Offline
                                w0w @perrin
                                last edited by w0w

                                @perrin
                                If the PPPoE interface is already up and the selected VIP is MASTER, this is what I see in the logsб when pressing "run reconcile now"

                                2025-09-19 19:04:02.698352+03:00 	rc.gateway_alarm 	94407 	>>> Gateway alarm: WAN_PPPOE (Addr:212xxxx Alarm:1 RTT:.537ms RTTsd:.089ms Loss:22%)
                                2025-09-19 19:03:53.092946+03:00 	kernel 	- 	pppoe0: link state changed to UP
                                2025-09-19 19:03:51.202925+03:00 	kernel 	- 	Limiting ICMPv6 destination unreachable output from 116 to 103 packets/sec
                                2025-09-19 19:03:50.152914+03:00 	kernel 	- 	Limiting ICMPv6 destination unreachable output from 111 to 98 packets/sec
                                2025-09-19 19:03:49.102945+03:00 	kernel 	- 	Limiting ICMPv6 destination unreachable output from 106 to 100 packets/sec
                                2025-09-19 19:03:48.052233+03:00 	kernel 	- 	pppoe0: link state changed to DOWN
                                2025-09-19 19:03:47.887051+03:00 	php-fpm 	4863 	/rc.interfaces_wan_configure: calling interface_dhcpv6_configure.
                                2025-09-19 19:03:47.846816+03:00 	kernel 	- 	pppoe0: link state changed to DOWN
                                2025-09-19 19:03:47.842883+03:00 	kernel 	- 	if_pppoe: pppoe0: failed to clear IP address: 49
                                2025-09-19 19:03:46.631320+03:00 	check_reload_status 	680 	Configuring interface wan
                                2025-09-19 19:03:46.626441+03:00 	pppoe-ha 	84394 	VHID 5 MASTER - UP wan (pppoe0)
                                2025-09-19 19:03:46.600614+03:00 	pppoe-ha 	84394 	Reconcile: evaluating 1 mapping(s) 
                                

                                I am not sure is it really necessary to break the connection at all? This ended up with
                                b2e798d2-ea0e-4fa7-8e69-818228197050-image.png for both ipv4 and 6

                                Overall, I can’t say it’s stable for me—I’m not sure why. I also need to fix another bug. It looks like something is preventing the VIPs from starting when the firewall boots. Thats why I used

                                 $PHP_BIN -r 'require_once "/etc/inc/interfaces.inc"; interfaces_vips_configure();'
                                
                                P 1 Reply Last reply Reply Quote 0
                                • P Offline
                                  perrin @w0w
                                  last edited by

                                  @w0w said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                                  I am not sure is it really necessary to break the connection at all?

                                  what is happening on a reconcile is that all interfaces are forced in the state they should be in. For the UP-command the script currently does a 'interface reload' (/usr/local/sbin/pfSctl -c 'interface reload <interface>'), whereas for the down it does a ifconfig <interface> down

                                  It showed up in my tests that using 'ifconfig <interface> up' as a UP-command would not reliably connect the pppoe interface in every case. That is why i opted in to use the reload command. The downside of this is in fact that it breaks the connection. I could add a check to see if the interface is already connected before reloading it to fix this behaviour in reconcile. I might be adding that.

                                  @w0w said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                                  Overall, I can’t say it’s stable for me—I’m not sure why. I also need to fix another bug. It looks like something is preventing the VIPs from starting when the firewall boots.

                                  great to hear that is is stable! Regarding the VIPs not coming up: I have no idea where this is coming from, probably has nothing to do with my script. On both of my firewalls the VIPs come up as expected. But i am only using VIPs as part of CARP.

                                  w0wW 1 Reply Last reply Reply Quote 0
                                  • w0wW Offline
                                    w0w @perrin
                                    last edited by

                                    @perrin said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                                    great to hear that is is stable!

                                    Unfortunately I have to clarify: no — your variant does not work stable on my systems. The bug described above on an already loaded system with PPPoE up seems to start with initialization of your script, and the link falls into packet loss; I don’t even need to press the button. Also please look at the “Enable” checkbox, try to add several interfaces. For me it sometimes disappears at all, sometimes moves around.

                                    P 1 Reply Last reply Reply Quote 0
                                    • P Offline
                                      perrin @w0w
                                      last edited by

                                      @w0w i created a new version 0.1.2 of the package which now checks if the state of an pppoe interface is up (ip address present). If that is true and the desired state of that interface is also up, the script will not reload the interface. so it should not break any legit connection.

                                      regarding your issue with the GUI: I can't confirm the bug. I can add as many interfaces as I want and the GUI stays consistent. I am using the pfSense rowhelper in the GUI, so that is more or less standard functionality. can you give me some more details on when that fails? also which browser are you using and do you have any plugins in use which mangle the html of a page (e.g. adblock)?

                                      w0wW 1 Reply Last reply Reply Quote 0
                                      • w0wW Offline
                                        w0w @perrin
                                        last edited by

                                        @perrin said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                                        regarding your issue with the GUI: I can't confirm the bug.

                                        I can't replicate it on the new version. Will test it functionality soon, thanks!

                                        1 Reply Last reply Reply Quote 0
                                        • w0wW Offline
                                          w0w
                                          last edited by w0w

                                          Tested—no luck. WAN stays Pending and only acquires IPv4; IPv6 never comes up. It seems you can’t just use pppoe0 up—you need to run something like:
                                          /usr/local/sbin/pfSctl -c 'interface reload wan'
                                          to bring it up correctly.

                                          see Bringing PPPoE and checking it after.

                                          Here’s my script’s logic, see Bringing PPPoE and checking it after.

                                          • Purpose & scope

                                            • A CARP-aware PPPoE watchdog for pfSense: it tracks the node’s CARP role (MASTER/BACKUP) and reacts by starting/validating or stopping PPPoE, and (on BACKUP) restoring missing VIPs (VHID 5).
                                            • All syslog lines use a uniform prefix PPPoE_script_message00X: with ascending IDs.
                                          • Tunables / constants

                                            • LOCKFILE=/var/run/run.sh.lock — stores PID (line 1) and last known CARP role (line 2).
                                            • PPPOE_IF=pppoe0, LAN_IF=lagg2.
                                            • Role detection is by grepping ifconfig ${LAN_IF} for MASTER vhid 5; VIP presence check greps for vhid 5.
                                            • PHP_BIN=/usr/local/bin/php.
                                            • Internal flag PPPOE_ALREADY_STARTED tracks whether the script has (re)started PPPoE in this run.
                                          • Singleton guard

                                            • On start, check_already_running() reads the lockfile; if the recorded PID is alive (ps -p), the script exits to avoid multiple instances.
                                          • Optional discovery

                                            • find_pppoe_info() grabs the first pppoeN interface and its IPv4 address (kept for parity with older versions; not strictly required elsewhere).
                                          • Main loop (role monitor)

                                            • start_monitoring():

                                              • Logs launch (001) and initializes CUR_ROLE from the lockfile if present.

                                              • Every 30 seconds:

                                                • Derives NEW_ROLE from ifconfig ${LAN_IF} (MASTER vhid 5 → MASTER; otherwise BACKUP).

                                                • If role unchanged → continue silently (no log spam).

                                                • If role changed:

                                                  • On MASTER: call handle_master_carp(); log 002.
                                                  • On BACKUP: call handle_non_master_carp(); log 003.
                                                • Update lockfile with current PID and the new role.

                                          • MASTER path (handle_master_carp)

                                            • If PPPoE hasn’t been (re)started in this run, call handle_pppoe_start().
                                            • Otherwise, log/verify link (004) via check_pppoe().
                                          • BACKUP path (handle_non_master_carp)

                                            1. Shut PPPoE down if any pppoeN exists:

                                              • Wait 10s, ifconfig ${PPPOE_IF} down, set PPPOE_ALREADY_STARTED=false, log 005.
                                            2. Ensure VIPs (VHID 5) exist on LAN_IF:

                                              • If vhid 5 is missing, log 006 and re-install VIPs by running:

                                                • php -r 'require_once "/etc/inc/interfaces.inc"; interfaces_vips_configure();'
                                          • Bringing PPPoE up (handle_pppoe_start)

                                            • Wait 130s to let CARP converge.

                                            • If no pppoeN exists:

                                              • Log 007, run pfSctl -c 'interface reload wan', set PPPOE_ALREADY_STARTED=true.
                                            • If pppoe0 exists and is UP:

                                              • Log 008, do nothing.
                                            • If it exists but is not UP:

                                              • ifconfig ${PPPOE_IF} up, log 009.
                                          • Verifying PPPoE (check_pppoe)

                                            • Wait 180s (grace period).

                                            • If no pppoeN is present:

                                              • Log 010, try pfSctl -c 'interface reload wan'.
                                              • On success: set PPPOE_ALREADY_STARTED=true; on failure log 011 and return error.
                                          • Logging policy

                                            • Routine role polls are quiet; logs emit only when the CARP role flips, plus the specific action logs (004–011) triggered by that transition.
                                          • Entry points

                                            • start: runs singleton check, optional discovery, then the monitoring loop.
                                            • stop: placeholder (prints a message; no teardown).
                                            • Otherwise: prints usage and exits non-zero.
                                          • Operational timings (summary)

                                            • Poll interval: 30s.
                                            • MASTER bring-up grace: 130s (CARP settle).
                                            • PPPoE verification grace: 180s.
                                            • BACKUP PPPoE down delay: 10s.
                                          • Key side effects

                                            • Keeps a persistent record of PID + last role in the lockfile.
                                            • Ensures PPPoE is up and stable on MASTER, down on BACKUP.
                                            • Auto-repairs missing VIPs (VHID 5) on BACKUP via pfSense PHP API.

                                          P 1 Reply Last reply Reply Quote 0
                                          • P Offline
                                            perrin @w0w
                                            last edited by

                                            @w0w said in new if_pppoe Backend - getting HA/CARP to work like in MPD:

                                            /usr/local/sbin/pfSctl -c 'interface reload wan'

                                            this is exactly what the script is doing. See GitHub.

                                            Please note, that the IPv6 stuff has nothing to do with my script but seems to be more related to the general if_pppoe troubles

                                            w0wW 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.