Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    FRR BGP over IPsec , when HA happens (slave-> master, master ->slave)

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    32 Posts 3 Posters 2.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • V
      vinns
      last edited by

      Hi everyone,

      i've a cluster setup with 2 pfsense, i've set up CARP for lan and wan interface. now on top of this cluster i've a BGP over IPSEC tunnel with AWS. Everything looks fine till the moment , that the HA happens, so slave becomes masters and vice versa. the BGP remains into a state of stuck. i need to restart FRR manually in order to connect to the secondary node (now master , after the HA has elevated that to the master role)

      any idea what i can change in this setup to have at least the BGP connecting on the second node, without having to remote kill that bgp connection first?
      i tried to setup OSPF but either i have a wrong idea on what it does , or that is not the correct tool to handle this, cause it did not improve anything.

      whatever suggestion is welcome ,

      cheers
      V.

      M 1 Reply Last reply Reply Quote 0
      • M
        michmoor LAYER 8 Rebel Alliance @vinns
        last edited by

        @vinns

        Pretty sure this matches up to your issue.
        https://redmine.pfsense.org/issues/14633

        Boils down to A) FRR Upstream doesn't have an option enabled and B) Once upstream gets the option enabled , netgate would need to provide functionality for FRR working in a high availability set up.

        The Request here has been pretty much stalled.
        https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276534

        Unless the maintainer for the FreeBSD package modifies this, FRR with High Availability is non-existent.

        Firewall: NetGate,Palo Alto-VM,Juniper SRX
        Routing: Juniper, Arista, Cisco
        Switching: Juniper, Arista, Cisco
        Wireless: Unifi, Aruba IAP
        JNCIP,CCNP Enterprise

        V M 2 Replies Last reply Reply Quote 1
        • V
          vinns @michmoor
          last edited by

          @michmoor thank you, i was going crazy the last months. i've tried everything that i know...but eventually these that you point out are pretty much what happens. i'll keep and eye on those links if there are updates in the future. for the moment we'll deal with the manual clean-up

          M 1 Reply Last reply Reply Quote 0
          • M
            mcury @michmoor
            last edited by mcury

            @michmoor Thanks for the info michmoor.

            I was about to implement HA with FRR, I guess that I'll have to use policy route instead (two VTIs), at least for the time being.

            Perhaps a gateway group with lower packet loss threshold to make the convergence faster.

            dead on arrival, nowhere to be found.

            M 1 Reply Last reply Reply Quote 0
            • M
              michmoor LAYER 8 Rebel Alliance @mcury
              last edited by

              @mcury
              When it comes to convergence the best you can do , generally with any platform, is by enabling BFD.
              On some vendors such as Juniper and PaloAlto of which i am familiar with, you can enable BFD with static routing.
              If that's supported on pfSense that would be my recommendation and to not rely on gateway groups. Gateway groups are just a means of providing failover or load distribution but not quick failover which you get from BFD.

              Whats your tolerance? If you are ok with a few seconds outage then your method may be desirable but if you need milliseconds then bfd all the way.

              Firewall: NetGate,Palo Alto-VM,Juniper SRX
              Routing: Juniper, Arista, Cisco
              Switching: Juniper, Arista, Cisco
              Wireless: Unifi, Aruba IAP
              JNCIP,CCNP Enterprise

              M 1 Reply Last reply Reply Quote 0
              • M
                michmoor LAYER 8 Rebel Alliance @vinns
                last edited by

                @vinns If you really need this feature i urge you to join the free bsd mailing list. According to the website, that seems to be the best method of getting an issue addressed or at the very least getting the community engaged enough to maybe offer some workarounds.

                https://docs.freebsd.org/en/articles/mailing-list-faq/
                https://lists.freebsd.org/

                Firewall: NetGate,Palo Alto-VM,Juniper SRX
                Routing: Juniper, Arista, Cisco
                Switching: Juniper, Arista, Cisco
                Wireless: Unifi, Aruba IAP
                JNCIP,CCNP Enterprise

                1 Reply Last reply Reply Quote 1
                • M
                  mcury @michmoor
                  last edited by

                  @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                  by enabling BFD.

                  I have FRR OSPF with BFD enabled, it works perfectly, but not in a HA setup.

                  @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                  you can enable BFD with static routing.

                  I'm almost 100% sure that this is not possible with pfSense.

                  @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                  Whats your tolerance? If you are ok with a few seconds outage then your method may be desirable but if you need milliseconds then bfd all the way.

                  Two VMs, one hosting AD (samba-ad and consequently DNS), and another VM as a fileserver..
                  Any minute that the connections isn't working will be a very big problem..
                  I was expecting to be able to use FRR with BFD but it won't be possible anymore..

                  So, as I see it, the only option I have right now, with HA setup, is policy routing.

                  dead on arrival, nowhere to be found.

                  M 2 Replies Last reply Reply Quote 0
                  • M
                    michmoor LAYER 8 Rebel Alliance @mcury
                    last edited by

                    @mcury
                    Without CARP supporting active/active there is no other option for fast high availability with routing in any capacity which granted is a very serious limitation.
                    For my deployments, that's a big issue so i tend to not use pfSense in much more "complicated" routing setups

                    Firewall: NetGate,Palo Alto-VM,Juniper SRX
                    Routing: Juniper, Arista, Cisco
                    Switching: Juniper, Arista, Cisco
                    Wireless: Unifi, Aruba IAP
                    JNCIP,CCNP Enterprise

                    1 Reply Last reply Reply Quote 1
                    • M
                      michmoor LAYER 8 Rebel Alliance @mcury
                      last edited by michmoor

                      @mcury
                      this may help. There is a solution if you want to call it that here in this redmine.

                      https://redmine.pfsense.org/issues/9141

                      The first statement here is nonsensical.

                      "" AFAIR it was done deliberately since in nearly all cases it would be an error to run an identical configuration on two routers running a routing protocol. You'd want separate feeds/connections to neighbors and to work out the failover using priorities/cost/etc in the routing protocols. ""

                      This is obviously ridiculous and counter-intutive to how high availability is supposed to work but moving the saved configuration to the other standby node looks to be a workable solution

                      Firewall: NetGate,Palo Alto-VM,Juniper SRX
                      Routing: Juniper, Arista, Cisco
                      Switching: Juniper, Arista, Cisco
                      Wireless: Unifi, Aruba IAP
                      JNCIP,CCNP Enterprise

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        mcury @michmoor
                        last edited by

                        @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                        This is obviously ridiculous and counter-intutive to how high availability is supposed to work but moving the saved configuration to the other standby node looks to be a workable solution

                        No progress here obviously, just wanted to add that in the mean time I'm using a workaround: every time i change something on the primary GUI I transfer the raw FRR running config onto the standby cluster (as saved config).

                        Ow, so it is possible..

                        I'll perform some tests to see how that goes, thanks a lot michmoor, I wasn't aware of any of this and I was about to jump into it.

                        dead on arrival, nowhere to be found.

                        M 1 Reply Last reply Reply Quote 0
                        • M
                          michmoor LAYER 8 Rebel Alliance @mcury
                          last edited by

                          @mcury let us know if that works

                          But if it does i cant imagine why that cant be sync'd. Lots of maintenance on the admins end to keep the configs in order

                          Firewall: NetGate,Palo Alto-VM,Juniper SRX
                          Routing: Juniper, Arista, Cisco
                          Switching: Juniper, Arista, Cisco
                          Wireless: Unifi, Aruba IAP
                          JNCIP,CCNP Enterprise

                          M 1 Reply Last reply Reply Quote 1
                          • M
                            mcury @michmoor
                            last edited by mcury

                            @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                            let us know if that works

                            I'll post here my findings.

                            @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                            But if it does i cant imagine why that cant be sync'd. Lots of maintenance on the admins end to keep the configs in order

                            If it works, I'll try to build a script..

                            updated the request: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276534

                            dead on arrival, nowhere to be found.

                            M 1 Reply Last reply Reply Quote 1
                            • M
                              michmoor LAYER 8 Rebel Alliance @mcury
                              last edited by

                              @mcury
                              curious...
                              Is FRR running on the standby firewall?

                              If it is there needs to be a way to have the process down and only running when it becomes active otherwise the standby is going to attempt peering with upstream.

                              Im not to familiar with FRR in HA mode.

                              Firewall: NetGate,Palo Alto-VM,Juniper SRX
                              Routing: Juniper, Arista, Cisco
                              Switching: Juniper, Arista, Cisco
                              Wireless: Unifi, Aruba IAP
                              JNCIP,CCNP Enterprise

                              M 1 Reply Last reply Reply Quote 0
                              • M
                                mcury @michmoor
                                last edited by

                                @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                                curious...
                                Is FRR running on the standby firewall?

                                Not at the moment, I'm about to build the slave to form the HA, only a single firewall running at the moment, just waiting for two NICs to arrive.

                                @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                                If it is there needs to be a way to have the process down and only running when it becomes active otherwise the standby is going to attempt peering with upstream.

                                If state is slave, pfSsh.php playback disable frr.. perhaps a good logic for the script to run every second.

                                dead on arrival, nowhere to be found.

                                M 1 Reply Last reply Reply Quote 0
                                • M
                                  michmoor LAYER 8 Rebel Alliance @mcury
                                  last edited by

                                  @mcury Curious. Got it working reliably?

                                  Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                  Routing: Juniper, Arista, Cisco
                                  Switching: Juniper, Arista, Cisco
                                  Wireless: Unifi, Aruba IAP
                                  JNCIP,CCNP Enterprise

                                  M 1 Reply Last reply Reply Quote 0
                                  • M
                                    mcury @michmoor
                                    last edited by

                                    @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                                    @mcury Curious. Got it working reliably?

                                    Unfortunately no, I depend on someone else to access the cluster, so I'm just waiting for him to call for the tests..

                                    dead on arrival, nowhere to be found.

                                    M 1 Reply Last reply Reply Quote 1
                                    • M
                                      mcury @mcury
                                      last edited by mcury

                                      I tested this yesterday, if both nodes in the HA have FRR enabled, no routes are exchanged between peers.
                                      I have both nodes with the exact same configuration, but backup node is with FRR disabled.

                                      In case primary node goes down, all I have to do is to enable FRR in the backup peer.

                                      dead on arrival, nowhere to be found.

                                      M 1 Reply Last reply Reply Quote 1
                                      • M
                                        michmoor LAYER 8 Rebel Alliance @mcury
                                        last edited by

                                        @mcury nice !
                                        Still requires an admins interaction BUT the concept works.
                                        I see no reason why it cant be automated.

                                        Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                        Routing: Juniper, Arista, Cisco
                                        Switching: Juniper, Arista, Cisco
                                        Wireless: Unifi, Aruba IAP
                                        JNCIP,CCNP Enterprise

                                        M 1 Reply Last reply Reply Quote 2
                                        • M
                                          mcury @michmoor
                                          last edited by

                                          @michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):

                                          Still requires an admins interaction BUT the concept works.
                                          I see no reason why it cant be automated.

                                          Exactly, a little intervention but nothing that takes a lot of time, tick two things, save and that is it. :)

                                          I'll start to plan a script, something to check, am I the primary, if so, enable frr, something like that.

                                          dead on arrival, nowhere to be found.

                                          M 1 Reply Last reply Reply Quote 0
                                          • M
                                            michmoor LAYER 8 Rebel Alliance @mcury
                                            last edited by michmoor

                                            @mcury maybe the script can check the CARP status? So check if i am Master?
                                            Also a secondary check as well. Maybe ping the SYNC interface of the neighbor. If its down and if you are master than bring up FRR.

                                            So high level
                                            Every GUI change in FRR needs to be sync'd to the standby
                                            The standby needs to monitor CARP status
                                            The standby needs a reliable detector to know it should take over routing - pings the SYNC interface of the master.

                                            Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                            Routing: Juniper, Arista, Cisco
                                            Switching: Juniper, Arista, Cisco
                                            Wireless: Unifi, Aruba IAP
                                            JNCIP,CCNP Enterprise

                                            M V 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.