FRR BGP over IPsec , when HA happens (slave-> master, master ->slave)
-
@mcury
this may help. There is a solution if you want to call it that here in this redmine.https://redmine.pfsense.org/issues/9141
The first statement here is nonsensical.
"" AFAIR it was done deliberately since in nearly all cases it would be an error to run an identical configuration on two routers running a routing protocol. You'd want separate feeds/connections to neighbors and to work out the failover using priorities/cost/etc in the routing protocols. ""
This is obviously ridiculous and counter-intutive to how high availability is supposed to work but moving the saved configuration to the other standby node looks to be a workable solution
-
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
This is obviously ridiculous and counter-intutive to how high availability is supposed to work but moving the saved configuration to the other standby node looks to be a workable solution
No progress here obviously, just wanted to add that in the mean time I'm using a workaround: every time i change something on the primary GUI I transfer the raw FRR running config onto the standby cluster (as saved config).
Ow, so it is possible..
I'll perform some tests to see how that goes, thanks a lot michmoor, I wasn't aware of any of this and I was about to jump into it.
-
@mcury let us know if that works
But if it does i cant imagine why that cant be sync'd. Lots of maintenance on the admins end to keep the configs in order
-
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
let us know if that works
I'll post here my findings.
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
But if it does i cant imagine why that cant be sync'd. Lots of maintenance on the admins end to keep the configs in order
If it works, I'll try to build a script..
updated the request: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276534
-
@mcury
curious...
Is FRR running on the standby firewall?If it is there needs to be a way to have the process down and only running when it becomes active otherwise the standby is going to attempt peering with upstream.
Im not to familiar with FRR in HA mode.
-
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
curious...
Is FRR running on the standby firewall?Not at the moment, I'm about to build the slave to form the HA, only a single firewall running at the moment, just waiting for two NICs to arrive.
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
If it is there needs to be a way to have the process down and only running when it becomes active otherwise the standby is going to attempt peering with upstream.
If state is slave, pfSsh.php playback disable frr.. perhaps a good logic for the script to run every second.
-
@mcury Curious. Got it working reliably?
-
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
@mcury Curious. Got it working reliably?
Unfortunately no, I depend on someone else to access the cluster, so I'm just waiting for him to call for the tests..
-
I tested this yesterday, if both nodes in the HA have FRR enabled, no routes are exchanged between peers.
I have both nodes with the exact same configuration, but backup node is with FRR disabled.In case primary node goes down, all I have to do is to enable FRR in the backup peer.
-
@mcury nice !
Still requires an admins interaction BUT the concept works.
I see no reason why it cant be automated. -
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
Still requires an admins interaction BUT the concept works.
I see no reason why it cant be automated.Exactly, a little intervention but nothing that takes a lot of time, tick two things, save and that is it. :)
I'll start to plan a script, something to check, am I the primary, if so, enable frr, something like that.
-
@mcury maybe the script can check the CARP status? So check if i am Master?
Also a secondary check as well. Maybe ping the SYNC interface of the neighbor. If its down and if you are master than bring up FRR.So high level
Every GUI change in FRR needs to be sync'd to the standby
The standby needs to monitor CARP status
The standby needs a reliable detector to know it should take over routing - pings the SYNC interface of the master. -
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
@mcury maybe the script can check the CARP status? So check if i am Master?
Also a secondary check as well. Maybe ping the SYNC interface of the neighbor. If its down and if you are master than bring up FRR.Yes, I'll have to learn carp CLI commands to check the status, any help is much appreciated because I'll probably need to parse the output to get what we need..
Then, set up some ifs and elses in the master and in the backup.
A ping test would also help this checking..
And lastly, a cron job in both nodes -
@mcury I got you. Im researhing now.
-
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
@mcury I got you. Im researhing now.
I'm stuck right now, unfortunately.
I'll be checking later today or perhaps during the weekend.But I think we will nail it, only a matter of time
-
@michmoor said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
hey guys , as i've been following with much interest this thread:
Every GUI change in FRR needs to be sync'd to the standby
The standby needs to monitor CARP status
The standby needs a reliable detector to know it should take over routing - pings the SYNC interface of the master.i've been playing a with conf's coptions myself here ,there is an option under FRR->Global Settings-> CARP Status IP , by default this is set to none , but if its set to the IP of the CARP then: Used to determine the CARP status. When the CARP vhid is in BACKUP status, FRR will not be started.
unfortunattely for me i can't test it , cause one of my nodes was fried.(waiting on a replacement this week or the next one)
hope that helps ...
-
@vinns said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
but if its set to the IP of the CARP then: Used to determine the CARP status. When the CARP vhid is in BACKUP status, FRR will not be started.
Thanks for the insight, I actually tried that but FRR remains active in the backup node.
-
I don't know what I did, but now it is working.
Routes, HA and everything... FRR is now not running on the secondary node.
My guess is that you need a reboot of both nodes after configuring FRR in HA mode, not sure yet what happened, but yes, it is working with that option (CARP Status IP).Good news :)
-
@mcury i can confirm the same. tested. seems okay, after selecting that CARP STATUS IP option.
one more thing i was not able to replicate , the FRR configs even though its in HA mode , does not propagate to the slave ( my slave node was fried a couple of weeks ago , so i had a new one bought) put them in cluster , but the only thing that did not propagate over , was the FRR confs... which is strange....any ideas?
-
@vinns said in FRR BGP over IPsec , when HA happens (slave-> master, master ->slave):
one more thing i was not able to replicate , the FRR configs even though its in HA mode , does not propagate to the slave ( my slave node was fried a couple of weeks ago , so i had a new one bought) put them in cluster , but the only thing that did not propagate over , was the FRR confs... which is strange....any ideas?
Same problem here, It doesn't propagate the configuration to the slave.
Since this cluster only has one area and a few networks, I configured the slave with the same settings manually.