CARP/HA not working
-
What do we want:
We want to make sure the servers always have internet and are available from outside the building via RDP for clients. They will connect to rdp via a FQDN such as remoteserver.domain.com:55555
So we want the firewalls setup in such a way they sync all the settings AND are always on, when one of the two should go down by failure or maintenance.Our setup:
2 Netgate XG-7100 firewalls:
The WAN cable is connected to ETH1
The LAN cable is connected to ETH2 and to a switch.
In port ETH8 of DC-FW1, there is a LANCable connected directly to port ETH8 of DC-FW2.DC-FW1:
Static WAN ip: 183.41.27.34 on ETH1
Static LAN ip: 10.0.0.253 /24 on ETH2 and dhcp server enabled
DC-FW2:
Static WAN ip: 183.41.27.35 on ETH1
Static LAN ip: 10.0.0.254 /24 on ETH2 and dhcp server enabledOn both devices:
I renamed the OPT1 port to SYNC.
But as the OPT1 Port is assigned to IX0 and I want the SYNC port to be on ETH8 I did the following:
I went into INTERFACES > SWITCHES > VLANs and created a vlan 8:I changed in PORTS the port VID to 8:
In INTERFACES > VLANs I created a VLAN (tag 8):And Assigned SYNC to this VLAN:
This I did on both.
On DC-FW1 the SYNC ip is:
On DC-FW2 SYNC is:
On both devices the SYNC firewall rule is set to:
On DC-FW1 the “high availability sync” is set like:
On DC-FW2 it is set like:
Now we get this:
And so there is no sync working.
I also created 2 VIPs:
And changed the outbound NAT:
Fyi in the ARP table I see this on
DC-FW1:And this on DC-FW2:
The physical wiring is like:
Can anyone assist me in this?
Thank you
-
@nick-loenders said in CARP/HA not working:
On both devices the SYNC firewall rule is set to:
Allowing TCP is not sufficient for sync. You have to allow "PFSYNC" protocol.
-
@viragomann and where / how is this?
-
-
@viragomann found it in the netgate manual. But CARP on the second FW still says:
-
@nick-loenders
Presumably the CARP VIPs were not synced over to the backup, since the sync wasn't allowed before.
Maybe you succeed when you edit the CARP VIPs and save them again.Otherwise check the system log on both for hints.
-
@viragomann I see this on one :
/rc.filter_synchronize: The Netgate pfSense Plus software configuration version of the other member could not be determined. Skipping synchronization to avoid causing a problem!but both run on
Version 21.05-RELEASE (amd64)
built on Tue Jun 01 16:52:56 EDT 2021
FreeBSD 12.2-STABLE -
@viragomann is the rest oke and alos the cable on ETH8 ? IS this a good connection? It is not a crosscable.
I also found this:
https://forum.netgate.com/topic/80092/resolved-carp-not-failing-back-and-other-weird-behaviour-on-pfsense-2-2/12where it says:
To add closure to this issue, the problem went away by resetting sysctl net.inet.carp.demotion from 240 to 0 with:sysctl net.inet.carp.demotion=-240
sysctl net.inet.carp.demotion is essentially a penalty against the advskew settings. Returning this to 0 made the VIPs stable and removed the warning from the CARP status page, though it would recur following a reboot.
According to https://forum.pfsense.org/index.php?topic=89132.msg496865#msg496865 the problem is caused when using CARP on a LAGG. When the LAGG is initialised it loses some CARP advertisements and causes net.inet.carp.demotion to be increased by the value of net.inet.carp.senderr_demotion_factor (240).
Setting:
net.inet.carp.senderr_demotion_factor=0
But where can I add this setting??
-
@nick-loenders said in CARP/HA not working:
@viragomann I see this on one :
/rc.filter_synchronize: The Netgate pfSense Plus software configuration version of the other member could not be determined. Skipping synchronization to avoid causing a problem!Ensure that you allow also access to the webconfigurator port (TCP) on the Sycn interface.
Maybe this can prevent this issue.Setting:
net.inet.carp.senderr_demotion_factor=0
But where can I add this setting??Never changed this setting. My sync is on a seperate dedicated interface.
-
@nick-loenders If I'm reading your picture right the "Synchronize Config to IP" is set to your sync interface IP, and it should be the LAN IP of router2.
-
@steveits
Nope, it's fine to use the sync ip of the other box.OP- I've always used a dedicated physical interface for sync. I'm not familiar with that switchport hardware, but could you use a straight vlan interface instead of a lag?
-
@dotdash said in CARP/HA not working:
fine to use the sync ip
Actually the docs say it "should use the Sync interface." Wonder why ours was set up otherwise, years ago.
@nick-loenders on router2 the Synchronize Config to IP field should be blank, you don't want the secondary syncing back to the primary. Also did you find the troubleshooting doc?
-
@steveits Hi, I changed it on the 2nd device, but still no luck :(
-
I just enable ipv4 any to any from any on the sync interface. I have mine configured as a /30 plugged directly from one box to the other.
-
-
@nick-loenders
I've never worked with a model with the switchport setup. There's a note in the manual about CARP limitations due to the switchport not going down. Do you have the expansion riser? I'd get a couple of quad port intel cards and use those ports. -
There are a number of issues here:
The Sync VLAN on the switch id configured incorrectly.
You need to add ports 9 and 10 as tagged members to VLAN 8 so that it is passed from the LAGG.Config sync should only ever be from the Primary to the Secondary (unless you have more than 2 nodes) otherwise you will create a loop. Remove all the settings from the XMLRPC Sync section on DC-FW2.
Leave the pfSync section though as state sync needs to be both directions.You should not have any outbound NAT rules for the SYNC subnet, that should never connect to anything but the other node.
Most importantly though is that when using the XG-7100 in an HA pair the failover interfaces should not be on the Eth ports. That is because you will not get full failover function using those.
In the event of the port losing link, a bad port or a bad cable or unintentional disconnect for example, it will not demote itself. The results in a split Master/Backup that will interrupt traffic.
It will still failover correctly if the full device fails or is upgraded though.To avoid that you should use the ix ports for WAN and LAN or add an expansion card with additional discrete interfaces and use those.
Steve
-
@stephenw10 Hi, we will use port ix0 as OPT1/SYNC port , that should work right?
Also at this moment the FW1 is connect to SWITCH1 and FW2 is connected to SWITCH2, but there is no link between the switches, apparently that needs to be done as well, so we'll do that too.
Also now FW1 is connected to WAN ip 1 and FW2 is connected to WAN ip 2 like:
But I guess I need to add a switch for this as well, for the 3rd WAN IP ? like:
??
-
ix0 will work for the SYNC interface, yes, but since it's doesn't use CARP SYNC can be on one of the Eth ports. You just need to configured the internal switch correctly.
Using ix0 as either WAN or LAN is a much better use if you don't have an expansion card.
You need to have a layer connection between the nodes on all interfaces that have CARP failover, yes. So, yes, you need a switch on the WAN side.
See: https://docs.netgate.com/pfsense/en/latest/recipes/high-availability.html
Steve
-
@stephenw10 so how do I configure the ETH8 then, so it is configured correctly for CARP SYNC ??
if I leave ETH1 for WAN and eth2 for LAN .