CARP broken in nighly build
-
I installed 4 pc with nightly build from 21.8.2018.
I then setup two ha cluster with 2 member.I can ping from cluaster a member 1 and 2 to the other cluster b member 1 and 2 without any problem.
I then added a virtual IP with carp. Both cluster reconive master and backup state correct.When I ping from cluster a member 2 to meber 1 (carp IP - master) I get a package loss.
When I ping from cluster a member 1 (master) to cluster b member 1 (master) I get package loss. It is so extrem that the ping stals when I to a ping at the same time the other way arround.Ping Cluster a member 1 to cluster b member 1 (carp master IP)
Ping Cluster a member 1 to cluster b member 1 (interface IP)
als you can see cable switch everything is the same.
-
I now did a full reinstall on 2.4.3 on the same hardware.
To make it short. SAME PROBLEM!I can ping all hosts and the switches without a problem. But I get lost packages and timeouts when I ping from one side to the other side carp ip.
Ping cluster b member 2 to cluster a member 1 (is master)
Tcpdump on the cluster a member 1 as you can see. It stops sudden and then I get the timeout packages.
I have no clue where here the Problem is! I have other pfSense installations with ha and lagg that work without a problem.
-
@Moderador-PfSense : please move to HA/Carp section
-
Mystery solved!
The virtual IP in carp has the timeout and package loss due to configuration.
In my test setup I build a vlan for each wan link between my offices. On both sides are pfsense Cluster. Now the router interfaces on both sides are in the same vlan.
I setup the carp interfaces on both side with a different password but the same vhid.
That seems to lead to the problem of package loss and totaly no traffic.
Once the vhid in the vlan was different everything started working as expected.
-
Just to mention it for others searching for CARP problems: Check the troubleshooting guides, that's what they are for. :)
https://www.netgate.com/docs/pfsense/highavailability/troubleshooting-high-availability-clusters.html#conflicting-vhids
The aforementioned problem is the first topic in this guide. And VHIDs in general should be a topic to get accustomed with when running HA setups not only with pfSense clusters on both sides. You could have easily had a couple of Juniper, Cisco or other L3 Switches in a VRRP/HSRP/other HA combination setup on the other side. That's what I had a few years ago. Upstream provider (ISP in a datacenter) had our uplinks on a HSRP setup and told me multiple times, they were using an ID >10 in their setup, so I could run it with vhid 1. Turned out the tech was wrong and it defaulted to - yes - vhid 1 on their side, too. So both had the same virtual mac address for both our VIPs on both sides. Easy to see, that such a thing will screw L2 up perfectly. ;)
-
Moved to HA/Carp section.