Pfsense 2.4.5 CARP - Traffic dies when moving back to Master
-
My problem is to return traffic back to the Master from Slave. If I disable CARP at Master, all services and traffic are transferred to Slave. When it's time to get back to original configuration and transfer VIP's and IPSEC-tunnels back to Master..
- At Master, I hit "Enable CARP"
- All VIP's are now green at Master and yellow at Slave
- But no more traffic to any of the servers behind Pfsense cluster.
If I hit "Disable CARP" at Master, traffic start's to flow immediatelly again via Slave and everyone else but me is happy. Only way to move traffic back to Master seems to be shutting down Slave and rebooting Master -> Traffic flows again just fine. After this I can boot Slave back online.
I'm puzzled. I have triple checked (with help from my colleague) that..
- All the servers and NIC's are identical and working fine.
- VLAN-settings are same with Master and Slave.
- Interface settings are same with Master and Slave (Master and Slave having their individual IP's, of course).
- I can ping Slave from Master and vice-versa using SYNC-interface.
- Switch ports connecting to Pfsense NIC's has broadcast and multicast traffic allowed and not throttled.
- Settings, rules, aliases, IPSEC-settings etc are synchronized to Slave (when changed at Master).
Now I do not know how to debug this further and I would really appreciate any pointers and help to guide me to solve this mysterious problem. I can't say if this is a bug on 2.4.5-version or have I missed or misconfigured something. Or is my setup (servers, NIC's and switches) incompatible with this kind of setup.
Couple of things I have considered:
- At the Slave -> System -> High Avail. Sync -> pfsync Synchronize Peer IP is not set, so the slave is using multicast. Should I consider changing to unicast and add Master's IP to the Slave?
- I have configured IP-alias'es with /32 subnet, but should I have used actual subnet (/24 or /25) instead? IP-aliases has /32, CARP-IP's have a full subnet.
- I have re-used VHID's, but only in differenct VLAN/subnets. Should I give every CARP-IP a different VHID, even they are in separate VLAN's? The manual says this should not matter, but can this cause problems to MAC-table of the switch?
At this point, I already thank you for using your time for reading all this information. I do not know what other essential information I should provide, but I'm happy to provide more if needed.
Here is a brief summary of the network's:
- Two public WAN:s, separated into VLAN's 4078 & 4079
- More than 200 IP-aliases on WAN CARP-interfaces (VLAN's 4078 & 4079).
- More than 50 private VLAN's (VLAN 10-100)
- Sync-traffic is connected using dedicated NIC with direct cable between servers (not via switch).
Status view (partial) at Master:
Status view at Slave:
-
Did you check "Synchronize states" in System > High Availability Sync on both nodes?
@jeppunen said in Pfsense 2.4.5 CARP - Traffic dies when moving back to Master:
At the Slave -> System -> High Avail. Sync -> pfsync Synchronize Peer IP is not set, so the slave is using multicast. Should I consider changing to unicast and add Master's IP to the Slave
Just give it a try.
I've set the respective other nodes IP here and it fails over flawlessly in both directions.@jeppunen said in Pfsense 2.4.5 CARP - Traffic dies when moving back to Master:
I have re-used VHID's, but only in differenct VLAN/subnets. Should I give every CARP-IP a different VHID, even they are in separate VLAN's?
I'd rather go with unique VHIDs to be safe.
-
@viragomann said in Pfsense 2.4.5 CARP - Traffic dies when moving back to Master:
Did you check "Synchronize states" in System > High Availability Sync on both nodes?
Good tip and easy to miss, but this one was ok in settings.
@jeppunen said in Pfsense 2.4.5 CARP - Traffic dies when moving back to Master:
At the Slave -> System -> High Avail. Sync -> pfsync Synchronize Peer IP is not set, so the slave is using multicast. Should I consider changing to unicast and add Master's IP to the Slave
Just give it a try.
I've set the respective other nodes IP here and it fails over flawlessly in both directions.Your first tip gave me an idea and you might be onto something with your state-theory.. Master uses unicast to transfer states etc but Slave uses multicast (as there is no IP set). Even sync-interfaces are connected with direct cable, maybe it's possible that states are missing from Master when Master resumes.. Or they are missing for some other reason..
@jeppunen said in Pfsense 2.4.5 CARP - Traffic dies when moving back to Master:
I have re-used VHID's, but only in differenct VLAN/subnets. Should I give every CARP-IP a different VHID, even they are in separate VLAN's?
I'd rather go with unique VHIDs to be safe.
The manual says: "The input validation in pfSense software will not permit using conflicting VHIDs on a single pair of systems". Because I have managed to use same VHID again and again, this mechanism should have prevented me to do crazy things? But if I'm using 254 as VHID, the MAC address is 00:00:5e:00:01:FE on all interfaces with same VHID. Even transfering from Master to Slave succeeds, maybe my switch does not like same MAC to be on multiple interfaces.. I don't know if this is an issue or not. I'll probably have to go through all interfaces and give them all an unigue vhid.
Thanks for the insights @viragomann