Carp VIP Failover
I just deployed a two pfsense firewall solution and it works great so far, but in my failover testing I noticed something that I didn't quite understand. I setup Virtual IPs and gateways for both the LAN and the WAN, when I pull the WAN interface on master the WAN VIPs fail over but not the LAN, likewise when I pull the LAN from master the LAN VIPs fail over. The problem is that the VIPs for the other side of the fence are still owned by the original master hence internet connectivity path is broken. It would make sense to me if any WAN or LAN interface fails all CARP IPs should fail over so the path to connect remains in place, is this not the case? Did I miss something? I have whipped up a diagram attached to demonstrate what I am seeing.
Thanks for the great software, the book is pretty good too :)
![CARP Failover Diagram.jpg](/public/imported_attachments/1/CARP Failover Diagram.jpg)
![CARP Failover Diagram.jpg_thumb](/public/imported_attachments/1/CARP Failover Diagram.jpg_thumb)
All CARP VIPs should fail at the same time, assuming all of the VHIDs, passwords, etc, match up properly. Which they normally do if you've let the config sync move them from master to slave. Did you confirm in both GUIs that the CARP status really still shows 'master' on the master and backup on the backup with the cable unplugged from the master?
yes, on the master all VIPs show as master and on the backup all VIPs show as backup. Interestingly, if I pull the LAN and WAN cables simultaneously all VIPs fail over and things work as expected, but pulling just one results in the failover of only VIPs on that interface. The sync interface is a dedicated port on each firewall, and I have the following VHID's:
vip1/vip2 are the virtual gateways for LAN and WAN respectively, and vip11/vip12 are virtual IPs on the WAN side for port forwards, etc. When you say VHIDs should match, do you mean they should be listed with the same CARP name on both firewalls? If so then yes, going to CARP:Status page shows master with all four VIPs and slave with all four VIPs, each matching in VHID. I did notice when these were first built that sync did not happen right away, I got a message saying the slave was running an older version of sync and automatic sync would not happen, I had to press force sync for the initial sync to happen. Since then sync seems fine, and these firewalls are identical in hardware and software versions. Could this be related to the problem? At the end of the day I do have HA, but the master would have to die completely for it to work.
What snapshot are you running on those? There was a carp sync issue quite a while ago but that has been fixed. You might want to try a snapshot from last Thurdsay or early on Friday. There were some CARP patches that went in on Friday afternoon that have caused some instability that hasn't yet been fixed, so avoid those at the moment.
fw1: 2.0-BETA5 (i386)
built on Fri Jan 21 19:22:57 EST 2011
fw2: 2.0-BETA5 (i386)
built on Fri Jan 21 19:22:57 EST 2011
These are pretty recent (Friday) Do you think I should update?
Not yet, wait for the next new snapshot. It has some CARP patches in it that even though they may not help this specific problem, if you get a build between yours and this one, it could panic at bootup.
Are snapshots from today onward ok?
It's supposed to be, but I saw a post from at least one other person that said they were still seeing a panic.
This problem still persists.
I have 7 VIP's and they switch over regularly between the master and the slave. In fact, the slave is master on most VIP's most of the time. As OP states, when the VIP's are not active on the same server there is no network connectivity between the VIP's on the master and the ones on the slave.
I see two solutions:
1: always switch over all VIP's
2: route the traffic between out-of-sync VIP's over the SYNC network
I'm running "2.0-RC1 (amd64) built on Thu Apr 7 21:38:24 EDT 2011" on both servers.
All VIPs should switch how things are right now. I haven't seen anyone else (with a proper setup) lately see CARP VIPs which are not all failing as a group.
There isn't a way to route traffic over the sync interface that way.
I just tried on my CARP VM pair and if I disconnect an interface on the master, all of the VIPs fail at once to the backup box.