2.4.5 HA with two WAN failover - two interfaces become master after approx. 2 hours
-
Hi Everyone
I'm starting to configure a multi-WAN multi-LAN FW with pfSense since last week, it seem a gorgeous product, and I could construct a HA FW with two LANs and two WANs (I've started with the HA and WAN balabce / failover, all the other things like Firewall itself, DHCP or DNS are pending until I can solve this issue)What I got:
Two pfsense intances: Oracle Virtual machines, with 5 NICs (virtio, promisc. mode enabled. One LAN, One DMZ, two WAN (fiexd IP's), and an exclusive SYNC interface)The problem:
I start both FWs, everything works fine (pfsync, CARP, XMLRPC), all HA / failover test work fine (shutdown any node, disconnect any WAN channel)....but around 2 hours the slave node becomes MASTER on two interfaces.... I need to restart the slave node and everything comes to normal...until two hours later... and soI've reading many forums trying to fix the issue, the logs only show:
carp: 2@vtnet1: BACKUP -> MASTER (master timed out)
carp: 4@vtnet2: BACKUP -> MASTER (master timed out)I also checked the spanning tree and other blocking issues in the switches, all are disabled, I've also working with the VIP base time (varying it from 1 to 7) but the behaviour persists
When only a FW is on, it works OK....but both FW work for around 1 - 2 hours, the the SLAVE has the behaviour menctioned above.
I feel that my options are exhausted, I don't know where to fix this issue...please your comments / help
-
Does the log on the master (the FW which is switching to backup status) show a "reloading filter" message just prior to the CARP state change? Seems to be a known issue which is causing CARP instability (for us, on physical hardware, but apparently the issue is more common on VMs). Will hopefully be fixed in 2.4.5-p1. Some discussion and possible temporary mitigation discussed here:
https://redmine.pfsense.org/issues/10414
https://forum.netgate.com/topic/153723/after-upgrade-to-2-4-5-primary-in-ha-pair-stops-sending-carp-adv-momentarily-after-firewall-rule-changes-are-applied