Unexpected CARP behavior
-
I am not sure is it related to 2.6 or it's just designed to work like that.
I have multiwan configuration and CARP. Two firewalls.
WAN1 is PPPoE
WAN2 is DHCP
I have configured an fail-over gateway
WAN1 is tier1
WAN2 is tier2
LAN and WAN2 have CARP VIP enabled and configured using https://docs.netgate.com/pfsense/en/latest/recipes/high-availability.html
When WAN2 is disconnected on both firewalls, this causes primary firewall to go into the backup stateNov 28 18:18:52 check_reload_status 416 updating dyndns opt4 Nov 28 18:18:51 php-fpm 49072 /rc.carpbackup: HA cluster member "(10.0.100.99@igb0): (WAN2)" has resumed CARP state "BACKUP" for vhid 6 Nov 28 18:18:50 check_reload_status 416 Carp backup event Nov 28 18:18:50 kernel carp: 6@igb0: MASTER -> BACKUP (more frequent advertisement received) Nov 28 18:18:50 kernel carp: 6@igb0: BACKUP -> MASTER (master timed out) Nov 28 18:18:50 check_reload_status 416 Carp master event
Is this an expected behavior?
-
PPPoE is not compatible with CARP and that is not a valid HA configuration.
-
@jimp
I am not using PPPoE (WAN1) in CARP configuration. But I think you mean that this CARP configuration is invalid because CARP does know nothing about this PPPoE WAN and doing reload? What I don't understand is why it changes the state of the primary firewall to BACKUP, just for a couple of seconds? -
It's invalid because it has PPPoE at all -- it's not compatible with CARP, and since it's not involved in CARP that couldn't participate in HA properly. While some parts of that may seem OK on paper, in practice it's an unsupported configuration so if it doesn't work, don't be surprised.
When you unplug an interface involved in CARP, a node will demote itself because of the hardware "failure". If it's unplugged on both, both nodes will be demoting themselves and they may have some issues figuring out the correct status because they both believe they have failed. Depending on the demotion counters they may have both maxed out their advskew and end up advertising at the same rate.
That isn't the kind of failure most people would ever see in production.
-
Thanks for the explanation, @jimp.
So then I use CARP only for LAN and looks like this works pretty well. -
@w0w said in Unexpected CARP behavior:
So then I use CARP only for LAN and looks like this works pretty well.
But isn't CARP for LAN only in your case pretty much useless? As it doesn' t really do anything other then having fallback in case you LAN is faulty or you do an upgrade? If that's all you need that's OK and fine :)
Just why would you disconnect a functioning WAN CARP setup on WAN2 because of that way? Is it no option to just put a dial-up PPPoE device in front of your CARP cluster, let it do the dial-up and setup a nice litte WAN1 CARP setup that way?
Just curious and wondering if that's not an option :)
-
@jegr said in Unexpected CARP behavior:
@w0w said in Unexpected CARP behavior:
So then I use CARP only for LAN and looks like this works pretty well.
But isn't CARP for LAN only in your case pretty much useless? As it doesn' t really do anything other then having fallback in case you LAN is faulty or you do an upgrade? If that's all you need that's OK and fine :)
Just why would you disconnect a functioning WAN CARP setup on WAN2 because of that way? Is it no option to just put a dial-up PPPoE device in front of your CARP cluster, let it do the dial-up and setup a nice litte WAN1 CARP setup that way?
Just curious and wondering if that's not an option :)
Yes, faulty firewall or upgrade is mostly my case.
If I put some PPPoE doing device in front of CARP cluster...
How do you imagine this, another NAT device?