Primary does not auto fallback with pfsense 2.7.2
-
I run a HA pair of pfsense firewalls. Both are running v2.7.2-RELEASE.
I needed to shut down the primary firewall for some hardware maintenance. This is the first time that this was done with the cluster running 2.7.2. The secondary firewall took over as the main (as expected), but remained as the main after the primary came back up. The primary came up as the backup in this case.
I left the firewalls in this state for several hours, with no change. However, when I rebooted the main (secondary), the primary took over and the secondary came up as the backup.
In the past, the secondary would persist as the main only if the primary was unavallable - the primary would automatically take over as the main as soon as it became available.
Nothing has changed in regard to CARP settings. I run the recommended setup with 3 x NICs, one each for WAN, LAN and sync. The primary's CARP advertisement settings are 1 sec with 0 skew (for both LAN and WAN interfaces) and the secondary is set to 1 sec with 100 skew.
Has anyone else noticed this behaviour?
Having said this, the issue may in fact be a good thing, in that the cluster can be left in a failed-over state for long periods of time, whilst maintaining high availability. This balances out the "wear and tear" on the hardware. However, any config changes would need to be made with the primary as main (which has always been the case, although now extra care is needed to ensure that this is the case before config changes are made)
-
@jypsilantis have not, but, there is a button on the CARP status page to enter persistent maintenance mode, and I do that to force it to the backup.
Check https://docs.netgate.com/pfsense/en/latest/troubleshooting/high-availability.html#primary-node-is-stuck-as-backup
-
@SteveITS, thank you for the prompt reply.
I have delved into the issue a little further, and it appears that the issue is a bug with the UI and supporting code, for v2.7.2-RELEASE, specifically the way that the UI sets up the low level CARP parameters on the NICS.
Both the primary and secondary NICs are set up with advskew = 254 at the ifconfig level, regardless of what is specified via the UI
Taking my LAN setups as an example, I use the following CARP parameters on the UI:
primary: advbase=1, advskew=0, base IP=10.0.0.1/16
secondary: advbase=1, advskew=100, base IP =10.0.0.2/16However, when I look at the NICs via ifconfig, I see that advbase is set to 254. The same happens with my WAN NICs. Here is the output of ifconfig for each of the LAN NICs:
Primary:
em1: flags=1008943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
description: LAN
options=48120b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,HWSTATS,MEXTPG>
ether 00:0b:ab:86:50:fa
inet 10.0.0.1 netmask 0xffff0000 broadcast 10.0.255.255
inet 10.0.0.3 netmask 0xffff0000 broadcast 10.0.255.255 vhid 2
inet6 fe80::20b:abff:fe86:50fa%em1 prefixlen 64 scopeid 0x2
carp: MASTER vhid 2 advbase 1 advskew 254
peer 224.0.0.18 peer6 ff02::12
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>Secondary:
em1: flags=1008943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
description: LAN
options=48120b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,HWSTATS,MEXTPG>
ether 00:0b:ab:86:52:12
inet 10.0.0.2 netmask 0xffff0000 broadcast 10.0.255.255
inet 10.0.0.3 netmask 0xffff0000 broadcast 10.0.255.255 vhid 2
inet6 fe80::20b:abff:fe86:5212%em1 prefixlen 64 scopeid 0x2
carp: BACKUP vhid 2 advbase 1 advskew 254
peer 224.0.0.18 peer6 ff02::12
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>This would explain what I am seeing, e.g., with primary as main, and secondary as backup, a reboot of the primary causes the secondary to assume main, but nothing happens when the main comes back up, because they are at the same priority. I think that it is pure chance that I don't get crossed-over states between the LAN and WAN interfaces.
Is there a portal where I can report this to the developers?
-
@jypsilantis said in Primary does not auto fallback with pfsense 2.7.2:
Is there a portal where I can report this to the developers?
redmine.pfsense.org.
I am not seeing that on 23.09.1 but it was a long ago setup that's been upgraded many times:
carp: MASTER vhid 151 advbase 1 advskew 0 peer 224.0.0.18 peer6 ff02::12 carp: MASTER vhid 152 advbase 1 advskew 0 peer 224.0.0.18 peer6 ff02::12 carp: MASTER vhid 154 advbase 1 advskew 0 peer 224.0.0.18 peer6 ff02::12
Just to ask, you haven't set the backup to sync to the primary have you? It should be one direction...
-
@SteveITS Thank you for this. I expected to see something similar with my primary's NICs. I did however set up CARP a number of times with the UI in 2.7.2, which may have triggered the problem in my case.
I have bi-directional pfsync set up, but XMLRPC sync is only from the primary to the secondary.
I will report the issue to the developers.