Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Primary does not auto fallback with pfsense 2.7.2

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    5 Posts 2 Posters 645 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jypsilantis
      last edited by

      I run a HA pair of pfsense firewalls. Both are running v2.7.2-RELEASE.

      I needed to shut down the primary firewall for some hardware maintenance. This is the first time that this was done with the cluster running 2.7.2. The secondary firewall took over as the main (as expected), but remained as the main after the primary came back up. The primary came up as the backup in this case.

      I left the firewalls in this state for several hours, with no change. However, when I rebooted the main (secondary), the primary took over and the secondary came up as the backup.

      In the past, the secondary would persist as the main only if the primary was unavallable - the primary would automatically take over as the main as soon as it became available.

      Nothing has changed in regard to CARP settings. I run the recommended setup with 3 x NICs, one each for WAN, LAN and sync. The primary's CARP advertisement settings are 1 sec with 0 skew (for both LAN and WAN interfaces) and the secondary is set to 1 sec with 100 skew.

      Has anyone else noticed this behaviour?

      Having said this, the issue may in fact be a good thing, in that the cluster can be left in a failed-over state for long periods of time, whilst maintaining high availability. This balances out the "wear and tear" on the hardware. However, any config changes would need to be made with the primary as main (which has always been the case, although now extra care is needed to ensure that this is the case before config changes are made)

      S 1 Reply Last reply Reply Quote 0
      • S
        SteveITS Galactic Empire @jypsilantis
        last edited by

        @jypsilantis have not, but, there is a button on the CARP status page to enter persistent maintenance mode, and I do that to force it to the backup.

        Check https://docs.netgate.com/pfsense/en/latest/troubleshooting/high-availability.html#primary-node-is-stuck-as-backup

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote ๐Ÿ‘ helpful posts!

        J 1 Reply Last reply Reply Quote 0
        • J
          jypsilantis @SteveITS
          last edited by

          @SteveITS, thank you for the prompt reply.

          I have delved into the issue a little further, and it appears that the issue is a bug with the UI and supporting code, for v2.7.2-RELEASE, specifically the way that the UI sets up the low level CARP parameters on the NICS.

          Both the primary and secondary NICs are set up with advskew = 254 at the ifconfig level, regardless of what is specified via the UI

          Taking my LAN setups as an example, I use the following CARP parameters on the UI:

          primary: advbase=1, advskew=0, base IP=10.0.0.1/16
          secondary: advbase=1, advskew=100, base IP =10.0.0.2/16

          However, when I look at the NICs via ifconfig, I see that advbase is set to 254. The same happens with my WAN NICs. Here is the output of ifconfig for each of the LAN NICs:

          Primary:

          em1: flags=1008943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
          description: LAN
          options=48120b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,HWSTATS,MEXTPG>
          ether 00:0b:ab:86:50:fa
          inet 10.0.0.1 netmask 0xffff0000 broadcast 10.0.255.255
          inet 10.0.0.3 netmask 0xffff0000 broadcast 10.0.255.255 vhid 2
          inet6 fe80::20b:abff:fe86:50fa%em1 prefixlen 64 scopeid 0x2
          carp: MASTER vhid 2 advbase 1 advskew 254
          peer 224.0.0.18 peer6 ff02::12
          media: Ethernet autoselect (1000baseT <full-duplex>)
          status: active
          nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

          Secondary:

          em1: flags=1008943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
          description: LAN
          options=48120b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,HWSTATS,MEXTPG>
          ether 00:0b:ab:86:52:12
          inet 10.0.0.2 netmask 0xffff0000 broadcast 10.0.255.255
          inet 10.0.0.3 netmask 0xffff0000 broadcast 10.0.255.255 vhid 2
          inet6 fe80::20b:abff:fe86:5212%em1 prefixlen 64 scopeid 0x2
          carp: BACKUP vhid 2 advbase 1 advskew 254
          peer 224.0.0.18 peer6 ff02::12
          media: Ethernet autoselect (1000baseT <full-duplex>)
          status: active
          nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

          This would explain what I am seeing, e.g., with primary as main, and secondary as backup, a reboot of the primary causes the secondary to assume main, but nothing happens when the main comes back up, because they are at the same priority. I think that it is pure chance that I don't get crossed-over states between the LAN and WAN interfaces.

          Is there a portal where I can report this to the developers?

          S 1 Reply Last reply Reply Quote 0
          • S
            SteveITS Galactic Empire @jypsilantis
            last edited by

            @jypsilantis said in Primary does not auto fallback with pfsense 2.7.2:

            Is there a portal where I can report this to the developers?

            redmine.pfsense.org.

            I am not seeing that on 23.09.1 but it was a long ago setup that's been upgraded many times:

            carp: MASTER vhid 151 advbase 1 advskew 0
            	      peer 224.0.0.18 peer6 ff02::12
            carp: MASTER vhid 152 advbase 1 advskew 0
            	      peer 224.0.0.18 peer6 ff02::12
            carp: MASTER vhid 154 advbase 1 advskew 0
            	      peer 224.0.0.18 peer6 ff02::12
            

            Just to ask, you haven't set the backup to sync to the primary have you? It should be one direction...

            Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
            When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
            Upvote ๐Ÿ‘ helpful posts!

            J 1 Reply Last reply Reply Quote 0
            • J
              jypsilantis @SteveITS
              last edited by

              @SteveITS Thank you for this. I expected to see something similar with my primary's NICs. I did however set up CARP a number of times with the UI in 2.7.2, which may have triggered the problem in my case.

              I have bi-directional pfsync set up, but XMLRPC sync is only from the primary to the secondary.

              I will report the issue to the developers.

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.