Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Backup node taking over CARP Virtual IP

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    11 Posts 2 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DerelictD
      Derelict LAYER 8 Netgate @jypsilantis
      last edited by Derelict

      @jypsilantis CARP/pfSense HA is incompatible with dynamic addresses like those obtained via DHCP. I would say if it ever worked it was a fluke.

      In a normal CARP setup the only reason a VIP in the BACKUP state would go MASTER is if that interface stopped receiving "better" advertisements from the MASTER node.

      Chattanooga, Tennessee, USA
      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
      Do Not Chat For Help! NO_WAN_EGRESS(TM)

      J 1 Reply Last reply Reply Quote 0
      • J
        jypsilantis @Derelict
        last edited by jypsilantis

        @derelict thank you for the quick reply.

        My LAN NICs are set to static IP addresses and the same is happening on these as well. I can try changing over to statics on the WAN side as well, but I think it won't make much difference.

        The strange thing is that the backup is still showing "backup" even though it has control of the VIP.

        It is almost as if there is some kind of load balancing happening - the backup appears to be slightly less loaded on the most part compared to the primary.

        Everything seems to be working properly so I am not too concerned on that part, just a bit confusing when you try to log onto the active master and end up on the backup.

        1 Reply Last reply Reply Quote 0
        • DerelictD
          Derelict LAYER 8 Netgate
          last edited by

          @jypsilantis In general, unless the primary node is in maintenance mode, all CARP VIPs on the primary should be MASTER and all CARP VIPs on the secondary should be BACKUP.

          If that is not the case the problem is generally a layer 2 / multicast/broadcast domain problem in the path between the nodes on that network.

          There is a sticky at the top of this category in which I attempted to explain the various parts of an HA cluster.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          J 1 Reply Last reply Reply Quote 0
          • J
            jypsilantis @Derelict
            last edited by

            @derelict thanks for this.

            I looked at the persistent article that you mentioned. The symptoms in my case are different - I do not have a master/master situation, so it looks like the nodes are correctly resolving and establishing priority orders.

            I have a managed switch on the LAN side, and the modem has handled CARP without missing a beat for at least 3 years now. The problem appears to have occurred concurrently with the upgrade to the latest version of pfsense that I installed a few days ago.

            DerelictD 1 Reply Last reply Reply Quote 0
            • DerelictD
              Derelict LAYER 8 Netgate @jypsilantis
              last edited by

              @jypsilantis Like I said, CARP is not compatible with interfaces that obtain their addressing from DHCP and never has been. I am probably misunderstanding what you actually have there.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              J 1 Reply Last reply Reply Quote 0
              • J
                jypsilantis @Derelict
                last edited by

                @derelict, one pair of interfaces (WAN) are on DHCP and the others (LAN) are true statically addressed (not DHCP pseudo static). The problem occurs on both.

                If DHCP were the issue on the WAN, I would see, for example

                                             CARP state             Ownership of VIP
                

                Primary LAN NIC PRIMARY Yes
                Backup LAN NIC. BACKUP No
                Primary WAN NIC BACKUP No
                Backup WAN NIC. PRIMARY Yes

                What I am actually seeing:

                                             CARP state             Ownership of VIP
                

                Primary LAN NIC PRIMARY No
                Backup LAN NIC. BACKUP Yes
                Primary WAN NIC PRIMARY No
                Backup WAN NIC. BACKUP Yes

                DerelictD 1 Reply Last reply Reply Quote 0
                • DerelictD
                  Derelict LAYER 8 Netgate @jypsilantis
                  last edited by

                  @jypsilantis That doesn't make much sense. You might want to just post screen shots of the CARP status pages or, better, output from both nodes of ifconfig -vvvvma

                  Some terminology so everyon'e on the same page: Nodes are primary/secondary, VIPs are MASTER/BACKUP.

                  Chattanooga, Tennessee, USA
                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                  J 1 Reply Last reply Reply Quote 0
                  • J
                    jypsilantis @Derelict
                    last edited by jypsilantis

                    @derelict thanks for this.

                    Here are some screenshots.

                    fw1.local is the primary member of the HA cluster, and fw2.local is the backup

                    The WAN-side NICs share address 10.1.0.10, which is presented by the active member to the broadband modem/router. The modem/router assigns "primary" IP addresses to each member via DHCP: 10.1.0.97 for fw1 and 10.1.0.85 for fw2

                    The LAN-side NICs share address 10.0.0.3. fw1 has an intrinsic static IP of 10.0.0.1 and fw2 has 10.0.0.2.

                    From the screenshots, you can see that fw2 is running backup CARP on both of its NICs, and conversely fw1 is running MASTER for both of its interfaces. As such, there is no split master/backup or dual master/master. These statuses appear to persist.

                    However, when I access the web interface via 10.0.0.3, I land on fw2. Similarly, the modem/router reports that fw2 has control of 10.1.0.10.

                    If I reboot the backup, then fw1 takes over 10.0.0.3 and I get to its web interface via this address. However, several minutes after fw2 comes back up, it resumes control of 10.0.0.3 and the status as per the screenshots returns.

                    This is counter intuitive, but strangely everything seems to be working fine in all other respects.

                    [edit: just noticed that the net mask for the LAN side CARP is wrong - should be /16. I have made the changes. However, no effect to the above behaviour, fw2 took over 10.0.0.3 shortly after reboot]

                    Screen Shot 2021-03-17 at 3.30.16 pm.png Screen Shot 2021-03-17 at 3.30.41 pm.png Screen Shot 2021-03-17 at 3.31.09 pm.png

                    DerelictD 1 Reply Last reply Reply Quote 0
                    • DerelictD
                      Derelict LAYER 8 Netgate @jypsilantis
                      last edited by

                      @jypsilantis You'll need to look at layer 2 and see what is happening with the CARP MAC address. Everything there looks fine. Be sure you're also not doing something like port forwarding the webgui connections around.

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      J 1 Reply Last reply Reply Quote 0
                      • J
                        jypsilantis @Derelict
                        last edited by

                        @derelict I may have found the problem. Possibly a corrupt or failing disk.

                        I replaced the disk on the backup node today, rebuilt and and restored configs from a previous (recent) backup file. Everything looks fine now.

                        I will keep monitoring in case the problem reoccurs, but it may be something as simple as this.

                        A really strange symptom if it is in fact a failing disk. SMART status was OK, so perhaps some corruption from the recent power outage that took out my primary firewall disk.

                        For anyone else who may experience this issue, try rebooting with the disk repair option, and/or change out the disk and rebuild/restore.

                        Thanks for your help and guidance.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.