Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Secondary takes over from functional master?

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    10 Posts 3 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K Offline
      KimmoJ
      last edited by

      I have a new cluster that works great now except for CARP. The secondary wants to steal the CARP master role and I wind up with dual masters.

      I have them connected straight across with a cable on a dedicated Ethernet port I just labeled "hbeat", and have the first set up with an IP of 10.10.10.2, and the other as 10.10.10.3. The master is set up to replicate config across to .3 and that works great, the config is identical. However, I'm unsure as to how CARP actually communicates.

      As soon as I plug in the interfaces on the secondary, it takes over all or a few interfaces and creates a situation where both nodes claim to be master. This breaks communcations, obviously.

      The one thing I don't control is our ISP's equipment. It's a switch with two ports active, and I just plug in there. Is that something that might contribute to this? I've set up an identical setup elsewhere to another ISP and had no such issues.

      Both ports are active, I can move the primary between 1 and 2 and have no issues (if the primary is the only one connected).

      Edit: I set the VHID's up as 1 through 5. Perhaps the Cisco I'm connecting to is messing with that? Maybe I should hike that number up by tacking on a zero, just to be safe?

      1 Reply Last reply Reply Quote 0
      • DerelictD Offline
        Derelict LAYER 8 Netgate
        last edited by

        CARP does not have a "heartbeat" interface. The direct-connect cable often called SYNC is for XMLRPC and state sync. The status of that interface is irrelevant to the function of the CARP VIPs on the traffic interfaces.

        CARP requires good layer 2/multicast through the switching gear between the ports the CARP VIPs are on.

        If the secondary is going MASTER it is probably not seeing the advertisements from the primary arriving on its WAN port.

        Yes, that could certainly be something in the ISP switch.

        You can packet capture for just CARP on the ISP interface and see what other VHIDs might be out there. I generally use the last octet of the CARP IP address as the VHID. If everyone does that on a /24 or smaller collision avoidance is self-regulated.

        You can also packet capture on the secondary interface to be sure it is seeing the advertisements from the primary. If it is not, CARP will not function and you will get dual-master.

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        1 Reply Last reply Reply Quote 0
        • K Offline
          KimmoJ
          last edited by

          So does every interface communicate via multicast on that VLAN? I have the machines set up on HP switches internally, with the separate VLAN's running "untagged" for the VLAN I want to route through that interface.

          Ie, is it enough that the WAN port (vhid 1) isn't able to communicate with the slave (assuming that's the issue) to cause the other shared IP's to not sort themselves out either?

          1 Reply Last reply Reply Quote 0
          • DerelictD Offline
            Derelict LAYER 8 Netgate
            last edited by

            I added more to the post above. You might want to re-read it.

            It is possible to have just one VIP go dual-master if there is a layer 2 problem on that network.

            Every interface primary/secondary pair needs to be able to communicate multicast 224.0.0.18 to each other for CARP to function.

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • K Offline
              KimmoJ
              last edited by

              Thanks again.

              Yeah, the behavior is just plain odd. As soon as I enable CARP on the secondary, even with every network wire physically pulled except LAN, it starts shifting around the master roles. The WAN cable isn't even connected but it still set that to master, as well as DMZ. It happens without me having anything on the ISP switch except the primary.  And the LAN port is in the default setting for a HP switch, VLAN 1 untagged.

              I shifted VHID 1-5 to 10-14 and it's still weird.

              I'm going to have to just leave the secondary physically disconnected and wait for the weekend to mess around with this some more, I think, can't keep disrupting traffic and making the users livid. Should have tested better before going live I guess, just didn't expect these issues.  :)

              1 Reply Last reply Reply Quote 0
              • DerelictD Offline
                Derelict LAYER 8 Netgate
                last edited by

                With an interface pulled the CARP status should be INIT, not MASTER. Not sure what you are seeing there.

                Chattanooga, Tennessee, USA
                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                1 Reply Last reply Reply Quote 0
                • K Offline
                  KimmoJ
                  last edited by

                  Me either. :) Oh well, I'll do some testing tomorrow, fortunately saturdays are not very busy. Every other weekday people are working from 5 am to 2 am, so not a lot of maintenance windows.

                  One thing that may be an issue is that I connected the firewalls to separate switches. I have two switches in the main rack configured in such a way as to have redundant paths to every other switch, with spanning tree in RSTP mode, and the switches are trunked (HP's variant of trunked, multiple ports joined together and with all VLAN's tagged, default VLAN untagged, ie default). Maybe there's an issue with broadcasts due to that.

                  So step one is to reconfigure the switches, shuffle around some ports and get both firewalls on the same one just to eliminate that potential source of failure.

                  ISP confirmed the switch we connect to is pure layer 2, so there's nothing there that should interfere.

                  1 Reply Last reply Reply Quote 0
                  • DerelictD Offline
                    Derelict LAYER 8 Netgate
                    last edited by

                    If you're bored you can just plug a laptop with wireshark into the switchports that the secondary should be plugged into. You should be seeing CARP advertisements from the primary. If not it's not going to work.

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • K Offline
                      KimmoJ
                      last edited by

                      Good tip, I'll try that as well.

                      1 Reply Last reply Reply Quote 0
                      • B Offline
                        bennyc
                        last edited by

                        Do you sync the VIP's? If so that could be the cause…
                        Had some issues with that in the past, see: https://forum.pfsense.org/index.php?topic=102740.msg572905#msg572905

                        4x XG-7100 (2xHA), 1x SG-4860, 1x SG-2100
                        1x PC Engines APU2C4, 1x PC Engines APU1C4

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.