Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Strange CARP Behavior

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    6 Posts 2 Posters 4.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W
      wheelz
      last edited by

      I see lots of people posting about various CARP issues and I have read through a bunch of them.  Those have helped me already solve most of my issues however now I am seeing something in my test environment that I can't explain at all.

      I have 2 VM (vmware esxi 5.0u1) as master and slave.  I have already followed the howto about using VDS to isolate my promiscuous traffic and also enabled Net.ReversePathFwdCheckPromisc on my hosts to get rid of the redundant adapter echos.  What I can't explain is this:

      If I disconnect all vNICs on my master, the second pfsense becomes the master so that is working perfectly:

      Jan 13 11:31:28 kernel: vip1: link state changed to UP

      Then I re-enable all the vNICs on my master and as expected, it fails back (though I'm not sure why it repeats so much in the log):

      Jan 13 11:33:58 kernel: vip1: link state changed to DOWN
      Jan 13 11:33:58 kernel: vip1: 2 link states coalesced
      Jan 13 11:33:58 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:33:54 kernel: vip1: link state changed to DOWN
      Jan 13 11:33:54 kernel: vip1: 2 link states coalesced
      Jan 13 11:33:54 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:33:51 kernel: vip1: link state changed to DOWN
      Jan 13 11:33:51 kernel: vip1: 2 link states coalesced
      Jan 13 11:33:51 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:33:48 kernel: vip1: link state changed to DOWN
      Jan 13 11:33:48 kernel: vip1: 2 link states coalesced
      Jan 13 11:33:48 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:33:03 kernel: vip1: link state changed to DOWN
      Jan 13 11:33:03 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)

      However if I halt the master (which should effectively be the same as completely disconnecting the network as I did above), I get this:

      Jan 13 11:34:15 kernel: vip1: link state changed to DOWN
      Jan 13 11:34:15 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:34:15 kernel: vip1: link state changed to UP
      Jan 13 11:34:11 kernel: vip1: link state changed to DOWN
      Jan 13 11:34:11 kernel: vip1: 2 link states coalesced
      Jan 13 11:34:11 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:34:08 kernel: vip1: link state changed to DOWN
      Jan 13 11:34:08 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:34:08 kernel: vip1: link state changed to UP
      Jan 13 11:34:05 kernel: vip1: link state changed to DOWN
      Jan 13 11:34:05 kernel: vip1: 2 link states coalesced
      Jan 13 11:34:05 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:34:01 kernel: vip1: link state changed to DOWN
      Jan 13 11:34:01 kernel: vip1: MASTER -> BACKUP (more frequent advertisement received)
      Jan 13 11:34:01 kernel: vip1: link state changed to UP

      And it just repeats.  That is really got me puzzled because both scenarios should react the same I would think… any ideas?  ???

      1 Reply Last reply Reply Quote 0
      • W
        wheelz
        last edited by

        :P  I think I figured it out.  In case anyone else has some strange behavior like this, keep in mind that you have to reboot your VMware hosts before advanced setting configurations like Net.ReversePathFwdCheckPromisc take effect.  duhh, me…  ::)

        1 Reply Last reply Reply Quote 0
        • C
          cmb
          last edited by

          hm, shouldn't have to, I don't recall having to do that on any of the many setups I've done that require it. When I was looking at this yesterday, it sort of sounded like what happens when Net.ReversePathFwdCheckPromisc isn't set, but that would cause it to never fail over correctly. In that regard, there's no difference between disconnecting NICs on the primary, disabling CARP manually on the primary, and shutting down the VM of the primary. Maybe some new VMware bug specific to something in your environment that a host reboot fixes, they seem to introduce them that impact CARP-related networking things on a routine basis.

          1 Reply Last reply Reply Quote 0
          • W
            wheelz
            last edited by

            Yea… just happened again.  So I guess that wasn't it.  I had a power blib that caused the vmware host and a switch to go down and come back up.  This time I just tried rebooting the switch and it seemed to have stopped it...

            1 Reply Last reply Reply Quote 0
            • C
              cmb
              last edited by

              Looping multicast traffic back in the same interface it comes out of is the cause of that. There must be something on the physical network I guess that's looping it back.

              1 Reply Last reply Reply Quote 0
              • W
                wheelz
                last edited by

                I had a couple physical boxes I was able to try this on and it worked OK.  So I am guessing it is an issue in the way VMware is configured…  I do have 2 hosts each with 2 physical NICs for vmguest networking going to 2 physical switches (trunked between them).  I'm using a VDS with a separate port group that has promiscuous enabled on the VLAN that has my CARP VIP and have configured the hosts with Net.ReversePathFwdCheckPromisc = 1.  Is there anything else I am missing?

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.