• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

One VLAN is master on both HA's??? Strange networking issue

HA/CARP/VIPs
4
14
1.3k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    MrPete
    last edited by Nov 3, 2022, 9:11 PM

    This is a strange one.

    Context: I have a number of VLANs. They are handled using a single trunk ethernet interfacing to a smart switch that breaks things out as needed. I have two hardware-identical boxes for HA/CARP.

    I've got an issue I have never seen before.

    My CARP is fine, except:

    • While VLAN 19 is 100% fine on the primary
    • The secondary thinks the primary host is down on that VLAN
    • And therefore it too is Primary CARP for that VLAN :(

    Running tcpdump at both ends of the link shows:

    • the secondary is sending but not receiving packets on ONLY that VLAN
    • all other VLANs are fine (and share the exact same cable)

    I down/up'd that interface
    Checked smart switch config (yes that VLAN is enabled)...
    I even checked cables ;).

    Not sure how long ago this changed. I don't see the issue in any log so far.

    Ideas MOST welcome!

    A D 2 Replies Last reply Nov 3, 2022, 9:52 PM Reply Quote 0
    • A
      awebster @MrPete
      last edited by Nov 3, 2022, 9:52 PM

      @mrpete Sounds like the switches are dropping the CARP packets in one direction...
      Lots of reasons why this could be happening...

      CARP is very similar to VRRP, in fact it uses the same Protocol ID (112) as VRRP (insert big discussion about why the overlap here), thus, if you have VRRP running on your switches and the VID for CARP and VRRP are the same, they will conflict with each other.
      Thus, CARP VID must be unique and distinct from any VRRP VID.
      Use tcpdump -T carp to see protocol 112 decoded as CARP and not VRRP.
      CARP uses the same multicast address, 224.0.0.18 as VRRP as well. Make sure nothing else is using it.
      Next if you are running IPv4 and IPv6, I've found it works best if each CARP Virtual IP uses a different VID, ie: different ones for IPv4 and IPv6,.
      The VRRP/CARP VID is mapped into the sending MAC address as: 00:00:5e:00:xx:xx where xx:xx = VID number, so these need to be kept separate to prevent mayhem. This is true for both IPv4 and IPv6.
      Check that you don't have any VLANs "short circuited" together or you'll also have issues because the IPv4 CARP packets are broadcast to the multicast address 01:00:5e:00:00:12 which will be seen by all devices reachable in the L2 broadcast domain. IPv6 CARP packets to 33:33:00:00:00:12 but for the same effect.
      Finally, check that there isn't some sort of IGMP configured on the switches that is filtering the multicast packets sent to 224.0.0.18 on that the affected VLAN.

      –A.

      M 1 Reply Last reply Nov 9, 2022, 10:15 PM Reply Quote 0
      • M
        MrPete @awebster
        last edited by Nov 9, 2022, 10:15 PM

        @awebster Thanks for that good list.
        I've not solved the problem so far... mostly have new questions.

        Progress:

        • Confirmed the list doesn't reveal issues: Not using VRRP; vhid's are all unique; no shared use of 224.0.0.18; vlans not interlinked; not IGMP filtering.
        • NOTE: underneath, on a (VLAN) interface where both Prim/Sec are Master, the primary sees secondary as Up, but secondary thinks primary is Down. Can't send packets from Secondary to Primary, period. :(

        Additional lessons learned:

        • This is a Very Dangerous problem. Any interface with two Masters means that both will receive and respond to LAN packets... thus destroying the integrity of various LAN communications. :(
        • While testing, a second interface suddenly went into this "mode" of both being Master. :(

        My temporary workaround for now: I've shut down my secondary HA machine until I have time and a strategy to diagnose or fully rebuild the setup.

        One QUESTION: @awebster you wrote "Next if you are running IPv4 and IPv6, I've found it works best if each CARP Virtual IP uses a different VID, ie: different ones for IPv4 and IPv6." -- where is the Vid for ipv6 separately configurable? I don't find this.

        A 1 Reply Last reply Nov 10, 2022, 3:37 AM Reply Quote 0
        • A
          awebster @MrPete
          last edited by Nov 10, 2022, 3:37 AM

          @mrpete Curious problem to be sure...
          Perhaps share output of ifconfig -a to have a look at what the underlying OS has actually got configured on the interfaces.

          In regards to the question about where you set the vhid, you do it in the Aliases when defining a CARP IP.
          For example:
          login-to-view
          login-to-view

          I also use BASE = 1, Skew = 0 on the primary, and Base = 1, Skew = 100 on the backup

          You can use the command tcpdump -e -s0 -nn -i interface -T carp proto 112 command to look at the actual packets you're receiving / sending to ensure that the everything is working as expected.
          You should see something similar to this:

          00:00:5e:00:01:xx > 01:00:5e:00:00:12,  ethertype IPv4 (0x0800), length 70: Primary_REAL_IP > 224.0.0.18: CARPv2-advertise 36: vhid=xx  advbase=1 advskew=0 authlen=7 counter=some_long_number
          
          

          IPv6 is a little less interesting, it looks like this, but note that the source MAC address should be different between the IPv4 and IPv6 versions based on differing VHID values:

          00:00:5e:00:01:yy > 33:33:00:00:00:12, ethertype IPv6 (0x86dd), length 90: fe80::link-local > ff02::12: ip-proto-112 36
          

          You would only see one sender of the CARPv2-advertise packets, unfortunately since the REAL mac address is not revealed, you need to rely on the source IP address to determine if it is indeed the correct system sending the packets. Similarly with the IPv6 version you need to look at the link-local IP address on the actual interface (use ipconfig in the shell to see this)

          Here is a dump of an interface on both my primary and backup systems, hopefully that will provide some clues:
          In this setup:
          VIP: nnn.mmm.208.74/24 VHID 174 and xxxx:yyyy:zzzz:e0d0::74/64 VHID 175
          Primary: nnn.mmm.208.174/24 and xxxx:yyyy:zzzz:e0d0::174/64
          Backup: nnn.mmm.208.175/24 and xxxx:yyyy:zzzz:e0d0::175/64

          em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                  options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
                  ether 00:50:56:a6:89:3c
                  hwaddr 00:50:56:a6:89:3c
                  inet6 fe80::250:56ff:fea6:893c%em3 prefixlen 64 scopeid 0x4
                  inet6 xxxx:yyyy:zzzz:e0d0::174 prefixlen 64
                  inet6 xxxx:yyyy:zzzz:e0d0::74 prefixlen 64 vhid 175
                  inet nnn.mmm.208.174 netmask 0xffffff00 broadcast nnn.mmm.208.255
                  inet nnn.mmm.208.74 netmask 0xffffff00 broadcast nnn.mmm.208.255 vhid 174
                  nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                  media: Ethernet autoselect (1000baseT <full-duplex>)
                  status: active
                  carp: MASTER vhid 174 advbase 1 advskew 0
                  carp: MASTER vhid 175 advbase 1 advskew 0
          		
          		
          em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                  options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
                  ether 00:50:56:a6:1d:39
                  hwaddr 00:50:56:a6:1d:39
                  inet6 fe80::250:56ff:fea6:1d39%em3 prefixlen 64 scopeid 0x4
                  inet6 xxxx:yyyy:zzzz:e0d0::175 prefixlen 64
                  inet6 xxxx:yyyy:zzzz:e0d0::74 prefixlen 64 vhid 175
                  inet nnn.mmm.208.175 netmask 0xffffff00 broadcast nnn.mmm.208.255
                  inet nnn.mmm.208.74 netmask 0xffffff00 broadcast nnn.mmm.208.255 vhid 174
                  nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                  media: Ethernet autoselect (1000baseT <full-duplex>)
                  status: active
                  carp: BACKUP vhid 174 advbase 1 advskew 100
                  carp: BACKUP vhid 175 advbase 1 advskew 100	
          

          –A.

          1 Reply Last reply Reply Quote 0
          • D
            Derelict LAYER 8 Netgate @MrPete
            last edited by Nov 10, 2022, 2:20 PM

            @mrpete This is invariably a switching issue. If the secondary does not receive the heartbeats sent from the primary it will think there is a failure and assume the MASTER role.

            Even if the primary receives the resulting heartbeats from the secondary, it will remain MASTER too since it is advskew 0 and the secondary is advskew 100.

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            A M 2 Replies Last reply Nov 10, 2022, 3:52 PM Reply Quote 0
            • A
              awebster @Derelict
              last edited by Nov 10, 2022, 3:52 PM

              @derelict said in One VLAN is master on both HA's??? Strange networking issue:

              @mrpete This is invariably a switching issue. If the secondary does not receive the heartbeats sent from the primary it will think there is a failure and assume the MASTER role.

              Exactly, my thoughts are that there is MAC address confusion at the switching level hence the verification necessary to make sure there is no incorrect configs as they'd be very hard to spot given that the CARP packets don't emanate with the NIC's real MAC address.

              –A.

              D 1 Reply Last reply Nov 11, 2022, 11:11 AM Reply Quote 0
              • M
                MrPete
                last edited by Nov 10, 2022, 8:48 PM

                @awebster Ah HA! Key to IPV6 CARP is you create TWO CARP Virtual IP's :)

                1 Reply Last reply Reply Quote 0
                • M
                  MrPete @Derelict
                  last edited by Nov 10, 2022, 8:50 PM

                  @derelict Understood. What's so strange is that most VLAN's are working just fine and DO see the heartbeats.

                  I'm digging in on it further...

                  A 1 Reply Last reply Nov 10, 2022, 9:03 PM Reply Quote 0
                  • A
                    awebster @MrPete
                    last edited by Nov 10, 2022, 9:03 PM

                    @mrpete Maybe try changing the VID on the problematic VLAN on both sides to see if that makes a difference since we know this will cause the source MAC address to change.

                    –A.

                    M 1 Reply Last reply Nov 11, 2022, 11:38 PM Reply Quote 0
                    • D
                      Derelict LAYER 8 Netgate @awebster
                      last edited by Nov 11, 2022, 11:11 AM

                      @awebster pfSense's tcpdump groks CARP. If you pcap for it you can generally tell primary from secondary advertisements by the advskew (0 and 100 respectively by default).

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      1 Reply Last reply Reply Quote 0
                      • M
                        MrPete @awebster
                        last edited by MrPete Nov 11, 2022, 11:39 PM Nov 11, 2022, 11:38 PM

                        @awebster and @Derelict My problem: secondary does not see ANY packets from primary on that VLAN, period. This presumably has nothing to do with CARP??

                        Quite confusing to me, how a single VLAN on a trunked ethernet wire can be nonfunctional like that.

                        I'll soon rip into this at a more detailed level. Have a monitoring switch or two I can use to observe ... something... in the wire. ;)

                        D 1 Reply Last reply Nov 12, 2022, 1:13 AM Reply Quote 0
                        • D
                          Derelict LAYER 8 Netgate @MrPete
                          last edited by Nov 12, 2022, 1:13 AM

                          @mrpete It must be something on that VLAN. Blocking multicast. Something.

                          Chattanooga, Tennessee, USA
                          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                          Do Not Chat For Help! NO_WAN_EGRESS(TM)

                          R 1 Reply Last reply Jan 4, 2023, 8:04 PM Reply Quote 0
                          • R
                            RobertK 1 @Derelict
                            last edited by RobertK 1 Jan 4, 2023, 8:05 PM Jan 4, 2023, 8:04 PM

                            Maybe your STP topology is different in that VLAN, so traffic goes on an unexpected path

                            1 Reply Last reply Reply Quote 0
                            • M
                              MrPete
                              last edited by Jan 9, 2023, 4:29 PM

                              Thanks all for the suggestions. Digging into it...

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.