• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Switches are not learning CARP HA MAC

Scheduled Pinned Locked Moved HA/CARP/VIPs
18 Posts 2 Posters 4.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    dvancleef
    last edited by Jun 14, 2017, 2:19 AM

    First off, I've set up the canonical CARP HA architecture many, many times using Cisco or Foundry/Brocade switches but this time I'm using Penguin Arcticas (4804iq 1G and 4806xp 10G) with Cumulus Linux on the management side. The switches are running purely in L2 mode.

    The symptom that I first encountered was that traffic was appearing on host ports that it shouldn't be, which was tracked down by Cumulus support to being caused by the switches not learning the mac address of the CARP VIP and thus traffic bound for the gateway being flooded out all ports on the VLAN.

    I'm running some very large VM-hosting servers on this network, and 8 dedicated cores for handling network interrupts are needed to avoid performance issues with the load under normal circumstances, dedicating 4x that many to handle all this extra traffic is not an option.

    Any possibility this is an IGMP issue? Any suggestions on what to investigate?

    1 Reply Last reply Reply Quote 0
    • D
      Derelict LAYER 8 Netgate
      last edited by Jun 14, 2017, 2:48 AM

      Switch problem. The switch should learn the CARP MAC from the CARP advertisements (which are multicast to 224.0.0.18, same as VRRP).

      If the switches are somehow blocking multicast they might not be learning the CARP MAC.

      https://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting#Switch.2FLayer_2_Issues

      Chattanooga, Tennessee, USA
      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
      Do Not Chat For Help! NO_WAN_EGRESS(TM)

      1 Reply Last reply Reply Quote 0
      • D
        dvancleef
        last edited by Jun 14, 2017, 3:04 AM

        Already been there, turned off igmp snooping to no effect (the switches have no L3 configuration on the default VRF so doubt that would solve it).

        We were able to force the mac into the switches with the use of arping and eliminate the packet flooding, but getting this to work right in the event of a failover might be tricky.

        1 Reply Last reply Reply Quote 0
        • D
          Derelict LAYER 8 Netgate
          last edited by Jun 14, 2017, 4:01 AM

          Use another (not broken) switch.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          1 Reply Last reply Reply Quote 0
          • D
            dvancleef
            last edited by Jun 14, 2017, 4:10 AM

            OK, we'll get right on throwing out $75,000 worth of switches. Thanks…

            1 Reply Last reply Reply Quote 0
            • D
              Derelict LAYER 8 Netgate
              last edited by Jun 14, 2017, 4:33 AM

              Well then they need to fix it. Broken is broken.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • D
                dvancleef
                last edited by Jun 14, 2017, 5:09 AM

                We've got a ticket open with the switch-OS vendor, had already tracked it down to the mac not getting learned, trying to figure out WHY its not being learned.

                1 Reply Last reply Reply Quote 0
                • D
                  dvancleef
                  last edited by Jun 14, 2017, 6:49 AM

                  @Derelict:

                  Switch problem. The switch should learn the CARP MAC from the CARP advertisements (which are multicast to 224.0.0.18, same as VRRP).

                  If the switches are somehow blocking multicast they might not be learning the CARP MAC.

                  https://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting#Switch.2FLayer_2_Issues

                  The actual CARP protocol is kind of underdocumented, lacking an RFC-like packet flow definition, but I was lead to understand that the mac would be learned not from the actual advertisements but from a gratuitous arp sent from the master after the election just as it was done in VRRP. And in our case the election does seem to be carried out correctly.

                  1 Reply Last reply Reply Quote 0
                  • D
                    Derelict LAYER 8 Netgate
                    last edited by Jun 14, 2017, 7:07 AM

                    The only time (that I know of) the CARP MAC address itself goes on the wire in the frame header is in the CARP advertisements. It is up to the switch to add it to the table since it received traffic from that MAC address on that port.

                    The ARP response to a WHO HAS for the CARP VIP is sourced from the interface MAC address, but contains the CARP VIP in the IS AT response.

                    Same with the gratuitous ARP. Sourced from the interface MAC, contains the CARP MAC as the sender MAC.

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • D
                      dvancleef
                      last edited by Jun 14, 2017, 7:45 AM

                      @Derelict:

                      The only time (that I know of) the CARP MAC address itself goes on the wire in the frame header is in the CARP advertisements. It is up to the switch to add it to the table since it received traffic from that MAC address on that port.

                      This is actually codified in RFC5798 7.2 for VRRP.

                      And here is what is strange here: I'm looking at tcpdumps in wireshark and seeing the MAC of the physical interface in the header, not the VIP's MAC:

                      Frame 53: 70 bytes on wire (560 bits), 70 bytes captured (560 bits)
                      Ethernet II, Src: SuperMic_ea:4b:70 (0c:c4:7a:ea:4b:70), Dst: IPv4mcast_12 (01:00:5e:00:00:12)
                          Destination: IPv4mcast_12 (01:00:5e:00:00:12)
                              Address: IPv4mcast_12 (01:00:5e:00:00:12)
                              .... ..0\. .... .... .... .... = LG bit: Globally unique address (factory default)
                              .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
                          Source: SuperMic_ea:4b:70 (0c:c4:7a:ea:4b:70)
                              Address: SuperMic_ea:4b:70 (0c:c4:7a:ea:4b:70)
                              .... ..0\. .... .... .... .... = LG bit: Globally unique address (factory default)
                              .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
                          Type: IPv4 (0x0800)
                      Internet Protocol Version 4, Src: X.X.X.X, Dst: 224.0.0.18
                      Common Address Redundancy Protocol
                      
                      
                      1 Reply Last reply Reply Quote 0
                      • D
                        Derelict LAYER 8 Netgate
                        last edited by Jun 14, 2017, 7:57 AM

                        Where is that capture taken? That is not what I see (or ever have seen).

                        Chattanooga, Tennessee, USA
                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                        1 Reply Last reply Reply Quote 0
                        • D
                          Derelict LAYER 8 Netgate
                          last edited by Jun 14, 2017, 8:02 AM

                          Anything else? Virtualization of pfSense?

                          Chattanooga, Tennessee, USA
                          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                          Do Not Chat For Help! NO_WAN_EGRESS(TM)

                          1 Reply Last reply Reply Quote 0
                          • D
                            dvancleef
                            last edited by Jun 14, 2017, 8:18 AM Jun 14, 2017, 8:14 AM

                            All real hardware (Supermicro Xeon E3-1270 machines, Supermicro-branded Intel X520-SR2 10G cards being used for WAN/LAN, there's i210 onboard that's used for PFSYNC).

                            Captures were taken at one of the switches, I believe the one that the carp master is connected to (I have so many dumpfiles, I've been staring at Wireshark and RFCs all day)

                            1 Reply Last reply Reply Quote 0
                            • D
                              Derelict LAYER 8 Netgate
                              last edited by Jun 14, 2017, 8:38 AM

                              What does a capture filtered on CARP taken on pfSense show? What is the source MAC there?

                              Diagnostics > Packet Capture, Select interface, Protocol: CARP on the current master.

                              I'm starting to smell something like shared IPMI maybe. Just a hunch. Probably wrong.

                              Chattanooga, Tennessee, USA
                              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                              Do Not Chat For Help! NO_WAN_EGRESS(TM)

                              1 Reply Last reply Reply Quote 0
                              • D
                                dvancleef
                                last edited by Jun 15, 2017, 12:55 AM

                                Should not be IPMI related, the first thing we do after powering on Supermicro boxes is to set the IPMI to dedicated.

                                Packet capture on the firewall also indicates carp packets sourced on the physical address:

                                Frame 1: 70 bytes on wire (560 bits), 70 bytes captured (560 bits)
                                Ethernet II, Src: SuperMic_ea:4b:70 (0c:c4:7a:ea:4b:70), Dst: IPv4mcast_12 (01:00:5e:00:00:12)
                                    Destination: IPv4mcast_12 (01:00:5e:00:00:12)
                                    Source: SuperMic_ea:4b:70 (0c:c4:7a:ea:4b:70)
                                    Type: IPv4 (0x0800)
                                Internet Protocol Version 4, Src: X.X.X.X, Dst: 224.0.0.18
                                    0100 .... = Version: 4
                                    .... 0101 = Header Length: 20 bytes (5)
                                    Differentiated Services Field: 0x10 (DSCP: Unknown, ECN: Not-ECT)
                                    Total Length: 56
                                    Identification: 0x984a (38986)
                                    Flags: 0x02 (Don't Fragment)
                                    Fragment offset: 0
                                    Time to live: 255
                                    Protocol: VRRP (112)
                                    Header checksum: 0x0000 [validation disabled]
                                    [Header checksum status: Unverified]
                                    Source: X.X.X.X
                                    Destination: 224.0.0.18
                                    [Source GeoIP: Unknown]
                                    [Destination GeoIP: Unknown]
                                Common Address Redundancy Protocol
                                    Version 2, Packet type 1 (Advertisement)
                                    Virtual Host ID: 102
                                    Advertisment Skew: 100
                                    Auth Len: 7
                                    Demotion indicator: 0
                                    Adver Int: 1
                                    Checksum: 0x5e00 [correct]
                                    Counter: 17876480869048372106
                                    HMAC: 8700fa73033777bc3da9ed687db21e9ca7ede6e4
                                
                                
                                1 Reply Last reply Reply Quote 0
                                • D
                                  dvancleef
                                  last edited by Jun 15, 2017, 2:29 AM Jun 15, 2017, 1:50 AM

                                  OK, worked out the cause.

                                  We are SPANning the LAN port to call recording servers (this installation is a massive-scale asterisk farm, 800 instances of asterisk both virtual and bare metal in 3 racks). Something in the bridge is rewriting the source mac. When I removed the SPAN ports the carp announcements became correct.

                                  The "broken" switch is totally innocent.

                                  This is a pfsense bug, but might be hard to fix.

                                  I've logged this as bug 7648 in redmine.

                                  1 Reply Last reply Reply Quote 0
                                  • D
                                    Derelict LAYER 8 Netgate
                                    last edited by Jun 15, 2017, 2:43 AM Jun 15, 2017, 2:35 AM

                                    Right. Silly place for a SPAN port with a $75,000 switch at-the-ready.

                                    When you said the switch wasn't learning the CARP MACs I (incorrectly) assumed you had verified the MACs were arriving at the switch and not being learned. I'll try to do better next time.

                                    Chattanooga, Tennessee, USA
                                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                    1 Reply Last reply Reply Quote 0
                                    • D
                                      dvancleef
                                      last edited by Jun 15, 2017, 4:55 AM Jun 15, 2017, 2:44 AM

                                      There's some odd quirks in the Broadcom Trident II ASIC on the Arcticas that make it impractical to SPAN on the switches (they insert dotq tags on SPANned packets originating on untagged ports but destined for tagged ports - you wind up with a mix of tagged and untagged packets coming out your SPAN port).

                                      We do have a workaround for the moment: running a script on the firewalls that periodically does an arping (using the CARP mac) if that firewall is currently the carp master, and a trigger on a CARP state change that does an immediate arping on a BACKUP->MASTER transition.

                                      1 Reply Last reply Reply Quote 1
                                      18 out of 18
                                      • First post
                                        18/18
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                        This community forum collects and processes your personal information.
                                        consent.not_received