HA Setup - gateway picking up wrong MAC in ARP Cache for CARP IP?
-
I've got two Netgate 4100s in a HA setup, which replaced a Netgate 3100 (BTW, the 4100s are a great step up compared to the 3100!). Currently they are running 23.05-RELEASE and internet is a Cox Business cable connection. The configuration was migrated over using the HA setup guide.
I've had a weird issue a few times recently where traffic to the CARP IP has just stopped working - rebooting the firewalls didn't help, but rebooting the cable modem did every time.
Finally got Cox tech support online to tell me that the CARP MAC is not being linked to the shared IP (or any IP at all), but the primary/master node has both it's WAN IP and the CARP IP mapped to it. The secondary/slave node only has it's WAN IP as expected.
Any ideas why this would be happening?
This doesn't appear to be an issue on the internal networks that I have access to.
Related question:
Should I be able to ping the CARP IPs from the secondary/slave node? To get an ARP entry on the WAN side, I did try pinging the CARP WAN and LAN IPs from the secondary node to populate the ARP cache.The WAN IP mac address is indeed the primary WAN mac - but the LAN mac is the CARP mac address as expected, so I am able to somewhat confirm what the cable modem is seeing.
-
@drees said in HA Setup - gateway picking up wrong MAC in ARP Cache for CARP IP?:
Finally got Cox tech support online to tell me that the CARP MAC is not being linked to the shared IP (or any IP at all), but the primary/master node has both it's WAN IP and the CARP IP mapped to it. The secondary/slave node only has it's WAN IP as expected.
Any ideas why this would be happening?
It's on the modem to request the MAC for the VIP and store it in its ARP table.
If it does, doesn't it get a response?Related question:
Should I be able to ping the CARP IPs from the secondary/slave node?Yes, if the ping is allowed by firewall rules it should work.
-
@viragomann said in HA Setup - gateway picking up wrong MAC in ARP Cache for CARP IP?:
@drees said in HA Setup - gateway picking up wrong MAC in ARP Cache for CARP IP?:
It's on the modem to request the MAC for the VIP and store it in its ARP table.
If it does, doesn't it get a response?It appears to be getting a response, but it's getting a response from the physical WAN MAC address of the primary instead of the CARP MAC address.
If I try to ping the CARP WAN IP from the secondary, it does not get replies (I confirmed that I do appear to have the appropriate rules, I can ping the physical WAN IP of the primary), but the ARP cache table of the secondary gives me the primary WAN MAC address rather than the CARP WAN MAC address.
Here's a redacted version of the ARP diagnostic page from the secondary firewall.
The first column is showing the last octet of the IP address:
209 = gateway
216 = CARP IP
219 = primary firewall
220 = secondary firewallThe 2nd column is showing the last bit of the MAC address:
19 = gateway MAC
1d = primary WAN MAC address (should be the CARP MAC address ending with d8!)
1d = primary WAN MAC address
19 = secondary WAN MAC addressSo why does it appear that the firewall is responding with the MAC address of the primary WAN interface instead of the CARP interface?
-
@drees re ping,
https://redmine.pfsense.org/issues/14026
“CARP backup is unable to ping master via CARP IP.“ -
@SteveITS said in HA Setup - gateway picking up wrong MAC in ARP Cache for CARP IP?:
https://redmine.pfsense.org/issues/14026
“CARP backup is unable to ping master via CARP IP.“Indeed this doesn't work anymore.
So why does it appear that the firewall is responding with the MAC address of the primary WAN interface instead of the CARP interface?
This is a normal behavior.
If the gateway wants to communicate with the CARP VIP (or any other VIP hooking up on it) it does an ARP request for the IP and the master tells to use the CARP MAC. So the gateway sends its request packet to the CARP MAC.
However, when the master is responding, it uses the hardware interface MAC together with the VIP.Hence devices talking to an HA system have to accept MAC changing for CARP to working properly.
See High Availability Switch/Layer 2 ConcernsI assume, that your modem / router has a setting to configure this. I has a similar issue with a Cisco cable box, it took me some hours with the support to get it set correctly.
-
Thanks for all the replies and help.
Would changing the CARP mode to Unicast from Multicast potentially have any effect?
I guess I'm still confused as to why the ARP cache would show the active device's physical MAC address instead of the CARP interfaces MAC address.
Also, it still seems totally weird that there would be no failover event that would cause traffic to start dropping. Absolutely nothing in the logs from the perspective of the pfSense boxes, but traffic coming from the CARP WAN IP just starts getting dropped.
-
FWIW, I found this older post from 2018 from @bw-linux who had the exact same issue as me.
https://forum.netgate.com/topic/134297/cox-and-the-carp-mac
Anyway, the short answer is that they weren't able to get it to work and it CARP/VRRP doesn't appear to be supported properly by the cable modems.
I think the only way we could get it to work would be to get pfsense to always respond/send traffic for the CARP IP using the same MAC instead of the MAC address of whatever device is primary.