Double Packet sending causing VIP to go Backup
-
Hi All,
we have an odd issue that started the other night. Not sure what's going on, may be someone can give me an inkling.
Our setup is described in: https://forum.pfsense.org/index.php?topic=66905.msg365740#msg365740
We decided to go with the hook into dev.d for the handling of failover situations and all has been going well with no (major) issues.
The other night we noticed that we were losing the bridging for a few seconds (debug lines are "OurBridge")
Mar 17 16:28:46 fibregate OurBridge: Switching Bridge off Mar 17 16:28:46 fibregate kernel: lan_vip4: MASTER -> BACKUP (more frequent advertisement received) Mar 17 16:28:46 fibregate kernel: lan_vip4: link state changed to DOWN Mar 17 16:28:49 fibregate OurBridge: Switching Bridge on Mar 17 16:28:49 fibregate kernel: lan_vip4: link state changed to UP
It was obvious that something on the network was broadcasting so that the firewall was thinking it was no longer the master. We ran a packet capture on the firewall to see if we could work out where it was coming from, and oddly, it was coming from itself.
16:28:44.431105 IP (tos 0x10, ttl 255, id 28808, offset 0, flags [DF], proto VRRP (112), length 56) fibregate.example.com > vrrp.mcast.net: carp fibregate.example.com > vrrp.mcast.net: CARPv2-advertise 36: vhid=4 advbase=1 advskew=0 authlen=7 counter=6823359283637811870 E..8p.@..p.O R. ....!.......^.q...z..8..k.....Q.J....x.. 16:28:45.432130 IP (tos 0x10, ttl 255, id 28953, offset 0, flags [DF], proto VRRP (112), length 56) fibregate.example.com > vrrp.mcast.net: carp fibregate.example.com > vrrp.mcast.net: CARPv2-advertise 36: vhid=4 advbase=1 advskew=0 authlen=7 counter=6823359283637811870 E..8q.@..p.. R. ....!.......^.q...z..8..k.....Q.J....x.. 16:28:46.433103 IP (tos 0x10, ttl 255, id 10329, offset 0, flags [DF], proto VRRP (112), length 56) fibregate.example.com > vrrp.mcast.net: carp fibregate.example.com > vrrp.mcast.net: CARPv2-advertise 36: vhid=4 advbase=1 advskew=0 authlen=7 counter=6823359283637811870 E..8(Y@..p.~ R. ....!.......^.q...z..8..k.....Q.J....x.. 16:28:46.433243 IP (tos 0x10, ttl 255, id 10329, offset 0, flags [DF], proto VRRP (112), length 56) fibregate.example.com > vrrp.mcast.net: carp fibregate.example.com > vrrp.mcast.net: CARPv2-advertise 36: vhid=4 advbase=1 advskew=0 authlen=7 counter=6823359283637811870 E..8(Y@..p.~ R. ....!.......^.q...z..8..k.....Q.J....x.. 16:28:49.434121 IP (tos 0x10, ttl 255, id 35552, offset 0, flags [DF], proto VRRP (112), length 56) fibregate.example.com > vrrp.mcast.net: carp fibregate.example.com > vrrp.mcast.net: CARPv2-advertise 36: vhid=4 advbase=1 advskew=0 authlen=7 counter=6823359283637811870 E..8..@..p.. R. ....!.......^.q...z..8..k.....Q.J....x..
As you can see, there are two packets with id 10329 at almost the same time. This is causing the VIP to go into BACKUP mode, and dropping the bridge. When the VIP comes back a couple of seconds later, so does the bridge.
This is a little annoying, as the servers that are being hosted on bridged network lose all internet connectivity - not ideal for a Voice/Video services that we provide.
So, my questions are:
- What would cause a retransmission of a CARP
- Why is it only on this on VLAN interface
- How can I go about tracking it down?
I'd rather work out what going on rather than rebooting it first and failing over to see if that clears it.
-
We tried the whole reboot it, and that's not solved it.
We thought it may be a switch echo… but adding a port mirror shows that there are indeed 2 packets being transmitted.
I don't know what’s going on here ... I'm open to any suggestions.
:-(