HA CARP - IPv6 Two masters
-
None of that is necessary in a "normal" environment. I reboot the VMs all the time. Just works.
I'm not really following you on this one. I'm not running these on VM's but that doesn't matter.
I don't want to have to reboot a physical machine or a VM every time I make a configuration change.This is not a L2 issue. This smells like a bug.
-
Works great for me… (IPs masked to protect the innocent), but I did find that increasing ADVBASE to 10 on the backup as opposed to default 1 helped alot (maybe its because these are running on ESXi), anyway that's my recipe and I'm sticking to it.
Consequently on the backup uncheck "virtual IPs" in the System / High Availability Sync page.MASTER
em0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:43:51:32
hwaddr 00:0c:29:43:51:32
inet6 fe80::20c:29ff:fe43:5132%em0 prefixlen 64 scopeid 0x1
inet AA.BB.CC.226 netmask 0xfffffff8 broadcast AA.BB.CC.231
inet6 xxxx:xxxx::1c prefixlen 125
** inet6 xxxx:xxxx::1e prefixlen 125 vhid 244**
** inet AA.BB.CC.225 netmask 0xfffffff8 broadcast AA.BB.CC.231 vhid 242**
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: MASTER vhid 244 advbase 1 advskew 0
carp: MASTER vhid 242 advbase 1 advskew 0
em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:43:51:3c
hwaddr 00:0c:29:43:51:3c
inet6 fe80::20c:29ff:fe43:513c%em1 prefixlen 64 scopeid 0x2
inet XX.YY.ZZ.251 netmask 0xffffff00 broadcast XX.YY.ZZ.255
inet6 xxxx:xxxx:10:2800::2 prefixlen 64
** inet XX.YY.ZZ.254 netmask 0xffffff00 broadcast XX.YY.ZZ.255 vhid 240**
** inet6 xxxx:xxxx:10:2800::1 prefixlen 64 vhid 241**
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: MASTER vhid 240 advbase 1 advskew 0
carp: MASTER vhid 241 advbase 1 advskew 0BACKUP
em0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:4c:da:30
hwaddr 00:0c:29:4c:da:30
inet6 fe80::20c:29ff:fe4c:da30%em0 prefixlen 64 scopeid 0x1
inet AA.BB.CC.227 netmask 0xfffffff8 broadcast AA.BB.CC.231
inet6 xxxx:xxxx::1d prefixlen 125
** inet6 xxxx:xxxx::1e prefixlen 125 vhid 244**
** inet AA.BB.CC.225 netmask 0xfffffff8 broadcast AA.BB.CC.231 vhid 242**
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: BACKUP vhid 244 advbase 10 advskew 100
carp: BACKUP vhid 242 advbase 10 advskew 100
em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:4c:da:3a
hwaddr 00:0c:29:4c:da:3a
inet6 fe80::20c:29ff:fe4c:da3a%em1 prefixlen 64 scopeid 0x2
inet XX.YY.ZZ.252 netmask 0xffffff00 broadcast XX.YY.ZZ.255
inet6 xxxx:xxxx:10:2800::3 prefixlen 64
** inet XX.YY.ZZ.254 netmask 0xffffff00 broadcast XX.YY.ZZ.255 vhid 240**
** inet6 xxxx:xxxx:10:2800::1 prefixlen 64 vhid 241**
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: BACKUP vhid 240 advbase 10 advskew 100
carp: BACKUP vhid 241 advbase 10 advskew 100</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast> -
Doesn't sound like a bug because far too many people are NOT seeing the issue. It is something specific to the way you have it configured or something in your environment.
-
Thank you all for helping.
I just factory rest both devices today and set them up from scratch again.
All the carp interfaces were working as expected except the IPV6 ULA CARP for the LAN (fd57:187e:523f:715::f/64)
It was exhibiting the same issues i was seeing prior to the factory reset, both primary and secondary both showing master.
The IPV6 GUA on the wan was working as expected
If I rebooted the secondary firewall all the carp interfaces would be in backup status. Anytime I synced the config from the primary it would cause the double master status.I was able to find a solution based off what awebster said about unchecking the virtual ip in the HA sync.
I unchecked this option and rebooted the secondary firewall and now all the carp interfaces are showing the correct status and config syncing doesnt affect them. -
Just to be sure there wasn't something somewhere that misbehaved with ULA and CARP:
Primary:
xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=3 <rxcsum,txcsum>ether ee:c2:d9:d8:55:46
hwaddr ee:c2:d9:d8:55:46
inet6 fe80::ecc2:d9ff:fed8:5546%xn5 prefixlen 64 scopeid 0xd
inet6 fda9:cfd8:f9f:1000::2 prefixlen 64
inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
inet 192.168.123.2 netmask 0xffffff00 broadcast 192.168.123.255
inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
status: active
carp: MASTER vhid 242 advbase 1 advskew 0
carp: MASTER vhid 243 advbase 1 advskew 0Secondary:
xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
ether 6e:24:e4:84:f5:f9
hwaddr 6e:24:e4:84:f5:f9
inet6 fe80::6c24:e4ff:fe84:f5f9%xn5 prefixlen 64 scopeid 0xa
inet6 fda9:cfd8:f9f:1000::3 prefixlen 64
inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
inet 192.168.123.3 netmask 0xffffff00 broadcast 192.168.123.255
inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
status: active
carp: BACKUP vhid 242 advbase 1 advskew 100
carp: BACKUP vhid 243 advbase 1 advskew 100Enter CARP Maintenance mode on Primary:
Primary:
xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=3 <rxcsum,txcsum>ether ee:c2:d9:d8:55:46
hwaddr ee:c2:d9:d8:55:46
inet6 fe80::ecc2:d9ff:fed8:5546%xn5 prefixlen 64 scopeid 0xd
inet6 fda9:cfd8:f9f:1000::2 prefixlen 64
inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
inet 192.168.123.2 netmask 0xffffff00 broadcast 192.168.123.255
inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
status: active
carp: BACKUP vhid 242 advbase 1 advskew 254
carp: BACKUP vhid 243 advbase 1 advskew 254Secondary:
xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
ether 6e:24:e4:84:f5:f9
hwaddr 6e:24:e4:84:f5:f9
inet6 fe80::6c24:e4ff:fe84:f5f9%xn5 prefixlen 64 scopeid 0xa
inet6 fda9:cfd8:f9f:1000::3 prefixlen 64
inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
inet 192.168.123.3 netmask 0xffffff00 broadcast 192.168.123.255
inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
status: active
carp: MASTER vhid 242 advbase 1 advskew 100
carp: MASTER vhid 243 advbase 1 advskew 100Fails back fine, too.</performnud,auto_linklocal></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum></up,broadcast,running,promisc,simplex,multicast>
-
I've been dealing with the same problem in my HA setup and it turned to be related to bug #6579
https://redmine.pfsense.org/issues/6579The affected CARP IPv6 address was something like:
2001:aaaa:bbb:ccc:0d00:ffff:ffff:ffff
After removing leading zeros:
2001:aaaa:bbb:ccc:d00:ffff:ffff:ffffCARP started to work reliably on that interface
-
Nice catch.
LAN@213 fd57:187e:523f:0715::f MASTER
-
I am having exactly the same as this since moving to 2.4 from 2.3.5
interesting only on 2 of the 4 IPv6 CARPs
they were the only 2 that could use :: in their address
I tried expanding to 0:0:0:
it did not helpI have confirmed by packet capture that packets to ff02::12 are seen on both systems
–------------
Ok I figured out how to get it to a normal state (all master on primary and all backup on secondary).
You need to reboot the backup firewall, and while its rebooting clear the firewall states on the primary.
Carp failover works perfectly when its like this but there is still an issue.ANY configuration sync (manual/auto) from the primary to the backup causes the backup to become master on the two IPV6 carps.
-
To reiterate I did not have this issue until upgrading from 2.3.5 to 2.4.2-RELEASE-p1, or at least it seemed to have gotten worse.
More testing:
changed from x::1 (ie X:0:0:0:1) to x:1:1:1:1
on one of the CARP interfaces and the problem went away
Did not change the real interface IPUPDATE: it worked for the first one, but broke both after I changed the second one.
Why are these 2 different then the other 2?
they connect to the same switch
I found a difference, one set of addresses used all lower case for the hex in the address, the none working ones had capitals.
I have changed all to lower and rebooted B unit and it came up all in backup, did not have to reset states on A firewall.
I'm not saying this is the issue - but giving people ideas of what I found
So in summary: using all lower case for hex and changed the addresses to ones that can not condense to :: -
Just want to add i appear to have hit this "bug" in one of our SG-4860 clusters
our IPv6 addresses are in their shortened form with no leading zeros, had to reboot secondary to clear this out, will keep an eye on things
-
I am also hitting something similar this in our office/test system.
Both devices are connected to a Cisco 3560G switch. IGMP snooping and ipv6 mld snooping are disabled. All ports are set to "portfast". There are no "loops" in the network. There are no topology changes.
You will notice that each one sees the others advertisements and their own.
Primary:
16:42:40.428976 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
16:42:42.597228 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
16:42:50.886692 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
16:42:52.607533 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
16:43:01.382988 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
16:43:02.612549 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36Backup:
16:42:09.212760 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
16:42:12.573960 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
16:42:19.608720 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
16:42:22.578900 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
16:42:30.015028 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
16:42:32.585911 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36This only happens for IPv6 CARP IPs.
Here are the interfaces, just to confirm the vhid:
Primary:
igb0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=6400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6>ether 0c:c4:7a:ac:82:1a
hwaddr 0c:c4:7a:ac:82:1a
inet6 fe80::ec4:7aff:feac:821a%igb0 prefixlen 64 scopeid 0x1
inet6 xxxx:xxxx:1:2::3 prefixlen 124
inet6 xxxx:xxxx:1:2::2 prefixlen 124 vhid 4
inet yyy.yyy.233.108 netmask 0xfffffff0 broadcast yyy.yyy.233.111
inet yyy.yyy.233.110 netmask 0xfffffff0 broadcast yyy.yyy.233.111 vhid 1
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: MASTER vhid 1 advbase 10 advskew 1
carp: MASTER vhid 4 advbase 10 advskew 1Backup:
igb0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
options=6400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6>ether 0c:c4:7a37:24
hwaddr 0c:c4:7a37:24
inet6 fe80::ec4:7aff:feab:3724%igb0 prefixlen 64 scopeid 0x1
inet6 xxxx:xxxx:1:2::4 prefixlen 124
inet yyy.yyy.233.109 netmask 0xfffffff0 broadcast yyy.yyy.233.111
inet yyy.yyy.233.110 netmask 0xfffffff0 broadcast yyy.yyy.233.111 vhid 1
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: MASTER vhid 4 advbase 10 advskew 101
carp: BACKUP vhid 1 advbase 10 advskew 101So the CARP interface is correctly assigned to the primary node, but the backup one still claims its master in the dashboard and with "ifconfig igb0".</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6></up,broadcast,running,promisc,simplex,multicast>
-
Why did you play with advbase/advskew?
Use 1/0 on the primary that will sync 1/100 to the secondary. Then just leave it alone.
-
Yes. I did try multiple base values between 0 - 20 for the base and 0 and 1 for skew. The settings are correctly(+100 for skew) transferred to the backup unit. Still backup thinks it's primary for IPv6.
-
Are you 100% certain the case described in reply #15 ^ is not present?
Use 1/0 on the primary that will sync 1/100 to the secondary. Then just leave it alone.
Just do that. If changing it didn't correct it it is not the problem.
Packet capture on both nodes and see if you see the CARP going out the interface or in the interface. You can filter on CARP only in Diagnostics > Packet Capture.
-
1. Regarding post #15 solution. I tried both shorthand(no leading zeroes) and full notation with nothing omitted.
2. I included a tcpdump in my first post. It looks to me that they are both receiving each other's updates. -
Have you tried changing to addresses that CAN NOT be shortened to have a :: ?
-
Have you tried changing to addresses that CAN NOT be shortened to have a :: ?
Yes I did. No difference.
-
Did you put base/skew back to the default or not?
-
-
Well, cut loose with more. Screen shots, pcaps, whatever. IPv6 CARP works.