HA CARP - IPv6 Two masters



  • Hello,

    I just setup two devices running pfsense 2.4.2 running in ha mode.

    I have several carp interfaces however the ipv6 carp interfaces show master on each device and the ipv4 carp interfaces are working properly.

    I have checked the broadcast domain for other vrrp devices and the vhid that the carp interfaces are using are not in use anywhere else.

    Im really not sure why its not working. Again this is only affecting ipv6

    Any help would be greatly appreciated.

    –primary device--
    CARP Interface IP Address Status
    WAN@210 66.X.X.30 MASTER
    WAN@211 2001:X:X:X::F MASTER
    LAN@212 172.26.8.65 MASTER
    LAN@213 fd57:187e:523f:0715::f MASTER
    RFC_BACKEND@214 172.26.8.30 MASTER

    --backup device--
    CARP Interface IP Address Status
    WAN@210 66.X.X.30 BACKUP
    WAN@211 2001:X:X:X::F MASTER
    LAN@212 172.26.8.65 BACKUP
    LAN@213 fd57:187e:523f:0715::f MASTER
    RFC_BACKEND@214 172.26.8.30 BACKUP



  • Im seeing the following in the logs on the backup firewall

    Dec 4 17:40:20 php-fpm 58958 /xmlrpc.php: The command '/sbin/ifconfig 'bce0.715' inet6 'fd57:187e:523f:0715::f' delete' returned exit code '1', the output was 'ifconfig: ioctl (SIOCDIFADDR): Can't assign requested address'
    Dec 4 17:40:20 php-fpm 58958 /xmlrpc.php: The command '/sbin/ifconfig 'bce0.210' inet6 '2001:X:X:X::F' delete' returned exit code '1', the output was 'ifconfig: ioctl (SIOCDIFADDR): Can't assign requested address'
    Dec 4 17:40:20 kernel ifa_maintain_loopback_route: insertion failed for interface bce0.715: 17
    Dec 4 17:40:20 php-fpm 58958 /xmlrpc.php: The command '/sbin/ifconfig bce0.715 inet6 'fd57:187e:523f:0715::f' prefixlen '64' alias vhid '213'' returned exit code '1', the output was 'ifconfig: ioctl (SIOCAIFADDR): File exists'
    Dec 4 17:40:20 kernel ifa_maintain_loopback_route: insertion failed for interface bce0.210: 17
    Dec 4 17:40:20 php-fpm 58958 /xmlrpc.php: The command '/sbin/ifconfig bce0.210 inet6 '2001:X:X:X::F' prefixlen '64' alias vhid '211'' returned exit code '1', the output was 'ifconfig: ioctl (SIOCAIFADDR): File exists'



  • What are the actual real IPv6 IPs?
    Are you assigning fd57 from the same subnet as the actual intefaces?
    Why would you want to use unique local addresses on IPv6?  That's not the design philosophy of IPv6.


  • Netgate

    Your switching is probably not properly passing traffic to multicast destination ff02::12.

    Diagnostics > Packet Capture on the primary:

    Interface: One with an IPv6 CARP VIP
    Address Family: IPv6-Only
    Protocol: any (Capturing CARP here doesn't seem to work.. Problem for another day.)
    Host Address: ff02::12
    Count: 5

    You should get something like this. Your source address will be different but should also start with fe80:

    02:45:01.595176 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:45:02.601844 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:45:03.645118 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:45:04.652798 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:45:05.668150 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36

    Do the same capture on the Secondary. You should see the same thing:

    02:46:12.490962 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:46:13.550945 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:46:14.611020 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:46:15.670940 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36
    02:46:16.728002 IP6 fe80::f092:faff:fe6a:3279 > ff02::12: ip-proto-112 36

    You will probably not see that. You will probably see the secondary transmitting from its own link-local address because it is not receiving the multicasts from the primary and is, properly, treating that CARP VIP as down. If that is the case you need to fix your layer 2.



  • I didn't include the actual IP addresses because I didn't want to expose the firewall but it's locked down so the point is moot.

    This pair of firewalls will be the gateway for vpn users. I have another vpn appliance to handle that.
    The VPN will give users RFC 1918 / 4193 (ULA) addresses and the firewall pair which is the gateway for those usesers will perform NAT / NPT to Globally routed addresses. I don't know if this is best practice but this is the solution I am trying to implement.

    – carp --
    WAN@210 66.133.130.30 MASTER
    WAN@211 2001:1960:20:D2::F MASTER
    LAN@212 172.26.8.65 MASTER
    LAN@213 fd57:187e:523f:0715::f MASTER
    RFC_BACKEND@214 172.26.8.30 MASTER

    -- primary --
    bce0.210: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:89:11:f6
            inet6 fe80::224:81ff:fe89:11f6%bce0.210 prefixlen 64 scopeid 0xd
            inet6 2001:1960:20:d2::a prefixlen 64
            inet6 2001:1960:20:d2::f prefixlen 64 vhid 211
            inet 66.133.130.28 netmask 0xfffffff8 broadcast 66.133.130.31
            inet 66.133.130.30 netmask 0xfffffff8 broadcast 66.133.130.31 vhid 210
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 210 vlanpcp: 0 parent interface: bce0
            carp: MASTER vhid 210 advbase 1 advskew 0
            carp: MASTER vhid 211 advbase 1 advskew 0
            groups: vlan
    bce0.710: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:89:11:f6
            inet6 fe80::224:81ff:fe89:11f6%bce0.710 prefixlen 64 scopeid 0xe
            inet 172.26.8.28 netmask 0xfffffff8 broadcast 172.26.8.31
            inet 172.26.8.30 netmask 0xfffffff8 broadcast 172.26.8.31 vhid 214
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 710 vlanpcp: 0 parent interface: bce0
            carp: MASTER vhid 214 advbase 1 advskew 0
            groups: vlan
    bce0.715: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:89:11:f6
            inet6 fe80::224:81ff:fe89:11f6%bce0.715 prefixlen 64 scopeid 0xf
            inet6 fd57:187e:523f:715::a prefixlen 64
            inet6 fd57:187e:523f:715::f prefixlen 64 vhid 213
            inet 172.26.8.66 netmask 0xffffffc0 broadcast 172.26.8.127
            inet 172.26.8.65 netmask 0xffffffc0 broadcast 172.26.8.127 vhid 212
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 715 vlanpcp: 0 parent interface: bce0
            carp: MASTER vhid 212 advbase 1 advskew 0
            carp: MASTER vhid 213 advbase 1 advskew 0
            groups: vlan

    -- backup --

    bce0.210: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:88:f1:06
            inet6 fe80::224:81ff:fe88:f106%bce0.210 prefixlen 64 scopeid 0xd
            inet6 2001:1960:20:d2::b prefixlen 64
            inet 66.133.130.29 netmask 0xfffffff8 broadcast 66.133.130.31
            inet 66.133.130.30 netmask 0xfffffff8 broadcast 66.133.130.31 vhid 210
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 210 vlanpcp: 0 parent interface: bce0
            carp: MASTER vhid 211 advbase 1 advskew 100
            carp: BACKUP vhid 210 advbase 1 advskew 100
            groups: vlan
    bce0.710: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:88:f1:06
            inet6 fe80::224:81ff:fe88:f106%bce0.710 prefixlen 64 scopeid 0xe
            inet 172.26.8.29 netmask 0xfffffff8 broadcast 172.26.8.31
            inet 172.26.8.30 netmask 0xfffffff8 broadcast 172.26.8.31 vhid 214
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 710 vlanpcp: 0 parent interface: bce0
            carp: BACKUP vhid 214 advbase 1 advskew 100
            groups: vlan
    bce0.715: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:88:f1:06
            inet6 fe80::224:81ff:fe88:f106%bce0.715 prefixlen 64 scopeid 0xf
            inet6 fd57:187e:523f:715::b prefixlen 64
            inet 172.26.8.67 netmask 0xffffffc0 broadcast 172.26.8.127
            inet 172.26.8.65 netmask 0xffffffc0 broadcast 172.26.8.127 vhid 212
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 715 vlanpcp: 0 parent interface: bce0
            carp: MASTER vhid 213 advbase 1 advskew 100
            carp: BACKUP vhid 212 advbase 1 advskew 100
            groups: vlan

    -- primary --
    15:29:54.265266 IP6 fe80::224:81ff:fe89:11f6 > ff02::12: ip-proto-112 36
    15:29:55.088217 IP6 fe80::224:81ff:fe88:f106 > ff02::12: ip-proto-112 36
    15:29:55.325010 IP6 fe80::224:81ff:fe89:11f6 > ff02::12: ip-proto-112 36
    15:29:56.374974 IP6 fe80::224:81ff:fe89:11f6 > ff02::12: ip-proto-112 36
    15:29:56.485201 IP6 fe80::224:81ff:fe88:f106 > ff02::12: ip-proto-112 36

    -- backup --
    5:34:50.696588 IP6 fe80::224:81ff:fe88:f106 > ff02::12: ip-proto-112 36
    15:34:50.939315 IP6 fe80::224:81ff:fe89:11f6 > ff02::12: ip-proto-112 36
    15:34:51.943702 IP6 fe80::224:81ff:fe89:11f6 > ff02::12: ip-proto-112 36
    15:34:52.128312 IP6 fe80::224:81ff:fe88:f106 > ff02::12: ip-proto-112 36
    15:34:52.953321 IP6 fe80::224:81ff:fe89:11f6 > ff02::12: ip-proto-112 36</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast>


  • Netgate

    My secondary:

    xn2: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    ether 4e:f6:47:4c:0e:df
    hwaddr 4e:f6:47:4c:0e:df
    inet6 fe80::4cf6:47ff:fe4c:edf%xn2 prefixlen 64 scopeid 0x7
    inet6 2001:beef:cafe:7e02::3 prefixlen 64
    inet6 2001:beef:cafe:7e02::1 prefixlen 64 vhid 240
    inet 172.25.237.3 netmask 0xffffff00 broadcast 172.25.237.255
    inet 172.25.237.1 netmask 0xffffff00 broadcast 172.25.237.255 vhid 237
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
    status: active
    carp: BACKUP vhid 237 advbase 1 advskew 100
    carp: BACKUP vhid 240 advbase 1 advskew 100

    Your secondary:

    bce0.715: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=80003 <rxcsum,txcsum,linkstate>ether 00:24:81:88:f1:06
            inet6 fe80::224:81ff:fe88:f106%bce0.715 prefixlen 64 scopeid 0xf
            inet6 fd57:187e:523f:715::b prefixlen 64
            inet 172.26.8.67 netmask 0xffffffc0 broadcast 172.26.8.127
            inet 172.26.8.65 netmask 0xffffffc0 broadcast 172.26.8.127 vhid 212
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 715 vlanpcp: 0 parent interface: bce0
            carp: MASTER vhid 213 advbase 1 advskew 100
            carp: BACKUP vhid 212 advbase 1 advskew 100
            groups: vlan

    Note the absence of the CARP VIP on the interface itself.

    It looks like the interface is confused. Not sure. Have you rebooted the secondary?

    I was able to get mine into a strange state but only by manually issuing ifconfig commands in the shell. You probably want to make sure everything looks good in the VIP settings and reboot the secondary.</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,linkstate></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></up,broadcast,running,promisc,simplex,multicast>



  • Yeah i noticed that as well when i was replying earlier.  I have rebooted the device and that doesnt clear the issue.
    I'm pretty sure its not a L2 issue because the v4 carp vip works and is on the same vlan 210/715.

    I am seeing this in the logs in the backup device.

    Dec 5 20:18:45 php-fpm 71243 /xmlrpc.php: The command '/sbin/ifconfig 'bce0.715' inet6 'fd57:187e:523f:0715::f' delete' returned exit code '1', the output was 'ifconfig: ioctl (SIOCDIFADDR): Can't assign requested address'
    Dec 5 20:18:45 php-fpm 71243 /xmlrpc.php: The command '/sbin/ifconfig 'bce0.210' inet6 '2001:1960:20:D2::F' delete' returned exit code '1', the output was 'ifconfig: ioctl (SIOCDIFADDR): Can't assign requested address'
    Dec 5 20:18:45 kernel ifa_maintain_loopback_route: insertion failed for interface bce0.715: 17
    Dec 5 20:18:45 php-fpm 71243 /xmlrpc.php: The command '/sbin/ifconfig bce0.715 inet6 'fd57:187e:523f:0715::f' prefixlen '64' alias vhid '213'' returned exit code '1', the output was 'ifconfig: ioctl (SIOCAIFADDR): File exists'
    Dec 5 20:18:45 kernel ifa_maintain_loopback_route: insertion failed for interface bce0.210: 17
    Dec 5 20:18:45 php-fpm 71243 /xmlrpc.php: The command '/sbin/ifconfig bce0.210 inet6 '2001:1960:20:D2::F' prefixlen '64' alias vhid '211'' returned exit code '1', the output was 'ifconfig: ioctl (SIOCAIFADDR): File exists'


  • Netgate

    Yeah. I get that is what you are seeing.

    There is nothing systemic regarding IPv6 and CARP. It has to be something in your config.

    Make sure all the interfaces match exactly in order and in name.

    Make sure all the IP aliases and other VIPs match exactly, except for the advskew.



  • Ok I figured out how to get it to a normal state (all master on primary and all backup on secondary).
    You need to reboot the backup firewall, and while its rebooting clear the firewall states on the primary.
    Carp failover works perfectly when its like this but there is still an issue.

    ANY configuration sync (manual/auto) from the primary to the backup causes the backup to become master on the two IPV6 carps.


  • Netgate

    None of that is necessary in a "normal" environment. I reboot the VMs all the time. Just works.



  • @Derelict:

    None of that is necessary in a "normal" environment. I reboot the VMs all the time. Just works.

    I'm not really following you on this one. I'm not running these on VM's but that doesn't matter.
    I don't want to have to reboot a physical machine or a VM every time I make a configuration change.

    This is not a L2 issue. This smells like a bug.



  • Works great for me… (IPs masked to protect the innocent), but I did find that increasing ADVBASE to 10 on the backup as opposed to default 1 helped alot (maybe its because these are running on ESXi), anyway that's my recipe and I'm sticking to it.
    Consequently on the backup uncheck "virtual IPs" in the System / High Availability Sync page.

    MASTER

    em0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:43:51:32
            hwaddr 00:0c:29:43:51:32
            inet6 fe80::20c:29ff:fe43:5132%em0 prefixlen 64 scopeid 0x1
            inet AA.BB.CC.226 netmask 0xfffffff8 broadcast AA.BB.CC.231
            inet6 xxxx:xxxx::1c prefixlen 125
    **        inet6 xxxx:xxxx::1e prefixlen 125 vhid 244**
    **        inet AA.BB.CC.225 netmask 0xfffffff8 broadcast AA.BB.CC.231 vhid 242**
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            carp: MASTER vhid 244 advbase 1 advskew 0
            carp: MASTER vhid 242 advbase 1 advskew 0
    em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:43:51:3c
            hwaddr 00:0c:29:43:51:3c
            inet6 fe80::20c:29ff:fe43:513c%em1 prefixlen 64 scopeid 0x2
            inet XX.YY.ZZ.251 netmask 0xffffff00 broadcast XX.YY.ZZ.255
            inet6 xxxx:xxxx:10:2800::2 prefixlen 64
    **        inet XX.YY.ZZ.254 netmask 0xffffff00 broadcast XX.YY.ZZ.255 vhid 240**
    **        inet6 xxxx:xxxx:10:2800::1 prefixlen 64 vhid 241**
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            carp: MASTER vhid 240 advbase 1 advskew 0
            carp: MASTER vhid 241 advbase 1 advskew 0

    BACKUP

    em0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:4c:da:30
            hwaddr 00:0c:29:4c:da:30
            inet6 fe80::20c:29ff:fe4c:da30%em0 prefixlen 64 scopeid 0x1
            inet AA.BB.CC.227 netmask 0xfffffff8 broadcast AA.BB.CC.231
            inet6 xxxx:xxxx::1d prefixlen 125
    **        inet6 xxxx:xxxx::1e prefixlen 125 vhid 244**
    **        inet AA.BB.CC.225 netmask 0xfffffff8 broadcast AA.BB.CC.231 vhid 242**
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            carp: BACKUP vhid 244 advbase 10 advskew 100
            carp: BACKUP vhid 242 advbase 10 advskew 100
    em1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>ether 00:0c:29:4c:da:3a
            hwaddr 00:0c:29:4c:da:3a
            inet6 fe80::20c:29ff:fe4c:da3a%em1 prefixlen 64 scopeid 0x2
            inet XX.YY.ZZ.252 netmask 0xffffff00 broadcast XX.YY.ZZ.255
            inet6 xxxx:xxxx:10:2800::3 prefixlen 64
    **        inet XX.YY.ZZ.254 netmask 0xffffff00 broadcast XX.YY.ZZ.255 vhid 240**
    **        inet6 xxxx:xxxx:10:2800::1 prefixlen 64 vhid 241**
            nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            carp: BACKUP vhid 240 advbase 10 advskew 100
            carp: BACKUP vhid 241 advbase 10 advskew 100</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></up,broadcast,running,promisc,simplex,multicast>


  • Netgate

    Doesn't sound like a bug because far too many people are NOT seeing the issue. It is something specific to the way you have it configured or something in your environment.



  • Thank you all for helping.

    I just factory rest both devices today and set them up from scratch again.
    All the carp interfaces were working as expected except the IPV6 ULA CARP for the LAN (fd57:187e:523f:715::f/64)
    It was exhibiting the same issues i was seeing prior to the factory reset, both primary and secondary both showing master.
    The IPV6 GUA on the wan was working as expected
    If I rebooted the secondary firewall all the carp interfaces would be in backup status. Anytime I synced the config from the primary it would cause the double master status.

    I was able to find a solution based off what awebster said about unchecking the virtual ip in the HA sync.
    I unchecked this option and rebooted the secondary firewall and now all the carp interfaces are showing the correct status and config syncing doesnt affect them.


  • Netgate

    Just to be sure there wasn't something somewhere that misbehaved with ULA and CARP:

    Primary:

    xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    options=3 <rxcsum,txcsum>ether ee:c2:d9:d8:55:46
    hwaddr ee:c2:d9:d8:55:46
    inet6 fe80::ecc2:d9ff:fed8:5546%xn5 prefixlen 64 scopeid 0xd
    inet6 fda9:cfd8:f9f:1000::2 prefixlen 64
    inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
    inet 192.168.123.2 netmask 0xffffff00 broadcast 192.168.123.255
    inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
    status: active
    carp: MASTER vhid 242 advbase 1 advskew 0
    carp: MASTER vhid 243 advbase 1 advskew 0

    Secondary:

    xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    ether 6e:24:e4:84:f5:f9
    hwaddr 6e:24:e4:84:f5:f9
    inet6 fe80::6c24:e4ff:fe84:f5f9%xn5 prefixlen 64 scopeid 0xa
    inet6 fda9:cfd8:f9f:1000::3 prefixlen 64
    inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
    inet 192.168.123.3 netmask 0xffffff00 broadcast 192.168.123.255
    inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
    status: active
    carp: BACKUP vhid 242 advbase 1 advskew 100
    carp: BACKUP vhid 243 advbase 1 advskew 100

    Enter CARP Maintenance mode on Primary:

    Primary:

    xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    options=3 <rxcsum,txcsum>ether ee:c2:d9:d8:55:46
    hwaddr ee:c2:d9:d8:55:46
    inet6 fe80::ecc2:d9ff:fed8:5546%xn5 prefixlen 64 scopeid 0xd
    inet6 fda9:cfd8:f9f:1000::2 prefixlen 64
    inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
    inet 192.168.123.2 netmask 0xffffff00 broadcast 192.168.123.255
    inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
    status: active
    carp: BACKUP vhid 242 advbase 1 advskew 254
    carp: BACKUP vhid 243 advbase 1 advskew 254

    Secondary:

    xn5: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    ether 6e:24:e4:84:f5:f9
    hwaddr 6e:24:e4:84:f5:f9
    inet6 fe80::6c24:e4ff:fe84:f5f9%xn5 prefixlen 64 scopeid 0xa
    inet6 fda9:cfd8:f9f:1000::3 prefixlen 64
    inet6 fda9:cfd8:f9f:1000::1 prefixlen 64 vhid 243
    inet 192.168.123.3 netmask 0xffffff00 broadcast 192.168.123.255
    inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255 vhid 242
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet manual
    status: active
    carp: MASTER vhid 242 advbase 1 advskew 100
    carp: MASTER vhid 243 advbase 1 advskew 100

    Fails back fine, too.</performnud,auto_linklocal></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></up,broadcast,running,promisc,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum></up,broadcast,running,promisc,simplex,multicast>



  • I've been dealing with the same problem in my HA setup and it turned to be related to bug #6579
    https://redmine.pfsense.org/issues/6579

    The affected CARP IPv6 address was something like:
    2001:aaaa:bbb:ccc:0d00:ffff:ffff:ffff
    After removing leading zeros:
    2001:aaaa:bbb:ccc:d00:ffff:ffff:ffff

    CARP started to work reliably on that interface


  • Netgate

    Nice catch.

    LAN@213  fd57:187e:523f:0715::f    MASTER



  • I am having exactly the same as this since moving to 2.4 from 2.3.5
    interesting only on 2 of the 4 IPv6 CARPs
    they were the only 2 that could use :: in their address
    I tried expanding to 0:0:0:
    it did not help

    I have confirmed by packet capture that packets to ff02::12 are seen on both systems

    –------------
    Ok I figured out how to get it to a normal state (all master on primary and all backup on secondary).
    You need to reboot the backup firewall, and while its rebooting clear the firewall states on the primary.
    Carp failover works perfectly when its like this but there is still an issue.

    ANY configuration sync (manual/auto) from the primary to the backup causes the backup to become master on the two IPV6 carps.



  • To reiterate I did not have this issue until upgrading from 2.3.5 to 2.4.2-RELEASE-p1, or at least it seemed to have gotten worse.

    More testing:
    changed from x::1 (ie X:0:0:0:1)  to x:1:1:1:1
    on one of the CARP interfaces and the problem went away
    Did not change the real interface IP

    UPDATE: it worked for the first one, but broke both after I changed the second one.
    Why are these 2 different then the other 2?
    they connect to the same switch
    I found a difference, one set of addresses used all lower case for the hex in the address, the none working ones had capitals.
    I have changed all to lower and rebooted B unit and it came up all in backup, did not have to reset states on A firewall.
    I'm not saying this is the issue - but giving people ideas of what I found
    So in summary: using all lower case for hex and changed the addresses to ones that can not condense to ::



  • Just want to add i appear to have hit this "bug" in one of our SG-4860 clusters

    our IPv6 addresses are in their shortened form with no leading zeros, had to reboot secondary to clear this out, will keep an eye on things



  • I am also hitting something similar this in our office/test system.

    Both devices are connected to a Cisco 3560G switch. IGMP snooping and ipv6 mld snooping are disabled. All ports are set to "portfast". There are no "loops" in the network. There are no topology changes.

    You will notice that each one sees the others advertisements and their own.

    Primary:
    16:42:40.428976 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
    16:42:42.597228 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
    16:42:50.886692 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
    16:42:52.607533 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
    16:43:01.382988 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
    16:43:02.612549 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36

    Backup:
    16:42:09.212760 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
    16:42:12.573960 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
    16:42:19.608720 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
    16:42:22.578900 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
    16:42:30.015028 IP6 fe80::ec4:7aff:feab:3724 > ff02::12: ip-proto-112 36
    16:42:32.585911 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36

    This only happens for IPv6 CARP IPs.

    Here are the interfaces, just to confirm the vhid:

    Primary:
    igb0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    options=6400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6>ether 0c:c4:7a:ac:82:1a
    hwaddr 0c:c4:7a:ac:82:1a
    inet6 fe80::ec4:7aff:feac:821a%igb0 prefixlen 64 scopeid 0x1
    inet6 xxxx:xxxx:1:2::3 prefixlen 124
    inet6 xxxx:xxxx:1:2::2 prefixlen 124 vhid 4
    inet yyy.yyy.233.108 netmask 0xfffffff0 broadcast yyy.yyy.233.111
    inet yyy.yyy.233.110 netmask 0xfffffff0 broadcast yyy.yyy.233.111 vhid 1
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
    carp: MASTER vhid 1 advbase 10 advskew 1
    carp: MASTER vhid 4 advbase 10 advskew 1

    Backup:
    igb0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
    options=6400bb <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6>ether 0c:c4:7a🆎37:24
    hwaddr 0c:c4:7a🆎37:24
    inet6 fe80::ec4:7aff:feab:3724%igb0 prefixlen 64 scopeid 0x1
    inet6 xxxx:xxxx:1:2::4 prefixlen 124
    inet yyy.yyy.233.109 netmask 0xfffffff0 broadcast yyy.yyy.233.111
    inet yyy.yyy.233.110 netmask 0xfffffff0 broadcast yyy.yyy.233.111 vhid 1
    nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
    carp: MASTER vhid 4 advbase 10 advskew 101
    carp: BACKUP vhid 1 advbase 10 advskew 101

    So the CARP interface is correctly assigned to the primary node, but the backup one still claims its master in the dashboard and with "ifconfig igb0".</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,rxcsum_ipv6,txcsum_ipv6></up,broadcast,running,promisc,simplex,multicast>


  • Netgate

    Why did you play with advbase/advskew?

    Use 1/0 on the primary that will sync 1/100 to the secondary. Then just leave it alone.



  • Yes. I did try multiple base values between 0 - 20 for the base and 0 and 1 for skew. The settings are correctly(+100 for skew) transferred to the backup unit. Still backup thinks it's primary for IPv6.


  • Netgate

    Are you 100% certain the case described in reply #15 ^ is not present?

    Use 1/0 on the primary that will sync 1/100 to the secondary. Then just leave it alone.

    Just do that. If changing it didn't correct it it is not the problem.

    Packet capture on both nodes and see if you see the CARP going out the interface or in the interface. You can filter on CARP only in Diagnostics > Packet Capture.



  • 1. Regarding post #15 solution. I tried both shorthand(no leading zeroes) and full notation with nothing omitted.
    2. I included a tcpdump in my first post. It looks to me that they are both receiving each other's updates.



  • Have you tried changing to addresses that CAN NOT be shortened to have a :: ?



  • @IcePick:

    Have you tried changing to addresses that CAN NOT be shortened to have a :: ?

    Yes I did. No difference.


  • Netgate

    Did you put base/skew back to the default or not?



  • @Derelict:

    Did you put base/skew back to the default or not?

    Yes, I did.


  • Netgate

    Well, cut loose with more. Screen shots, pcaps, whatever. IPv6 CARP works.



  • I disabled "DHCP Snooping" on the directly connected switch. That was somehow blocking stuff. Seems to be working OK now. I can no longer reproduce the issue. Will post if I can.


  • Netgate

    Amazing. It was a setting on the switch. Simply amazing.

    Glad you found it.



  • So the problem is kind of back.

    Same situation. Secondary pfsense become master for both IPv6 CARP groups, both report as master. The weird thing now is that if I shut down the secondary pfsense box IPv6 stops working completely. The primary box reports CARP status "Master"(as it always does), but the address is not reachable on the local LAN.

    IGMP / DHCP snooping is disabled on the two switches between test PC and firewalls. The IPv4 CARP works fine.


  • Netgate

    Again, it sounds like something at layer 2.

    Either of the nodes will show MASTER if it does not receive the heartbeats from the other node. Solving dual MASTER is generally as simple as fixing the reason(s) that one node is not seeing the heartbeats from the other node.



  • It most definitely not L2 issue. The devices can see each other. Confirmed with tcpdump.( tcpdump -i igb0 -ttt -n proto CARP). There are 2 VHIDs on this interface.

    Master:
    00:00:00.000094 IP 217.117.yyy.xxx > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 1, authtype none, intvl 1s, length 36
    00:00:01.004961 IP 217.117.yyy.xxx > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 1, authtype none, intvl 1s, length 36
    00:00:00.000103 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
    00:00:01.005069 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36

    Backup:
    00:00:00.000064 IP 217.117.yyy.xxx > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 1, authtype none, intvl 1s, length 36
    00:00:01.004781 IP 217.117.yyy.xxx > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 1, authtype none, intvl 1s, length 36
    00:00:00.000062 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36
    00:00:01.004907 IP6 fe80::ec4:7aff:feac:821a > ff02::12: ip-proto-112 36

    What I did is to download a config backup from each unit and a do a restore from config. That fixed the issue for me. There were no changes made to the underlying switching network.

    I also hit this bug with 2.4.3-Release-P1. That left me with one extra VHID on each interface stuck in "INIT" state. Rebooting the firewall is the only way I found to fix it.

    P.S. - Yesterday I also tried shutting down the "Backup" unit and fully un-plugging it from the network. While the IPv6 CARP interface on the LAN was showing as "up" and "master" on the only firewall left, IPv6 connectivity was not working until I rebooted the firewall.



  • This post is deleted!


  • Happens on other interfaces too(igb2.12):

    Master:
    00:00:00.000000 IP 172.28.0.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 1, authtype none, intvl 1s, length 36
    00:00:01.009252 IP 172.28.0.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 1, authtype none, intvl 1s, length 36

    Backup:
    00:00:00.000000 IP 172.28.0.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 1, authtype none, intvl 1s, length 36
    00:00:01.010086 IP 172.28.0.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 1, authtype none, intvl 1s, length 36

    Interface shows "Master" on both devices.

    igb1:
    Master:
    00:00:00.431700 IP 172.29.100.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 1, authtype none, intvl 1s, length 36
    00:00:00.000072 IP6 fe80::ec4:7aff:feac:821b > ff02::12: ip-proto-112 36
    00:00:00.964265 IP6 fe80::ec4:7aff:feab:3725 > ff02::12: ip-proto-112 36
    00:00:00.040245 IP6 fe80::ec4:7aff:feac:821b > ff02::12: ip-proto-112 36
    00:00:00.000067 IP 172.29.100.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 1, authtype none, intvl 1s, length 36

    Backup:
    00:00:01.004330 IP 172.29.100.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 1, authtype none, intvl 1s, length 36
    00:00:00.000053 IP6 fe80::ec4:7aff:feac:821b > ff02::12: ip-proto-112 36
    00:00:00.185346 IP6 fe80::ec4:7aff:feab:3725 > ff02::12: ip-proto-112 36
    00:00:00.819555 IP6 fe80::ec4:7aff:feac:821b > ff02::12: ip-proto-112 36
    00:00:00.000135 IP 172.29.100.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 1, authtype none, intvl 1s, length 36

    For some reason the "Backup" unit is also receiving it's own advertisements, but only on IPv6.

    Seems setting "advskew" to 100 on primary one, waiting for backup unit to get the config and rebooting the backup unit fixes the issue with the advertisements. Pending further testing of course.


  • Netgate

    You should be decoding those as CARP, not VRRP so we can see what is going on in a more clear fashion. You can:

    • Set the protocol to CARP then view the capture in Diagnostics > Packet Capture. That will result in tcpdump decoding as CARP.

    • Set wireshark to decode as CARP by right-clicking a VRRP packet and using Decode As to decode protocol 112 as CARP instead of VRRP.

    0_1530203621154_Screen Shot 2018-06-28 at 9.33.07 AM.png

    You should never have to touch advbase/advskew. They should be 1/0 on the primary which should sync to 1/100 on the secondary.

    I do recall one issue with IPv6 CARP and the way the VIPs are defined. I cannot remember if it was leading zeroes, capital hex digits or what. How are you specifying your CARP VIPs?

    For some reason the “Backup” unit is also receiving it’s own advertisements, but only on IPv6.

    If they both think they are MASTER on a VIP they will both be advertising. If you look at the MAC addresses in the capture, you will likely see that the secondary is not receiving its own advertisements, but that it is sending them along with the primary.



  • Yes, I agree. I saw a suggestion about that. Need to add "-T carp" to the tcpdump command for it work:

    tcpdump -npi igb1 -T carp -e | egrep "224.0.0.18|ff02::12:"
    

    for example. The "egrep" is there because if I just use "expression" carp, it does not dump the IPv6 traffic.

    Back to my problem. I've reverted to 1/0(and via config sync 1/100 on backup).

    How I am able to reproduce the problem:

    • Make a change on any CARP VIP
    • Reboot primary

    However if I reboot the backup unit after I've made a change on CARP, everything works as advertised. Going to do some more dumping to try to figure it out.


  • Netgate

    I don't know of any fixes regarding this, but if you are rebooting these units you should be on 2.4.3_1.

    https://www.netgate.com/docs/pfsense/highavailability/redundant-firewalls-upgrade-guide.html