too much has resumed CARP state "BACKUP" for vhid in the log



  • Hello netgate forum,

    i experiences too many lines (few hundred within 10 minutes) like

    Jul 17 03:02:23 mc-fw-prd-prd-c-fwint-01-p php-fpm[14390]: /rc.carpbackup: HA cluster member "(XX.XXX.XXX.145@vtnet2.3064): (MC_EDMZ)" has resumed CARP state "BACKUP" for vhid 21
    Jul 17 03:02:23 mc-fw-prd-prd-c-fwint-01-p php-fpm[14390]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX::145@vtnet2.3064): (MC_EDMZ)" has resumed CARP state "BACKUP" for vhid 21
    Jul 17 03:02:23 mc-fw-prd-prd-c-fwint-01-p php-fpm[14390]: /rc.carpbackup: HA cluster member "(XX.XXX.XXX.170@vtnet2.3064): (MC_EDMZ)" has resumed CARP state "BACKUP" for vhid 21
    Jul 17 03:02:23 mc-fw-prd-prd-c-fwint-01-p php-fpm[14390]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX::170@vtnet2.3064): (MC_EDMZ)" has resumed CARP state "BACKUP" for vhid 21
    Jul 17 03:02:23 mc-fw-prd-prd-c-fwint-01-p php-fpm[14390]: /rc.carpbackup: HA cluster member "(XX.XXX.XXX.171@vtnet2.3064): (MC_EDMZ)" has resumed CARP state "BACKUP" for vhid 21
    Jul 17 03:02:23 mc-fw-prd-prd-c-fwint-01-p php-fpm[14390]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX::171@vtnet2.3064): (MC_EDMZ)" has resumed CARP state "BACKUP" for vhid 21

    when booting pfsense

    this prevents pfsense from starting quagga and haproxy, so i start them both manually after carp goes silent

    How can i fix it ?

    pfsense 2.4.4_3

    Best Regards
    Arr


  • LAYER 8 Moderator

    Hi,

    1. is there a reason why you aren't running the latest stable version 2.4.5_1?
    2. all those messages are from other(!) IPs but on the same interface. could you show your Virtual IP table? Seems to me you unnecessary run CARP on a big number of IPs on the same interface when you could simply use IPs Aliases.

    Especially because of 2) I think you're simply running a slightly misconfigured CARP setup or you're using CARP-style VIPs the wrong way if you say it shows hundreds(!) of those messages for ~10min. That smells like you've created a whole /24 or even bigger space as single CARP IPs. :)

    Greets
    \jegr



  • Hi JeGr,

    it is not in 2.4.5_1 because the reboot is with these reason with longer downtime. However it is in plan to do the upgrade or re-install to this version.

    this is the interface config, the ...231 is "direct" to CARP and the others are IP-Aliases to the .231 :



  • vtnet2.3064: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=80000<LINKSTATE>
    ether 26:52:80:fd:0b:79
    inet6 fe80::2452:80ff:fefd:b79%vtnet2.3064 prefixlen 64 scopeid 0x9
    inet6 XXXX:XXXX:XXXX::231 prefixlen 64
    inet6 XXXX:XXXX:XXXX::130 prefixlen 64 vhid 22
    inet6 XXXX:XXXX:XXXX::245 prefixlen 64 vhid 22
    inet6 XXXX:XXXX:XXXX::131 prefixlen 64 vhid 22
    [...]
    inet XX.XXX.XXX.231 netmask 0xffffff80 broadcast XX.XXX.XXX.255
    inet XX.XXX.XXX.130 netmask 0xffffff80 broadcast XX.XXX.XXX.255 vhid 21
    inet XX.XXX.XXX.245 netmask 0xffffff80 broadcast XX.XXX.XXX.255 vhid 21
    inet XX.XXX.XXX.160 netmask 0xffffff80 broadcast XX.XXX.XXX.255 vhid 21
    inet XX.XXX.XXX.131 netmask 0xffffff80 broadcast XX.XXX.XXX.255 vhid 21
    [...]
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    media: Ethernet 10Gbase-T <full-duplex>
    status: active
    vlan: 3064 vlanpcp: 0 parent interface: vtnet2
    carp: MASTER vhid 21 advbase 1 advskew 0
    carp: MASTER vhid 22 advbase 1 advskew 0
    groups: vlan



  • Best Regards



  • correction:

    this is the interface config, the ...231 is "direct" to CARP and the others are IP-Aliases to the .231 :
    ==>
    this is the interface config, the ...231 is physical IP to interface vtnet2.3064 (in secondary ...232), the ...130 is "direct to CARP and the others are IP-Aliases to the .130 :


  • LAYER 8 Moderator

    That's a bit chaotic - could you show the VirtualIP page as screenshot (and blank the middle)?

    I assume all IPs are in the same prefix/subnet (e.g. the XXes are all the same)?

    it is not in 2.4.5_1 because the reboot is with these reason with longer downtime. However it is in plan to do the upgrade or re-install to this version.

    Why is updating an issue with how long it takes to boot? Simply update the secondary node, check if it works then switch the primary with "carp maint mode" so it doesn't take over its master role after booting, check it, too and if working switch over to primary. No downtime necessary! :)



  • Hi JeGr,

    yes the XXs are all the same

    and yes it is with downtime .... but i never checked the carp maintenance mode - at next maintenance time window i checks this way

    let me try do show screenshot later

    Best Regards


Log in to reply