PfSense 2.2.2 CARP-Backup becomes Master



  • Hi,

    I have installed two new pfSense 2.2.2 nodes and restored their configuration from a backup of the former 2.1.5 nodes. The new nodes are DELL R220, with a onboard Dual and additional Quadport Broadcom-NIC. After the upgrade - using the exact same switch ports as before - results in situations where both nodes assume themself as master. This did not happen with the previous systems, so it should not be an issue of the switches (Cisco Catalyst 3750 with Portfast enabled).
    When I reboot both nodes for a few minutes things seem to be alright, after that the secondary nodes switches to master. I have no clue where I should start to search, currently I can only permanently disable carp on the secondary node. dmesg on the first node shows:

    bge0: <broadcom netxtreme="" gigabit="" ethernet,="" asic="" rev.="" 0x5719001=""> mem 0xa2a90000-0xa2a9ffff,0xa2aa0000-0xa2aaffff,0xa2ab0000-0xa2abffff irq 16 at device 0.0 on pci1
    bge0: APE FW version: NCSI v1.3.7.0
    bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
    miibus0: <mii bus=""> on bge0
    brgphy0: <bcm5719c 1000base-t="" media="" interface=""> PHY 1 on miibus0
    brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    bge1: <broadcom netxtreme="" gigabit="" ethernet,="" asic="" rev.="" 0x5719001=""> mem 0xa2a60000-0xa2a6ffff,0xa2a70000-0xa2a7ffff,0xa2a80000-0xa2a8ffff irq 17 at device 0.1 on pci1
    bge1: APE FW version: NCSI v1.3.7.0
    bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
    miibus1: <mii bus=""> on bge1
    brgphy1: <bcm5719c 1000base-t="" media="" interface=""> PHY 2 on miibus1
    brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    bge2: <broadcom netxtreme="" gigabit="" ethernet,="" asic="" rev.="" 0x5719001=""> mem 0xa2a30000-0xa2a3ffff,0xa2a40000-0xa2a4ffff,0xa2a50000-0xa2a5ffff irq 16 at device 0.2 on pci1
    bge2: APE FW version: NCSI v1.3.7.0
    bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
    miibus2: <mii bus=""> on bge2
    brgphy2: <bcm5719c 1000base-t="" media="" interface=""> PHY 3 on miibus2
    brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    bge3: <broadcom netxtreme="" gigabit="" ethernet,="" asic="" rev.="" 0x5719001=""> mem 0xa2a00000-0xa2a0ffff,0xa2a10000-0xa2a1ffff,0xa2a20000-0xa2a2ffff irq 17 at device 0.3 on pci1
    bge3: APE FW version: NCSI v1.3.7.0
    bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
    miibus3: <mii bus=""> on bge3
    brgphy3: <bcm5719c 1000base-t="" media="" interface=""> PHY 4 on miibus3
    brgphy3:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    xhci0: <intel lynx="" point="" usb="" 3.0="" controller=""> mem 0xa2c00000-0xa2c0ffff irq 16 at device 20.0 on pci0
    usbus0: waiting for BIOS to give up control
    xhci0: 32 byte context size.
    xhci0: Port routing mask set to 0xffffffff
    usbus0 on xhci0
    ehci0: <intel lynx="" point="" usb="" 2.0="" controller="" usb-b=""> mem 0xa2c12000-0xa2c123ff irq 16 at device 26.0 on pci0
    usbus1: EHCI version 1.0
    usbus1 on ehci0
    pcib2: <acpi pci-pci="" bridge=""> irq 16 at device 28.0 on pci0
    pci2: <acpi pci="" bus=""> on pcib2
    bge4: <broadcom netxtreme="" gigabit="" ethernet,="" asic="" rev.="" 0x5720000=""> mem 0xa2b30000-0xa2b3ffff,0xa2b40000-0xa2b4ffff,0xa2b50000-0xa2b5ffff irq 16 at device 0.0 on pci2
    bge4: APE FW version: NCSI v1.3.7.0
    bge4: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
    miibus4: <mii bus=""> on bge4
    brgphy4: <bcm5720c 1000base-t="" media="" interface=""> PHY 1 on miibus4
    brgphy4:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    bge5: <broadcom netxtreme="" gigabit="" ethernet,="" asic="" rev.="" 0x5720000=""> mem 0xa2b00000-0xa2b0ffff,0xa2b10000-0xa2b1ffff,0xa2b20000-0xa2b2ffff irq 17 at device 0.1 on pci2
    bge5: APE FW version: NCSI v1.3.7.0
    bge5: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
    miibus5: <mii bus=""> on bge5
    brgphy5: <bcm5720c 1000base-t="" media="" interface=""> PHY 2 on miibus5
    brgphy5:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
    pcib3: <acpi pci-pci="" bridge=""> irq 18 at device 28.2 on pci0
    pci3: <acpi pci="" bus=""> on pcib3
    pcib4: <acpi pci-pci="" bridge=""> at device 0.0 on pci3
    pcib5: <pci-pci bridge=""> at device 0.0 on pci4
    pci5: <pci bus=""> on pcib5
    pcib6: <pci-pci bridge=""> at device 0.0 on pci5
    pci6: <pci bus=""> on pcib6
    vgapci0: <vga-compatible display=""> mem 0xa1000000-0xa1ffffff,0xa2800000-0xa2803fff,0xa2000000-0xa27fffff irq 18 at device 0.0 on pci6
    vgapci0: Boot video device
    pcib7: <pci-pci bridge=""> at device 1.0 on pci4
    pci7: <pci bus=""> on pcib7
    pci7: <memory, ram=""> at device 0.0 (no driver attached)
    ehci1: <intel lynx="" point="" usb="" 2.0="" controller="" usb-a=""> mem 0xa2c11000-0xa2c113ff irq 23 at device 29.0 on pci0
    usbus2: EHCI version 1.0
    usbus2 on ehci1
    isab0: <pci-isa bridge=""> at device 31.0 on pci0
    isa0: <isa bus=""> on isab0
    ahci0: <intel lynx="" point="" ahci="" sata="" controller=""> port 0x3048-0x304f,0x3054-0x3057,0x3040-0x3047,0x3050-0x3053,0x3020-0x303f mem 0xa2c10000-0xa2c107ff irq 19 at device 31.2 on pci0
    ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
    ahcich0: <ahci channel=""> at channel 0 on ahci0
    ahcich4: <ahci channel=""> at channel 4 on ahci0
    ahcich5: <ahci channel=""> at channel 5 on ahci0
    ahciem0: <ahci enclosure="" management="" bridge=""> on ahci0
    acpi_tz0: <thermal zone=""> on acpi0
    acpi_tz1: <thermal zone=""> on acpi0
    battery0: <acpi control="" method="" battery=""> on acpi0
    battery1: <acpi control="" method="" battery=""> on acpi0
    battery2: <acpi control="" method="" battery=""> on acpi0
    ppc1: cannot reserve I/O port range
    uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
    orm0: <isa option="" rom=""> at iomem 0xc0000-0xc7fff on isa0
    sc0: <system console=""> at flags 0x100 on isa0
    sc0: VGA <16 virtual consoles, flags=0x300>
    vga0: <generic isa="" vga=""> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    ppc0: cannot reserve I/O port range
    est0: <enhanced speedstep="" frequency="" control=""> on cpu0
    p4tcc0: <cpu frequency="" thermal="" control=""> on cpu0
    est1: <enhanced speedstep="" frequency="" control=""> on cpu1
    p4tcc1: <cpu frequency="" thermal="" control=""> on cpu1
    est2: <enhanced speedstep="" frequency="" control=""> on cpu2
    p4tcc2: <cpu frequency="" thermal="" control=""> on cpu2
    est3: <enhanced speedstep="" frequency="" control=""> on cpu3
    p4tcc3: <cpu frequency="" thermal="" control=""> on cpu3
    Timecounters tick every 1.000 msec
    IPsec: Initialized Security Association Processing.
    random: unblocking device.
    usbus0: 5.0Gbps Super Speed USB v3.0
    usbus1: 480Mbps High Speed USB v2.0
    usbus2: 480Mbps High Speed USB v2.0
    ugen0.1: <0x8086> at usbus0
    uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
    ugen1.1: <intel> at usbus1
    uhub1: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr=""> on usbus1
    ugen2.1: <intel> at usbus2
    uhub2: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr=""> on usbus2
    ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
    ada0: <st500nm0003-9zm172 ga0a=""> ATA-9 SATA 2.x device
    ada0: Serial Number Z1W3G20R
    ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
    ada0: Command Queueing enabled
    ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
    ada0: Previously was known as ad4
    ses0 at ahciem0 bus 0 scbus3 target 0 lun 0
    ses0: <ahci sgpio="" enclosure="" 1.00="" 0001=""> SEMB S-E-S 2.00 device
    ses0: SEMB SES Device
    cd0 at ahcich4 bus 0 scbus1 target 0 lun 0
    cd0: <hl-dt-st dvd+-rw="" gta0n="" a3b0=""> Removable CD-ROM SCSI-0 device
    cd0: Serial Number KL3F17M0730
    cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
    cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
    SMP: AP CPU #1 Launched!
    SMP: AP CPU #2 Launched!
    SMP: AP CPU #3 Launched!
    Timecounter "TSC-low" frequency 1696110642 Hz quality 1000
    Root mount waiting for: usbus2 usbus1 usbus0
    uhub0: 17 ports with 17 removable, self powered
    uhub1: 2 ports with 2 removable, self powered
    uhub2: 2 ports with 2 removable, self powered
    Root mount waiting for: usbus2 usbus1 usbus0
    ugen0.2: <avocent> at usbus0
    ukbd0: <ep1 interrupt=""> on usbus0
    kbd0 at ukbd0
    ugen1.2: <vendor 0x8087=""> at usbus1
    uhub3: <vendor 2="" 9="" 0x8087="" product="" 0x8008,="" class="" 0,="" rev="" 2.00="" 0.05,="" addr=""> on usbus1
    ugen2.2: <vendor 0x8087=""> at usbus2
    uhub4: <vendor 2="" 9="" 0x8087="" product="" 0x8000,="" class="" 0,="" rev="" 2.00="" 0.05,="" addr=""> on usbus2
    uhub3: 4 ports with 4 removable, self powered
    ugen0.3: <no manufacturer=""> at usbus0
    uhub5: <no 2="" 9="" manufacturer="" gadget="" usb="" hub,="" class="" 0,="" rev="" 2.00="" 0.00,="" addr=""> on usbus0
    uhub4: 6 ports with 6 removable, self powered
    Root mount waiting for: usbus0
    uhub5: 6 ports with 6 removable, self powered
    Root mount waiting for: usbus0
    ugen0.4: <avocent> at usbus0
    ukbd1: <keyboard> on usbus0
    kbd2 at ukbd1
    Trying to mount root from ufs:/dev/ufsid/556747db4a12b13e [rw]...
    bge4: link state changed to DOWN
    bge5: link state changed to DOWN
    bge3: link state changed to DOWN
    bge2: link state changed to DOWN
    bge1: link state changed to DOWN
    bge0: link state changed to DOWN
    bge0: promiscuous mode enabled
    carp: demoted by 240 to 240 (interface down)
    bge2: promiscuous mode enabled
    carp: demoted by 240 to 480 (interface down)
    bge1: promiscuous mode enabled
    carp: demoted by 240 to 720 (interface down)
    bge5: promiscuous mode enabled
    carp: demoted by 240 to 960 (interface down)
    bge3: promiscuous mode enabled
    carp: demoted by 240 to 1200 (interface down)
    bge4: promiscuous mode enabled
    carp: demoted by 240 to 1440 (interface down)
    carp: demoted by 0 to 1440 (pfsync bulk start)
    carp: VHID 5@bge4: INIT -> BACKUP
    carp: demoted by -240 to 1200 (interface up)
    bge4: link state changed to UP
    carp: VHID 6@bge3: INIT -> BACKUP
    carp: demoted by -240 to 960 (interface up)
    bge3: link state changed to UP
    tun1: changing name to 'ovpns1'
    tun2: changing name to 'ovpns2'
    tun3: changing name to 'ovpns3'
    tun4: changing name to 'ovpns4'
    ovpns1: link state changed to UP
    ovpns2: link state changed to UP
    ovpns3: link state changed to UP
    tun5: changing name to 'ovpns5'
    ovpns4: link state changed to UP
    tun6: changing name to 'ovpns6'
    ovpns5: link state changed to UP
    tun7: changing name to 'ovpns7'
    ovpns6: link state changed to UP
    tun8: changing name to 'ovpns8'
    ovpns7: link state changed to UP
    ovpns8: link state changed to UP
    pflog0: promiscuous mode enabled
    carp: VHID 1@bge5: INIT -> BACKUP
    carp: demoted by -240 to 720 (interface up)
    bge5: link state changed to UP
    carp: VHID 3@bge2: INIT -> BACKUP
    carp: demoted by -240 to 480 (interface up)
    bge2: link state changed to UP
    carp: VHID 2@bge1: INIT -> BACKUP
    carp: demoted by -240 to 240 (interface up)
    bge1: link state changed to UP
    carp: VHID 4@bge0: INIT -> BACKUP
    carp: demoted by -240 to 0 (interface up)
    bge0: link state changed to UP
    carp: VHID 4@bge0: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 6@bge3: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 3@bge2: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 2@bge1: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 5@bge4: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 1@bge5: BACKUP -> MASTER (preempting a slower master)
    ovpns4: link state changed to DOWN
    ovpns5: link state changed to DOWN
    ovpns4: link state changed to UP
    ovpns6: link state changed to DOWN
    ovpns5: link state changed to UP
    ovpns7: link state changed to DOWN
    ovpns6: link state changed to UP
    ovpns1: link state changed to DOWN
    ovpns8: link state changed to DOWN
    ovpns7: link state changed to UP
    ovpns2: link state changed to DOWN
    ovpns1: link state changed to UP
    ovpns8: link state changed to UP
    ovpns3: link state changed to DOWN
    ovpns2: link state changed to UP
    ovpns3: link state changed to UP
    carp: demoted by 0 to 0 (pfsync bulk fail)
    carp: demoted by 240 to 240 (interface down)
    bge3: link state changed to DOWN
    carp: VHID 6@bge3: INIT -> BACKUP
    carp: demoted by -240 to 0 (interface up)
    bge3: link state changed to UP
    carp: demoted by 0 to 0 (pfsync bulk start)
    carp: VHID 6@bge3: BACKUP -> MASTER (preempting a slower master)
    ifa_add_loopback_route: insertion failed: 17
    ovpns4: link state changed to DOWN
    ovpns5: link state changed to DOWN
    ovpns4: link state changed to UP
    ovpns6: link state changed to DOWN
    ovpns5: link state changed to UP
    ovpns7: link state changed to DOWN
    ovpns6: link state changed to UP
    ovpns8: link state changed to DOWN
    ovpns7: link state changed to UP
    ovpns8: link state changed to UP
    carp: demoted by 240 to 240 (interface down)
    bge3: link state changed to DOWN
    carp: VHID 3@bge2: MASTER -> BACKUP (more frequent advertisement received)
    carp: VHID 2@bge1: MASTER -> BACKUP (more frequent advertisement received)
    carp: VHID 4@bge0: MASTER -> BACKUP (more frequent advertisement received)
    carp: VHID 1@bge5: MASTER -> BACKUP (more frequent advertisement received)
    carp: VHID 5@bge4: MASTER -> BACKUP (more frequent advertisement received)
    arp: 192.168.191.2 moved from 54:9f:35:25:55:24 to 00:00:5e:00:01:05 on bge4
    carp: VHID 6@bge3: INIT -> BACKUP
    carp: demoted by -240 to 0 (interface up)
    bge3: link state changed to UP
    carp: VHID 1@bge5: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 3@bge2: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 5@bge4: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 2@bge1: BACKUP -> MASTER (preempting a slower master)
    carp: VHID 4@bge0: BACKUP -> MASTER (preempting a slower master)
    arp: 192.168.191.2 moved from 00:00:5e:00:01:05 to 54:9f:35:25:55:24 on bge4
    ovpns1: link state changed to DOWN
    ovpns2: link state changed to DOWN
    ovpns1: link state changed to UP
    ovpns3: link state changed to DOWN
    ovpns2: link state changed to UP
    ovpns3: link state changed to UP
    carp: VHID 6@bge3: BACKUP -> MASTER (master down)
    ifa_add_loopback_route: insertion failed: 17
    ovpns4: link state changed to DOWN
    ovpns5: link state changed to DOWN
    ovpns4: link state changed to UP
    ovpns6: link state changed to DOWN
    ovpns5: link state changed to UP
    ovpns7: link state changed to DOWN
    ovpns6: link state changed to UP
    ovpns8: link state changed to DOWN
    ovpns7: link state changed to UP
    ovpns8: link state changed to UP
    carp: demoted by 0 to 0 (pfsync bulk fail)</keyboard></avocent></no></no></vendor></vendor></vendor></vendor></ep1></avocent></hl-dt-st></ahci></st500nm0003-9zm172></intel></intel></intel></intel></cpu></enhanced></cpu></enhanced></cpu></enhanced></cpu></enhanced></generic></system></isa></acpi></acpi></acpi></thermal></thermal></ahci></ahci></ahci></ahci></intel></isa></pci-isa></intel></memory,></pci></pci-pci></vga-compatible></pci></pci-pci></pci></pci-pci></acpi></acpi></acpi></bcm5720c></mii></broadcom></bcm5720c></mii></broadcom></acpi></acpi></intel></intel></bcm5719c></mii></broadcom></bcm5719c></mii></broadcom></bcm5719c></mii></broadcom></bcm5719c></mii></broadcom>
    

    Does anyone have a clue, how I could diagnose the problem?

    Kind regards,

    Jens



  • I have a similar problem that I've been digging into. Also on 2.2.2, but I've only used 2.2.2.

    What I've noticed:

    • Once the primary fails and the secondary takes over, the secondary stops receiving CARP advertisements via multicast. Which is to be expected while the primary is down.

    • Once the primary comes back online though, the primary is able to see CARP advertisements from the secondary, but the secondary is not seeing the primary's CARP advertisements, so they both get stuck in Master on my LAN interface. (WAN Interface has no problems switching back to Backup)

    • If I go to the VIPs, change the VHID on the LAN and apply changes, backup receives CARP advertisements again and everything goes back to normal (Restarting the LAN interface on the secondary also restores everything back to normal)

    I also have broadcom network ports, and I've read elsewhere that they've had trouble with CARP on broadcom's drivers.  I'm hoping the only resolution isn't that I have to get different network adapters.



  • Figured out my issue.

    port security was enabled and set to restrict on the switchport that the LAN interfaces were connected to and I could see in the switch logs that it was getting tripped.

    Disabled port security now all is well.


Log in to reply