Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Adding CARP VIPs causes Pair to start Crashing

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    2 Posts 1 Posters 905 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cthomas
      last edited by

      I currently manage 4 sets of pfSense firewalls in HA, and a number of standalone firewalls.  I've been trying to get a Pair setup at 1 site for a few months now and I just can't get it working right.

      Basically, as soon as I start adding CARP VIPs to the pair, it becomes unstable and starts crashing.

      For example, about an hour ago I added a single CARP VIP to vlan229 on lagg1, which was replicated without issues to the secondary.  To test stability I disabled CARP on the primary fw to cause it to failover, which it did successfully, then I enabled CARP on the primary and the VIPs failed back no problem.  I then rebooted the secondary fw and it never came back up.  After ten minutes I remotely power-cycled the server, and 5 minutes later the secondary fw was up and running.  Successfully rebooted the primary, successfully rebooted the secondary.  Seems stable..  Then I added another CARP VIP to vlan3 on lagg0 and as soon as the VIP replicated to the secondary fw it crashed.  Approximately five minutes later when the secondary finished booting, the primary crashed, as soon as the primary finished booting the secondary crashed.

      This is technically the 3rd pair of physical servers I've tried to set this up on, one of them actually had a bad flakey NIC which caused quite a few problems while trying to implement this config.  So I think I've ruled out hardware issues on the FWs.  pfSense just doesn't like something I'm trying to do.  I've submitted a crap-ton of crash reports, I can provide the hostnames of these fws to the devs if they PM me.

      A high-level overview of the config is as follows;

      2x Dell R410's with a 6-Core Intel CPU, 16GB RAM, 500Gb HDD.
      2x Onboard Broadcom NICs, 2x Intel Quad-Port Adapters
      pfSense v2.1.4 w/ CARP+IPAlias patch applied

      /boot/loader.conf.local
        kern.ipc.nmbclusters="131072"
        hw.bce.tso_enable="0"
        hw.pci.enable_msix="0"

      igb7 = WAN
      igb6 = lagg1
      igb5 = lagg1
      igb4 = Network1
      igb3 = lagg0
      igb2 = lagg0
      igb1 = lagg0
      igb0 = lagg0
      bce0 = Network2
      bce1 = pfSync

      lagg0 is LACP, and has 10 vlans plus a network directly assigned to lagg0 (untagged traffic)
          lagg0_untagged >–--------< There are a few Windows NLBs on this subnet
          lagg0_vlan3
          lagg0_vlan12
          lagg0_vlan16 >----------< Another FW Pair using CARP VHID 34
          lagg0_vlan22
          lagg0_vlan32
          lagg0_vlan185
          lagg0_vlan186
          lagg0_vlan228
          lagg0_vlan230 >----------< There are a few Windows NLBs, and a Pair of Barracudas on this subnet
          lagg0_vlan320

      lagg1 is LACP, and has 7 vlans plus a network directly assigned to lagg1 (untagged traffic)
          lagg1_untagged
          lagg1_vlan4
          lagg1_vlan5
          lagg1_vlan6
          lagg1_vlan14
          lagg1_vlan229
          lagg1_vlan261
          lagg1_vlan262

      lagg0 is connected to a pair of Netgear GS728TS Switches (v1h1 B5.2.0.2 V5.3.0.17)
        fw1 is connected to sw1 g1/g2/g3/g4
        fw2 is connected to sw2 g1/g2/g3/g4

      lagg1 is connected to a pair of Netgear GS752TS Switches (H00.00.01 B1.0.2.0 V5.1.0.2)
        fw1 is connected to sw1/g1 and sw2/g1
        fw2 is connected to sw1/g2 and sw2/g2

      ... This pair is supposed to replace an aging pfSense fw at a Data Center (I inherited this stuff), the old FW is starting to become unstable, it is setup very similar to the above pair except that it does not have an HA interface, there is only a single port in lagg1, and all of the VIPs are ProxyARPs.

      Questions and Suggestions are welcome, I would really like to get this pair up and running before my old FW finally craps out on me.

      -ct

      1 Reply Last reply Reply Quote 0
      • C
        cthomas
        last edited by

        At the same time these two firewalls are up and down as a result of them crashing .. I started getting reports that folks couldn't access a website that uses a Windows NLB and resides on vlan230.  There were three separate incidents where I happened to have these firewalls up and running with active CARPs and this website became inaccessible.

        I don't understand it, because I added a CARP VIP to lagg0_vlan3, and lagg1_vlan229.  But I definitely think that the two bouncing firewalls caused the issue.  During the last incident, I immediately powered off the two firewalls, and the issue went away.

        The resource(s) sitting behind the Barracuda NLBs on the same vlan, do not appear to have been affected.

        -ct

        1 Reply Last reply Reply Quote 0
        • First post
          Last post
        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.