Dual pfsense CARP Multi-WAN problems…

  • So I am in the midst of setting up a dual pfsense CARP environment for the first time that will be replacing my current single pfsense set up. While I have followed the pfSense Docs Howto with generally good results, my particular set up is more complicated than what the Howto covers, so I took some educated guesses, and some trial and error before I finally got my setup 95% working.  Just about everything appears to sync properly when doing test failovers in my test environment.  The remaining 5% has got me stumped, however.

    My current pfSense box has been working successfully as a multiWAN failover for quite some time now, and so when I started creating the CARP set up, I took a config backup from my current setup, and restored it to each node of my CARP setup.  This gave me my multiWAN, my VPN configs (it acts as a PPTP VPN server on one WAN only, and has an IPSEC site-to-site VPN on the other), custom NAT mappings, etc, etc.

    My remaining 5% problem, is that when I simulate a failure of the CH WAN in my test environment, it never fails over to the BR WAN.  The gateway monitor in the dashboard in both pfSense boxes correctly shows the CH WAN down, but I cannot get it to fail over to BR WAN.  It just sits there with a downed connection, using the same config from my current setup.  And yet, in the pfsense diag menu, performing a ping of the test destination succeeds while the CH WAN is down.  I've checked my Firewall & NAT rules, and I've got some pretty wide open rules in there that should allow all traffic through both WANs from the LAN.  So I'm a bit stumped.  Anyone have any ideas?

    I've taken a screenshot of a Visio worksheet that shows both my current setup, as well as my test CARP setup, and attached them below to give a better illustration of what I've got.  I've partially masked my IP so that it can't be abused, but there should be enough to show how it's all laid out.

    Thanks in advance for your help!
    ![pfsense CARP Test Environment.jpg](/public/imported_attachments/1/pfsense CARP Test Environment.jpg)
    ![pfsense CARP Test Environment.jpg_thumb](/public/imported_attachments/1/pfsense CARP Test Environment.jpg_thumb)
    ![pfSense Current Environment.jpg](/public/imported_attachments/1/pfSense Current Environment.jpg)
    ![pfSense Current Environment.jpg_thumb](/public/imported_attachments/1/pfSense Current Environment.jpg_thumb)

  • OK, so new/revised symptom.  It turns out that the failover IS occurring, but at a ridiculously slow rate.  To the tune of 10 minutes.  I haven't the foggiest why it's taking so long.  Any one got any ideas for that?  My CARP VIPs are all set to Master Base/Skew 1/0, Slave Base/Skew 1/100, so it shouldn't be taking more than a few seconds…

  • Well, as an update, it turns out my problem wasn't really a problem, per se, it was my method of testing that was skewing my results.  I was trying a non-stop ping of an external source, and killing the connections.  This apparently doesn't work well as a test, because if I left the ping running, it would never recover, but if I killed the ping, and refreshed the network adapter on the test box, it switched over very quickly.

    Anyway, this box CARP cluster is now in production, and has already gone through some failures on the MultiWAN that showcased its redundancy capabilities.

  • Actually, now I'm noticing another weird issue with this setup.  On pfsense02, I'm finding that from the outside, I cannot ping 71.x.x.20, when I can ping 71.x.x.19 on pfsense01 just fine.  I checked the firewall rules on pfsense02, and verified that I have a rule that allows ICMP from any to any, same as on pfsense01.  The rules between these two boxes are identical.  Also, when I go into Diagnostics -> Ping, and ping google.com via the Ch interface (not the VIP for that interface), I get ping backs, so the connection does indeed appear to be working.

    I am at a loss.  Does any one have any idea why it wouldn't be pingable?  I'm also bringing this up, because I'm seeing a lot of packet loss on the Ch interface gateway monitoring on pfsense02, but not as much (though the logs say there is some) on pfsense01.

    I could really use your help…  Thanks in advance!

  • I could really use some help here, this issue is still unresolved, and I'm at a loss as to how to troubleshoot it…

  • Hi mcampbell,

    I'm planing on implementing the same thing, multiple WAN with CARP.
    Is your problem solved? OR are yous still having some problems.
    Could you kind to share some information on how did you solve this issue?
    Hope to hear from you soon.



  • Nope, never solved it.  No one has ever seemed to have any interest in lending a hand to this problem.  The problem continues to crop up at random intervals spanning from once a week to once every couple of months.  Haven't found any rhyme or reason to its sudden switch over, and I don't see anything obvious in the logs.

    My workaround is go into Status -> CARP (failover) on the primary, disable, and then reenable CARP.  This forces CARP to revert back to the primary for MASTER status.  It's unfortunately not perfectly seamless, there exists a second or two of dropped packets while the interfaces switch back to the primary box, so it may be best to do it during off hours.

    If you ever hear of any solution to this, please share!  :)

  • It's not necessarily that there is no interest in helping you, it's much more likely that nobody who has read your problem so far has a clue.  I help here all the time yet I have zero experience with dual-node pfSenses with CARP and failover etc.  You've only posted this a few hours ago.  Give it some time for others to read and comment.  Do some forum searching and Googling while you wait.

  • I apologize if I sounded impatient.  But, if you look at the timestamp of the original post, this thread is 10 months old, with not a single reply before network.novice today.  A few times, I would post updates in an effort to get as much info in the hands of those who might help me (not to mention sending my post back to the top), but I never heard back.  After a while, I gave up trying to elicit a response.

    I can appreciate that this is an unusual setup, and the sampling of people on this board who have this exact setup are probably few and far in between.  But I have done plenty of googling in the meantime, and I have found others with this problem without the Multi-WAN, but never found any solutions–closest I got was the workaround I mentioned in my last post. So I believe that this is more of a general CARP issue, rather than a problem with my specific combo of multiple-WAN/CARP.  I would hope we have a few more general CARP experts here than those who've got my setup.  Maybe this bit of information might bring them out of the woodwork?

  • Most people don't go scrolling through months of backposts, so unless someone saw it during the week it was posted or via a forum search, it gets buried.

    I wish I could do more to help you with this but I don't have the experience.  If you're doing this on behalf of your company then you might want to consider purchasing support.  It's fast and reasonably priced.  I've done it and have received fast, critical help from both JimP and ChrisB.  If you're a one-man gang then I hope you get a helpful reply.

  • Hi guys - you have exactly the problem I have! And you've found the same fix. Forcing failover / failback works as a work around.
    I'm using the latest 2.1.5 with the same results.

    Here's a "sort of" solution too. Don't use the main carp vip - use another ip (an alias ?) for your services? They don't have the same issue even when the issue occurs on the main IP.

    I think these articles are related: