Carp master breaking the Internet
So I have two PFSense 1.2.3 just setup a few weeks ago after this problem happened for the first time. And now it has happened a second time.
I have two PFSense running happily in VMWARE ESXi 4.0, the Master becomes broken and when ever its on the internet stops working.
IF I disable carp on the master (Fast at bootup or the PFSense becomes so slow) then the secondary takes over and internet works. As soon as I turn back on the master "enable carp" on the status page. the internet stops working.
My network is a WAN with 3 static IP's setup in carp. and a LAN and Wireless network setup with CARP Gateway address. I run Captive portal on the master router for the Wireless network.
I create and whole new install of pfsense and restored the config back to the new install and its doing the same thing.
Your vswitches probably aren't configured to allow multiple MACs. See #4 here.
Looks like the setting Net.ReversePathFwdCheckPromisc was added to the instructions after I original setup the system. I changed that and all seems back to normal.
Ok, spoke to soon that did not solve the problem.
So I have the 2 two PFSenses running on ESXI with CARP.
Was working for over a year, then primary crashed not a new setup. When Primary is put online Internet stops working, after a few seconds.
If a I have a command prompt open on my computer behind the setup pinging google.com, it will not fail? but I will be unable to connect to website in a new browser window. Also, wen the primary stops working, the webinterface is unresponse until carp is disabled on the router.
I have not found anything in the logs that looks helpful, the CPU on the router never goes above a few percent while this is happening.
Ok, how I resolved it.
First I tried created a new router from the 1.2.3 VM, and restored the configuration from the original router. This router had the same behavior as the failed router.
Second I copied the running secondary router, made a few adjustments to the config so that it was the primary and it worked. It has now been running for 12+ hours with no problem.
I would guess, that I had some config issues that I missed. User error again!