Carp master breaking the Internet
-
So I have two PFSense 1.2.3 just setup a few weeks ago after this problem happened for the first time. And now it has happened a second time.
I have two PFSense running happily in VMWARE ESXi 4.0, the Master becomes broken and when ever its on the internet stops working.
IF I disable carp on the master (Fast at bootup or the PFSense becomes so slow) then the secondary takes over and internet works. As soon as I turn back on the master "enable carp" on the status page. the internet stops working.
My network is a WAN with 3 static IP's setup in carp. and a LAN and Wireless network setup with CARP Gateway address. I run Captive portal on the master router for the Wireless network.
I create and whole new install of pfsense and restored the config back to the new install and its doing the same thing.
any thoughts?
-
Your vswitches probably aren't configured to allow multiple MACs. See #4 here.
http://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting -
Looks like the setting Net.ReversePathFwdCheckPromisc was added to the instructions after I original setup the system. I changed that and all seems back to normal.
Thanks!
-
Ok, spoke to soon that did not solve the problem.
So I have the 2 two PFSenses running on ESXI with CARP.
Was working for over a year, then primary crashed not a new setup. When Primary is put online Internet stops working, after a few seconds.
If a I have a command prompt open on my computer behind the setup pinging google.com, it will not fail? but I will be unable to connect to website in a new browser window. Also, wen the primary stops working, the webinterface is unresponse until carp is disabled on the router.
I have not found anything in the logs that looks helpful, the CPU on the router never goes above a few percent while this is happening.
-
Ok, how I resolved it.
First I tried created a new router from the 1.2.3 VM, and restored the configuration from the original router. This router had the same behavior as the failed router.
Second I copied the running secondary router, made a few adjustments to the config so that it was the primary and it worked. It has now been running for 12+ hours with no problem.
I would guess, that I had some config issues that I missed. User error again!