CPU load/loss of Packets after 2-3 days with HA-setup
Hi there. This is my first post in this forum.
I have a setup of 2 pfsense-machines on ESXi, a number of attached networks and three gateways (of which 2 are PPPoE).
Since the failover didn't work for some reasons, I have been running only one machine for quite some time. After the last update (v. 2.4.4) I wanted to try the failover scenario once again and hurray.. it worked!
So I was happy having both machines running again.
Problem is: After some time or due to some yet unknown trigger/event the cpu load of the active machine suddenly explodes and therefore packetlosses on most interfaces occur. Gateways get flagged as down due to the huge amount of packet loss.
Solution is: The second I shut down the 2nd machine (the passive one), no more packet drops occur and everything runs smoothly again.
I have no idea what causes this issue. I double and triple checked the setup in pfsense, ESXi, switchconfig, etc. I cannot reproduce this issue other than starting both machines and waiting some days which is not very wise, since this is a company network with ~ 100 employees attached.
I would be very thankful for any advice as for how I can figure out what's causing this issue or how to solve it. I think it should have something to do with CARP because one instance alone is running flawlessly.