Unstable OpenVPN
-
Hi all,
we have been using pfSense for more than a year at our 2 main offices successfully for the most but we are struggling with a problem related to OpenVPN which seems to have been around for years (although intermittent and hard to replicate apparently).Our setup is: 2 CARP pfSenses at each location.
Main HQ has 2 WANs (different ISP, VLAN and interface).
Secondary HQ has 1 WAN only.Between the two offices there are 2 OpenVPN connections, a client and a server each side.
Both sides the connections are not using any remote network in OpenVPN configuration to avoid static routing troubles when the second link comes up.
Instead we are using gateways groups attached to the needed rules to allow traffic to use both links (when available).
Everything works fine and seems all good, at least initially.The real problem comes when for some reason the connectivity is lost and has to be restored. One link works as expected, the second does not and our tests show (in our opinion) a clearly buggy behavior.
At both sides, the second link gateway is reported as down with a 100% packet loss.
The client OpenVPN status shows a 'waiting' status while the server reports OpenVPN corretly connected and exchanging traffic with the client.Manually pinging the counterpart from both sides does not work but, looking in the states of the two firewalls, it shows an impossible situation where one side reports traffic back and forth (while still being unable to ping the other side, hence part of the bug) while the other side has no record at all of such ping.
Meanwhile, this second firewall, has apparently no response to its attempts to ping the first, which instead reports such state correcly.So why shall we not think it's our fault and that our configuration is broken?
Simple, if you restart one (sometimes both) end, everything works as expected without any issue. Both link are up and the traffic gets balanced over the two.To be noticed:
- The two configuration (SERVER+CLIENT) are identical in each direction (clearly not sharing the same VPN network ;)
- The failing link is always the same (server at 2ry HQ, client at Main HQ).
- Manually stopping/starting or restarting, in any order, the client and the server OpenVPN instances has no effect, a full reboot is required.
- When this problem occurs, some devices (specific) have some troubles with some services and again these are resolved just rebooting pfSense.
Thanks for the time spent reading this and the time you will invest in helping resolve this really really annoying issue!