Large amount of data over IPSec breaks network/NAT
-
Hi,
we have a strange and quite serious problem with our IPSec setup that impairs our whole network, basically it seems that NAT stops working altogether during some heavy data transferring over IPSec tunnel. We have two servers on each side of a tunnel that sync data, about ~10GB each night. All has been well for several months until last week when we started experiencing strange connectivity problems and the problems have been narrowed down to this sync procedure. The connection breaks a few minutes into the transfer and with it goes NAT for all of our internal networks as well. The connections also cannot be corrected any other way than rebooting the whole firewall. The logs show nothing and switches etc. are working fine and show no traces of error. Traffic between internal networks works fine as well as traffic from the firewall to the internet, traffic from internal networks to the internet stops at the firewall.
Our setup is a CARP redundant pfsense 2.02 installation.
-
…about ~10GB each night. All has been well for several months until last week when we started experiencing strange connectivity problems and the problems have been narrowed down to this sync procedure.
So, you've made no changes whatsoever in software or hardware, yet it just stopped working after several months ?
The connections also cannot be corrected any other way than rebooting the whole firewall. The logs show nothing and switches etc. are working fine and show no traces of error. Traffic between internal networks works fine as well as traffic from the firewall to the internet, traffic from internal networks to the internet stops at the firewall.
Have you done any packet captures (with tcpdump) on LAN and WAN when the problem occurs?
So normal routing (without NAT) between e.g. LAN and OPT1 continues to work, but NATting stops?
What happens if you disable pf (pfctl -d, re-enable with pfctl -e)? -
That sounds a lot like what would happen if your sync process started going nuts with huge numbers of connections and maxes out the state table. Check your RRD States graph vs. your states limit.