2.2.6 IPSEC ReKey and Hardware Hang
I hope someone can help. I have recently upgraded my pfSense Cluster from 2.0.5 to 2.2.6, which includes a hardware swap (from same vendor) as the old boxes were out of warranty. Our platform has around 35 IPSEC tunnels terminating into it, typically from Draytek hardware.
We've had very few issues with IPSEC on the old platform and tunnels remained connected with no issues. After migrating to 2.2.6 we have notices a lot more tunnel restarts. The logs tend to show a re-key operation happening prior to the tunnel restarting.
I've tried disabling (and consequently re-enabling) DPD, i've also setup the tunnels at the pfsense end to be responder only (as the Draytek's are set to outbound), i've also configured the pfsense end to disable re-key, but i'm still seeing the issue.
The Re-Key's dont seem to be after any specific time, i've seen them happen after 1 hour and some after 36 hours. It seems that most seem to happen at the same time, looking at the connected time in the pfsense IPSEC Status. I have also noticed that sometimes pfsense tunnel uptime suggests a re-connection but the Draytek end suggests the tunnel is up. Typically the tunnels transfer data with no issues after they re-establish.
I've got 2 2.2.6 clusters in 2 different DC's and have transferred some tunnels from one to the other, to change some settings. I notice in other posts that Prefer Older SA's is a suggested fix to the re-key issue, the closest option i can find in 2.2.6 is Configure Unique ID's as, i have set this to Never (hopefully i've understood the description) to see if this helps.
Any thoughts would be appreciated.
We also experienced a hardware hang on our primary router this morning, it didnt offer me the chance to upload a crash dump, but i am concerned it is the reported 2.2.5(6) crash issue others have posted. Fingers crossed it doesnt happen again! Any thoughts on what else i can check would be appreciated. I have disabled AES-NI on this box, incase it was part of the issue.
I had a similar issue with connections to an ASA, what fixed it for me was checking the disable rekey box in the Phase 1 settings, and I also had issues with Unique IDs at some point so I configure my boxes with "Configure Unique IDs as:" set to No under Advanced IPSec settings.