AES-CGM and stalling IPSec
-
HI all.
I have discovered an IPSec issue today when deploying a bunch of SG-2100 boxes (ARM64 CPU) to do IPSec Site2Site to a XG-1537.
If I use AES-CGM for encryption (both 128 and 256bit) as guided by Netgate, the SG-2100 boxes will stall/become unresponsive after a while if there is more than one Phase2 tunnels active in the Tunnel. Boxes with only one Phase2 tunnel does not seem to suffer the issue (so far at least).The XG-1537 does not seem to suffer issues - it has QAT enabled.
Disabling SafeXcel (HW Acceleration) does not mitigate the Issue.
But changing the cipher on the tunnel to AES256 (not CGM - I believe it is really AES256-CBC) resolves the issue.I have a lot of testing to do still, but it’s quite evident the change of cipher resolves the issue. The SG-2100’s are using 22.01, and the XG-1537 is using 21.05. I will try to upgrade them all to 22.05 and see it changes anything.
Is this a known issue with multiple Phase two’s?
-
-
Have a look:
But i try GCM with 22.05 and i can't reproduce it at the moment. Looks like GCM is now usable.
-
@nocling said in AES-CGM and stalling IPSec:
Have a look:
But i try GCM with 22.05 and i can't reproduce it at the moment. Looks like GCM is now usable.
Yeah, I know that redmine report, but that is not the issue I’m suffering here. Disabling SafeXcel makes no difference to my setup. But then again, I need to test 22.05 in both ends and try with disabled asyncronous cryptography and so on.
But right now it definitely manifests itself when using multiple phase 2 tunnels in a Site2Site. I seem unable to recreate the same issue when only using one Phase 2.
-
I use 2 P2 and async crypto now with AESGCM256, no impact at the moment.
-
@nocling said in AES-CGM and stalling IPSec:
I use 2 P2 and async crypto now with AESGCM256, no impact at the moment.
But you had the issue before 22.05?
-
Yes, with 22.01 my 2100 hangs up some times a day before i can find out what happened. I reproduce it and so we got the Bug Report after a other 2100 are affected to.
-
@nocling said in AES-CGM and stalling IPSec:
Yes, with 22.01 my 2100 hangs up some times a day before i can find out what happened. I reproduce it and so we got the Bug Report after a other 2100 are affected to.
Yeah I saw your posts on the issue, but you could see the mbuf_clusters grow in diagnostics. The mbuf_clusters graph shows no changes when I’m suffering my issue.
-