IPSEC Dropouts



  • Hi,

    I have a pair of pfSense servers that have multiple IPSEC tunnels established to various devices. For testing, I upgraded the standby server to 2.3 to check if everything works. The only issue I have come across is after a short period of time after boot all of my IPSEC tunnels drop out and stop working.

    There is 13 tunnels total. 9 of the tunnels are IKEv2, 4 of the tunnels are IKEv1. The tunnels terminate on various devices (about half to pfSense servers, 4 to Juniper devices, 4 to VyOS servers). 9 of the tunnels are transport mode, 4 of the tunnels are tunnel mode. pfSense is set to initiate and respond to 9 of the tunnels, the other 4 tunnels it is initiate only. pfSense is running in an ESX VM. All tunnels are using a PSK for authentication.

    For testing, I took a snapshot before upgrading to 2.3. After upgrading to 2.3, with no settings changed, I experience the issues with IPSEC. If I revert my snapshot so it is back to 2.2.6, everything works normally again.

    The strange thing is on the remote devices I cannot see any reason for it to stop working, the logs don't show anything unusual.

    If I reboot the pfSense server, the tunnels all come up and work for about 30 minutes to 2 hours, they all drop out seperately at different times. After they dropout I can see from the IPSEC status pages that the IPSEC connections are up but there is no security assosciations at all. When I check the remote ends some of them show that there is a security association, others are showing that there is no security associations. If I restart the IPSEC service in pfSense I still get the same result - no security associations but all of the tunnels are up. The only way I can get it to work for a little while is rebooting the server again.

    I enabled raw logging for everything in the IPSEC logging area, but I am not sure what I should be looking at in those.

    Has anyone else ran into issues with their tunnels on 2.3? If it is helpful I can attach some of the debug logs I have…



  • Haven't seen any issues here, but strongswan 5.4.0 release just came out yesterday and was added. Not much different from what was there before. But if you aren't yet up to the most recent available snapshot, that'd be the first thing to do.



  • Hi cmb,

    I first upgraded to 2.3 just over a week ago, I have been doing the updates every day since then to see if there is any change. I noticed the update for strongswan today and updated the server earlier and rebooted just in case that was required but I still have the same issue.

    Thanks



  • What's in the IPsec logs at the time it stops working? Sounds like your P2s expire and aren't rekeyed for some reason.



  • Doing some more testing on the latest 2.3 snapshot it looks like some connections dont even come up straight after boot at all.

    As an example for one of the peers I see these logs over and over again:

    
     11[IKE] <40> 203.170.xx.xx is initiating an IKE_SA 
     08[IKE] <con14|39>initiating IKE_SA con14[39] to 203.170.xx.xx</con14|39> 
    

    I will turn up the logging, disable all tunnels except one so I can get some better details about the issue.



  • If you can get me remote access, or do a screen share, I'd like to take a look. Can PM me here to arrange, or /msg me if you're on IRC (cmb on Freenode).



  • With debug logs on I got some more useful info. I just rebooted the server to make sure everything starts from scratch.

    First, only some of the tunnels came up. Of those tunnels that came up only some of them got their child security associations. This is what is logged for the transport mode security associations that didn't come up (I can see this same thing in the log for each):

    Mar 28 11:57:36 gateway2 charon: 15[IKE] traffic selectors 222.127.xx.xx/32|/0 222.127.xx.xx/32|/0 === 193.239.xx.xx/32|/0 193.239.xx.xx/32|/0 inacceptable
    Mar 28 11:57:36 gateway2 charon: 15[IKE] <con12|19>traffic selectors 222.127.xx.xx/32|/0 222.127.xx.xx/32|/0 === 193.239.xx.xx/32|/0 193.239.xx.xx/32|/0 inacceptable
    Mar 28 11:57:36 gateway2 charon: 15[IKE] failed to establish CHILD_SA, keeping IKE_SA
    Mar 28 11:57:36 gateway2 charon: 15[IKE] <con12|19>failed to establish CHILD_SA, keeping IKE_SA

    For the tunnel mode connections, if I grep for the IP of the other end I can't see anything like this in the log, I can only see the send/receive packet entries so I might need to disable all tunnels to be able to get the logs just for that specific host…</con12|19></con12|19>


Log in to reply