More than one IPSec tunnel phase1 is fine, but adding another phase1 prevents an existing tunnel from re-establishing a connection
-
I have 5 different tunnels set up with IPSec in host-to-host config, which all run stable and without obvious problems.
When I add a new tunnel phase1 (con10), all other phase1's stay connected, but as soon as I drop the con5 connection and try to re-establish it, it keeps on attempting to connect, but never succeeds. I can drop any other tunnel and it will immediately reconnect on the first try, but the last one previously added does not connect again.
If I disable the phase 2 connections and also the the phase 1, then I can enable the tunnel in the config and all it well. It's when I enable the phase1 that the problem occurs with the con5 connection.
This is weird and I'm at a loss for why this happens. I'm not event at the stage where I actually establish a connection with the far site on this link yet!
How can this be?
-
I have attached ipsec.log
It records what happens when I do the following:
- con10's status is disabled.
- con5's status is enabled and connected
- I enable con10 and con5 stays connected
- I then disconnect con5. It immediately attempts to reconnect, but fails and just shows "connecting" in the UI IPsec status
- I then disable con10 again and con5 connects immediately.
BTW: Where is a disabled ipsec tunnel's config stored? Even a grep of the content of the pfSense is unable to locate it?? When I enable the tunnel it's added to /var/etc/ipsec/swanctl.conf, but from where?
The config of both con5 and con10 are below:
con5 { # P1 (ikeid 5): Client5 fragmentation = yes unique = replace version = 2 proposals = aes256-sha256-modp2048 dpd_delay = 10s rekey_time = 25920s reauth_time = 0s over_time = 2880s rand_time = 2880s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 196.250.xxx.yyy local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con5 { # P2 (reqid 3): RC01 network mode = tunnel policies = yes life_time = 3600s rekey_time = 3240s rand_time = 360s start_action = trap remote_ts = 192.168.0.0/24 local_ts = 192.168.152.0/29 esp_proposals = aes256-sha256-modp2048 dpd_action = trap } } }
con10 { # P1 (ikeid 10): Client10 fragmentation = yes unique = replace version = 2 proposals = aes256gcm128-sha256-modp2048,aes256-sha256-modp2048 dpd_delay = 10s rekey_time = 25920s reauth_time = 0s over_time = 2880s rand_time = 2880s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 165.165.xxx.yyy local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } }
-
-
Has anyone here ever experienced something like this? It seems @Gblenn is experiencing the same (or very similar) behaviour as posted here.
@jimp, is it possible that a recent upgrade introduced some bug?
-
Interestingly: With con10 enabled I can disconnect any of the other connections and they will reconnect immediately. Only when I disconnect con5, is it not able to reconnect.
The other connections are configured to use AES_GCM_16 (128), whereas con5 uses AES_CBC (256). Could that cause this link not to be able to re-establish ifself?
-
Figured out what is happening, but this seems like a bug that gets triggered by something we don't know yet.
Looking at the /var/etc/ipsec/swanctl.conf, I note the following behaviour:
- There are multiple active connection configured in they are listed, which the last being con10.
- Con10 will connect fine in this scenario.
- If I enable (in the pfSense UI), con9, it will be inserted into the swanctl.conf file after the con10 configuration.
- Once con10 is not the last connection configurated in swanctl.conf, it doesn't connect anymore.
- I can work around the issue by enabling con9, establishing the connection and then disabling the config. The connection stays up and then I can connect con10.
This is not normal, but I don't know how to resolve this.
Below is the swanctl.conf file without con9 active (I have cut the secrets and public ip's)
# This file is automatically generated. Do not edit connections { bypass { remote_addrs = 127.0.0.1 children { bypasslan { local_ts = 192.168.131.0/24 remote_ts = 192.168.131.0/24 mode = pass start_action = trap } } } con3 { # P1 (ikeid 3): customer1 - link1 fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes256gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 77760s reauth_time = 0s over_time = 8640s rand_time = 8640s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 105.27.aaa.bbb local { id = 197.214.119.130 auth = psk } remote { id = 192.168.0.2 auth = psk } children { con3 { # P2 (reqid 9): M/Monit to Office LAN # P2 (reqid 7): Unify to Office LAN # P2 (reqid 6): GTS1 to Office LAN mode = tunnel policies = yes life_time = 43196s rekey_time = 38876s rand_time = 4320s start_action = trap remote_ts = 172.16.3.0/24,172.16.3.0/24,172.16.3.0/24 local_ts = 192.168.131.191/32,192.168.131.177,192.168.131.174 esp_proposals = aes256gcm128-modp2048,aes256gcm96-modp2048,aes256gcm64-modp2048,aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } } con4 { # P1 (ikeid 4): customer1 - link2 fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes256gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 77760s reauth_time = 0s over_time = 8640s rand_time = 8640s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 41.164.aaa.bbb local { id = 197.214.xxx.yyy auth = psk } remote { id = 41.164.fff.ccc auth = psk } children { con4 { # P2 (reqid 10): M/Monit to Office LAN backup # P2 (reqid 8): Unify to Office LAN # P2 (reqid 5): GTS1 to Office LAN mode = tunnel policies = yes life_time = 43196s rekey_time = 38876s rand_time = 4320s start_action = trap remote_ts = 172.16.3.0/24,172.16.3.0/24,172.16.3.0/24 local_ts = 192.168.131.191/32,192.168.131.177/32,192.168.131.174/32 esp_proposals = aes256gcm128-modp2048,aes256gcm96-modp2048,aes256gcm64-modp2048,aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } } con10 { # P1 (ikeid 10): Greenway Farm fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 165.165.bbb.ddd local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con10 { # P2 (reqid 17): Greenway server # P2 (reqid 16): Greenway server # P2 (reqid 15): Greenway server # P2 (reqid 14): Greenway server # P2 (reqid 11): Greenway server mode = tunnel policies = yes life_time = 604800s rekey_time = 544320s rand_time = 60480s start_action = trap remote_ts = 10.10.3.0/24,10.10.4.0/24,192.168.3.0/24,10.10.2.0/24,192.168.1.0/24 local_ts = 192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24 esp_proposals = aes128gcm128,aes128gcm96,aes128gcm64 dpd_action = trap } } } } secrets { ike-0 { secret = <cut> id-0 = %any id-1 = 192.168.0.2 } ike-1 { secret = <cut> id-0 = %any id-1 = 41.164.ccc.ddd } ike-2 { secret = <cut> id-0 = %any id-1 = %any } }
-
Here is the scanctl.conf file with the con9 enabled:
# This file is automatically generated. Do not edit connections { bypass { remote_addrs = 127.0.0.1 children { bypasslan { local_ts = 192.168.131.0/24 remote_ts = 192.168.131.0/24 mode = pass start_action = trap } } } con3 { # P1 (ikeid 3): customer1 - link1 fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes256gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 77760s reauth_time = 0s over_time = 8640s rand_time = 8640s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 105.27.aaa.bbb local { id = 197.214.xxx.yyy auth = psk } remote { id = 192.168.0.2 auth = psk } children { con3 { # P2 (reqid 9): M/Monit to Office LAN # P2 (reqid 7): Unify to Office LAN # P2 (reqid 6): GTS1 to Office LAN mode = tunnel policies = yes life_time = 43196s rekey_time = 38876s rand_time = 4320s start_action = trap remote_ts = 172.16.3.0/24,172.16.3.0/24,172.16.3.0/24 local_ts = 192.168.131.191/32,192.168.131.177,192.168.131.174 esp_proposals = aes256gcm128-modp2048,aes256gcm96-modp2048,aes256gcm64-modp2048,aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } } con4 { # P1 (ikeid 4): customer1 - link2 fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes256gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 77760s reauth_time = 0s over_time = 8640s rand_time = 8640s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 41.164.aaa.bbb local { id = 197.214.xxx.yyy auth = psk } remote { id = 41.164.aaa.bbb auth = psk } children { con4 { # P2 (reqid 10): M/Monit to Office LAN backup # P2 (reqid 8): Unify to Office LAN # P2 (reqid 5): GTS1 to Office LAN mode = tunnel policies = yes life_time = 43196s rekey_time = 38876s rand_time = 4320s start_action = trap remote_ts = 172.16.3.0/24,172.16.3.0/24,172.16.3.0/24 local_ts = 192.168.131.191/32,192.168.131.177/32,192.168.131.174/32 esp_proposals = aes256gcm128-modp2048,aes256gcm96-modp2048,aes256gcm64-modp2048,aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } } con10 { # P1 (ikeid 10): Greenway Farm fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 165.165.bbb.ddd local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con10 { # P2 (reqid 17): Greenway server # P2 (reqid 16): Greenway server # P2 (reqid 15): Greenway server # P2 (reqid 14): Greenway server # P2 (reqid 11): Greenway server mode = tunnel policies = yes life_time = 604800s rekey_time = 544320s rand_time = 60480s start_action = trap remote_ts = 10.10.3.0/24,10.10.4.0/24,192.168.3.0/24,10.10.2.0/24,192.168.1.0/24 local_ts = 192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24 esp_proposals = aes128gcm128,aes128gcm96,aes128gcm64 dpd_action = trap } } } con9 { # P1 (ikeid 9): Reliance Compost fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes128-sha256-modp2048 dpd_delay = 10s rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 196.250.eee.fff local { id = 197.214.xxx.yyy # P2 (reqid 8): Unify to Office LAN # P2 (reqid 8): Unify to Office LAN # P2 (reqid 5): GTS1 to Office LAN mode = tunnel policies = yes life_time = 43196s rekey_time = 38876s rand_time = 4320s start_action = trap remote_ts = 172.16.3.0/24,172.16.3.0/24,172.16.3.0/24 local_ts = 192.168.131.191/32,192.168.131.177/32,192.168.131.174/32 esp_proposals = aes256gcm128-modp2048,aes256gcm96-modp2048,aes256gcm64-modp2048,aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } } con10 { # P1 (ikeid 10): Greenway Farm fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048 dpd_delay = 10s rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 165.165.bbb.ddd local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con10 { # P2 (reqid 17): Greenway server # P2 (reqid 16): Greenway server # P2 (reqid 15): Greenway server # P2 (reqid 14): Greenway server # P2 (reqid 11): Greenway server mode = tunnel policies = yes life_time = 604800s rekey_time = 544320s rand_time = 60480s start_action = trap remote_ts = 10.10.3.0/24,10.10.4.0/24,192.168.3.0/24,10.10.2.0/24,192.168.1.0/24 local_ts = 192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24 esp_proposals = aes128gcm128,aes128gcm96,aes128gcm64 dpd_action = trap } } } con9 { # P1 (ikeid 9): Reliance Compost fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes128-sha256-modp2048 dpd_delay = 10s rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 196.250.eee.fff local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con9 { # P2 (reqid 13): RC Subnet mode = tunnel policies = yes life_time = 3600s rekey_time = 3240s rand_time = 360s start_action = trap remote_ts = 192.168.0.0/24 local_ts = 192.168.152.0/29 esp_proposals = aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } } } secrets { ike-0 { secret = <cut> id-0 = %any id-1 = 192.168.0.2 } ike-1 { secret = <cut> id-0 = %any id-1 = 41.164.68.170 } ike-2 { secret = <cut> id-0 = %any id-1 = %any } ike-3 { secret = <cut> id-0 = %any id-1 = %any } }
So see now that my synopses was not quite correct. The con9 connection is now twice in the file. I suspect this causes problems?
This seems to be a pfSense bug rather than something in the strongswan then?
-
After some enabling and disabling, I know have only one config per configured connection in swanctl.conf, but the behaviour has not changed. I don't know why, since I didn't really make any significant changes, but that's what is like now.
The connection for con10 cannot be established when the config for con9 is present (after the config for con10 in the file). If con10 is the last, then it can be established.
con9 is site-to-site with a RouterOS device.
con10 is site-to-site with a Sophos device.
The other 2 (con3 and con4) are to a Fortigate and can be disconnected and reconnected on demand.@jimp, should I open a bug report for this? I don't want to make noise in redmine if it's a false alarm, so please advise if you can.
-
This post is deleted! -
When something like this happens it tends to be one of two things:
- There is something in the two configurations which conflicts so only one of the two is unique from strongSwan's point of view.
- Something internal to strongSwan has gotten confused and you need to stop it and start it again (not restart, as that functions differently).
In either of those cases it's not a bug per se, so unless you can narrow it down further it likely doesn't warrant a Redmine issue since there wouldn't be anything actionable on our part.
-
UPDATE: After some more experimentation it turns out that the position of the connection in the config file it not relevant. If both con9 and con10 are enabled, then con9 cannot connect. When con10 is disabled (ie, removed from the config file), con9 can connect.
When con10 alone is in the config file, it connects succesfully, but if both con9 and con10 are present, then neither connects.This is driving me a little crazy here... especially since I cannot see from the logs what the cause may be.
-
@jimp The start vs restart is something I will do after hours tonight to see if it makes a difference. I have up to this point only used restart. I did notice that restart does not drop established connections.
wrt the connnection configs and a possible conflict: I have only used the UI to create these connection parameters, not didn't hack around in the swanctl.conf file. Where are the "disabled" connections stored? I have not been able to find that even with a grep search.
-
The configuration data is all stored in
config.xml
which on the installation is held in/conf/config.xml
. -
We stopped and started the IPsec service, but the bahaviour is the same.
If one config is in the file, the service starts.
con9 present alone: con9 connects automatically (far side is RouterOS)
com10 present alone: con10 connects if far side is restarted manually, but not automatically (far side is Sophos).
If both con9 and 10 are present, neither connection starts.Below are two configurations from the swanctl.conf file. There's not overlap or conflict that I can see, but maybe I'm missing something?
con9 { # P1 (ikeid 9): RC fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048,aes128-sha256-modp2048 dpd_delay = 10s rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 196.250.abc.def local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con9 { # P2 (reqid 13): RC Subnet mode = tunnel policies = yes life_time = 3600s rekey_time = 3240s rand_time = 360s start_action = trap remote_ts = 192.168.0.0/24 local_ts = 192.168.152.0/29 esp_proposals = aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = trap } } }
con10 { # P1 (ikeid 10): G fragmentation = yes unique = replace version = 2 proposals = aes128gcm128-sha256-modp2048 rekey_time = 544320s reauth_time = 0s over_time = 60480s rand_time = 60480s encap = no mobike = no local_addrs = 197.214.xxx.yyy remote_addrs = 165.165.cde.fgh local { id = 197.214.xxx.yyy auth = psk } remote { id = %any auth = psk } children { con10 { # P2 (reqid 17): G server # P2 (reqid 16): G server # P2 (reqid 15): G server # P2 (reqid 14): G server # P2 (reqid 11): G server mode = tunnel policies = yes life_time = 604800s rekey_time = 544320s rand_time = 60480s start_action = trap remote_ts = 10.10.3.0/24,10.10.4.0/24,192.168.3.0/24,10.10.2.0/24,192.168.1.0/24 local_ts = 192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24,192.168.153.0/24 esp_proposals = aes128gcm128-modp2048,aes128gcm96-modp2048,aes128gcm64-modp2048 dpd_action = clear } } }
-
Your problem is almost certainly because both of those have a remote ID of
%any
and are using pre-shared keys.strongSwan references pre-shared keys by associating them with a remote identifier. With both of those tunnels enabled, it would have both keys listed for the
%any
ID, so it's unpredictable which (if either) would ever match.Fix the remote IDs and both will probably work.
-
@jimp nice.
i have been following this out of curiosity. I have been a bit worried about the stability of IPsec on the platform based on my current experience so this has been an interesting post to follow. I would've never thought about the remote id being a problem. Makes sense -
@jimp, you're quite correct about the peer identifier. (No surprise there! )
Thanks for pointing that out. Although we need to finally test the one link after hours tonight, the "g" link is already working with both configs enabled.
-
@michmoor said in More than one IPSec tunnel phase1 is fine, but adding another phase1 prevents an existing tunnel from re-establishing a connection:
@jimp nice.
i have been following this out of curiosity. I have been a bit worried about the stability of IPsec on the platform based on my current experience so this has been an interesting post to follow. I would've never thought about the remote id being a problem. Makes senseIndeed an interesting finding and definitely something to investigate to see if it resolves my issue...