2.5 upgrade broke some, not all, IPSEC
-
Updated to 2.5 on a handful of platforms, including:
pfSense Community under VMware
pfSense Community on bare metal
pfSense+ on Netgate SG-1100Experiencing VPN trouble on all. Let's take a single instance; IPSEC from pfSense 2.5.0-RELEASE (amd64) on bare metal to pfSense+ 21.02-RELEASE (arm64) on SG-1100.
I have tried the edit - save - stop - start suggestion as well as removing the tunnel entirely from both ends and starting from scratch.
Output of "swanctl --list-conns"
con200000: IKEv1, reauthentication every 25920s, dpd delay 10s local: 208.77.nn.nn remote: 67.79.nn.nn local pre-shared key authentication: id: 208.77.nn.nn remote pre-shared key authentication: id: 67.79.nn.nn con0: TUNNEL, rekeying every 3240s, dpd action is hold local: 192.168.0.0/24|/0 remote: 192.168.1.0/24|/0
output of "swanctl --load-all --file /var/etc/ipsec/swanctl.conf --debug 1"
no authorities found, 0 unloaded no pools found, 0 unloaded loaded ike secret 'ike-0' loaded ike secret 'ike-1' loaded ike secret 'ike-2' loaded ike secret 'ike-3' loaded ike secret 'ike-4' loaded ike secret 'ike-5' loaded ike secret 'ike-6' loaded ike secret 'ike-7' loaded ike secret 'ike-8' loaded ike secret 'ike-9' loaded ike secret 'ike-10' loaded connection 'bypass' loaded connection 'con300000' loaded connection 'con400000' loaded connection 'con600000' loaded connection 'con700000' loaded connection 'con8000' loaded connection 'con5000' loaded connection 'con1000000' loaded connection 'con1100000' loaded connection 'con200000' successfully loaded 10 connections, 0 unloaded
relevant output of "/var/etc/ipsec/swanctl.conf"
con200000 { fragmentation = yes unique = replace version = 1 aggressive = no proposals = aes128-sha256-modp2048 dpd_delay = 10s dpd_timeout = 60s reauth_time = 25920s over_time = 2880s rand_time = 2880s encap = no mobike = no local_addrs = 208.77.nn.nn remote_addrs = 67.79.nn.nn pools = local { id = 208.77.nn.nn auth = psk } remote { id = 67.79.nn.nn auth = psk } children { con0 { dpd_action = trap mode = tunnel policies = yes life_time = 3600s rekey_time = 3240s rand_time = 360s start_action = trap local_ts = 192.168.0.0/24 remote_ts = 192.168.1.0/24 esp_proposals = aes128gcm128-modp2048,aes128-sha256-modp2048 } } }
Log entries:
Feb 19 09:40:53 charon 65414 07[NET] <200> received packet: from 67.79.nn.nn[500] to 208.77.nn.nn[500] (180 bytes) Feb 19 09:40:53 charon 65414 07[ENC] <200> parsed ID_PROT request 0 [ SA V V V V V ] Feb 19 09:40:53 charon 65414 07[IKE] <200> no IKE config found for 208.77.nn.nn...67.79.nn.nn, sending NO_PROPOSAL_CHOSEN Feb 19 09:40:53 charon 65414 07[ENC] <200> generating INFORMATIONAL_V1 request 2737378764 [ N(NO_PROP) ] Feb 19 09:40:53 charon 65414 07[NET] <200> sending packet: from 208.77.nn.nn[500] to 67.79.nn.nn[500] (40 bytes)
As mentioned, we have other examples of tunnels that aren't working after updates, but let's start with this and see if there are any suggestions that work for it and might be applied to others.
-
From the log message it looks like it can't match the tunnel for some reason.
Can you show the same output from the other side as well?
-
Thanks @jimp . Yes, I agree with the analysis. However, I've been through it several times and I don't find a discrepancy. In case it's just my eyeballs not working, here's the requested output from the other side:
"swanctl --list-conns"
con100000: IKEv1, reauthentication every 25920s, dpd delay 10s local: 67.79.nn.nn remote: 208.77.nn.nn local pre-shared key authentication: id: 67.79.nn.nn remote pre-shared key authentication: id: 208.77.nn.nn con0: TUNNEL, rekeying every 3240s, dpd action is hold local: 192.168.1.0/24|/0 remote: 192.168.0.0/24|/0
output of "swanctl --load-all --file /var/etc/ipsec/swanctl.conf --debug 1"
no authorities found, 0 unloaded no pools found, 0 unloaded loaded ike secret 'ike-0' loaded connection 'bypass' loaded connection 'con100000' successfully loaded 2 connections, 0 unloaded
relevant output of "/var/etc/ipsec/swanctl.conf"
con100000 { fragmentation = yes unique = replace version = 1 aggressive = no proposals = aes128-sha256-modp2048 dpd_delay = 10s dpd_timeout = 60s reauth_time = 25920s over_time = 2880s rand_time = 2880s encap = no mobike = no local_addrs = 67.79.nn.nn remote_addrs = 208.77.nn.nn pools = local { id = 67.79.nn.nn auth = psk } remote { id = 208.77.nn.nn auth = psk } children { con0 { dpd_action = trap mode = tunnel policies = yes life_time = 3600s rekey_time = 3240s rand_time = 360s start_action = trap local_ts = 192.168.1.0/24 remote_ts = 192.168.0.0/24 esp_proposals = aes128gcm128-modp2048,aes128-sha256-modp2048 } } }
Log entries:
Feb 19 11:09:24 charon 50154 09[KNL] creating acquire job for policy 67.77.nn.nn/32|/0 === 208.77.nn.nn/32|/0 with reqid {1} Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> queueing ISAKMP_VENDOR task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> queueing ISAKMP_CERT_PRE task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> queueing MAIN_MODE task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> queueing ISAKMP_CERT_POST task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> queueing ISAKMP_NATD task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> queueing QUICK_MODE task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> activating new tasks Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> activating ISAKMP_VENDOR task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> activating ISAKMP_CERT_PRE task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> activating MAIN_MODE task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> activating ISAKMP_CERT_POST task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> activating ISAKMP_NATD task Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> sending XAuth vendor ID Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> sending DPD vendor ID Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> sending FRAGMENTATION vendor ID Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> sending NAT-T (RFC 3947) vendor ID Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> sending draft-ietf-ipsec-nat-t-ike-02\n vendor ID Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> initiating Main Mode IKE_SA con100000[367] to 208.77.nn.nn Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> IKE_SA con100000[367] state change: CREATED => CONNECTING Feb 19 11:09:24 charon 50154 06[CFG] <con100000|367> configured proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_2048 Feb 19 11:09:24 charon 50154 06[ENC] <con100000|367> generating ID_PROT request 0 [ SA V V V V V ] Feb 19 11:09:24 charon 50154 06[NET] <con100000|367> sending packet: from 67.79.nn.nn[500] to 208.77.nn.nn[500] (180 bytes) Feb 19 11:09:24 charon 50154 06[NET] <con100000|367> received packet: from 208.77.nn.nn[500] to 67.79.nn.nn[500] (40 bytes) Feb 19 11:09:24 charon 50154 06[ENC] <con100000|367> parsed INFORMATIONAL_V1 request 2873590687 [ N(NO_PROP) ] Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> received NO_PROPOSAL_CHOSEN error notify Feb 19 11:09:24 charon 50154 06[IKE] <con100000|367> IKE_SA con100000[367] state change: CONNECTING => DESTROYING
-
Nothing immediately stands out there either. Can you, temporarily, disable all tunnels on the first box so that only this one is enabled? I'm curious if something else in one of the other tunnels could be conflicting or causing this one to fail.
I'd need to see the full config from that one to know for sure though.
-
I disabled all the other tunnels on the system from VPN > IPsec > Tunnels and clicked the green Apply button.
Now in status, all of the tunnels (9 total) that I disabled are showing "Established" still, while the #10 tunnel (this one in question, and the only one marked as enabled) shows "Disconnected".
Restarting the IPsec service via GUI has no effect. If I click to stop the service, the GUI page reloads but does not indicate that the service is stopped. ie: it displays the "stop" icon button again and not the "play" icon as I would expect.
Is there something wrong at a deeper level? I might note that this was previously a 2.4.5_1 system that was upgraded.
LMK inf you need the full config on the system. It might take me a bit to obfuscate the relevant parts.
-
You might have to manually stop IPsec and then start it again from Status > Services to ensure the disabled parts are fully deactivated for this kind of test.
There is a problem with the status page showing incorrectly, that's already got a fix in (https://redmine.pfsense.org/issues/11435)
-
@jimp The service doesn't stop, even if doing manually from Status > Services. I click on the stop icon, and it just refreshes to another stop icon. The service never actually stops.
The tunnels are verified to still be up as they're passing traffic (I can reach the private IPs on the other ends).
Re the status page not appearing correctly, I actually have applied the following patches already:
ead6515637a34ce6e170e2d2b0802e4fa1e63a00
57beb9ad8ca11703778fc483c7cba0f6770657ac
c09137ab4726dc492c658c27b6c46e25f0fbb55b -
Do you have something like Service Watchdog setup which might be restarting it when it shouldn't be?
-
Nope. Nothing like that. It's pretty much a stock setup.
I've been informed that in addition to this IPSEC issue, SIP traffic is not passing. Unrelated items, yes. But both issues came after the update.
My concern is that there are things that have been mangled in the upgrade process, especially considering this box started as a MUCH earlier version of pfSense several years ago. We may have to simply export the config, spin up a fresh install and import the config across.
-
Before doing that you might want to reset your browser cache to make sure it isn't using outdated JS/CSS. Maybe something there is tripping up the service stop/start buttons.
-
@jimp Thanks Yeah, that wasn't it. I even switched browsers. Something is, I'm afraid, really wrong with this thing.
-
@gtoger make a config backup and then reinstall from scratch..then try restoring the config..see if that helps.
-
@hescominsoon It's not what I wanted to do, but I did it.
Did it solve the problem? Nope. Still have a failure to connect this tunnel.
Could it be that we're going between a pfSense CE and a pfSense+ on a Netgate device? Would seem awfully dang strange. But I'm convinced there's a bug here someplace.
-
Hello,
I can report the same problems with my VM - Hardware PFSense an Tunnels
BR
Martin -
Try to resave/reapply the Phase 1 parameters for your tunnels,
this could be related to https://redmine.pfsense.org/issues/11455 -
This thread is getting out of hand like the previous one. We need to keep each thread for ONE issue only, not for multiple unrelated things that happen to be in IPsec.
See my previous response at https://forum.netgate.com/post/964752
Before reporting any issues, please look at the list of recent IPsec issues and apply fixes/workarounds from there to eliminate known causes.
You can install the System Patches package and then create entries for the following commit IDs to apply the fixes:
ead6515637a34ce6e170e2d2b0802e4fa1e63a00
#1143557beb9ad8ca11703778fc483c7cba0f6770657ac
#1143510eb04259fd139c62e08df8de877b71fdd0eedc8
#11442ded7970ba57a99767e08243103e55d8a58edfc35
#11486afffe759c4fd19fe6b8311196f4b6d5e288ea4fb
#114872fe5cc52bd881ed26723a81e0eed848fd505fba6
#11488
Please refrain from replying to someone else's thread with a "me too" until there is confirmation that your issues are really the same and not just similar.
I'll split some of these off into their own threads if they don't already have them, but for now, this one is locked.