IPSEC failed to run after 2.4.4 upgrade
-
No. Status is the same. Log errors are the same. Let me run fsck
-
What kind of hardware is this on?
-
HyperV. fsck -y / in single user mode didn't help. Hardware is IBM server brought 5-6 years ago.
-
The only references I can find to the error were from a 5 year old MIPS bug report in strongSwan about not wiping secure memory as expected.
That would seem to imply that it's having an issue with manipulating memory in some way, which doesn't sound good. Though I can't find any recent reference to say for sure.
Can you maybe try provisioning a new VM to see if the same thing happens there? Or trying on a different Hyper-V host if you have one?
Or snapshot the VM and upgrade to 2.4.5 and see if the problem persists there.
-
I also found this memory issue and probably this is the problem - dashboard shows 31% memory usage from 1G, without any active connection to VM.
I can provision new VM to test, also can move VM to another host.
Finally, I can't find a way to upgrade to 2.4.5. From dashboard system is up to date (2.4.4). The only available options is to switch to Latest development snapshots Experimental 2.4.X devel. Any hint how to upgrade ? -
The development snapshots choice should get you there, 2.4.5 is under development, not released or stable.
-
Neither upgrade to 2.4.5 or relocation of VM resolved issue. I'm going to provision new VM, upload current configuration and see what happened. Thanks a lot for your assistance
-
Hello.
Exact same problem here. IPSec refused to start after upgrade from 2.4.3-p1 to 2.4.4 with identical error messages. It was working fine just before. Tried the above step (reinstalling strongswan) without success. pfSense is also visualized but on a different platform (VMware 6.5 / Dell PowerEdge R630) so it might not be a virtualization specific issue. Reverted to snapshot taken just before the upgrade and everything is working fine under 2.4.3-p1. Retried the update with same result (IPSEC not starting) so this is not a upgrade glitch but a reproducible issue. Went back to 2.4.3-p1 snapshot again as it is a production firewall. Any idea what else to test/try? -
It's working OK for me here on ESX 6.7, so it's not likely to be specific to virtualization in general, but maybe something in your environment or configuration.
Can you share the contents of
/var/etc/ipsec/strongswan.conf
? If there is anything private in there, you can mask/redact it.There is a way to disable the integrity tests but I'd rather find out why it's failing first.
The way those tests are described it's a simple file checksum test, but if that was the case it should be happening for everyone consistently or flagged by
pkg check -s strongswan
. -
To anyone that can still reproduce this:
- Go to VPN > IPsec, Advanced tab.
- Under IPsec Logging Controls set strongSwan Lib to Highest, then Save
- Try to restart IPsec
- look in Status > System Logs, IPsec tab for a message about why it failed. Alternately, check
clog /var/log/ipsec.log
from the shell.
The strongSwan source seems to imply that it could be a file/filesystem issue. The checksum is missing, the file size is wrong, or the checksum doesn't match. It could also be that somehow it can't find the library (Maybe run
ldconfig
in the shell and then try starting it again).The debug logs will hopefully tell us more.
-
Here is the strongswan.conf (0_1541782634617_strongswan.conf.txt) from our running 2.4.3 that fails after upgrade.
I cannot reproduce now but will try the above of hours if nobody was able to provide before. -
One thing I did when upgrading was to refresh dashboard, because i thought that upgrade process freeze. Probably related with issue.
-
@dtrandov said in IPSEC failed to run after 2.4.4 upgrade:
One thing I did when upgrading was to refresh dashboard, because i thought that upgrade process freeze. Probably related with issue.
Probably not if the same thing happened after updating to 2.4.5.
Can you try the log changes I posted about a few replies up?
-
Sure, but I can do it Monday and get back with result.
-
I have opened https://redmine.pfsense.org/issues/9106 to track this, but we can't do anything until we can either reproduce it locally, or get the debug log messages stating exactly what part of the test failed.
After gathering the above info, I do have a hunch as it what might help. Install the system patches package and then try the following patch:
diff --git a/src/etc/inc/vpn.inc b/src/etc/inc/vpn.inc index d12eb986c2..c055a04d66 100644 --- a/src/etc/inc/vpn.inc +++ b/src/etc/inc/vpn.inc @@ -1556,6 +1556,7 @@ EOD; } /* manage process */ + mwexec("/etc/rc.d/ldconfig start", false); if ($restart === true) { mwexec("/usr/local/sbin/ipsec restart", false); } else {
If that doesn't help, revert the patch.
-
Thanks for opening the issue for tracking.
Retried the update with the same problem. I changed the log level to highest as requested and started IPsec but it unfortunately did not report more information:Nov 11 23:56:33 ipsec_starter 19939 ipsec starter stopped Nov 11 23:56:33 ipsec_starter 19939 charon refused to be started Nov 11 23:56:33 ipsec_starter 19939 charon has quit: integrity test of libstrongswan failed Nov 11 23:56:33 ipsec_starter 19637 no known IPsec stack detected, ignoring! Nov 11 23:56:33 ipsec_starter 19637 no KLIPS IPsec stack detected Nov 11 23:56:33 ipsec_starter 19637 no netkey IPsec stack detected Nov 11 23:56:33 ipsec_starter 19637 Starting strongSwan 5.7.1 IPsec [starter]...
I also tried the patch but it did not change anything.
Reverted back to 2.4.3 for now.
Let's see if dtrandov as better results. -
Did required changes, but no additional info in log files and IPsec refused to start.
Just for test i ran again pkg check -s strongswan, and some files are missing:
pkg check -s strongswanChecking strongswan: 0% strongswan-5.7.1: missing file /usr/local/man/man1/pki---acert.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---dn.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---gen.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---issue.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---keyid.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---pkcs7.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---print.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---pub.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---req.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---self.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---signcrl.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki---verify.1.gz strongswan-5.7.1: missing file /usr/local/man/man1/pki.1.gz strongswan-5.7.1: missing file /usr/local/man/man5/ipsec.conf.5.gz strongswan-5.7.1: missing file /usr/local/man/man5/ipsec.secrets.5.gz strongswan-5.7.1: missing file /usr/local/man/man5/strongswan.conf.5.gz strongswan-5.7.1: missing file /usr/local/man/man5/swanctl.conf.5.gz strongswan-5.7.1: missing file /usr/local/man/man8/charon-cmd.8.gz strongswan-5.7.1: missing file /usr/local/man/man8/ipsec.8.gz strongswan-5.7.1: missing file /usr/local/man/man8/swanctl.8.gz
I'm sure when issue arise, I did reinstall strongswan and no errors appears.
Also:
cat /var/etc/ipsec/strongswan.conf # Automatically generated config file - DO NOT MODIFY. Changes will be overwritten. starter { load_warning = no config_file = /var/etc/ipsec/ipsec.conf } charon { # number of worker threads in charon threads = 16 ikesa_table_size = 32 ikesa_table_segments = 4 init_limit_half_open = 1000 install_routes = no load_modular = yes ignore_acquire_ts = yes cisco_unity = no syslog { identifier = charon # log everything under daemon since it ends up in the same place regardless with our syslog.conf daemon { ike_name = yes dmn = 1 mgr = 1 ike = 1 chd = 1 job = 1 cfg = 1 knl = 1 net = 1 asn = 1 enc = 1 imc = 1 imv = 1 pts = 1 tls = 1 esp = 1 lib = 4 } # disable logging under auth so logs aren't duplicated auth { default = -1 } } plugins { # Load defaults include /var/etc/ipsec/strongswan.d/charon/*.conf stroke { secrets_file = /var/etc/ipsec/ipsec.secrets } unity { load = no } eap-radius { class_group = yes eap_start = no accounting = yes servers { dc1.mydomain.com-radius { address = 192.168.111.101 secret = "asdasd" auth_port = 1812 acct_port = 1813 } } } attr { subnet = 172.16.15.0/24,192.168.111.0/24 split-include = 172.16.15.0/24,192.168.111.0/24 } xauth-generic { script = /etc/inc/ipsec.auth-user.php authcfg = dc1.mydomain.com-radius } } }
-
The man pages being missing is expected, as we normally will strip out the man pages and other docs from the host itself to save space.
I was hoping that little patch would help, but at least we know it isn't the library path.
If you can get a host back into the failed state, try going to an ssh or console shell prompt and then run this command:
ipsec stop ipsec start --debug-all
And then check the console output and IPsec log
-
Console output:
ipsec stop:
Stopping strongSwan IPsec failed: starter is not running
ipsec start --debug-all:
/usr/local/etc/strongswan.conf:68: syntax error, unexpected ., expecting : or '{' or '=' [.] invalid config file '/usr/local/etc/strongswan.conf' abort initialization due to invalid configuration Starting strongSwan 5.7.1 IPsec [starter]... Loading config setup uniqueids=yes Loading conn 'bypasslan' authby=never auto=route leftsubnet=192.168.111.0/24 rightsubnet=192.168.111.0/24 type=passthrough Loading conn 'con-mobile' auto=add dpdaction=clear dpddelay=10s dpdtimeout=60s eap_identity=%identity esp=aes256-sha256,aes192-sha256,aes128-sha256,aes256-sha256,aes192-sha256,aes128-sha256! forceencaps=no fragmentation=yes ike=aes256-sha256-modp1024! ikelifetime=28800s installpolicy=yes keyexchange=ikev2 left=172.16.15.160 leftauth=pubkey leftcert=/var/etc/ipsec/ipsec.d/certs/cert.crt leftid=fqdn:vpn2.mydomain.com leftsendcert=always leftsubnet=172.16.15.0/24,192.168.111.0/24 lifetime=3600s mobike=yes reauth=yes rekey=yes right=%any rightauth=eap-radius rightsourceip=192.168.89.0/24 type=tunnel kernel appears to lack the native netkey IPsec stack no netkey IPsec stack detected kernel appears to lack the KLIPS IPsec stack no KLIPS IPsec stack detected no known IPsec stack detected, ignoring!
line 68 is where is dc1.mydomain.com-radius:
dc1.mydomain.com-radius { address = 192.168.111.101 secret = "masked" auth_port = 1812 acct_port = 1813 }
-
OK, that is a different condition. It doesn't like a
.
in the RADIUS server name. I didn't have any set that way, but now that I set one up I see the same error. Curious that the error is shown on the console but not in the logs. But now I do see the same integrity test error!So we're getting closer! Let me find a fix for this, most likely it will involve removing the dots or swapping them for some other character.