IPSEC failed to run after 2.4.4 upgrade



  • After upgrade to 2.4.4, IPSEC failed to run. Log file said this:

    Nov 7 08:50:15	ipsec_starter	92574	Starting strongSwan 5.7.1 IPsec [starter]...
    Nov 7 08:50:15	ipsec_starter	92574	no netkey IPsec stack detected
    Nov 7 08:50:15	ipsec_starter	92574	no KLIPS IPsec stack detected
    Nov 7 08:50:15	ipsec_starter	92574	no known IPsec stack detected, ignoring!
    Nov 7 08:50:15	ipsec_starter	92821	charon has quit: integrity test of libstrongswan failed
    Nov 7 08:50:15	ipsec_starter	92821	charon refused to be started
    Nov 7 08:50:15	ipsec_starter	92821	ipsec starter stopped
    

    On dashboard IPsec/Overview status shows awaiting connections.
    Before upgrade everything was just fine. No configuration changes at all.

    Any help will be appreciated.


  • Rebel Alliance Developer Netgate

    The first four lines are normal, but that fifth line might be the issue. It's saying that the library may be corrupted. It's not related to the first three errors.

    Does it work after a reboot?

    Does pkg check -s strongswan show any errors?

    You could force a reinstall of just that with pkg upgrade -f strongswan



  • I did reboot twice and result is the same. Tomorrow I will check package and reinstall if needed. Thanks.



  • well, no luck since now.

    pkg check -s strongswan:

    Checking strongswan:
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---acert.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---dn.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---gen.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---issue.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---keyid.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---pkcs7.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---print.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---pub.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---req.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---self.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---signcrl.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---verify.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/ipsec.conf.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/ipsec.secrets.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/strongswan.conf.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/swanctl.conf.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man8/charon-cmd.8.gz
    strongswan-5.7.1: missing file /usr/local/man/man8/ipsec.8.gz
    strongswan-5.7.1: missing file /usr/local/man/man8/swanctl.8.gz
    Checking strongswan... done
    

    then, reinstall and show no errors, but warn me to remove /usr/local/etc/ipsec.conf and /usr/local/etc/strongswan. conf. Reboot didn't help. Ipsec log contains the same errors. Shall I rename these files ?

    Files renamed, but after reboot system created again ....


  • Rebel Alliance Developer Netgate

    The man pages being missing is fine. The warnings about /usr/local/etc/ipsec.conf and strongswan.conf are expected as well, that's all normal.

    Try this next:

    pkg delete -f strongswan
    pkg clean -ay
    pkg-static install -fy strongswan
    

    You might also try selecting the reboot option at the ssh or console menu and then choosing the option to force a disk check.



  • pkg went well, reboot via ssh and fsck gave this:

    ** /dev/ufsid/58e37092fef9beb3 (NO WRITE)
    
    USE JOURNAL? no
    
    ** Skipping journal, falling through to full fsck
    
    SETTING DIRTY FLAG IN READ_ONLY MODE
    
    UNEXPECTED SOFT UPDATE INCONSISTENCY
    ** Last Mounted on /
    ** Root file system
    ** Phase 1 - Check Blocks and Sizes
    ** Phase 2 - Check Pathnames
    ** Phase 3 - Check Connectivity
    ** Phase 4 - Check Reference Counts
    UNREF FILE I=1043331  OWNER=root MODE=100666
    SIZE=0 MTIME=Nov  8 15:41 2018
    CLEAR? no
    
    ** Phase 5 - Check Cyl groups
    SUMMARY BLK COUNT(S) WRONG IN SUPERBLK
    SALVAGE? no
    
    18226 files, 198717 used, 7413296 free (2128 frags, 926396 blocks, 0.0% fragmentation)
    

  • Rebel Alliance Developer Netgate

    Did you run that manually? If you, you need to run fsck -y / a few more times, until it doesn't find or fix any problems.

    If you ran the automatic disk check then it should have done 5 runs of it which should hopefully have been sufficient.

    Any change in behavior?



  • No. Status is the same. Log errors are the same. Let me run fsck


  • Rebel Alliance Developer Netgate

    What kind of hardware is this on?



  • HyperV. fsck -y / in single user mode didn't help. Hardware is IBM server brought 5-6 years ago.


  • Rebel Alliance Developer Netgate

    The only references I can find to the error were from a 5 year old MIPS bug report in strongSwan about not wiping secure memory as expected.

    That would seem to imply that it's having an issue with manipulating memory in some way, which doesn't sound good. Though I can't find any recent reference to say for sure.

    Can you maybe try provisioning a new VM to see if the same thing happens there? Or trying on a different Hyper-V host if you have one?

    Or snapshot the VM and upgrade to 2.4.5 and see if the problem persists there.



  • I also found this memory issue and probably this is the problem - dashboard shows 31% memory usage from 1G, without any active connection to VM.

    I can provision new VM to test, also can move VM to another host.
    Finally, I can't find a way to upgrade to 2.4.5. From dashboard system is up to date (2.4.4). The only available options is to switch to Latest development snapshots Experimental 2.4.X devel. Any hint how to upgrade ?


  • Rebel Alliance Developer Netgate

    The development snapshots choice should get you there, 2.4.5 is under development, not released or stable.



  • Neither upgrade to 2.4.5 or relocation of VM resolved issue. I'm going to provision new VM, upload current configuration and see what happened. Thanks a lot for your assistance



  • Hello.
    Exact same problem here. IPSec refused to start after upgrade from 2.4.3-p1 to 2.4.4 with identical error messages. It was working fine just before. Tried the above step (reinstalling strongswan) without success. pfSense is also visualized but on a different platform (VMware 6.5 / Dell PowerEdge R630) so it might not be a virtualization specific issue. Reverted to snapshot taken just before the upgrade and everything is working fine under 2.4.3-p1. Retried the update with same result (IPSEC not starting) so this is not a upgrade glitch but a reproducible issue. Went back to 2.4.3-p1 snapshot again as it is a production firewall. Any idea what else to test/try?


  • Rebel Alliance Developer Netgate

    It's working OK for me here on ESX 6.7, so it's not likely to be specific to virtualization in general, but maybe something in your environment or configuration.

    Can you share the contents of /var/etc/ipsec/strongswan.conf? If there is anything private in there, you can mask/redact it.

    There is a way to disable the integrity tests but I'd rather find out why it's failing first.

    The way those tests are described it's a simple file checksum test, but if that was the case it should be happening for everyone consistently or flagged by pkg check -s strongswan.


  • Rebel Alliance Developer Netgate

    To anyone that can still reproduce this:

    • Go to VPN > IPsec, Advanced tab.
    • Under IPsec Logging Controls set strongSwan Lib to Highest, then Save
    • Try to restart IPsec
    • look in Status > System Logs, IPsec tab for a message about why it failed. Alternately, check clog /var/log/ipsec.log from the shell.

    The strongSwan source seems to imply that it could be a file/filesystem issue. The checksum is missing, the file size is wrong, or the checksum doesn't match. It could also be that somehow it can't find the library (Maybe run ldconfig in the shell and then try starting it again).

    The debug logs will hopefully tell us more.



  • Here is the strongswan.conf (0_1541782634617_strongswan.conf.txt) from our running 2.4.3 that fails after upgrade.
    I cannot reproduce now but will try the above of hours if nobody was able to provide before.



  • One thing I did when upgrading was to refresh dashboard, because i thought that upgrade process freeze. Probably related with issue.


  • Rebel Alliance Developer Netgate

    @dtrandov said in IPSEC failed to run after 2.4.4 upgrade:

    One thing I did when upgrading was to refresh dashboard, because i thought that upgrade process freeze. Probably related with issue.

    Probably not if the same thing happened after updating to 2.4.5.

    Can you try the log changes I posted about a few replies up?



  • Sure, but I can do it Monday and get back with result.


  • Rebel Alliance Developer Netgate

    I have opened https://redmine.pfsense.org/issues/9106 to track this, but we can't do anything until we can either reproduce it locally, or get the debug log messages stating exactly what part of the test failed.

    After gathering the above info, I do have a hunch as it what might help. Install the system patches package and then try the following patch:

    diff --git a/src/etc/inc/vpn.inc b/src/etc/inc/vpn.inc
    index d12eb986c2..c055a04d66 100644
    --- a/src/etc/inc/vpn.inc
    +++ b/src/etc/inc/vpn.inc
    @@ -1556,6 +1556,7 @@ EOD;
     	}
     
     	/* manage process */
    +	mwexec("/etc/rc.d/ldconfig start", false);
     	if ($restart === true) {
     		mwexec("/usr/local/sbin/ipsec restart", false);
     	} else {
    

    If that doesn't help, revert the patch.



  • Thanks for opening the issue for tracking.
    Retried the update with the same problem. I changed the log level to highest as requested and started IPsec but it unfortunately did not report more information:

    Nov 11 23:56:33 	ipsec_starter 	19939 	ipsec starter stopped
    Nov 11 23:56:33 	ipsec_starter 	19939 	charon refused to be started
    Nov 11 23:56:33 	ipsec_starter 	19939 	charon has quit: integrity test of libstrongswan failed
    Nov 11 23:56:33 	ipsec_starter 	19637 	no known IPsec stack detected, ignoring!
    Nov 11 23:56:33 	ipsec_starter 	19637 	no KLIPS IPsec stack detected
    Nov 11 23:56:33 	ipsec_starter 	19637 	no netkey IPsec stack detected
    Nov 11 23:56:33 	ipsec_starter 	19637 	Starting strongSwan 5.7.1 IPsec [starter]...
    

    I also tried the patch but it did not change anything.
    Reverted back to 2.4.3 for now.
    Let's see if dtrandov as better results.



  • Did required changes, but no additional info in log files and IPsec refused to start.
    Just for test i ran again pkg check -s strongswan, and some files are missing:
    pkg check -s strongswan

    Checking strongswan:   0%
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---acert.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---dn.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---gen.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---issue.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---keyid.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---pkcs7.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---print.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---pub.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---req.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---self.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---signcrl.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki---verify.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man1/pki.1.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/ipsec.conf.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/ipsec.secrets.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/strongswan.conf.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man5/swanctl.conf.5.gz
    strongswan-5.7.1: missing file /usr/local/man/man8/charon-cmd.8.gz
    strongswan-5.7.1: missing file /usr/local/man/man8/ipsec.8.gz
    strongswan-5.7.1: missing file /usr/local/man/man8/swanctl.8.gz
    

    I'm sure when issue arise, I did reinstall strongswan and no errors appears.

    Also:

    cat /var/etc/ipsec/strongswan.conf
    
    # Automatically generated config file - DO NOT MODIFY. Changes will be overwritten.
    starter {
           load_warning = no
           config_file = /var/etc/ipsec/ipsec.conf
    }
    
    charon {
    # number of worker threads in charon
           threads = 16
           ikesa_table_size = 32
           ikesa_table_segments = 4
           init_limit_half_open = 1000
           install_routes = no
           load_modular = yes
           ignore_acquire_ts = yes
    
    
           cisco_unity = no
    
    
    
           syslog {
                   identifier = charon
                   # log everything under daemon since it ends up in the same place regardless with our syslog.conf
                   daemon {
                           ike_name = yes
                           dmn = 1
                           mgr = 1
                           ike = 1
                           chd = 1
                           job = 1
                           cfg = 1
                           knl = 1
                           net = 1
                           asn = 1
                           enc = 1
                           imc = 1
                           imv = 1
                           pts = 1
                           tls = 1
                           esp = 1
                           lib = 4
    
                   }
                   # disable logging under auth so logs aren't duplicated
                   auth {
                           default = -1
                   }
           }
    
           plugins {
                   # Load defaults
                   include /var/etc/ipsec/strongswan.d/charon/*.conf
    
                   stroke {
                           secrets_file = /var/etc/ipsec/ipsec.secrets
                   }
    
                   unity {
                           load = no
                   }
                   eap-radius {
                           class_group = yes
                           eap_start = no
                           accounting = yes
                           servers {
                                   dc1.mydomain.com-radius {
                                           address = 192.168.111.101
                                           secret = "asdasd"
                                           auth_port = 1812
                                           acct_port = 1813
                                   }
    
                           }
                   }
                   attr {
                           subnet = 172.16.15.0/24,192.168.111.0/24
                           split-include = 172.16.15.0/24,192.168.111.0/24
                   }
                   xauth-generic {
                           script = /etc/inc/ipsec.auth-user.php
                           authcfg = dc1.mydomain.com-radius
                   }
    
           }
    }
    

  • Rebel Alliance Developer Netgate

    The man pages being missing is expected, as we normally will strip out the man pages and other docs from the host itself to save space.

    I was hoping that little patch would help, but at least we know it isn't the library path.

    If you can get a host back into the failed state, try going to an ssh or console shell prompt and then run this command:

    ipsec stop
    ipsec start --debug-all
    

    And then check the console output and IPsec log



  • Console output:

    ipsec stop:

    Stopping strongSwan IPsec failed: starter is not running
    

    ipsec start --debug-all:

    /usr/local/etc/strongswan.conf:68: syntax error, unexpected ., expecting : or '{' or '=' [.]
    invalid config file '/usr/local/etc/strongswan.conf'
    abort initialization due to invalid configuration
    Starting strongSwan 5.7.1 IPsec [starter]...
    Loading config setup
      uniqueids=yes
    Loading conn 'bypasslan'
      authby=never
      auto=route
      leftsubnet=192.168.111.0/24
      rightsubnet=192.168.111.0/24
      type=passthrough
    Loading conn 'con-mobile'
      auto=add
      dpdaction=clear
      dpddelay=10s
      dpdtimeout=60s
      eap_identity=%identity
      esp=aes256-sha256,aes192-sha256,aes128-sha256,aes256-sha256,aes192-sha256,aes128-sha256!
      forceencaps=no
      fragmentation=yes
      ike=aes256-sha256-modp1024!
      ikelifetime=28800s
      installpolicy=yes
      keyexchange=ikev2
      left=172.16.15.160
      leftauth=pubkey
      leftcert=/var/etc/ipsec/ipsec.d/certs/cert.crt
      leftid=fqdn:vpn2.mydomain.com
      leftsendcert=always
      leftsubnet=172.16.15.0/24,192.168.111.0/24
      lifetime=3600s
      mobike=yes
      reauth=yes
      rekey=yes
      right=%any
      rightauth=eap-radius
      rightsourceip=192.168.89.0/24
      type=tunnel
    kernel appears to lack the native netkey IPsec stack
    no netkey IPsec stack detected
    kernel appears to lack the KLIPS IPsec stack
    no KLIPS IPsec stack detected
    no known IPsec stack detected, ignoring!
    

    line 68 is where is dc1.mydomain.com-radius:

    dc1.mydomain.com-radius {
                                           address = 192.168.111.101
                                           secret = "masked"
                                           auth_port = 1812
                                           acct_port = 1813
                                   }
    
    
    

  • Rebel Alliance Developer Netgate

    OK, that is a different condition. It doesn't like a . in the RADIUS server name. I didn't have any set that way, but now that I set one up I see the same error. Curious that the error is shown on the console but not in the logs. But now I do see the same integrity test error!

    So we're getting closer! Let me find a fix for this, most likely it will involve removing the dots or swapping them for some other character.



  • Great.

    well, I've just defined new radius server entry (without dots in name) in System/User Manager/Authentication Servers and IPSec is up now..... definitely dots in name are problematic.


  • Rebel Alliance Developer Netgate

    Thankfully that is a very simple fix (assuming you don't fudge the regex like I did on my first commit), so if you want to keep a RADIUS server entry with . in the name, you can if you apply the following patch:

    diff --git a/src/etc/inc/vpn.inc b/src/etc/inc/vpn.inc
    index 2458a224f5..e3ccb4f3c4 100644
    --- a/src/etc/inc/vpn.inc
    +++ b/src/etc/inc/vpn.inc
    @@ -478,7 +478,7 @@ EOD;
     	$user_sources = explode(',', $config['ipsec']['client']['user_source']);
     	foreach ($user_sources as $user_source) {
     		$auth_server = auth_get_authserver($user_source);
    -		$nice_user_source = strtolower(preg_replace('/\s+/', '_', $user_source));
    +		$nice_user_source = strtolower(preg_replace('/[\s\.]+/', '_', $user_source));
     		if ($auth_server && $auth_server['type'] === 'radius') {
     			$radius_server_txt .= <<<EOD
     				{$nice_user_source} {
    


  • Thanks a lot. Really appreciate you assistance :)



  • Thanks jimp!
    I'll try this also tonight and post results.
    Quick question: (I'm no expert with diff.) Shouldn't the path in the above patches be /etc/inc/vpn.inc? There is no /src folder on our system.
    Also, will this be corrected is a future release or will we need to keep the patch as long as we have names with dot?


  • Rebel Alliance Developer Netgate

    It will be corrected in 2.4.4-p1 and 2.4.5.

    As for the path, the system patches package defaults to a path strip level of 2, so it will do the right thing here.

    The source in github has a src/ directory prefix, which is why it shows in patches made from git commits like the above.



  • Understood for the default strip level of the system patches package.
    This worked also for us and we now have a working 2.4.4 system.
    Thanks jimp!