IPSec to Azure Causing StrongSwan Crashes



  • Site to Site IPSec tunnel to Azure.  Connection works fine, right until it doesn't.

    This is the crash report:
    Crash report begins.  Anonymous machine information:

    amd64
    10.0-STABLE
    FreeBSD 10.0-STABLE #46 e852cd6(HEAD)-dirty: Wed Jul  2 16:10:52 CDT 2014    root@pf22-amd64-snap:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10

    Crash report details:

    PHP Errors:
    [03-Jul-2014 20:13:28 Canada/Eastern] PHP Fatal error:  Maximum execution time of 900 seconds exceeded in /etc/inc/ipsec.inc on line 490



  • That code is looping around reading through some status data:

    while (!strstr($sread, "")) {
      $sread = fgets($fd);
      $response .= $sread;
    }
    

    I am guessing it is getting into an endless loop, never getting out of the "while".
    Then it is getting terminated by the PHP interpreter when it uses up the 900 second time limit.
    You could try putting a higher time limit in the code just above:

    set_time_limit(1800);
    

    and see if it still dies - that would indicate that it is really getting stuck in the loop forever.

    set_time_limit(0);
    

    will let the code run without time limit, but this does not look like a place that  should need this!
    Ermal introduced this function ipsec_smp_dump_status() in Feb 2014, so I guess he will see this and have a look at what is the problem with the while loop test.



  • I am working on it though still is strange that you get this.

    Is strongswan running at all during this time?



  • I can email detailed logs if you like.  StrongSwan goes down hard and restarts



  • I'm not sure which fixed the issue since I've made a variety of changes recently.  But the problem is gone… here is the list of what changed.

    • I'm running the July 6 build.
    • I've fixed an overlapping IP range issue (the one side had 10.10.1.0/24, 10.10.3.0, 10.10.4.0 and Azure was 10.10.2.0 which I think confused both sides of the tunnel).  Azure is now 10.11.0.0 so its a nice simple, single phase 2 entry
    • Switched from AES256 to 3DES.... the AES256 implementation and Azure's implementation don't seem to like each other.


  • I spoke to soon, though much different error now.  When StrongSwan coughs up a lung, it now says:

    Jul 7 21:03:51 charon: 15[LIB] dumping 2 stack frame addresses:
    Jul 7 21:03:51 charon: 15[LIB] /lib/libthr.so.3 @ 0x801337000 (_swapcontext+0x15b) [0x80134545b]
    Jul 7 21:03:51 charon: 15[LIB] ->
    Jul 7 21:03:51 charon: 15[LIB] /lib/libthr.so.3 @ 0x801337000 (sigaction+0x343) [0x801345043]
    Jul 7 21:03:51 charon: 15[LIB] ->
    Jul 7 21:03:51 charon: 15[DMN] killing ourself, received critical signal



  • Can you put that in debug mode and send me the trace!

    Seems like something wrong is happening with some library there!



  • Please let me know which options to select and where to send the trace.  I'd really love to get IPSec tunnels stable.