IPsec won't pass data after a client disconnects and reconnects



  • Originally, pinoyboy had pasted an observation that IPsec on pfsense is still exhibiting this behavior.
    I just wanted to ask about the feasibility of a work-around till its fixed.

    IPsec fails to pass traffic when disconnected and then reconnected until raccoon service is restarted:

    I went into my logs and I noticed that there is a common thing that is going on whenever this happens:

    racoon: ERROR: no configuration found for (I deleted the IP)
    racoon: ERROR: failed to begin ipsec sa negotication

    I also noticed that a restart of the raccoon process, without requiring a logout/login cycle of the client corrects the issue until next disconnect.

    I also ran across this script to restart raccoon in the event of a WAN IP change.
    http://wiki.openwrt.org/doc/howto/vpn.ipsec.basics.racoon

    #!/bin/sh

    ListenInterface() {
      local iface="$1"
      if [ "$INTERFACE" = "$iface" ]; then
        /etc/init.d/racoon restart
      fi
    }

    RacoonInstance() {
      config_list_foreach "$1" listen ListenInterface
    }

    if [ "$ACTION" = "ifup" ]; then
      config_load racoon
      config_foreach RacoonInstance racoon
    fi

    So, what I was wondering about is killing two birds with one stone.
    Rewrite this script a bit to work on pfsense and apply it on pfsense to check for WAN IP changes that break IPsec and to also
    check the IPsec logfile for instances of:
    "racoon: ERROR: no configuration found for"
    "racoon: ERROR: failed to begin ipsec sa negotication"

    and then empty the log and restart raccoon if found.

    What do you think?  Anyone?


  • Rebel Alliance Developer Netgate



  • That logic, generically speaking, will definitely do it, but the condition checked for is different. 
    It seems the feature you implemented would totally solve the issue of fail-over occurrence (WAN IP Change) but
    probably needs to have a condition added to also check for error with a recently disconnected/reconnected client to solve both problems.

    P.S.  Nice feature you added.  Will definitely matter alot to those with multiple ISPs.


  • Rebel Alliance Developer Netgate

    rc.newipsecdns should also get called if a WAN fails and comes back. It should, in theory, catch both cases.

    The "failover" part only happens if IPsec is set to use a gateway group. It still gets restarted even if it's not part of failover. Give it a try, it should solve both problems.



  • I agree with you, it will work as is if the WAN going down and coming up is the issue.

    But if the issue is a client logging out and then logging back in, it won't help the issue I imagine.
    Thats a separate issue with a separate condition but the same cure.  Restart IPsec.


  • Rebel Alliance Developer Netgate

    I didn't see anywhere in this thread that you mentioned "client" as in mobile, I had assumed you meant "disconnect and reconnect" as the WAN getting disconnected.

    If it's a mobile client, restarting racoon is not needed and very harsh. Make sure the Mobile IPsec on 2.0 page is followed exactly and clients can disconnect and reconnect as needed.



  • I would say thats not exactly true, although it would be nice if it were.
    This traffic not passing condition is constant and reproducible 100% of the time for me.

    Disconnect, wait a few seconds or couple of minutes with a mobile client and it will consistently not be able to access the web or the site behind the pfsense.  Wait a good while longer to try a reconnect and it will work again or do a reset of raccoon and it will work again.

    I don't use ipsec much but this is always the case with mobile client quick disconnect and reconnect.
    One would think it wouldn't be that big of an issue, but with mobile clients you disconnect and reconnect often.


  • Rebel Alliance Developer Netgate

    On 2.0.3 or 2.1?

    We have several confirmed reports of it working fine on 2.1. There is an open ticket for that problem.
    https://redmine.pfsense.org/issues/1351



  • I'm using 2.03, but people are still reporting the same behaviour in 2.1

    As of yesterday.  Still time to get it worked into the 2.1 release?  (-:


  • Rebel Alliance Developer Netgate

    If it's actually broken on 2.1, yes.

    The people I saw that are still broken on 2.1 are not using the required settings to make it work as defined on http://doc.pfsense.org/index.php/Mobile_IPsec_on_2.0

    Anyone I'm aware of using the proper settings on 2.1 is working.



  • Why you would prefer 24 hours in phase 1 but 8 in phase two?    Lifetime: 86400 and Lifetime: 28800 is unclear to me, but I've done that.
    No difference in behaviour from both being same 28800 for me.
    Makes no mention of NAT-T?  Mine is enabled.
    Uncheck "Prefer Old IPsec SA"  - Makes no diff for me.
    Strict vs obey seems to also make no diff. for me except perhaps in timeout  (many hours).

    With all settings, checked, rechecked, triple checked, checked to death, disconnect then reconnect within minutes is not reliable.
    If I disconnect a few seconds, then reconnect it will pass traffic.  (sometimes) 
    If I disconnect wait 5 minutes and reconnect it won't pass traffic.  (always)
    If I disconnect and wait an hour, it will work.  (I say 1 hour, but it could be a little longer or shorter)
    I seriously doubt this issue is fixed in 2.1 either, since people are still mentioning it and saying they are on 2.1

    That said, looks you have addressed alot in ipsec for 2.1 to make it more reliable,
    mainly changes and drops in the SERVER's IP, not so much client drops and  login / logouts.


  • Rebel Alliance Developer Netgate

    The settings have been determined after a ton of work and tested to work on Android and iOS both. These are the settings that work, use them, you can't change them without breaking one platform or another. The settings aren't really a guide or suggestions, you must use them exactly and not deviate, or you will have problems.

    It mentions NAT-T ("NAT Traversal: Force")

    And yes, it works on 2.1 with current snapshots. Don't make assumptions until you have tested it personally on 2.1 and have verified every single setting on that page.

    2.0.3 is known to be broken, as was 2.1 a few weeks ago, but it has been fixed since then (see the ticket I linked earlier).



  • haha.  I like your attitude.  No taking crap from a newbie.
    I will do.  I'll set up a 2.1 box and test it with exactly this setting.
    I don't use ipsec myself much, so my interest is purely to improve pfsense, which I really like.



  • Unfortunately, I have to confirm that 2.1-RC1 (amd64) built on Sat Aug 24 12:23:49 EDT 2013 connected with Cisco VPN client (IPsec ) is still vulnerable  :'(. My IPsec log contains:

    Aug 25 18:23:24: racoon: ERROR: no configuration found for 77.253.100.xxx.
    Aug 25 18:23:24: racoon: ERROR: failed to begin ipsec sa negotication.
    

    SAD table shows:

    Source ▴ 	Destination 	Protocol 	SPI 	Enc. alg. 	Auth. alg. 	Data 	
    xxx.xxx.xxx.xxx[4500] 	77.253.100.xxx[62460] 	ESP-UDP 	7074425e 	aes-cbc 	hmac-md5 	0 B
    77.253.100.xxx[62460] 	xxx.xxx.xxx.xxx[4500] 	ESP-UDP 	09007124 	aes-cbc 	hmac-md5 	27524 B
    
    

    It is look like there is no traffic passed to mobile client. More the traffic is blocked permanently and I can not find any solutions to solve it.


  • Rebel Alliance Developer Netgate

    Make sure you review every single setting here:
    http://doc.pfsense.org/index.php/Mobile_IPsec_on_2.0

    If it still doesn't work, blame the Cisco IPsec Client. Especially the Windows one, from Cisco, will not play well with anything but Cisco. If you connect to anything but a Cisco device with the Windows client, you're actually violating its license. (AFAIK, iOS, Android, and OS X are all OK)

    If you must use IPsec from Windows, use the Shrew Soft client.



  • That reads more like there is a problem with either the phase 1 or phase 2 config than the error I was talking about.

    The error I was concerned with did allow connections, it just didn't reliably reconnect in a timely manner after a short disconnect.


Log in to reply