IPSec Tunnel dies but shows as up still



  • Hi,

    Have a problem with IPSEC site to site tunnel, have been running 1.2.3 for the last couple of days and have noticed this issue, it seems to come up after a few hours, very strange as the tunnel shows as still been up but nothing is routing between local and remote, however not had any time to look at this at present as on holiday but thought i'd mention if anyone else has had an issue and i'll try and fill in blanks upon my return.

    The workaround is simple to disable the IPSEC then re-enable and all back to normal. This has only occured in 1.2.3 all previous versions (1.2.0 1.2.1 & 1.2.2) have held this tunnels fine.

    Anyone any ideas with almost no info? :-)

    J



  • This is probably our Nat-T patch.



  • Weird though that it stays up for so long though (Minimum of about 8 hours so far).

    Is there any easy workround, or shall I get 1.2.2 put back in place (just swap of machines so not much hassle)

    J



  • I have the same problem with NAT-T turned on. If I turn it off and restart racoon, I can keep the connection up. I don't have a site to site though. This a mobile ipsec client (Netscreen Remote). FOr now I am going to disable NAT-T. I will test it on each new build I load.

    Cheers Guys, Thanks for everthing so far.



  • I have the same issue.  It started after the update from 1.2.2 to 1.2.3.  I have included various debugging messages.

    This is a LAN to LAN tunnel between pfSense 1.2.3 and a Cisco 515 PIX running 6.3(5).

    NAT-T is disabled on all tunnels.

    pfSense:

    Mar 12 10:02:24 last message repeated 12 times
    Mar 12 10:00:18 last message repeated 3 times
    Mar 12 09:59:40 racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
    Mar 12 09:55:38 racoon: [515 PIX]: INFO: IPsec-SA expired: ESP/Tunnel 1.1.1.1[0]->2.2.2.2[0] spi=239703863(0xe499737)

    PIX:
    ISAKMP (0): sending NOTIFY message 11 protocol 3
    ISAKMP (0): sending NOTIFY message 11 protocol 3

    I went in and deleted the SADs in pfSense and the tunnel recovered after a minute or two.

    Here is the full debugging messages from the PIX as the tunnel recovered:

    ISAKMP (0): retransmitting phase 2 (9/1)... mess_id 0x9cbb7a01
    crypto_isakmp_process_block:src:1.1.1.1, dest:2.2.2.2 spt:500 dpt:500
    OAK_QM exchange
    oakley_process_quick_mode:
    OAK_QM_IDLE
    ISAKMP (0): processing SA payload. message ID = 3704443265
    
    ISAKMP : Checking IPSec proposal 1
    
    ISAKMP: transform 1, ESP_3DES
    ISAKMP:   attributes in transform:
    ISAKMP:      SA life type in seconds
    ISAKMP:      SA life duration (VPI) of  0x0 0x1 0x51 0x80
    ISAKMP:      encaps is 1
    ISAKMP:      authenticator is HMAC-SHA
    ISAKMP:      group is 2
    ISAKMP (0): atts are acceptable.
    ISAKMP (0): processing NONCE payload. message ID = 3704443265
    
    ISAKMP (0): processing KE payload. message ID = 3704443265
    
    ISAKMP (0): processing ID payload. message ID = 3704443265
    ISAKMP (0): ID_IPV4_ADDR_SUBNET src home/255.255.255.0 prot 0 port 0
    ISAKMP (0): processing ID payload. message ID = 3704443265
    ISAKMP (0): ID_IPV4_ADDR_SUBNET dst work/255.255.255.0 prot 0 port 0
    return status is IKMP_NO_ERROR
    crypto_isakmp_process_block:src:1.1.1.1, dest:2.2.2.2 spt:500 dpt:500
    OAK_QM exchange
    oakley_process_quick_mode:
    OAK_QM_AUTH_AWAIT
    ISAKMP (0): Creating IPSec SAs
            inbound SA from   1.1.1.1 to  2.2.2.2 (proxy         home to           work)
            has spi 1675689223 and conn_id 16 and flags 25
            lifetime of 86400 seconds
            outbound SA from  2.2.2.2 to   1.1.1.1 (proxy           work to         home)
            has spi 255200531 and conn_id 15 and flags 25
            lifetime of 86400 seconds
    VPN Peer: IPSEC: Peer ip:1.1.1.1/500 Ref cnt incremented to:2 Total VPN Peers:3
    VPN Peer: IPSEC: Peer ip:1.1.1.1/500 Ref cnt incremented to:3 Total VPN Peers:3
    return status is IKMP_NO_ERROR
    
    

    Here is pfSense:

    Mar 12 11:36:09	racoon: [515 PIX]: INFO: initiate new phase 2 negotiation: 1.1.1.1[500]<=>2.2.2.2[500]
    Mar 12 11:36:08	racoon: ERROR: failed to pre-process packet.
    Mar 12 11:36:08	racoon: ERROR: failed to get sainfo.
    Mar 12 11:36:08	racoon: ERROR: failed to get sainfo.
    Mar 12 11:36:08	racoon: [515 PIX]: INFO: respond new phase 2 negotiation: 1.1.1.1[500]<=>2.2.2.2[500]
    Mar 12 11:36:04	racoon: [515 PIX]: INFO: IPsec-SA established: ESP 1.1.1.1[500]->2.2.2.2[500] spi=3554398685(0xd3dbd1dd)
    Mar 12 11:36:04	racoon: [515 PIX]: INFO: IPsec-SA established: ESP 2.2.2.2[0]->1.1.1.1[0] spi=53898054(0x3366b46)
    Mar 12 11:36:04	racoon: WARNING: attribute has been modified.
    Mar 12 11:36:04	racoon: WARNING: ignore RESPONDER-LIFETIME notification.
    Mar 12 11:36:03	racoon: [515 PIX]: INFO: initiate new phase 2 negotiation: 1.1.1.1[500]<=>2.2.2.2[500]
    Mar 12 11:36:02	racoon: [Chicago PIX]: ERROR: pfkey DELETE received: ESP 1.1.1.1[500]->63.87.80.5[500] spi=2796219124(0xa6aaeaf4)
    Mar 12 11:36:02	racoon: INFO: unsupported PF_KEY message REGISTER
    Mar 12 11:35:59	racoon: [515 PIX]: INFO: IPsec-SA established: ESP 1.1.1.1[500]->2.2.2.2[500] spi=1675689223(0x63e0fd07)
    Mar 12 11:35:59	racoon: [515 PIX]: INFO: IPsec-SA established: ESP 2.2.2.2[0]->1.1.1.1[0] spi=255200531(0xf360d13)
    Mar 12 11:35:59	racoon: WARNING: attribute has been modified.
    Mar 12 11:35:59	racoon: WARNING: ignore RESPONDER-LIFETIME notification.
    Mar 12 11:35:58	racoon: [515 PIX]: INFO: initiate new phase 2 negotiation: 1.1.1.1[500]<=>2.2.2.2[500]
    Mar 12 11:35:56	racoon: ERROR: failed to pre-process packet.
    Mar 12 11:35:56	racoon: ERROR: failed to get sainfo.
    Mar 12 11:35:56	racoon: ERROR: failed to get sainfo.
    Mar 12 11:35:56	racoon: [515 PIX]: INFO: respond new phase 2 negotiation: 1.1.1.1[500]<=>2.2.2.2[500]
    


  • Hi

    Sorry for delay, we had to pull the firewall and go back to 1.2.2 as 99% of the work of one department do is via the site to site VPN, so can't help with any further debugging on this, but certainly the above seems to be what i remember seeing.

    Is that a permanent fix, or does that just re-establish the tunnel which will fail again? (I would guess its just the same as disabling and re-enabling so it will fail again)

    Great product though, hopefully if we can get this sorted I can go back up to 1.2.3 and carry on testing.

    J



  • The tunnel promptly dies again after what I assume is the expiration of the current SA.



  • Hey Guys,
    NAT-T is still not working for me in 1.2.3 RC-1 built on Fri Apr 24 19:37:18 EDT 2009. Connects with no problem when NAT-T is disabled. Looks like a time out at phase 1. Anyone else seeing this behavior? Even better, does any one have this working with NAT-T?



  • Is anyone having a good experience with IPSEC lan to LAN with 1.2.3?



  • I updated to RC1 and all of my IPSec site-to-site tunnels die and can't recover without restarting IPSec.  Please see my above post for more details.

    I am not using NAT-T.

    I tried enabling dead peer detection.  It didn't help.



  • I have only 1 site to site tunnel. I have to stop and restart ipsec after a reboot, but once I do this, it stays up until next reboot. I think that might have been due to another problem with my provider and a router port problem though that I have only had once. I will find out next time its down. (which might be a very long time)

    Cheers guys.



  • I finally gave up on the 1.2.3 snapshots.  I rolled back to 1.2.2 and my tunnels came back up and have been rock solid since.  I made zero configuration changes.  I simply pointed the console update URL to the 1.2.2 full upgrade download.

    1.2.3 had fixed some issues I'd had with some network cards, but those were pretty minor compared to non-functional IPSec.

    If any developer wants help testing or debugging this issue I'd be happy to reinstall the latest snapshot and work with them on it.  I've got access to PIXs running 6.3.5 and 8.0.4 and a Concentrator 3005.



  • @kapara:

    Is anyone having a good experience with IPSEC lan to LAN with 1.2.3?

    No issues here with 1.2.3 RC1 and Astaro 7.4 with IPsec VPNs.  The Astaro box is even behind a NAT machine (a pfSense box ;-) so NAT-T appears to be working well, too.

    We just got this particular setup up and running a couple days ago, no issues the entire time so far.



  • Everyone that has posted above needs to update whether they are using parallel tunnels or not (meaning you have more than one network so a tunnel for each network).  This problem existed in 1.2.2 and I am curious whether it has gotten worse or better.

    Thanks,
    Roy



  • @rwalker:

    Everyone that has posted above needs to update whether they are using parallel tunnels or not (meaning you have more than one network so a tunnel for each network).  This problem existed in 1.2.2 and I am curious whether it has gotten worse or better.

    I'm using 1.2.3 RC1 in a multi-tunnel setup with no problems for about a week now.

    The pfSense side has two subnets, the other side (Astaro) has one.

    The only issue I see is that the IPsec status screen shows the VPNs in state yellow instead of green, and one of the tunnels is missing the source network, yet the tunnels are working fine.  They have been yellow for at least a couple days.

    See the attached picture.

    Green is the pfSense endpoint IP.
    Red is the Astaro endpoint IP.
    Pink is the Astaro LAN network
    Blue is the pfSense LAN network

    Note the missing LAN network in the first listed VPN and how the status is yellow on both.  But both VPNs are functioning properly.




  • I am connecting to 3 Cisco PIXs with 6 tunnels.

    PIX 1: 3 tunnels
    PIX 2: 2 tunnels
    PIX 3: 1 tunnel

    PIX config:
    crypto ipsec transform-set myset esp-3des esp-sha-hmac
    crypto map outside_map 40 ipsec-isakmp
    crypto map outside_map 40 match address outside_cryptomap_40
    crypto map outside_map 40 set pfs group2
    crypto map outside_map 40 set peer 1.1.1.1
    crypto map outside_map 40 set transform-set myset
    isakmp key ******** address 1.1.1.1 netmask 255.255.255.255
    isakmp identity address
    isakmp policy 42 authentication pre-share
    isakmp policy 42 encryption 3des
    isakmp policy 42 hash md5
    isakmp policy 42 group 2
    isakmp policy 42 lifetime 86400
    isakmp policy 62 authentication pre-share
    isakmp policy 62 encryption 3des
    isakmp policy 62 hash sha
    isakmp policy 62 group 2
    isakmp policy 62 lifetime 86400

    PFSense:

    Phase 1:
    Negotiation mode: MAIN
    My identifier: MY IP
    Encryption algorithm: 3DES
    Hash algorithm: SHA1
    DH key group: 2
    Lifetime: 28800
    Authentication method: pre-shared

    Phase 2:
    Protocol: ESP
    Encryption algorithm: 3DES
    Hash algorithms: SHA1 , MD5
    PFS key group: 2
    Lifetime: 86400

    Works perfectly on 1.2.2.



  • The issue is only between pfSense <-> pfSense.  Between any other device I have tested it works without a problem (parallel tunnels or not).  I have now upgraded a few of our firewalls to 1.2.3RC1, so will see what happens.

    Roy



  • Well just tested with NAT-T enabled and mine is now working!!! Thanks guys, truly wonderful.



  • At the moment I use 1.2.3 RC2 and have also a similar problem, but not the same as written here. I only have troubles to access over VPN if I'm on the WLAN port. No problems at all on the LAN port. On the other side an old ZyWall 30W is working.

    I now disabled NAT-T because it's the only hint I found here. So I cann tell more about that tomorrow or so.

    After a reboot I have again acces over vpn from WLAN for a few hours until it's broken again. And as I told, after it is brolen on WLAN it works fine in LAN.

    Sigma



  • We were trying ipsec-tools 0.8 during that timeframe to fix a number of DPD issues with tunnels not re-negotiating.

    This proved to cause lot's of issues with parallel tunnels.

    So we backed out the change and went back to 0.7.2. We will be merging 0.7.3 soon which was just released.

    It has a few small fixes but unlikely to break things.


Log in to reply