IPSec tunnel dropping and re-negotiating every couple of minutes
-
Hi All,
I have an IPSEC tunnel to a Sonicwall which has been stable for the last year.
Out of the blue it is now disconnecting and reconnecting every few minutes?
I am still on 2.1.5 (I know its not the latest version) I can't upgrade due to the way we use captive portal which cannot be done with 2.2I believe it disconnects saying DPD: remote seems to be dead.
below is the log.
My pfsense WAN address is: 192.168.11.253. The Sonicwall WAN address is: 192.168.20.253
Please don't get confused by the WAN addresses looking like internal addresses. This is a Private IP network.Jul 3 08:26:12 racoon: [WG-PER]: [192.168.20.253] INFO: DPD: remote (ISAKMP-SA spi=9cb77d59ed2471f1:629ca1f3f93c4039) seems to be dead.
Jul 3 08:26:12 racoon: INFO: purging ISAKMP-SA spi=9cb77d59ed2471f1:629ca1f3f93c4039.
Jul 3 08:26:12 racoon: INFO: purged IPsec-SA spi=2739960925.
Jul 3 08:26:12 racoon: INFO: purged IPsec-SA spi=72463484.
Jul 3 08:26:12 racoon: INFO: purged ISAKMP-SA spi=9cb77d59ed2471f1:629ca1f3f93c4039.
Jul 3 08:26:12 racoon: [WG-PER]: INFO: ISAKMP-SA deleted 192.168.11.253[500]-192.168.20.253[500] spi:9cb77d59ed2471f1:629ca1f3f93c4039
Jul 3 08:26:12 racoon: [WG-PER]: INFO: IPsec-SA request for 192.168.20.253 queued due to no phase1 found.
Jul 3 08:26:12 racoon: [WG-PER]: INFO: initiate new phase 1 negotiation: 192.168.11.253[500]<=>192.168.20.253[500]
Jul 3 08:26:12 racoon: INFO: begin Identity Protection mode.
Jul 3 08:26:44 racoon: [WG-PER]: [192.168.20.253] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 192.168.20.253[0]->192.168.11.253[0]
Jul 3 08:26:44 racoon: INFO: delete phase 2 handler.
Jul 3 08:26:48 racoon: [WG-PER]: [192.168.20.253] INFO: request for establishing IPsec-SA was queued due to no phase1 found.
Jul 3 08:27:02 racoon: ERROR: phase1 negotiation failed due to time up. 978df3efa937f74d:0000000000000000
Jul 3 08:27:12 racoon: [WG-PER]: [192.168.20.253] ERROR: unknown Informational exchange received.
Jul 3 08:27:19 racoon: [WG-PER]: [192.168.20.253] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 192.168.20.253[0]->192.168.11.253[0]
Jul 3 08:27:19 racoon: INFO: delete phase 2 handler.
Jul 3 08:27:20 racoon: [WG-PER]: INFO: IPsec-SA request for 192.168.20.253 queued due to no phase1 found.
Jul 3 08:27:20 racoon: [WG-PER]: INFO: initiate new phase 1 negotiation: 192.168.11.253[500]<=>192.168.20.253[500]
Jul 3 08:27:20 racoon: INFO: begin Identity Protection mode.
Jul 3 08:27:20 racoon: INFO: received Vendor ID: RFC 3947
Jul 3 08:27:20 racoon: [WG-PER]: [192.168.20.253] INFO: Selected NAT-T version: RFC 3947
Jul 3 08:27:20 racoon: [WG-PER]: [192.168.20.253] INFO: Hashing 192.168.20.253[500] with algo #2
Jul 3 08:27:20 racoon: [Self]: [192.168.11.253] INFO: Hashing 192.168.11.253[500] with algo #2
Jul 3 08:27:20 racoon: INFO: Adding remote and local NAT-D payloads.
Jul 3 08:27:20 racoon: [Self]: [192.168.11.253] INFO: Hashing 192.168.11.253[500] with algo #2
Jul 3 08:27:20 racoon: INFO: NAT-D payload #0 verified
Jul 3 08:27:20 racoon: [WG-PER]: [192.168.20.253] INFO: Hashing 192.168.20.253[500] with algo #2
Jul 3 08:27:20 racoon: INFO: NAT-D payload #1 verified
Jul 3 08:27:20 racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txt
Jul 3 08:27:20 racoon: INFO: received Vendor ID: DPD
Jul 3 08:27:20 racoon: INFO: NAT not detected
Jul 3 08:27:20 racoon: [WG-PER]: INFO: ISAKMP-SA established 192.168.11.253[500]-192.168.20.253[500] spi:b7bb59743c81f91e:659cfae4695cd91a
Jul 3 08:27:21 racoon: [WG-PER]: INFO: initiate new phase 2 negotiation: 192.168.11.253[500]<=>192.168.20.253[500]
Jul 3 08:27:21 racoon: WARNING: attribute has been modified.
Jul 3 08:27:21 racoon: [WG-PER]: INFO: IPsec-SA established: ESP 192.168.11.253[500]->192.168.20.253[500] spi=34172402(0x2096df2)
Jul 3 08:27:21 racoon: [WG-PER]: INFO: IPsec-SA established: ESP 192.168.11.253[500]->192.168.20.253[500] spi=2344647773(0x8bc07c5d) -
What's "the way you use captive portal" that can't be done with 2.2.x?
DPD tearing down the connection means it's died for some reason.
Some Sonicwall firmware versions have significant issues with DPD, and some really old ones used a proprietary DPD that isn't compatible with any standard DPD implementation. Given it's worked for the past year, assuming no changes on either side, it's not the latter. Could possibly be the former, but could be any number of other causes. Given the fact that side is trying to rekey but failing, I doubt that's the issue either.
The Sonicwall side's logs are probably be more telling in that case since the logs there don't really show a cause and that end probably will, what does it show?
-
Hi,
Thanks for the quick answer.
I have just deleted the setup on the pfsense, rebooted and added it all back so far it has been up now for 15 minutes, longer than before. Maybe deleting it helped?The problem is that I have 2 Lan networks and they both have a captive portal.
They both are routed through a IPSEC VPN tunnel (all traffic)
With 2.2 if you set up a IPSEC tunnel, your LAN interface is not reachable any more (if you route 0.0.0.0/0 through the tunnel), therefore the captive portal does not work as the LAN interface is not reachable.
I have found a workaround in 2.1.5 adding a SPD to still be able to communicate with the LAN interface address. That can't be done with 2.2 as it is Strongswan now and I haven't found a way adding these SPD entries.Our Sonicwall is a 5600, one of the bigger and newer models (1 year old)
The Sonicwall logs state that the remote site is trying to re-negotiate (see below) (the log reads from bottom to top).
I will call Sonicwall for Tech support in a bit, but I fear they will say that the remote site keeps re-negotiating, so it's not a sonicwall problem.Info VPN IKE IKE negotiation complete. Adding IPSec SA. (Phase 2) 192.168.11.253, 500 192.168.20.253, 500 VPN Policy: PER-WG; ESP:AES-128; HMA
C_SHA1; Group 5; Lifetime=28800 secs ; inSPI:0x69087c5d; outSPI:0x9aed5a6Info VPN IKE IKE Responder: Accepting IPSec proposal (Phase 2) 192.168.11.253, 500 192.168.20.253, 500 VPN Policy: PER-WG; Local network 0.
0.0.0 / 0.0.0.0; Remote network 10.11.15.0/255.255.255.0Info VPN IKE IKE Responder: Received Quick Mode Request (Phase 2) 192.168.11.253, 500 192.168.20.253, 500 VPN Policy: PER-WG
Info VPN IKE IKE Responder: Main Mode complete (Phase 1) 192.168.11.253, 500 192.168.20.253, 500 VPN Policy: PER-WG;AES-128; SHA1; DH Group 5; lifetime=28800 secs
Info VPN IKE NAT Discovery : No NAT/NAPT device detected between IPSec Security gateways 192.168.11.253, 500 192.168.20.253, 500 VPN Policy: PER-WG
Info VPN IKE IKE Responder: Remote party timeout - Retransmitting IKE request. 192.168.20.253, 500 192.168.11.253, 500 VPN Policy: PER-WG
Info VPN IKE IKE Responder: Received Main Mode request (Phase 1) 192.168.11.253, 500 192.168.20.253, 500
-
The exclusion of the LAN subnet is the same in 2.2.2 and newer as it was in 2.1.5, the LAN subnet to the LAN IP is automatically excluded. Should be fine there now.
From the looks of that, recreating the VPN had nothing to do with it starting to work again. Re-creating it would leave it not trying to connect for long enough that the Sonicwall could have sorted itself out.
If it's still an issue, try disabling DPD on both sides. It's possibly not helpful at all in that circumstance, and it might just be one side or the combination of the two having a DPD issue, in which case that'd fix it.
Sonicwall support should be willing to help troubleshoot why DPD is timing out. That's the root problem, the renegotiation is just a consequence of the root problem - not getting DPD replies for long enough to consider the connection dead and delete it.
-
Hi,
Recreating fixed the issue in the end. I had the tunnel deactivated for over 30 minutes on both ends beforehand as well but it still had the issue.
Either way, I am happy it works again.In regards to the exclusion of the LAN subnet, you are correct, it is the same in 2.1.5 and 2.2.2.
But I managed to put a workaround in place for 2.1.5 so the LAN subnet is not excluded and therefore my Captive portal works on both LAN interfaces.
This workaround however is not possible on 2.2.2, otherwise I would love to upgrade.Not sure if anyone is interested in the workaround, but I post it anyways, maybe it helps someone.
I modified the vpn.inc config to automatically add a spd entry if there is a vpn tunnel
if ($config['interfaces']['lan']) {
$lanip = get_interface_ip("lan");
if (!empty($lanip) && is_ipaddrv4($lanip)) {
$lansn = get_interface_subnet("lan");
$lansa = gen_subnet($lanip, $lansn);
$spdconf .= "spdadd -4 {$lanip}/32 {$lansa}/{$lansn} any -P out none;\n";
$spdconf .= "spdadd -4 {$lansa}/{$lansn} {$lanip}/32 any -P in none;\n";
}
$lanipv6 = get_interface_ipv6("lan");
if (!empty($lanipv6) && is_ipaddrv6($lanipv6)) {
$lansnv6 = get_interface_subnetv6("lan");
$lansav6 = gen_subnetv6($lanipv6, $lansnv6);
$spdconf .= "spdadd -6 {$lanipv6}/128 {$lansav6}/{$lansnv6} any -P out none;\n";
$spdconf .= "spdadd -6 {$lansav6}/{$lansnv6} {$lanipv6}/128 any -P in none;\n";
}
if ($config['interfaces']['opt1']) {
$opt1ip = get_interface_ip("opt1");
if (!empty($opt1ip) && is_ipaddrv4($opt1ip)) {
$opt1sn = get_interface_subnet("opt1");
$opt1sa = gen_subnet($opt1ip, $opt1sn);
$spdconf .= "spdadd -4 {$opt1ip}/32 {$opt1sa}/{$opt1sn} any -P out none;\n";
$spdconf .= "spdadd -4 {$opt1sa}/{$opt1sn} {$opt1ip}/32 any -P in none;\n"; -
I know this post has been inactive for a while, but I just wanted to say that I ran into the same issue. Every roughly 160 seconds (2 minutes 40 sec), the ipsec tunnel would drop and reconnect. Some differences for me were that there were no issues reported in the log, the other end is not a sonic wall, and my version is 2.3.2-RELEASE-p1.
I fixed it by deleting my configuration and recreating it from scratch. There must be some subtle bug in the ipsec back-end.