Fix the VPN IPSEC Dead Peer Detection in 1.2.2 or 1.2.3 {$200}
-
What I find odd is none of the Devs have not responded to this post nor the other posts I have seen regarding this issue at least telling us it is a bug or if it is unable to be fixed.
I will probably end up buying Cisco Linksys RV042's instead of implementing pfSense at remote locations until this can be fixed as I need to deploy about 7 locations with remote VPN's and needing to manually intervene each times is just a little too much for me.
-
Actually, if you use the term "dev" loosely, I am one :-)
(I am a committer on 2.0/HEAD and for packages, but I'm not a part of the core team)
This still feels more like a racoon bug than a pfSense bug. If it does turn out to be a pfSense bug, it may just be in terms of how DPD is being configured by the WebGUI. There isn't much to go wrong there, though.
I can try to build a tunnel to a 2.0 box and see if it behaves the same way.
-
As far as I know both should be able to initiate DPD. I think this is part of the problem as DPD is not available in 1.2.2. I have tried modifying the file in pfsense to enable it but it does nothing once I have edited the file.
In 1.2.3 people are having the same problem. I wish someone would shed some light on the issue.
-
I should mention that I am testing this with 1.2.3, so DPD should be working, in theory anyhow.
-
Could it be something to do with the version of racoon? I could not find much info about it besides a project called Kame..but that did not tell me much.
-
I have another tunnel between this 1.2.2 and another 1.2.2 and the tunnel has no problem coming back up. But others have complained about it so there is no consistency.
-
Could it be something to do with the version of racoon? I could not find much info about it besides a project called Kame..but that did not tell me much.
That's possible, which is why I want to try it against 2.0 as well. They are both running the same version of ipsec-tools (http://ipsec-tools.sourceforge.net/) though:
from 1.2.3:
May 11 12:33:07 pfsense-123test racoon: INFO: @(#)ipsec-tools 0.7.1 (http://ipsec-tools.sourceforge.net)from 2.0:
May 11 11:32:58 pfSense-20test racoon: INFO: @(#)ipsec-tools 0.7.1 (http://ipsec-tools.sourceforge.net)There is a slightly newer version of ipsec-tools (0.7.2) out, but I don't see any details release notes on the web site that state what bugs were fixed.
-
It appears that what I thought was the DPD packet may have been Cisco's IPSec keep-alive. I disabled that, and left DPD enabled on the pfSense side, and I see no regular traffic on the tunnel.
That makes me think that either DPD isn't being turned on, or it isn't being negotiated properly for the tunnel.
With Cisco's Keep-Alive turned on, I get:
May 11 15:41:44 pfsense-123test racoon: INFO: received Vendor ID: CISCO-UNITY
May 11 15:41:44 pfsense-123test racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txt
May 11 15:41:44 pfsense-123test racoon: INFO: received Vendor ID: DPDWith it disabled, I get:
May 11 16:26:21 pfsense-123test racoon: INFO: received Vendor ID: CISCO-UNITY
May 11 16:26:21 pfsense-123test racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txtThat makes me wonder if, perhaps, the Cisco side really doesn't support DPD as well as it claims to. I'm not sure if there is some other mechanism to signal racoon when something else fails (i.e. the keepalive ping) to reload a specific tunnel. That would probably be more reliable than DPD since that must be supported by both sides.
Why this works for some vendors/devices and not others is also puzzling…
-
Then why is it that when I reboot my concentrator the pfSense does not respond as seeing the tunnel go down at all but stays green?
-
Then why is it that when I reboot my concentrator the pfSense does not respond as seeing the tunnel go down at all but stays green?
That is what I'm trying to figure out… :-)
As you said, when both sides are pfSense, it seems to work. I'm wondering if the "invalid SPI" reply generated by the Cisco is broken in some way (or racoon's parsing thereof) such that it doesn't pick up on the fact that the tunnel traffic is being rejected.
-
Version history: –-------------- 0.7.1 - 23 July 2008 o Fixes a memory leak when invalid proposal received o Some fixes in DPD o do not set default gss id if xauth is used o fixed hybrid enabled builds o fixed compilation on FreeBSD8 o cleanup in network port value manipulation o gets ports from SADB_X_EXT_NAT_T_[SD]PORT if present in purge_ipsec_spi() o Generates a log if cert validation has been disabled by configuration o better handling for pfkey socket read errors o Fixes in yacc / bison stuff o new plog() macro (reduced CPU usage when logging is disabled) o Try to works better with huge SPD/SAD o Corrected modecfg option syntax o Many other various fixes…
-
That is all foreign to me but maybe it has something to do with the problem.
-
I'll have to look at it more tomorrow, but I might be able to see if bumping ipsec-tools to 0.7.2 might help things along.
Can't promise anything though. -
Those aren't the changes in 0.7.2, but the changes in 0.7.1. Here are the changes in 0.7.2:
0.7.2 - 22 April 2009
-
Fix a remote crash in fragmentation code
-
Phase2 message identities are phase1 specific (Vista compatibility)
-
Autogenerate ChangeLog from cvs metadata
-
Fix mode config pool resizing
-
NAT-T fixes related to purging of IPsec SA:s and retransmission
-
Remove phase1 handler immediately if first exchange is bad
-
A bunch of memory leak and possible memory corruptions (triggerable
by bad configuration or startup parameters)
Seems like an update that is worth upgrading to given how many crash fixes there are in it.
-
-
But this is not likely to be applied to 1.2.2….... :'( Or at least in 1.2.3 unless it is already there. Just concerned about stability.
-
But this is not likely to be applied to 1.2.2….... :'( Or at least in 1.2.3 unless it is already there. Just concerned about stability.
The devs have to consider what is worse:
1. Potential instability due to a new version of ipsec-tools or even an increase in stability due to bugs being fixed.
2. Shipping with a remote DoS attack vulnerability that has been known for 3 weeks now. -
I am testing a build of 1.2.3-RC with ipsec-tools 0.7.2 and it may be my slightly weird test environment, but it didn't fix the issue so far.
That said, if I switch both ends of the IPSec tunnel to Aggressive Mode instead of Main Mode, then DPD seems to want to work, but doesn't actually get all the way.
May 14 10:17:55 pfsense-123test racoon: INFO: IPsec-SA request for x.x.x.49 queued due to no phase1 found. May 14 10:17:55 pfsense-123test racoon: INFO: initiate new phase 1 negotiation: x.x.x.40[500]<=>x.x.x.49[500] May 14 10:17:55 pfsense-123test racoon: INFO: begin Aggressive mode. May 14 10:17:55 pfsense-123test racoon: INFO: received Vendor ID: CISCO-UNITY May 14 10:17:55 pfsense-123test racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txt May 14 10:17:55 pfsense-123test racoon: INFO: received Vendor ID: DPD May 14 10:17:55 pfsense-123test racoon: NOTIFY: couldn't find the proper pskey, try to get one by the peer's address. May 14 10:17:55 pfsense-123test racoon: INFO: ISAKMP-SA established x.x.x.40[500]-x.x.x.49[500] spi:5570f421d746e391:6a94f32073b1ed3f May 14 10:17:56 pfsense-123test racoon: INFO: initiate new phase 2 negotiation: x.x.x.40[500]<=>x.x.x.49[500] May 14 10:17:56 pfsense-123test racoon: WARNING: ignore RESPONDER-LIFETIME notification. May 14 10:17:56 pfsense-123test racoon: INFO: IPsec-SA established: ESP x.x.x.49[0]->x.x.x.40[0] spi=165272301(0x9d9daed) May 14 10:17:56 pfsense-123test racoon: INFO: IPsec-SA established: ESP x.x.x.40[500]->x.x.x.49[500] spi=463118085(0x1b9a9f05) [Power off Cisco VPN Concentrator] May 14 10:19:10 pfsense-123test racoon: INFO: DPD: remote (ISAKMP-SA spi=5570f421d746e391:6a94f32073b1ed3f) seems to be dead. May 14 10:19:11 pfsense-123test racoon: INFO: ISAKMP-SA deleted x.x.x.40[500]-x.x.x.49[500] spi:5570f421d746e391:6a94f32073b1ed3f
The log message is all well and good except it didn't actually delete the SAs. They're still there.
I'm making a new build right now to see if things behave any differently. Before that, I'm going to go back to a stock 1.2.3 snapshot and see if Aggressive mode behaves the same way.
-
All is now working! I removed the Concentrator from the DMZ of my local pfSense and connected it directly to a public IP. I noticed in the firewall logs that my firewall was blocking port 500 traffic and all other traffic which originated from the remote sites to my local site. Odd since I created a rule on my DMZ allowing all traffic to pass to the public interface of my concentrator.
Lan 10.20.30.1
DMZ 10.20.20.1Concentrator Private: 10.20.30.2
Concentrator Public: 10.20.20.2I was seeing my customers IP's being blocked:
10.0.0.0/24
192.168.127.0/24
172.20.30.0/16These were being blocked both on the LAN interface and on the DMZ. I will post some of the logs. I just need to reconfigure to internal again rather than direct connect to public IP on Concentrator Public interface.
-
So your tunnel reestablishes OK now after a power cycle?
That is strange, since my concentrator is already on a public IP and has no filtering in front of it. Its public is on the same switch as my pfSense test box.
And yet if I power cycle the concentrator, the tunnel never comes back up.
Are you sure nothing else changed in all that?
-
I did modify the DPD setting in the pfSense 1.2.2 by modifying the file. I think DPD was enabled for phase 1 or phase 2 and I enabled it for the other one in the conf file. If I can remember which file I will post the changes I made.