Fix the VPN IPSEC Dead Peer Detection in 1.2.2 or 1.2.3 {$200}
-
Exactly!!!!!!!!!!!!!!!! ??? ??? ??? :'( :'( :'(
-
I did notice something weird, if you look at the times on the DPD packets in my tcpdump, they were being initiated by the VPN concentrator, not pfSense.
Perhaps that is why it isn't working? Even though pfSense is set for DPD, it has negotiated it with Cisco but is only replying and not initiating DPD checks?
Just a guess… probably needs more experimentation.
-
What I find odd is none of the Devs have not responded to this post nor the other posts I have seen regarding this issue at least telling us it is a bug or if it is unable to be fixed.
I will probably end up buying Cisco Linksys RV042's instead of implementing pfSense at remote locations until this can be fixed as I need to deploy about 7 locations with remote VPN's and needing to manually intervene each times is just a little too much for me.
-
Actually, if you use the term "dev" loosely, I am one :-)
(I am a committer on 2.0/HEAD and for packages, but I'm not a part of the core team)
This still feels more like a racoon bug than a pfSense bug. If it does turn out to be a pfSense bug, it may just be in terms of how DPD is being configured by the WebGUI. There isn't much to go wrong there, though.
I can try to build a tunnel to a 2.0 box and see if it behaves the same way.
-
As far as I know both should be able to initiate DPD. I think this is part of the problem as DPD is not available in 1.2.2. I have tried modifying the file in pfsense to enable it but it does nothing once I have edited the file.
In 1.2.3 people are having the same problem. I wish someone would shed some light on the issue.
-
I should mention that I am testing this with 1.2.3, so DPD should be working, in theory anyhow.
-
Could it be something to do with the version of racoon? I could not find much info about it besides a project called Kame..but that did not tell me much.
-
I have another tunnel between this 1.2.2 and another 1.2.2 and the tunnel has no problem coming back up. But others have complained about it so there is no consistency.
-
Could it be something to do with the version of racoon? I could not find much info about it besides a project called Kame..but that did not tell me much.
That's possible, which is why I want to try it against 2.0 as well. They are both running the same version of ipsec-tools (http://ipsec-tools.sourceforge.net/) though:
from 1.2.3:
May 11 12:33:07 pfsense-123test racoon: INFO: @(#)ipsec-tools 0.7.1 (http://ipsec-tools.sourceforge.net)from 2.0:
May 11 11:32:58 pfSense-20test racoon: INFO: @(#)ipsec-tools 0.7.1 (http://ipsec-tools.sourceforge.net)There is a slightly newer version of ipsec-tools (0.7.2) out, but I don't see any details release notes on the web site that state what bugs were fixed.
-
It appears that what I thought was the DPD packet may have been Cisco's IPSec keep-alive. I disabled that, and left DPD enabled on the pfSense side, and I see no regular traffic on the tunnel.
That makes me think that either DPD isn't being turned on, or it isn't being negotiated properly for the tunnel.
With Cisco's Keep-Alive turned on, I get:
May 11 15:41:44 pfsense-123test racoon: INFO: received Vendor ID: CISCO-UNITY
May 11 15:41:44 pfsense-123test racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txt
May 11 15:41:44 pfsense-123test racoon: INFO: received Vendor ID: DPDWith it disabled, I get:
May 11 16:26:21 pfsense-123test racoon: INFO: received Vendor ID: CISCO-UNITY
May 11 16:26:21 pfsense-123test racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txtThat makes me wonder if, perhaps, the Cisco side really doesn't support DPD as well as it claims to. I'm not sure if there is some other mechanism to signal racoon when something else fails (i.e. the keepalive ping) to reload a specific tunnel. That would probably be more reliable than DPD since that must be supported by both sides.
Why this works for some vendors/devices and not others is also puzzling…
-
Then why is it that when I reboot my concentrator the pfSense does not respond as seeing the tunnel go down at all but stays green?
-
Then why is it that when I reboot my concentrator the pfSense does not respond as seeing the tunnel go down at all but stays green?
That is what I'm trying to figure out… :-)
As you said, when both sides are pfSense, it seems to work. I'm wondering if the "invalid SPI" reply generated by the Cisco is broken in some way (or racoon's parsing thereof) such that it doesn't pick up on the fact that the tunnel traffic is being rejected.
-
Version history: –-------------- 0.7.1 - 23 July 2008 o Fixes a memory leak when invalid proposal received o Some fixes in DPD o do not set default gss id if xauth is used o fixed hybrid enabled builds o fixed compilation on FreeBSD8 o cleanup in network port value manipulation o gets ports from SADB_X_EXT_NAT_T_[SD]PORT if present in purge_ipsec_spi() o Generates a log if cert validation has been disabled by configuration o better handling for pfkey socket read errors o Fixes in yacc / bison stuff o new plog() macro (reduced CPU usage when logging is disabled) o Try to works better with huge SPD/SAD o Corrected modecfg option syntax o Many other various fixes…
-
That is all foreign to me but maybe it has something to do with the problem.
-
I'll have to look at it more tomorrow, but I might be able to see if bumping ipsec-tools to 0.7.2 might help things along.
Can't promise anything though. -
Those aren't the changes in 0.7.2, but the changes in 0.7.1. Here are the changes in 0.7.2:
0.7.2 - 22 April 2009
-
Fix a remote crash in fragmentation code
-
Phase2 message identities are phase1 specific (Vista compatibility)
-
Autogenerate ChangeLog from cvs metadata
-
Fix mode config pool resizing
-
NAT-T fixes related to purging of IPsec SA:s and retransmission
-
Remove phase1 handler immediately if first exchange is bad
-
A bunch of memory leak and possible memory corruptions (triggerable
by bad configuration or startup parameters)
Seems like an update that is worth upgrading to given how many crash fixes there are in it.
-
-
But this is not likely to be applied to 1.2.2….... :'( Or at least in 1.2.3 unless it is already there. Just concerned about stability.
-
But this is not likely to be applied to 1.2.2….... :'( Or at least in 1.2.3 unless it is already there. Just concerned about stability.
The devs have to consider what is worse:
1. Potential instability due to a new version of ipsec-tools or even an increase in stability due to bugs being fixed.
2. Shipping with a remote DoS attack vulnerability that has been known for 3 weeks now. -
I am testing a build of 1.2.3-RC with ipsec-tools 0.7.2 and it may be my slightly weird test environment, but it didn't fix the issue so far.
That said, if I switch both ends of the IPSec tunnel to Aggressive Mode instead of Main Mode, then DPD seems to want to work, but doesn't actually get all the way.
May 14 10:17:55 pfsense-123test racoon: INFO: IPsec-SA request for x.x.x.49 queued due to no phase1 found. May 14 10:17:55 pfsense-123test racoon: INFO: initiate new phase 1 negotiation: x.x.x.40[500]<=>x.x.x.49[500] May 14 10:17:55 pfsense-123test racoon: INFO: begin Aggressive mode. May 14 10:17:55 pfsense-123test racoon: INFO: received Vendor ID: CISCO-UNITY May 14 10:17:55 pfsense-123test racoon: INFO: received Vendor ID: draft-ietf-ipsra-isakmp-xauth-06.txt May 14 10:17:55 pfsense-123test racoon: INFO: received Vendor ID: DPD May 14 10:17:55 pfsense-123test racoon: NOTIFY: couldn't find the proper pskey, try to get one by the peer's address. May 14 10:17:55 pfsense-123test racoon: INFO: ISAKMP-SA established x.x.x.40[500]-x.x.x.49[500] spi:5570f421d746e391:6a94f32073b1ed3f May 14 10:17:56 pfsense-123test racoon: INFO: initiate new phase 2 negotiation: x.x.x.40[500]<=>x.x.x.49[500] May 14 10:17:56 pfsense-123test racoon: WARNING: ignore RESPONDER-LIFETIME notification. May 14 10:17:56 pfsense-123test racoon: INFO: IPsec-SA established: ESP x.x.x.49[0]->x.x.x.40[0] spi=165272301(0x9d9daed) May 14 10:17:56 pfsense-123test racoon: INFO: IPsec-SA established: ESP x.x.x.40[500]->x.x.x.49[500] spi=463118085(0x1b9a9f05) [Power off Cisco VPN Concentrator] May 14 10:19:10 pfsense-123test racoon: INFO: DPD: remote (ISAKMP-SA spi=5570f421d746e391:6a94f32073b1ed3f) seems to be dead. May 14 10:19:11 pfsense-123test racoon: INFO: ISAKMP-SA deleted x.x.x.40[500]-x.x.x.49[500] spi:5570f421d746e391:6a94f32073b1ed3f
The log message is all well and good except it didn't actually delete the SAs. They're still there.
I'm making a new build right now to see if things behave any differently. Before that, I'm going to go back to a stock 1.2.3 snapshot and see if Aggressive mode behaves the same way.
-
All is now working! I removed the Concentrator from the DMZ of my local pfSense and connected it directly to a public IP. I noticed in the firewall logs that my firewall was blocking port 500 traffic and all other traffic which originated from the remote sites to my local site. Odd since I created a rule on my DMZ allowing all traffic to pass to the public interface of my concentrator.
Lan 10.20.30.1
DMZ 10.20.20.1Concentrator Private: 10.20.30.2
Concentrator Public: 10.20.20.2I was seeing my customers IP's being blocked:
10.0.0.0/24
192.168.127.0/24
172.20.30.0/16These were being blocked both on the LAN interface and on the DMZ. I will post some of the logs. I just need to reconfigure to internal again rather than direct connect to public IP on Concentrator Public interface.