IKEv2 / ISAKMP from iOS device behind pfSense / NAT-T not working
-
Hi all,
I'm working on getting IKEv2 VPNs working from iOS 9.1 devices running behind a pfSense 2.2.5 firewall to a VPN gateway on the Internet (not behind NAT).
When I attempt to start the VPN connection, it tries for a few seconds and then fails. If I drop the iOS device off my network and onto the cellular network, it works straight away. After taking a few packet captures, I noticed that the outgoing IKE (500/udp) traffic is being NATed properly but the ISAKMP traffic (4500/udp) is not being NATed at all.
Here is a summary of the packet capture (taken from the WAN side of pfSense) that illustrates the problem:
1\. [pfSense WAN IP] -- IKE 500/udp --> [VPN Gateway] 2\. [VPN Gateway] -- IKE 500/udp --> [pfSense WAN IP] 3\. [iOS device *internal* RFC1918 IP] -- ISAKMP 4500/udp --> [VPN Gateway] (*timeout*) 4\. [iOS device *internal* RFC1918 IP] -- ISAKMP 4500/udp --> [VPN Gateway] (*timeout*) 5\. [iOS device *internal* RFC1918 IP] -- ISAKMP 4500/udp --> [VPN Gateway] (*timeout*)
As you can see, the initial IKE negotiation is being NATed correctly, with the iOS device internal IP being replaced with the pfSense WAN IP - as expected - but the subsequent ISAKMP / NAT-T traffic on 4500/udp is not being translated at all.
Outbound NAT configuration is set to "Automatic outbound NAT rule generation (IPSec passthrough included)" with no Port Forward, 1:1 or NPt configuration. IPSec is also not enabled on pfSense itself.
Has anyone else come across it before?
Many thanks in advance,
Ed -
Filter for :4500 under Diag>States, what do the matching states look like?
Only way I can think of that happening is if you had manual outbound NAT and weren't NATing 4500, or maybe if the firewall itself had a connection out with the same source and dest ports to the same remote IP already and the outbound was matching that state.
-
@cmb:
Filter for :4500 under Diag>States, what do the matching states look like?
Here's a representation of what states look like when I filter for [VPN Gateway] under Diags > States:
INT PROTO Source -> Router -> Destination State LAN udp [VPN Gateway]:500 <- [iOS device internal IP] MULTIPLE:MULTIPLE WAN udp [pfSense WAN IP]:500 ([iOS device internal IP]:500) -> [VPN Gateway]:500 MULTIPLE:MULTIPLE
So the initial IKE (500/udp) traffic is there, but no mention at all of the ISAKMP 4500/udp traffic. It also shows there isn't an existing NAT on that port to the same destination server.
Very odd :-\
-
Where/how are you seeing the 4500 traffic leaving without NAT?
You have some no state pass firewall rules defined?
-
@cmb:
Where/how are you seeing the 4500 traffic leaving without NAT?
From a packet capture that I took on pfSense of the WAN interface filtering on all traffic going to the VPN server. I can send to your privately if you like.
@cmb:
You have some no state pass firewall rules defined?
I assume that's a rule where "State type" is set to "none"? If so, no.
The only thing I have that mentions 4500/udp are some traffic shaping rules on the floating tab:
-
If you disable that 4500 floating rule, does it not happen?
-
@cmb:
If you disable that 4500 floating rule, does it not happen?
Disabled all floating rules and reset states, still no go.
-
What if you completely remove your shaper config?
-
-
I had this same problem with my AT&T microcell. Turn off packet scrubbing. Set your NAT rules to manual, and delete the default rule that is created for IPSec.
-
I had this same problem with my AT&T microcell. Turn off packet scrubbing. Set your NAT rules to manual, and delete the default rule that is created for IPSec.
Thanks, have tried the following:
- "Disable Firewall Scrub" is checked under Advanced > Firewall/NAT
- Set NAT mode to manual: no go
- Deleted default IPSec rules: no go
- Recreated default IPSec rules: still no go
Here are the rules I now have currently. This results in exactly the same packet capture / behaviour I documented initially (IKE 500/udp requests are being translated, ESP 4500/udp packets are not translated):
http://d.pr/i/17jSV
-
Some encouraging progress today.
One thing I should have mentioned initially is that the ISAKMP (4500/udp) packets are quite large (2032 bytes) and thus were being fragmented. After reading of some similar cases like mine on the forums, one of the suggestions was to set MSS clamping on the interfaces.
I've set the MSS on both the LAN and WAN to 1500 bytes and it now works!
There is still one problem though: it only works for one device. If I try connecting from a different device, it fails. Looking at the packet captures, traffic from the second device isn't being NATed at all. Traffic from the original / first device is NATed perfectly.
This appears to be NAT-related because if I reset states and initiate from the other device, it works while the other one doesn't. Essentially its "first in best dressed" and all other devices afterwards fail to NAT.
Any ideas on why it can only seem to handle / NAT one connection at a time?
-
Take the MSS out (it's not doing anything useful other than apparently forcing scrub on), and enable scrub. Then reset your states, and it should work after. Found what's most likely the source issue here, where you don't have scrub enabled, and have fragmented UDP traffic, it bypasses NAT on the egress interface. Gathering some more details to get a bug ticket opened.
-
@cmb:
Take the MSS out (it's not doing anything useful other than apparently forcing scrub on), and enable scrub. Then reset your states, and it should work after. Found what's most likely the source issue here, where you don't have scrub enabled, and have fragmented UDP traffic, it bypasses NAT on the egress interface. Gathering some more details to get a bug ticket opened.
I've removed the MSS clamping and switched on scrubbing. That seems to work as well as before; that is, I can establish a session from a single device but a second device fails to connect. Packet trace shows packets from the first device being NATed properly but packets from the second device still bypass NAT and vice-versa.
Let me know if there are any details I can provide.
Thanks so much for all your help so far.
-
Are you back to default auto outbound NAT as well?
Could you capture that traffic to a file and get me the pcap file? Can email that to me (cmb at pfsense dot org) with a link to this thread.
-
Bump, I'm seeing this too.. Sadly I need this working ASAP, so I'm reverting to a full 2.2.5 backup pre 2.2.6 taken on 2015/12/22 19:23:56 and see if this fixes it…
Will report back -
Was a ticket ever opened on this issue?
@cmb:
Found what's most likely the source issue here, where you don't have scrub enabled, and have fragmented UDP traffic, it bypasses NAT on the egress interface. Gathering some more details to get a bug ticket opened.
-
Sounds like it could possibly be https://redmine.pfsense.org/issues/5819 which is fixed on 2.3. I kind of doubt the referenced commit would apply cleanly against 2.2.x (again, assuming it's related) but it's worth checking for someone hitting the issue.
-
Sounds like it could possibly be https://redmine.pfsense.org/issues/5819 which is fixed on 2.3. I kind of doubt the referenced commit would apply cleanly against 2.2.x (again, assuming it's related) but it's worth checking for someone hitting the issue.
Nope, that's not it. I don't have two WANs.
-
Did you try it? Don't dismiss it outright because of that one difference.