pfSense IPsec Microsoft Azure MTU
-
Hi All
I've been chasing down a problem with an IPSec S2S VPN to Microsoft Azure for a few days now.The basic issue is that whatever I've tried in pfSense (MSS clamping, explicitly setting the MTU of the LAN/WAN interfaces), pfSense does not seem to participate in PMTUD, and thus from my client LAN, I end up with an MTU black hole between 1420 and 1492 bytes (1492 being the MTU of my PPPoE link).
If I manually set my client NIC's MTU to 1420, then the problem disappears. If I swap the pfSense box for a Draytek 2860, I do not experience any issues (and I have about 6 or 7 locations using Draytek 2860s without any issues).
Microsoft actually recommend setting the MTU of the IPSec VPN to 1400, or if it is not possible to set the MTU, to instead set the MSS to 1350. I have tried setting the MSS of the pfSense VPN to 1350, but obviously this has no impact on UDP traffic.
Is it possible to explicitly set the MTU of an IPSec VPN tunnel? Or otherwise force pfSense to participate in PMTUD, as clearly the Draytek router is able to do.
Thanks
Mark -
I had the same issue with UDP traffic especially large packets. To solve I had to disable scrubbing. See:
https://docs.netgate.com/pfsense/en/latest/config/advanced-firewall-nat.html
Another option is to use baby jumbo(RFC4638) frames for PPPoE. Which sets the MTU to the MTU to 1500. See:
https://forum.netgate.com/topic/78754/rfc4638-baby-jumbo-frames-for-pppoe-connections-mtu-1508/5
-
@rai80 Thanks - but what benefit would baby jumbo frames have here? I need pfSense to drop the MTU of the VPN tunnel to 1400 bytes in order to be compatible with Azure.
I've also tried disabling scrubbing, but it doesn't appear to have any effect. (the problem I am trying to troubleshoot is EAP-TLS WPA2 enterprise certificate authentication - this uses large UDP packets, but I feel I need to resolve this MTU black hole issue first.)
-
@rolytheflycatcher Baby jumbo frames will not help is this case. It will solve some MTU issues with IPV6.
A month ago I digged in the same issue. EAP-TLS with certificates and large UDP packets. In my case it was solved when I disabled scrubbing. I have the same scenario: W10 client -> pfSense box -> ipsec vpn -> Azure -> W2019 Server with NPS/PKI.
Did you set framed-mtu to 1344 in NPS ?
Default MTU for ipsec interface is 1400. MSS clamping is disabled by default. You can enable it. In my case it did not help.
To make sure its a MTU issue and packets are getting dropped, do a packet trace on the pfSense incoming interface and a packet trace on the Radius/NPS Server interface and compare packets. In my case only a few packets came through, most where missing.
-
@rai80 I have exactly same set up as you except I have NPS on a Win2k16 server in Azure.
Framed MTU is actually set to 1000 in NPS - I can't remember why that number, but I had trouble when I first set up EAP-TLS and I think I might have set it to 1000 to be well clear of any MTU limits.
I have EAP-TLS/cert auth working fine at several locations using Draytek 2860 routers as the VPN end point. So this problem is specific to PfSense - when I swap my pfsense box for a spare draytek 2860, the problem disappears.
Good idea about looking at packet traces on both sides.
-
@rai80 just following up on this - did disabling scrubbing really fix this issue for you? I have tried myself and it doesn't seem to have any effect on RADIUS EAP/TLS packets.
-
@rolytheflycatcher Yes, in my case it did. Did you compare packet captures at both sides ?
-
@rai80 Yes - even with scrubbing disabled, I am not seeing all of the RADIUS messages pass. Lines 1 to 6 are identical at both ends. However, lines 7, 8 and 9 do not reach Wireshark on the RADIUS server.
- 22:29:39.739546 IP 172.X.X.X.41168 > 172.Y.Y.Y.1812: UDP, length 286
- 22:29:39.774041 IP 172.Y.Y.Y.1812 > 172.X.X.X.41168: UDP, length 90
- 22:29:39.793664 IP 172.X.X.X.41168 > 172.Y.Y.Y.1812: UDP, length 456
- 22:29:39.824191 IP 172.Y.Y.Y.1812 > 172.X.X.X.41168: UDP, length 1086
- 22:29:39.830957 IP 172.X.X.X.41168 > 172.Y.Y.Y.1812: UDP, length 290
- 22:29:39.863551 IP 172.Y.Y.Y.1812 > 172.X.X.X.41168: UDP, length 1018
- 22:29:39.900166 IP 172.X.X.X.41168 > 172.Y.Y.Y.1812: UDP, bad length 1786 > 1472
- 22:29:42.902720 IP 172.X.X.X.41168 > 172.Y.Y.Y.1812: UDP, bad length 1786 > 1472
- 22:29:48.912688 IP 172.X.X.X.41168 > 172.Y.Y.Y.1812: UDP, bad length 1786 > 1472
After this failure, the sequence repeats.
-
Is there some other device in the network which blocks packet fragmentation? What do you see when you enable scrubbing?
-
@rai80 no, the only devices are:
Unifi AP -> pfSense -> IPSec -> MS Azure -> RADIUS server (VM in Azure)
I also experienced exactly the same with a Cisco Mobility Express AP. When I replace the pfSense router with a Draytek router, the problem disappears.
Disabling/Enabling scrubbing does not seem to make any difference to the logs. This is with scrubbing enabled:
23:03:44.443059 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, length 286
23:03:44.476094 IP Y.Y.Y.Y.1812 > X.X.X.X.41168: UDP, length 90
23:03:44.482902 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, length 456
23:03:44.515550 IP Y.Y.Y.Y.1812 > X.X.X.X.41168: UDP, length 886
23:03:44.574340 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, length 290
23:03:44.605656 IP Y.Y.Y.Y.1812 > X.X.X.X.41168: UDP, length 886
23:03:44.611526 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, length 290
23:03:44.644070 IP Y.Y.Y.Y.1812 > X.X.X.X.41168: UDP, length 537
23:03:44.670406 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, bad length 1786 > 1472
23:03:47.679874 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, bad length 1786 > 1472
23:03:53.689705 IP X.X.X.X.41168 > Y.Y.Y.Y.1812: UDP, bad length 1786 > 1472 -
I think the messages are smaller this time because I reduced the Framed-MTU on the RADIUS server down to around 800 as a test.
-
@rolytheflycatcher There is a known bug https://redmine.pfsense.org/issues/7801 . Some people found a workaround, see the comments.
-
@rai80 Thank you - I saw that bug report but still couldn't get things working.
However, on this thread @stephenw10 answered a general query I had about PMTUD not appearing to work. It seems that PMTUD with policy-based IPSec does not work, but it does work with route-based IPSec. In my case, I have been using a policy-based IPSec tunnel. As soon as I set up route-based IPSec (with static routes at the moment, but I'm sure BGP will work too) then my RADIUS/EAP-TLS issue disappeared - and with scrubbing enabled (i.e. default pfSense settings).