Upgrade to 2.4.5 broke 802.1x RADIUS WiFi over VPN
-
Hello,
I'm so glad to have found this thread!! I have been investigating the same issue for a few days and for some reason did not find this thread until now!!
I have a very similar environment:
- Main site connecting to multiple sites using OpenVPN site-to-site (Peer to Peer (SSL/TLS) || UDP || TUN) tunnels
- Main site hosting Windows NPS (RADIUS) server
- Branch sites with UniFi UAP access points, authenticating domain joined Windows clients against RADIUS server at main site, using WPA2-Enterprise (computer authentication)
As in the initial post, it is possible to connect to servers/PCs at main site and to servers/PCs between sites and firewalls are set to allow all traffic over OpenVPN tunnels.
First experienced a problem after upgrading sites from pfSense 2.4.4-p3 to 2.4.5 :: after the upgrades wireless clients were unable to authenticate against RADIUS server over S2S OpenVPN tunnels.
At the time I was able to get clients to authenticate by changing Framed-MTU on Windows NPS server as per this article:
https://support.microsoft.com/en-us/help/883389/how-to-reduce-the-eap-packet-size-by-using-the-framed-mtu-attribute-in
I used the suggested Framed-MTU = 1344
pfSense on the main site is hosted on Hyper-V and was not upgraded from 2.4.4-p3 due to performance problems reported for pfSense 2.4.5 running on hypervisors.
With performance issues resolved with 2.4.5-p1 release, I upgraded main site (and all branch sites) to 2.4.5-p1, and now wireless clients are not able to authenticate.
I should add, I had OpenVPN custom options in place:
OpenVPN custom options used on pfSense 2.4.4:
fragment 1400; mssfix;
OpenVPN custom options on pfSense 2.4.5:
tun-mtu 1500; tun-mtu-extra 32; mssfix 1450;
I have tried several things, including different custom options for OpenVPN for S2S server/clients and lower values for Framed-MTU on Windows NPS server, but have not been able to get wireless clients to authenticate. Clients on the same site and the RADIUS server continue to work as expected and the only change to the environment was the pfSense upgrades to 2.4.5-p1.
I will assign OpenVPN interfaces and revert back.
-
@wdup said in Upgrade to 2.4.5 broke 802.1x RADIUS WiFi over VPN:
...
I will assign OpenVPN interfaces and revert back.You can either assign OpenVPN interfaces or revert back. No need to do both.
Actually, from what I have seen, you only need to assign OpenVPN an interface on the main site (with the NPS), not on the remote sites. I am running pfSense 2.4.5_p1 on HyperV, so it should work the same way for you.
IMO Netgate should mention this in the Assigning OpenVPN Interfaces documentation, as there is no indication it is necessary for proper fragmentation handling. Also IMO it should not be necessary for proper fragmentation handling; proper handling of fragmented packets should be a baseline for the VPN to be considered working. But at least a note in the docs would be nice.
-
@DAVe3283 LOL ... apologies, my previous comment was ambiguous, I meant I will revert back with feedback after assigning the OpenVPN interfaces
I have indeed assigned the OpenVPN interfaces and I'm happy to confirm the problem is resolved.
To confirm, I have also only assigned OpenVPN interfaces on the main site where the Windows NPS server is hosted, and clients can authenticate again.
However, my concern is we may be in a unique situation where we experience this "problem", but I would like to understand what has in fact changed from previous pfSense versions and what the underlying cause of the "problem" is?
Even though the "problem" is solved by only assigning OpenVPN interfaces at the main site, I feel it might be best to assign ALL OpenVPN interfaces at ALL sites to avoid similar "problems" going forward - what do you think?
I agree a note in the documentation would be great!
-
It's probably this you're hitting: https://redmine.pfsense.org/issues/7779
You could confirm it by checking the packet size and if they are fragmented in a packet capture.
If you are using RADIUS with UDP this is more likely to be an issue. If it's using TLS, and therefore TCP, I expect it to detect the route MTU and use packets that do not fragment. If it is not doing so you should investigate that.
Steve
-
@stephenw10 Thank you for the reply.
If I may ask the question differently, is there any harm in assigning ALL OpenVPN interfaces?
-
Not really. You will need to restart any OpenVPN servers after assigning them as an interface though.
Also to actually make use of it make sure traffic is passed on the assigned interface firewall rules and not the 'OpenVPN' rules.
Steve
-
Hi there, I am running 2.5.1 on 2 sites, with a site2site openvpn.
I would like to get radius to work in both directions in order to have fall-back NPS for Wifi.
Right now there is a rule in the openvpn interface which allows all.
There is also one for opt3 which is not handling any traffic though.
Would it be enough to disable the rule in openvpn if to get the traffic handled by the opt3 if?
I would need to do that on both sites I guess, to have it working in both directions? -
Yes, if you disable a rule on the group OpenVPN interface traffic will hit rules on the assigned interfaces and get the required reply-to tags.
Steve
-
@stephenw10 Hey there, I was brave and tested to change those settings from remote :)
Nothing broke. Traffic is being handled by the interface specific rule now.
But still I don't get any request on the RADIUS server on the other tunnel end. Always bad UDP checksum... -
@ogghi said in Upgrade to 2.4.5 broke 802.1x RADIUS WiFi over VPN:
Always bad UDP checksum...
In a packet capture?
That's expected if you have checksum offloading enabled on the capture interface.You're not seeing the radius traffic arrive at the server at all?
Steve
-
@stephenw10 nothing arrives on the radius server from over the vpn connection.
That's the weird thing. At least nothing is logged in the windows service... -
Hmm, well I'd pcap on the server to be sure. I'd also pcap at each interface in the route to see where it's failing.
We have seen issues with large UDP packets not fragmenting correctly across the tunnel. You would see that in a pcap if you are hitting that or something similar.Steve
-
@stephenw10 Just did some package capture. On the ADC on the other tunnel side:
On the one where it's working:
I am wondering why the length seems to be capped at 190 bytes for the one going through the tunnel...?
-
190B may just be the size of that request.
Where, specifically did you capture there?
I would check on the OpenVPN and internal interfaces at both ends if the tunnel. The traffic should appear in all 4 places but since something is failing it may not. You need to determine where it's failing.
Steve
-
@stephenw10 thanks for your help! :)
So I did capture traffic. Seems there is just no reply from the RADIUS server. Traffic gets to the server, but there is never any packet being sent back.
So it seems like debugging this windows NAPS is due here!EDIT: Seems it must be some issue on Windows firewall? The NPS server logs nothing at all. If running locally NTradping tool it shows at least some log entries. But other then opening port 1812 UDP on the firewall...what else could I do here?
-
Does it log something if there is a bad request? Incorrect shared secret for example.
You might be able to see some difference in the radius requests that fail wireshark. They are smaller packets as you noted.
I don't think that's a problem in the pfSense config though if traffic arrives at the server and looks the same as when it arrives in the remote firewall.
Steve
-
@stephenw10 Hi there!
I just checked again the radius config for the auth servers in the pfSense. Actually I reconfigured it. Now the packet sizes are identical.
I get the bad UDP checksum also for the radius on the ADC without VPN where it's working.
So my current thought is that there might be an issue with the NPS itself. I'll try to uninstall/reinstall the role there. Who knows... -
@ogghi I think I'll try and debug on the windows server/NPS side. The packets arrive at the windows server as seen on Wireshark. But nothing is ever logged on NPS. So it might be some really stupid bug here..