captive portal with radius, ACCEPTing username even if NPS has trigger a "deny" policy

jimmychoosshoes

PFSENSE 2.5.2 (2.6.0 was deathly slow for me, seriously slow, even with a rebuild and restore so until I have time to faultfind I rolled back to 2.5.2). Windows server for radius and NPS. Captive portal is VLAN with PFSENSE running DNS forward for the captive portal, DHCP helper forwards to windows DHCP. I only have one zone at the moment.

NPS CRP is configured to accept client based on IP and NAS identifier for pfsense. This NPS also has VPN policies. In Network policies, the pfsense applicable policy also applies to NAS identifier and radius client IP, there is a global DENY policy with time restrictions, this is to apply to all radius requests to this NPS server.

Captive portal works as voucher on primary, radius on secondary. Voucher works perfectly. Radius appears to work but needs tweaking:

If I input "incorrect domain details" into the captive portal:

captive portal says "invalid credentials specified". Opening a new tab and trying to go to a webpage elicits a 204 redirect option to the captive portal (as expected)
PFSENSE Status-System Logs-Authentication-Captive Portal Auth gives FAILURE with the username I tried.
NPS server logs an error code 16 (credential mismatch) correct NAS identifier and IP are logged for the attempt

input "correct domain details":

captive portal works but takes around 30 seconds to redirect - I can open a new tab and it browses instantly.
PFSENSE Status-System Logs-Authentication-Captive Portal Auth gives ACCEPT with the username I tried.
NPS server logs an attempt with no error codes and logs the correct Network Policy that has been triggered - the "accept" network policy for PFSENSE

input "correct domain details" but during conditions that the "NPS deny policy will be in effect":

captive portal works but takes around 30 seconds to redirect - I can open a new tab and it browses instantly.
Status-System Logs-Authentication-Captive Portal Auth gives ACCEPT with the username I tried.
NPS server logs an attempt with no error codes and gives the "deny policy" Network Policy that has been triggered.

As you can see, even when my NPS deny policy is triggered and returned I still get an ACCEPTlogged in PFSENSE. I assume that I need to add some logic into PFSENSE to only act on an "grant accept" NPS returned policy. How can I do this? Do I need to add a specific "vendor attribute" to send a "this is a deny access policy" to PFSENSE from NPS?

Gertjan

@jimmychoosshoes

The thing is : when you use the pfSense FreeRadius package, you could stop this service in the GUI, and then, on the command line :

radiusd -X

now you'll see the entire FreeRadius detailed log. The decisions made, the info received, the info send back to pfSense.

You are using another Radius system. So up to you to debug.

@jimmychoosshoes said in captive portal with radius, ACCEPTing username even if NPS has trigger a "deny" policy:

captive portal works but takes around 30 seconds to redirect

Doesn't work like that.
The captive portal itself isn't some code or script.
It's a firewall rule that has two states :
a) blocks your clients IP/MAC,
b) doesn't block your IP/MAC.

You can inspect the firewall rules at any moment to see in what state the firewall is.
The time it takes between “blocking” to “non-blocking” takes …. One micro second or so ( ? ). The PHP captive portal login page keeps on executing and will at the end redirect the browser the client is using. Subsequent http/https request will flow trough the firewall “ as if the portal wasn’t there any more.

When you see "PFSENSE Status-System Logs-Authentication-Captive Portal Auth gives ACCEPT " the firewall rue is already in place. The code that logs the ACCEPT line comes just next.

@jimmychoosshoes said in captive portal with radius, ACCEPTing username even if NPS has trigger a "deny" policy:

takes around 30 seconds to redirect

As usual : I place all my bets on : you have a "DNS mess".

IMHO : the pfSense captive portal "client" might be somewhat hard coded to use the pfSense FreeRadius server package.
I know many use their own Radius server, so, it might work. The thing is, Radius servers are not simple process with 'some' settings. They have thousands of setting. Up to you to see what pfSense can supply, and how you deal with it within your Radius, and what to send back (and what pfSense accepts).

jimmychoosshoes

@gertjan said in captive portal with radius, ACCEPTing username even if NPS has trigger a "deny" policy:

you are using another Radius system. So up to you to debug

This I understand, hence why I am here to look at options other people have had. Hopefully there will be someone who reads this who has implemented a windows server NPS and has seen something similar. My issue is that radius is sending a "deny access" policy result back to pfsense and this is being interpreted as an "grant access".

"The captive portal itself isn't some code or script."

Surely the captive portal issues a response.redirect to the client after authentication? Is there a log of this action as i'm curious to know what the redirect being attempted/sent to the client is. From the client I suspect there is a captive portal destination being requested (forgive me, I do not know the correct term), this is not a true web page or destination and if the pfsense captive portal tried to return the user to this 204 request then a loop will occur. Without knowing the correct term I cannot describe fully what I mean. for example, I can request a "web page" proper such as www.google.com, this will be intercepted by the captive portal and redirect back to www.google.com when finished (assuming we are authenticated). However using Edge browser with an unauthenticated user browsing to a web page can result in an edge page stating "The network you are using may require you to go to its sign in page" with a "connect" button - I suspect this is when a DNS is known already and thus not necessarily intercepted by the captive portal but is refused by the firewall rule (again a suspicion). clicking this edge generated connect button leads to the pfsense captive portal login page as expected. I suspect the delay is down to the captive portal being unable to redirect to this "check for captive portal" request that was generated by the client. I fixed this by cheekily adding a company policy page on our intranet and using the setting pfsense captive portal "After authentication Redirection URL"

jimmychoosshoes

I will add a reply rather than edit. I am correct, I have found that the reason for a 30 second loop is simply the return URL being a generated 204 web page being requested. Edge attempts to go to http://edge-http.microsoft.com/captiveportal/generate_204 which results in a timely loop. By me adding a force redirect this has solved it.

By manually altering the captive portal URL so that &redirurl goes to something more suitable (such as www.google.com) this 30 second loop is eliminated.

Now I just need to find out why an NPS "deny access" policy returned to PFSENSE is not honored. I suspect I need to add something else to the policy so that PFSENSE knows that this is a "deny" policy not an "accept" policy.

jimmychoosshoes

Interestingly I did all my testing with a Windows 10 and Windows 11 laptop until I was happy with my captive portal. I tested at each stage:

set up captive portal with defaults, no authentication.
set up voucher roll and voucher authentication.
add SSL certificate (I already use an ACME letsencrypt with pfsense so I added another URL to the SAN for the captive portal)
set up radius
customise the logon HTML and the "error" HTML

I was happy that this all worked - only the "edge browser" seems to have an oddity with captive portal (force redirect sorted that and I was going to force redirect to my "company landing page" anyway, chrome and firefox have no issue sending its captive portal check plus redirecting back. Now to test with other devices:

*Ipad worked fine.
*Android did not. Android was convinced that it was connected - it attempted a www.gstatic.com/generate_204 which apparently (according to the device) succeeded pre authentication! There was no traffic flow though (good). However I could not get the captive portal page to trigger on an android device, it was convinced that it needed to "sign in" but then would simply say that it was connected.

I spent quite a lot of time looking at firewall logs, device logs and trying to fathom why the android device was convinced it had a connectivity allowed and I was never shown the captive portal page, I checked everything from DNS (I use pfsense forwarder and there is only one "exception" which is a "disclaimer landing page" simple URL on a local webserver).

In the end I found that if I "disconnected all users" then this would work. After digging it seems that if I make a change to the pfsense portal settings I need to disconnect all users for my android device to see the captive portal. Most odd.

Android device is version 12. I have no idea what this will do to the people who have vouchers when I disconnect (radius auth will be irrelevant of course, they can re-sign in.