What is the best way to troubleshoot login issues?
-
Pfsense 2.1.3
Captive Portal
Authentication Radius with on Active DirectoryI have a reoccurring problem. I get a few users who cannot seem to login through the captive portal. They will enter the details correctly and the login screen just flickers and asks them to login again.
Normally a user will enter their active directory username, the pfsesne box sends the details to a Windows Radius server and they get granted access. The odd thing is these users who cannot get passed the login screen on CP are actually authenticating on the Radius logs. I can see on the timestamp that they are being granted access on the radius server but the client is not then being sent to google(setup in CP). The user also does not show up in CP status. I can try an array of usernames and passwords and still get no further. If I restart pfsesne, it usually fixes the problem but I have hundreds of users who get booted off if I do this.It is random users, no fixed pattern what I can see.
What is the best way to diagnose what my problem is.
summary
The users hits the captive portal login screen
they enter their credentials
The windows radius server grants them access
Captive portal just seems to refresh and asks them to login again.
-
Hi,
Your DHCP server on the portal Interface didn't run out of IP's, right ?
To 'debug', and see what pfSense is getting back from the Radius server, debug this file:
/usr/local/captiveportal/index.php
Checkout line 98 (or close):
captiveportal_logportalauth("unauthenticated","noclientmac",$clientip,"ERROR");Put the same line where ever you want, to show variables in your captive portal log
Like this:
captiveportal_logportalauth("Test on line xx", "$this-is-a-variable-I-like-to-see", $this-is-another-one, "ETC.");When you put them in place, hit your portal interface, test and see what happens in a normal login situation, when you get authorized.
When a client passes by and says: I can't get in, have a look in your logs and you see why.
example: pfSense didn't get a valid return value from Radius …
Or whatever.The concept is: log a max - and see what is strange, not normal, not logic.
Btw:
Look again in
/usr/local/captiveportal/
You will find two "inc" more files. These are PHP files - you can do the same with them.For some more serious stuff:
/etc/inc/captiveportal.inc - all the internal captive portal functions are there. -
Captive portal just seems to refresh and asks them to login again.
Try to enter some URL like www.google.com in After authentication Redirection URL …
btw.. which guide you use for your AD/radius+pfsense setup?? can you redirect me there.. thanks
-
Hi,
Your DHCP server on the portal Interface didn't run out of IP's, right ?
To 'debug', and see what pfSense is getting back from the Radius server, debug this file:
/usr/local/captiveportal/index.php
Checkout line 98 (or close):
captiveportal_logportalauth("unauthenticated","noclientmac",$clientip,"ERROR");Put the same line where ever you want, to show variables in your captive portal log
Like this:
captiveportal_logportalauth("Test on line xx", "$this-is-a-variable-I-like-to-see", $this-is-another-one, "ETC.");When you put them in place, hit your portal interface, test and see what happens in a normal login situation, when you get authorized.
When a client passes by and says: I can't get in, have a look in your logs and you see why.
example: pfSense didn't get a valid return value from Radius …
Or whatever.The concept is: log a max - and see what is strange, not normal, not logic.
Btw:
Look again in
/usr/local/captiveportal/
You will find two "inc" more files. These are PHP files - you can do the same with them.For some more serious stuff:
/etc/inc/captiveportal.inc - all the internal captive portal functions are there.I am going to test this today. My DHCP server is on a Windows box that is also acting as the Radius Server.
-
Captive portal just seems to refresh and asks them to login again.
Try to enter some URL like www.google.com in After authentication Redirection URL …
btw.. which guide you use for your AD/radius+pfsense setup?? can you redirect me there.. thanks
Hi Lynx
I have tried browsing to a site after trying to authenticate by I just keep getting redirected back to the CP login screen.
I cannot remember which exact guide I used. It was more of a collection. It was very easy to do though. I already had Active Directory in place. I just installed Network Policy Server on a Windows box and setup a radius client that was the IP address of the wan interface on my pfsense box. Then in pfsesne, turned on radius authentication and banged in the radius secret.
-
How are they handling cookies? Do they block or dont accept by default?
Its not in the security settings, but the handling of login parameters in the browser…
-
Hi Supermule
I am not sure what you mean. I will turn the security settings down the next time it happens.
I have setup a syslog server and I am now pumping the portalauthlogs out to that. I will have to wait until I get a user who cannot login before I can monitor their login process.
-
now I got same problem..
. so I followed one tutorial and setup networkpolicy in my AD and make it a radius server…
when I try to log in to the portal.. its not redirecting to google which I setup in redirectUrl....
it keeps on asking me to log-in... :(
-
Lynx
I would first off check the logs on your NPS/Radius server to see what's happening. If there is a radius issue, you should get an error at the logon screen saying 'no radius response' (something along those lines).
-
Out of curiosity, what does the following log indicate?
logportalauth[43261]: TIMEOUT: username, mac, ip address
Timeout is what I curious about. I am seeing a lot of it in my logs.
-
Out of curiosity, what does the following log indicate?
logportalauth[43261]: TIMEOUT: username, mac, ip address
Timeout is what I curious about. I am seeing a lot of it in my logs./etc/inc/captiveportal.inc says in the function captiveportal_prune_old():
TIMEOUT happens when the idle time out arrives, hard time, session time out, voucher, radius etc.The fonction captiveportal_prune_old() is run every minute by a cron task.
It removes the portal connections that where established, and reached the end. -
More on TIMEOUT:
Its what happens after user logged in.
(So there is a valid session, portal firewall rules are activated to let the user out, hard time out and soft time time are counted, etc - the user should be able to surf the net)Jun 9 08:27:05 logportalauth[20106]: CONCURRENT LOGIN - REUSING OLD SESSION: 104, 24:xx:64:21:2e:a1, 192.168.2.149 Jun 9 08:27:05 logportalauth[20106]: LOGIN: 104, 24:xx:64:21:2e:a1, 192.168.2.149 Jun 9 08:26:01 logportalauth[55814]: LOGIN: 104, 24:xx:64:21:2e:a1, 192.168.2.149 Jun 9 05:54:21 logportalauth[20106]: LOGIN: 211, a4:xx:d2:48:20:f3, 192.168.2.151 Jun 9 05:44:16 logportalauth[14947]: TIMEOUT: 211, a4:xx:d2:48:20:f3, 192.168.2.151 Jun 9 00:44:15 logportalauth[55814]: LOGIN: 211, a4:xx:d2:48:20:f3, 192.168.2.151 Jun 8 20:47:52 logportalauth[49541]: TIMEOUT: 201, 90:xx:7c:8a:3f:8c, 192.168.2.158 Jun 8 19:33:22 logportalauth[55814]: CONCURRENT LOGIN - REUSING OLD SESSION: 201, 90:xx:7c:8a:3f:8c, 192.168.2.158 Jun 8 19:33:22 logportalauth[55814]: LOGIN: 201, 90:xx:7c:8a:3f:8c, 192.168.2.158 Jun 8 19:32:59 logportalauth[55814]: CONCURRENT LOGIN - REUSING OLD SESSION: 201, 90:xx:7c:8a:3f:8c, 192.168.2.158 Jun 8 19:32:59 logportalauth[55814]: LOGIN: 201, 90:xx:7c:8a:3f:8c, 192.168.2.158 Jun 8 19:25:16 logportalauth[20106]: LOGIN: 201, 90:xx:7c:8a:3f:8c, 192.168.2.158 Jun 8 13:27:54 logportalauth[60793]: TIMEOUT: 211, a4:xx:d2:48:20:f3, 192.168.2.151 Jun 8 10:59:14 logportalauth[93838]: TIMEOUT: 104, 24:xx:64:21:2e:a1, 192.168.2.149 Jun 8 10:50:11 logportalauth[98826]: TIMEOUT: 206, 3c:xx:f4:44:b8:58, 192.168.2.156 Jun 8 10:47:10 logportalauth[10666]: TIMEOUT: 206, b0:xx:94:99:a1:da, 192.168.2.155 Jun 8 10:35:07 logportalauth[17403]: TIMEOUT: 212, 3c:xx:72:14:6e:6f, 192.168.2.152 Jun 8 10:29:34 logportalauth[20106]: LOGIN: 211, a4:xx:d2:48:20:f3, 192.168.2.151 Jun 8 09:45:53 logportalauth[45180]: TIMEOUT: 212, 18:xx:a2:6a:fc:14, 192.168.2.157 Jun 8 09:45:53 logportalauth[45180]: TIMEOUT: 205, 30:xx:c9:cf:7c:1e, 192.168.2.127 Jun 8 09:06:42 logportalauth[4923]: TIMEOUT: 211, a4:xx:d2:48:20:f3, 192.168.2.151 Jun 8 08:51:38 logportalauth[55487]: TIMEOUT: 102, 00:xx:5b:c3:6c:91, 192.168.2.150 Jun 8 08:50:38 logportalauth[38730]: TIMEOUT: 205, 1c:xx:94:9c:3e:9a, 192.168.2.130
Every minut,
(Do a
ps ax | grep 'prune'
to see the task)
hard time out and idle time is checked. for every logged in user.
You are using radius, so other checks might happen.
When the time is up, firewall rules are removed, the user session is destroyed, the user is cut from the net, and a TIMEOUT message is being logged. -
update
I had the same issue yesterday. I had a user trying to login through the CP portal. After entering their credentials, the screen just refreshes and asks them to login again but Radius is authenticating them. I watched the portauthlog and nothing new appeared and the user is not shown on the Captive Portal Status page.
I next looked at the IP address that the user had picked up. Let's say they had picked up 192.168.1.40. I searched the CP status page and saw another user logged in with that IP address. As soon as I deleted that session for 192.168.1.40 I can then get this user to login without any issues.
I need to look into more detail then next time it happens.
I have idle and hard timeout set to 6 hours. I also have my dhcp lease set to 6 hours.
-
I next looked at the IP address that the user had picked up. Let's say they had picked up 192.168.1.40. I searched the CP status page and saw another user logged in with that IP address. As soon as I deleted that session for 192.168.1.40 I can then get this user to login without any issues.
Analyse this situation.
A DHCP server that starts out dealing an IP that's actively already used by another user: Its shouldn't do that, big troubles will arrive.I have idle and hard timeout set to 6 hours. I also have my dhcp lease set to 6 hours.
Idle time : 30 minutes
=> because: if the users doesn't do anything on the network, have it removed.
Hard time out : 360 minutes (6 houres)
=> Always good to have a hard time out.
DHCP lease: a little bit more then the hard time out: 400 minutes or 24000 secondes.
=> Lease time is expressed in secondes and the DHCP server page. -
Hi Gertan
I will try those settings out.
Just to confirm. My Windows DHCP server was not issuing the same IP address to 2 different clients. I think it was more to do with pfsense/captive portal already having a connection on that IP address with another user and when this new user tried to authenticate with the same IP address it caused CP to error. This may be fixed by the settings you have suggested so I will implement them now.
cheers
-
…..I think it was more to do with pfsense/captive portal already having a connection on that IP address with another user and when this new user tried to authenticate with the same IP address it caused CP to error.
I rephrase.
A user (with an IP obtained from your DHCP server) has a device with a MAC address.
He connects to the portal interface, a session is opened with its
IP
MAC
Start time
End time (the End time will be 'Start time' + 'hard time out')
Session-ID
Etc.This user will NOT be redirected to the portal login web interface anymore.
This user should manually LOGOUT (using the popup, so the session will be portal will be destroyed) if he want to see the portal login web interface again.
His IP stays the same all this time.If another user connects, it should NOT obtain the same IP (IP conflict ! - note that user CAN hard code the IP, you better ignore these users ;)).
This other user has of course another MAC ….I cannot imagine how is could be possible that two users have the SAME IP ..... your DHCP will never allow that.
A unique user with its unique MAC will receive a unique IP.
This is how thing work in DHCP land :)