Captive Portal Password-only Authentication Loop

tempaccount325

Hello,

I have recently installed a pfSense based captive portal solution at one of my family's hotels. I replaced the precious solutions which was using an OpenWRT based OS for the gateway and was giving them some issues (freezing/boot loops).
I will try to be as descriptive as possible, but if you see that I forget something, feel free to ask.

The errors that I am getting:
logportalauth[1657]: CONCURRENT LOGIN - REUSING IP 10.10.0.144 WITH DIFFERENT MAC ADDRESS 40:b0:fa:7b:d2:b7: Guest, 40:b0:fa:7b:d2:b7, 10.10.0.144

Aug 6 13:12:09 logportalauth[8762]: LOGIN: Guest, 40:0e:85:2e:9b:4c, 10.10.0.223
Aug 6 13:12:22 logportalauth[8762]: LOGIN: Guest, 40:0e:85:2e:9b:4c, 10.10.0.223
Aug 6 13:12:22 logportalauth[8762]: CONCURRENT LOGIN - REUSING OLD SESSION: Guest, 40:0e:85:2e:9b:4c, 10.10.0.223

Network Topology

The pfSense box has the IP address 10.10.0.254 and is the gateway of the 10.10.0.0/24 network.
The DHCP server (provided by pfSense) is offering IP Addresses in the 10.10.0.1-10.10.0.250 range.
The Access Points are all on the 10.10.10.0/24 network. Due to the infrastructure, groups of APs (12 total) connect to 3 different switches, which then connect to another switch that houses the pfSense box. The APs are all acting as transparent bridges (ethernet cable to LAN port) with their DHCP servers disabled.
The guests are all connected by wireless internet.
There are only two interfaces on the pfSense box so far. They are WAN and LAN and there are no VLANS.

User Base

There are approximately 50 new users everyday and a permanent network camera connected to one of the APs.

Captive Portal Authentication

Currently, the Captive Portal page is configured to only accept a password field. The user field is hidden and it is the same for every guest. The pfSense box is managing the authentication locally and there is no RADIUS server.
Idle Time Out: 2160 minutes (1 + 1/2 days)
Hard Time Out: 5760 minutes (4 days)

DHCP

(In addition to the Network Topology chapter)
Default Lease Time: 348000 seconds
Maximum Lease Time: 349000 seconds

Transparent Proxy

I have Squid running as a transparent proxy

System Uptime

( at the time of first erratic entry in Screenshot)
1 Day 18 Hours 30 Minutes

This is the proposed solution to the issue that I am getting, but my system as not even been up long enough for a DHCP lease to expire:
https://redmine.pfsense.org/issues/3352 and https://forum.pfsense.org/index.php?topic=68325.0

Logs

In the first attached image you can see two different examples of one of the errors I am getting.
The 10.10.0.239 address involves the 88:… and d0:... MACS.
The 10.10.0.144 address involves the c8:... and 40:... MACS.
Both c8 and 88 did not manage to login to the Captive Portal and there are no entries to the Captive Portal table in regard to these.
At 20:52 of 7 of August there was a DHCP OFFER to c8 of the IP address 10.10.0.144 .
At 21:05 of 7 of August there was a DHCP OFFER to 88 of the IP address 10.10.0.239 .
There are no DHCP log entries for d0 and 40 in between 20:13 and 22:00 on the 7 of August.
The Captive Portal has the following information about there MAC Addresses:
40
IP Address: 10.10.0.144
Session Start: 9:37 of the 6 of August;
Last Activity 18:57 of the 6 of August;
IP Address: 10.10.0.48
Session Start: 19:54 of the 6 of August;
Last Activity 08:38 of the 7 of August;

d0
IP Address: 10.10.0.239
Session Start: 16:45 of the 6 of August;
Last Activity 18:01 of the 7 of August;
IP Address: 10.10.0.242
Session Start: 19:54 of the 7 of August;
Last Activity 20:52 of the 7 of August;

There's two very strange things to me here: the Captive Portal client abandons their current IP address and gets another one before the end of the Idle, Hard or Lease timeouts and the IP address is still "borrowed", by the perspective of the Captive Portal, even if the MACs have moved on to some other address.

Lastly, would the "CONCURRENT LOGIN - REUSING OLD SESSION" error be something not to worry about? I can't understand if it is a short-term authentication loop (never longer than 2 login attempts on the Captive Portal auth logs) or could it represent a situation where the user is no longer allowed to login?

Note:
I have individually logged in from every access point and I encountered no issues while authenticating.
The "CONCURRENT LOGIN - REUSING IP <ip_address>WITH DIFFERENT MAC ADDRESS <mac_address>" error only showed up after the uptime mentioned above and there were 3 different occasions in the space of 2 hours.
The "CONCURRENT LOGIN - REUSING OLD SESSIONS" error seems to be a lot more frequent and random. About 15 entries on the 174 logged in users and it happened about 2 hours in and has not stopped.

Thank you for your time.

Screenshot.png_thumb

Screenshot-1.png_thumb </mac_address></ip_address>

Derelict

I think you're seeing DHCP leases expire and IPs getting assigned to new users before the captive portal session times out. When they browse, the IP/MAC pair doesn't match so they get forwarded to the portal. When they log in, you see the log entry when the CP makes a rule for the new IP/MAC pair and deletes the old.

I believe the way you have it configured is harmless. Are users saying they can't get on?

I think the DHCP lease/portal timeout problem manifests itself when you have noconcurrentlogins set.

A 4-day timeout with ~50 new per day on a 250-IP pool is pretty aggressive. I'm not surprised you're seeing some DHCP re-leasing. Have you considered just setting an idle timeout of like a day and no hard timeout? That should keep people who are still on the property active, but kill the sessions a day after the guest leaves. I mean if they're still active and are just going to sign right back in, why inconvenience them with a hard timeout? You could shorten the DHCP lease to something more reasonable like a day or 12 hours.

I'm pretty sure the concurrent login reusing old session happens when the user manages to go back to the captive portal URL (http://lan_address:8000/ probably) and logs in again even though they have an active session. I'm not quite sure how they manage to do that. It's harmless.

tempaccount325

I think you're seeing DHCP leases expire and IPs getting assigned to new users before the captive portal session times out. When they browse, the IP/MAC pair doesn't match so they get forwarded to the portal. When they log in, you see the log entry when the CP makes a rule for the new IP/MAC pair and deletes the old.

This is the strangest thing, because there should have not been any DHCP leases expired because I had not hit any of the limits yet (4 days) taking into account the uptime (1 day 18 hours).

I think the DHCP lease/portal timeout problem manifests itself when you have noconcurrentlogins set.

I do have concurrent logins set, the user that logs in is always the same.

A 4-day timeout with ~50 new per day on a 250-IP pool is pretty aggressive.

Yes, I have just confirmed this. I am not sure what is happening, but the 250 leases expired, when the concurrent users had peaked at 178 and there had not been enough time for many Captive Portal sessions to expire with the Idle Timeout (since I turned the system on at about 01:00 AM local time). I will probably take up your suggestion, but I was asked to do it like this, so I will look into expanding the available subnets (only managed to make the system unresponsive as of yet).

Thank you for the quick answer.

Derelict

You can control the maximum lease time granted, but you cannot control a device asking for a shorter lease. Take a look at your dhcp leases file. I'll bet you see a lot of one hour leases.

The checkbox is to "Disable concurrent logins" You have that unchecked right?

I also don't understand why you are using a username/password hardcoded in hidden fields. Just use "No authentication."

tempaccount325

You can control the maximum lease time granted, but you cannot control a device asking for a shorter lease. Take a look at your dhcp leases file. I'll bet you see a lot of one hour leases.

I didn't know there was an alternative to the default and maximum dhcp leases. I can't see it anymore, since there were a few electrical issues.

The checkbox is to "Disable concurrent logins" You have that unchecked right?
I also don't understand why you are using a username/password hardcoded in hidden fields. Just use "No authentication."

It is checked.
It allows us to have some control over who authenticates without complicating the login process, since we are in a very densely populated area (only the username is hardcoded).

Derelict

Ok. I get it now. The username is hardcoded and they have to enter the password.

I thought that with the "Disable concurrent logins" box checked every time someone logged on with that username it would kick the last guy off.

Derelict

I am completely wrong regarding the DHCP Lease timeout and the CP timeout.

logportalauth[1657]: CONCURRENT LOGIN - REUSING IP 10.10.0.144 WITH DIFFERENT MAC ADDRESS 40:b0:fa:7b:d2:b7: Guest, 40:b0:fa:7b:d2:b7, 10.10.0.144

That error is telling you that an existing session exists for 10.10.0.144/40:b0:fa:7b:d2:b7. That session remains active and the new logon attempting to also use 10.10.0.144 will just receive the logon page again.

I believe there should be a LOGIN attempt logged right before that detailing the same IP with the different MAC address. Yes. In your first screenshot the user with MAC address c8:bc:c8:34:26:96 was receiving the login page over and over again.

tempaccount325

I believe there should be a LOGIN attempt logged right before that detailing the same IP with the different MAC address. Yes. In your first screenshot the user with MAC address c8:bc:c8:34:26:96 was receiving the login page over and over again.

Yes, I have gotten to that conclusion. I have been counting the DHCP leases and they surpass the amount of new logged in users (I measured this after a system reboot, so there was not enough time for leases to expire and consequentially be reused). There was about 50 new users and 70 DHCP leases used up within a day. My guess is that these are foreign devices to the hotel and are just connecting to any open network.

I did try to solve this lack of DHCP leases by extending the network to 10.10.0.0/23, but I ran into issues trying to communicate and authenticate in the 10.10.0.0/24 range. Therefore, I have given up that solution and will just try to work it out with lower lease and timeout times.

– UPDATE --
The current count of DHCP leases and Captive Portal logins are:

204 active Leases (out of 250);
158 Logged in users;
Problem: I am getting the "logportalauth[1657]: CONCURRENT LOGIN - REUSING IP 10.10.0.144 WITH DIFFERENT MAC ADDRESS 40:b0:fa:7b:d2:b7: Guest, 40:b0:fa:7b:d2:b7, 10.10.0.144" error again (this one is just a template and there was only one occurrence so far).
What I have gathered is:
The MAC 58:… who had IP 10.10.0.132 and wanted to login was not able to do it (the IP lease was attributed one hour before attempting to login at 2AM and this is registered in the DHCP Lease table). I checked the Captive Portal entry for that IP address and it was attributed to MAC b4:... at 8 AM on the previous day. What this strange this time is that the MAC that is logged in on the Captive Portal, b4:... (with the IP 10.10.0.132), has no entry on the DHCP table.
The DHCP lease default and hard timeouts are 90000 (25 hours) and 90100 seconds, respectively.
The Captive Portal timeout is 1440 minutes (24 hours).
At the time of the event the system had an uptime of 20 hours.
The Captive Portal table table was empty at the time of boot.
The DHCP table had no more than 10 entries at the time of boot.

I find this all very puzzling. I will wait to confirm that this was due to a partial use of the DHCP table at the time of boot. Otherwise, I will have to again try to make it into a /23 sized network.

One other thing... there seem to 5 leases at the bottom of the DHCP table that have expired. Neither of the have a time between Start and End superior to 40 minutes and 2 of them have a difference of 3 minutes. Is this to be expected?

This is the actual "OS version: 2.1.4-RELEASE (amd64)
built on Fri Jun 20 12:59:50 EDT 2014
FreeBSD 8.3-RELEASE-p16"

Thank you.

Derelict

Don't forget that everyone walking by MIGHT get a DHCP lease or two or three even if they don't go through the portal.

Also remember that the DHCP server can enforce a MAXIMUM lease time. Client devices can request and receive a shorter lease.

I have come to the conclusion that it really isn't the lease timeout that matters. It's a matter of having a large enough DHCP pool that the same address is not given out until after the portal timeout occurs.

I think you have two choices: Increase the pool size or decrease the portal timeout. And best practices say the DHCP default lease and max lease time should be longer than the portal timeout.

Also, you're probably going to need external logging of DHCP and CP to see what's really going on. My

tempaccount325

The issue persists. I have gone and replaced the older hardware so now I have only the same kind of Access Points.
I have also successfully expanded the network into 10.10.0.0/23 and the DHCP server is issuing addresses between 1 and 250 on both intervals.

I have noticed a pattern in the first "CONCURRENT LOGIN - REUSING IP <ip_address>WITH DIFFERENT MAC ADDRESS <mac_address>". This error has appeared on 3 out of the 6 days the system has been up. The error always occurred after an uptime of 18 to 19 hours.

The values for DHCP and Captive Portal timeouts have stayed the same as the above posts. This time there were about 130 registered users through the Captive Portal and 150 DHCP leases at the time of the error.

Since this is my first experience with pfSense I am wondering if I should file a bug report or not.</mac_address></ip_address>

Derelict

It's not a bug. There's a CP entry for the same MAC with a different IP. That is, by design, an error.

You need a good DHCP log that doesn't wrap around as quickly as the one on the firewall. You'll see the IP being reassigned.

Sounds like you've made two DHCP pools. I wouldn't. I'd make one from, say, 10.10.0.17 - 10.10.1.254

tempaccount325

It's not a bug. There's a CP entry for the same MAC with a different IP. That is, by design, an error.

I see, but that is not immediately troublesome. The strangest part seems to be the reassignment of the leased IP to another device or does it expire? Is it normal for a lease to expire before it's default timeout? I have not understood this yet I guess then the solution would imply greatly reducing the Captive Portal's timeout, which I was requested to keep reasonably high.

Alternatively, would the issue be resolved by enabling the "Disable MAC filtering" option in the Captive Portal's configuration page?

Derelict

Personally, I think your timeouts are (or at least were) far too long for your DHCP pool size. Take, for example, your hard timeout. Do you really care if someone has to re-navigate the portal after they have been idle for a day and a half? And idle means idle - zero packets to the internet. Basically they'd have to be off property or have the device turned off. I would think for a hotel, something like an idle timeout of 12-18 hours would be reasonable. Remember, worst case they get the portal and enter the password again. A far sight better than not being able to get on at all.

And if they are staying for a week and not triggering the idle timeout, what's the point of kicking them off after a set time?

The trick is making the portal session timeout before the DHCP lease address is reissued to a new device.

Eliminating MAC filtering will, I believe, fix this.

Makes it a lot easier for someone to hijack a dormant session. All they have to do is assume the IP addresses in sequence and try to get out.

Also, given the same circumstances producing the error, the user in question will, instead, be through without the portal.

tempaccount325

Eliminating MAC filtering will, I believe, fix this.

Makes it a lot easier for someone to hijack a dormant session. All they have to do is assume the IP addresses in sequence and try to get out.

Thank you for reminding me of this.

I will have a go with less conservative timeout values during this week.

tempaccount325

I have done some testing and I had complete success with the following timeout values:
13 hours default DHCP lease (and 13 hours + 100 seconds)
12 hours Captive Portal Timeout (idle)

17 hours default DHCP lease (and 17 hours + 100 seconds)
16 hours Captive Portal Timeout (idle)

The first combination was running for 3 days and the second one for 2 days. I will keep pushing the values up without rebooting until I get to where I was asked to and see if it works.

Thank you for all your help.

Derelict

Good to hear.

Those pushing for a higher timeout know they're talking about absolutely zero internet traffic for 16 hours right? It means the device is either powered off or is off the property. All it takes is one internet packet to reset the 16-hour timer.

That setting should allow the VAST majority of multi-day guests to only have to navigate the portal once during their stay. And, worst case, they have to navigate it again.

tempaccount325

Those pushing for a higher timeout know they're talking about absolutely zero internet traffic for 16 hours right? It means the device is either powered off or is off the property. All it takes is one internet packet to reset the 16-hour timer.

Oh, I see how I was not clear enough. I meant the management.

That setting should allow the VAST majority of multi-day guests to only have to navigate the portal once during their stay. And, worst case, they have to navigate it again.

Yes, this was what I was aiming for. I see a lot less logins during the morning period.

Everyone is satisfied.