Captive Portal Redirect Loop, cleared by disable/enable of CP service

communig8

Hi all

We are running a couple of pfSense 2.0.1 Servers as Gateways from a Public Wireless Network to the Internet.
We are using CP to authenticate users and set their bandwidth profile.
We use our own hotspot login page rather than the pfSense internal one.
So CP will redirect to our hotspot page to get login credentials which then in turn redirects back to CP to authenticate.
This all works fine the vast majority of the time.
However, every now and again, it all goes wrong. I have no idea as yet what triggers the problem.

We see (using Packet Capture) the following;

HTTP GET from client browser.
redirect from CP to our hotspot page (which is out on the Internet and is included as an "Allowed Hostname" in the CP config).
CP then intercepts this and redirects again to the hotspot page.
This goes on until the browser says "Too many redirects".

All user attempts then get the above treatment indefinitely.
We have to disable and the re-enable the CP service to restore normal operation.

Anyone seen this? or offer any suggestions?
(Don't forget this is working fine the vast majority of the time).

Thanks, Richard

communig8

Nothing still ??

communig8

I suspect that the difference betweeen my configuration and most folks is that I'm redirecting from the uploaded Portal Page content to an external web server.
I'm doing this so I can use the same Portal page for multiple pfSense boxes.
The site I redirect to is in the Allowed Hostnames list and, as I posted before, works fine most of the time.
Something happens to start the redirect loop which I can only recover from by the process of disbaling and re-enabling the CP service.
Any thoughts anyone?

communig8

No ideas anyone?

communig8

Here's a thought, please comment.

Because I use an external web server to handle the login page, its DNS name is in the "Allowed Hosts" list.
When the CP service is started it creates rules that contain DNS names.
There is a process that does DNS lookups for these names to put the IP address in the rule.
What if the DNS lookup fails for some reason? Will this invalidate the rule?
This would explain why it stays broken until CP is restarted and re-writes the rules.

Anyone?? Anyone??

dhatz

I seem to remember that a lot of CP fixes were put into 2.03, so my first suggestion would be to try a 2.01 -> 2.03 upgrade.

Or, even better, upgrade to 2.1-RC1

communig8

Thanks for that. Do you know where I can find the fix list for 2.0.3 so I can check?
I can't risk an RC as its a live system on a Public Wireless network and I to
just upgrade as it's a 500 mile round trip to the site! Cant risk a remote upgrade just encase it goes wrong.
So I really need to be certain before hand.
Thanks, Richard

communig8

Started looking at the impact of changing CP "Allowed Hosts" on ipfw tables.
ipfw tables contain the unique IPs of the CP "Allowed Hosts".
So there must be a process that does DNS lookups and populates these table.
It seems that the tables are emptied before they are updated which could lead
to there being missing entries if something goes wrong with the DNS lookup.
I can see the reason for emptying a table as it may contain out of date entries
but you cant tell which ones, so you delete all and re-populate but this is open to
problems.

communig8

An Update….

I've now been running using allowed IP rather than allowed host for some time without any problems at all.
So there is an issue with the DNS lookup process that updates the ipfw tables.