Captive portal problem under high load
Under High load devices stop connecting to the internet and come back with various errors. Restarting the Captive portal fixes it but the problem comes back within a day
I'm not sure what is causing this. DHCP and DNS seem fine. Looking through the logs the only thing that stands out is from system -> general logs i see a bunch of messages similar to below while this error is happening.
nginx: 2017/12/06 07:46:20 [error] 89653#100253: *35901 limiting connections by zone "addr", client: 172.16.182.78, server: , request: "GET /index.php?zone=chromebook&redirurl=http%3A%2F%2Fcdn1.securly.com%2Fiwf-encode.txt HTTP/1.1", host: "172.16.182.1:8002"
nginx: 2017/12/06 07:45:01 [error] 90181#100231: *33152 connect() to unix:/var/run/php-fpm.socket failed (61: Connection refused) while connecting to upstream, client: 172.16.172.239, server: , request: "GET /generate_204 HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: "www.gstatic.com"
any suggestions on how to fix or troubleshoot?
Gertjan last edited by
Can you give some numbers ?
I would say problems start occuring after 800 devices. Our APs spread out the devices evenly between 8 vlans that are /24 each
Gertjan last edited by
And let me guess, they all try to enter between 08h30 and 09h00 AM ?
I presume that your problem is related to the authentication phase.
Ones the client is connected, it's IP and MAC is loaded in one of the first tables in ipfw see https://doc.pfsense.org/index.php/Captive_Portal_Troubleshooting - remember : 2.4.x : no more "-x" parameter)
When you look at the ipfw rules and tables, and /etc/inc/captiveportal.inc (where the rules are created and injected into ipfw) it is easy to create somewhere in the middle a pass-all rule. Put one in, and see if the "load" problem still exists. If so : it's not the portal or pfsense but your routing capabilities, it's time tu upgrade the hardware.
If the problem is the authentication phase, or, more precis : the web server that handles the login pages, the creation of the rules into the tables, and the housekeeping of a mini database - 2 of them (the 'nasty' PHP build-in SQLITE which tends to create a huge file that tends be be read and written often - you better have some fast media or put it into RAM) you should look up the several threads in this forum that talk about heavy load portals - have read about some installations that have several thousands of clients at the same moment.
Also : do not set the soft and hard time out to low : tat means people have to re-log again more often.
Btw : I presume you have some PHP knowledge (accessible ones one can read - it worlds most simple language, only basic was more ….) and have some global "system" knowledge about things like "ipfw" (all the doc is on the net already).
You want to tune your system, which is ok of course, so, the question is : are you a tuner ? If not, have it tune ;)
See my reply not as a "do this and you will be fine", more as a "I would take these steps to see where the bottle neck is".
Btw : you are running VLAN's over what ? one 1GB interface ? 100 Mbit interface ?
Think about ditching VLAN and take real physical LAN's (1 Gb does NOT take 1 Giga bit per second, it will be far less ...)
hi dboe732 do you resolve your problem ?