100,000 users (Captive portal only) with pfSense. Is it possible?
-
I'm plan to deploy pfSense for Captive portal only, Estimate total user around 100,000 users but I think concurrent user will around 30,000 users.
What is the best way to deploy these configuration? Can pfSense support this requirement? I'm not sure how to calculate and sizing hardware in large scale user. Please advice.
-
There are some that have a few thousand simultaneous captive portal users. Haven't heard of anyone with the requirement of 100,000 on a single system.
You're in uncharted territory there. No reason I can think of that it shouldn't work, but that may be one of those things that you don't know until you try. I suspect you aren't going to be able to design the network in a fashion that 30,000+ simultaneous users all go through one box, just by sheer volume of traffic that many users would probably generate (several Gpbs?) and the limits of general purpose PC hardware of pushing packets, especially when using captive portal since that's pushing the traffic through two packet filters (PF and ipfw). If those users wouldn't have nearly the level of traffic I'm imagining then it may be feasible all on a single box. All the session data is handled in a flat text file, which scales ok but that's roughly 6 times the size of any CP deployment I've worked on or am aware of, so that may be an issue. I would definitely go with the fastest new server CPU you can get, don't try to recycle an old machine, as you're going to need the CPU power for the packet filters, handling of CP session management, logins, etc.
Hopefully you can either simulate that many users, or can scale up gradually, as that kind of load is going to be a serious stress on any captive portal. Ours should be better than most because once you're authenticated it's all staying in kernel, where other open source CP implementations push all packets through userland which is a drag on performance especially at scale.
-
Consider a plan B:
Use several pfSense boxes, each covering a physical surface/segment.
You could focus on the portal user access procedure, like all boxes using a shared Radius server. This will make roaming possible.
If simple authentication is used, a MySQL server could be used (some core code (PHP!) changes will be needed).Btw: you are covering what ? Kennedy Airport ?
-
We plan to implement public WiFI in a city, that why the estimate user is around 100,000 users.
I think one box can't handle 100K users, I think it should use several box and use load balance like F5 to balance the traffic, F5 should know and balance the same user to the same box, and then I may have centralize Radius.
Is this a good design?
One thing which I don't know is, Althought I can find a high spec hardware like Dual Xeon Qudcore 3.0 GHz, RAM 16GB, etc. but I'm not sure how to calculate the performance or throughtput on server. How many server should I use? if I use 32GB of RAM, Is it better than 16GB?
So I can't estimate the size and number of servers to handle 100K users. For simulate user, Can you recommend that kind of software we can use for testing?
-
You can't load balancing any captive portal with anything, it works in a way that that's impossible. What you should do and what is typical with a city-wide network like that is split out the network into multiple broadcast domains, usually by geographical segregation in the type of network you're talking about there. That's desirable for numerous reasons, the primary two being performance (30,000 active devices on a single broadcast domain will have serious performance difficulties especially on wireless) and security (a single compromised host or malicious user with an ARP poisoning tool/malware can only impact a smaller subset of the network, amongst other possibilities for abuse). It also eliminates any concerns with scalability on your portal, as a well designed network for that kind of scenario isn't going to have more than a thousand or a couple thousand hosts on the same broadcast domain, and you'll have one captive portal install per broadcast domain (or maybe per 3 or 4 since you can do CP on multiple interfaces).
RADIUS would definitely be centralized, and setup in a redundant fashion.
I really doubt if you'd need more than 4 GB RAM.
Lot of work goes into design, sizing, etc. for a network of this scale, more than you can get thorough assistance with on the forum I expect, just due to time constraints. We'd be glad to assist in much more detail under our commercial support, see the link in my signature for info.
-
You might want to check the discussion in the captiveportal max users thread.
If it's going to be a public hotspot, one should also think about ways to mitigate possible abuse (either unintentional due to malware-infected PCs, or even intentional e.g. roaming spammers), how to deal with possible dhcp DoS attacks, rogue APs for mitm attacks, DoS attacks against the CP itself etc.
The underlying tools in pfsense (pf+ipfw) offer some relevant features, but afaik those aren't yet available from the webGUI.
A city-wide public Wifi for 30.000 active devices is a very big project that will require a great deal of work in design. You might want to read the material at http://www.muniwireless.com/category/city-county-wifi-networks/