Captiveportal db not updating?
-
We're using Pfsense in our school network to validate students' internet access. This is done using captiveportal to validate users against a radius server. Also the system is configured to do periodic checks of whether the students have been removed from the allow group, as the teachers would on occation like the ability to deny some students internet access in an attempt to divert attention away from facebook and MSN to the lectures at hand.
We're encountering a bit of a hiccup though with users removed from the group still having internet access, but not being visible in the captiveportal view.Doing a bit of testing, collating information from the ipfw tables with the data from /var/db/captiveportal.db. Roughly 10% of the allowed mac/ips in the firewall table are not listed in the database. I'm guessing this also means they are no longer checked to ensure they're still allowed online in the routine. The machine shows up just fine in the dhcp leases table as well as I can spot the successful login in /var/log/portalauth.log file.
My guess is that something goes wrong when trying to write to the file, I can't see where the error ought to occur. Can anyone help me with troubleshooting this issue?
Just updated to RC1 yesterday in hopes of fixing the problem, as the stable version seemed to be unable to handle the amounts of concurrent logins we have.
-
How are you disconnecting people?
-
I wasn't the one who initially setup the system, I've just been handed the task of trying to figure out why it isn't working as expected, so please excuse me if the rudimentary understanding of the inner workings of Pfsense seem stupid.
Our users logs on with their domain user/pass setup which seems to be validated through RADIUS authentication as setup under the Captiveportal Service setup. The checkbox for Reauthenticate connected users every minute is checked, which unless I'm all mistaken should query the RADIUS server whether a connected user is still allowed internet access.
The RADIUS server itself will only say you're allowed if you're member of the AD group allowed wireless internet access.For the users which can be see on the status page of Captiveportal, removing them from the allowed group on the Active Directory will result in them getting disconnected and met by the captiveportal page.
I'm guessing the reauthentication iterates through all the users in the captiveportal.db file, and disconnects those who are no longer allowed. But as they're not logged in the database, it doesn't seem to check the membership again.
-
Well than check your radius if it does not have any caching or somesuch.
You also can trace the traffic through pfSense and the radius server with tcpdump/wireshark and see if the radius tells pfSense, during periodic checks, that it should disconnect the user. It exepcts a response of 3(Access-Reject) from the radius to actually disconnect the user. -
Poking at both the RADIUS log file which shows authentication requests and the response, as well as using tcpdump from the pfsense system to monitor data to and from the RADIUS server shows that users in the captiveportal status list get re-authenticated every minute, but those who don't figure in the captiveportal database, but still have been entered into the ipfw table are not attempted re-authenticated.
Past the initial authentication there seem to be no more traffic between the RADIUS and the pfsense server regarding this user/mac address.
I've given the problem a bit of thought and figure it might be related to the way the db is handled when re-authenticating. Haven't run through all the source code yet, and my php knowledge is rudimentary, but I could imagine it might be a missing update problem with multiple processes working on the same file
One example being captiveportal_prune_old() which calls captiveportal_read_db(). It seems that once the db is read into memory the file is once again unlocked.
The prune function runs through all sessions, removing timed out sessions but also seem to handle the reauthentication. With a large amount of users it might take a bit of time, during which other login calls and such might add rows to the session database.
when the function has finished re-authenticating, it writes it's memory cached version of the DB to file again using captiveportal_write_db. Any updates done while the re-authentication loop runs seems to have been lost. Not sure if this is the actual case, but it's what I can gather from the code I've read.
The rows going missing will not figure in the database pruning run, and thus it won't reauthenticate them against the RADIUS server and thus, if their permissions in the directory are removed, they won't be kicked offline, since the pruning function doesn't seem to remember them.Is this completely misguided of me to think along this path? From my knowledge it seems plausible, but I haven't gotten the whole picture yet.
If indeed this could be the cause of users not being listed in captiveportal and such, maybe introducing functions to add and remove entries from the captiveportal db could be in order.
The pruning function would read the DB as usual, but instead of making changes directly in it's own memory data set it'd store a list of entries to remove in an array and entries to update in another. Once done, it should lock the database file exclusively, read a fresh copy into memory which would include new elements, without releasing the lock. Then write it's change queues (update/delete/insert) to the memory copy and then re-write the database file before releasing. Not sure if this is viable, considering it'll have exclusive lock on the file for longer time, but if my earlier assumption is correct it should remedy the problem of rows added during the pruning process disappearing?I'd imaging a function update_captiveportal_db(&$addedElements, &$removedElements, &$updatedElements) {
//Lock database file
//Read into memory
foreach($removedElements as $toRemove) {
//find and remove element from memory cached db
}
foreach($updatedElements as $toUpdate) {
//Find the updated element, update memory cached db row
}
foreach($addedElements as $toAdd) {
//Add element to memory cached DB
}
//Write db file again
//Release lock
} -
Yeah there is a race there and i am aware of.
Give me some time and i will get back to you with a fix. -
Thank you very much. If there's something I can do to help I'd be happy to help.
-
Would something akin to this work do you think? I've copied the content of the read_db and _write_db together.
The idea is to send along a list of the entries you want to remove(Could be built instead of```
/* This is a kludge to overcome some php weirdness */
foreach($unsetindexes as $unsetindex)
unset($cpdb[$unsetindex]);Also sending along an array of db style entries which were added during the run, and a reference to the complete data table as read initially, this done to allow any updates to be written.
function captiveportal_update_db(&$removeEntries, &$addEntries, &$allEntries) {
global $g;
//Create lookup arrays with references to the passed data.
$refTable = array(); //Reference table for all the entries for quick lookup
$cntTable = count($allEntries);
for($i = 0; $i < $cntTable; $i++) {
$refTable[$allEntries[$i][2]."-".$allEntries[$i][3]] = &$allEntries[$i];
}
$delRefTable = array();
$cntTable = count($removeEntries);
for($i = 0; $i < $cntTable; $i++) {
$delRefTable[$removeEntries[$i][2] . "-" . $removeEntries[$i][3]] = &$removeEntries[$i];
}$cpdblck = lock('captiveportaldb', LOCK_EX); //Create exclusive lock, we won't unlock till we're done updating $cpdb = array(); //Read the database much like read_db $fd = @fopen("{$g['vardb_path']}/captiveportal.db", "r"); if ($fd) { while (!feof($fd)) { $line = trim(fgets($fd)); if ($line) { $lineArr = explode(",", $line); //Storing the line entry in a separate variable $index = $lineArr[2] . "-" . $lineArr[3]; //This is the index to call. I don't know if mac-IP is likely to change, or unusable as an unique index. if(isset($refTable[$index])) { //The currently read row from the .db file already exists in the data table. Disregard what's in the .db file, replace with the data we've potentially altered. if(!isset($delRefTable[$index])) { //This line has not been tagged for deletion $cpdb[] = $refTable[$lineArr[2] . "-" . $lineArr[3]]; } //If it is tagged for deletion. Omit reading it into memory. } else { //This line was added since we read the data last time. Copy it directly without altering. $cpdb[] = $lineArr; } } } fclose($fd); //Add the new entries we might've added during the process. foreach($addEntries as $entry) { $cpdb[] = $entry; } } //End of the copy from _read. Now opening the file for write access as in write_db //Since we've already ensured that we have an exclusive access to the file since we started reading, nothing more should have been added, the $cpdb entry should now include anything written since first read, with any deleted entries removed and any added entries added. $fd = @fopen("{$g['vardb_path']}/captiveportal.db", "w"); if ($fd) { foreach ($cpdb as $cpent) { fwrite($fd, join(",", $cpent) . "\n"); } fclose($fd); } unlock($cpdblck);
}
Note that I haven't tested this code at all, it should work fairly well but might have syntax errors. I've done a test with using the &$ references in the indexing table to ensure it is valid code and able to do proper references so we don't waste too much memory.
-
Try this one or grab a snapshot that will have that in it.
https://rcs.pfsense.org/projects/pfsense/repos/mainline/commits/006802ab988a6fd7be75d09f00464fd19c903ab7
https://rcs.pfsense.org/projects/pfsense/repos/mainline/commits/328c1def40513f3c39e18016e63ed58f5c59a78b
https://rcs.pfsense.org/projects/pfsense/repos/mainline/commits/ce1942d6a3097915948eee340bdc2564ae84ea6bIts almost the same idea but a little more complex/clean.
-
Nice work there. I'll try to get the update on this afternoon once our users have left the building.
I'll of course report back with how it works, it definitely looks like it should do the job fine. Thank you very much for the quick response. -
The update seems to fix the problem, though now it seems the captiveportal is no longer re-authenticating against the RADIUS server.
Apart from the initial logons there are no further logs in the RADIUS server about any given user, where before they'd get logged once every minute.I'm not sure where the entry to run the prune_old() function once a minute lies, it seems to be in the php.core file according to grep, but I'm assuming there is a task somewhere perhaps in the scheduler supposed to run this. I'm not sure, but I believe I saw a scheduler entry about the pruning earlier, but can't remember where I found them.
Closest I've found is under Firewall there's a schedules menu, but this list is empty at present.
Is i safe for me to manually make a PHP file which includes the captiveportal.inc file and call the prune function manually? Or would any errors or such get logged somewhere? I've checked /var/log but can't seem to find anything relating to pruning or such?
Doing a cat /var/run/cp_pruneold.pid and using that pid number in ps -aux | grep <pid>shows it's running with minicron. When trying to run the /etc/rc.prunecaptiveportal manually I get
Fatal error: Error converting Address in /etc/inc/radius.inc on line 210
Corresponding line is
return radius_put_addr($this->res, $attrib, $value);
On a bit closer poking I think the problem lies in captiveportal.inc
using $no_users = count($cpdb); was fine when you used a regular indexed array, but the change to the read_db has changed it so where before you were doing $cpdb[$i] you'd use a number from 0 to x, now it's indexed with the keys, and not the number, thus calling $cpdb[0] (first element) gets you a reference to a non existant element.Need to read through the read and write code to figure out how the keys are handled so I don't break anything in an attempted fix. A quick and dirty fix would be
$indexes = array(); foreach($cpbd as $cpKey => $cpVal) { $indexes[] = $cpKey; } $no_users = count($indexes);
and then replace $cpdb[$i] with $cpdb[$indexes[$i]];
–-
Further pondering
Quick, not nescesarily correct fix would be$keyArr = array(); foreach($cpdb as $key => $value) { $keyArr[] = $key; } $no_users = count($keyArr); //for ($i = 0; $i < $no_users; $i++) { for($j = 0; $j < $no_users; $j++) { $i = $keyArr[$j];
This should remove the need for any further alterations in the prune function as $i will now contain the key to the data again.
I am not certain if things will work correctly with the unsetindexes.Also looking over the altered write_db function I'm assuming the prune_old function doesn't alter any database rows? If it does the data won't get written as it seems to favor the data in the read_db done during the write_db call. Is this a completely wrong assumption?</pid>
-
I've modified the captiveportal.inc file fixing up the code like I suggested, and implemented a sort of optimization to avoid multiple dictionary lookups for each reference to the current row.
Since the forum doesn't allow .inc uploads I've renamed it to .txt
I've put it on our server and it now seems to prune automatically again without problems.
-
It should not do that so please check again.
I put this fix in for the referenced entries https://rcs.pfsense.org/projects/pfsense/repos/mainline/commits/3e5c0ab797c65ac4833dfb75049a3e5dd396db74 -
Will try the updated snapshot this afternoon then.
My temporary implemented fix as was attached did get reauthentication running again, but I'm seeing entries in the firewall which aren't in the .db file again. Will report back after testing if the new snapshot fixes the problem. -
Right, tried to do an update with the autoupdater, which seemed to go well, except the captiveportal.inc file is now back to before that last edit again (e.g. using ints for trying to access the data).
I guess I'm a bit stupid, but haven't been able to find a download link for that version of the captiveportal.inc file, so I quickly mirrored my kludge back together.
Also with a bit of a lookover I think I might've found why we're still experiencing users getting dropped out of the database.
It seems captiveportal_write_db only re-reads the .db file (and thus any changes other processes might've done) if unsetindexes isn't an empty array. If the pruning function doesn't remove anyone (which is kinda likely considering it passes once every minute) it seems to behave the same way as before with overwriting any changes made with the memory cached version.
To combat this I've made a small change to the prune function on our server again to do thus:
if (!empty($cpdb)) { captiveportal_write_db($cpdb, false, $unsetindexes); }
Still not entirely sure nothing is altered during the prune function other than removing, but the way write_db works seems to discard any changes if there's any deleted indexes, so skipping the write entirely if there's no deleted rows shouldn't prove that much more of a problem.
I'll report back how things go with this approach.
Any pointers on how to retrieve the captiveportal.inc file with your changes would be nice to have too, since any small hacks I make in an attempt to narrow down the root of the problem get overwritten when autoupdate is invoked.
-
From console you can use pfSense shell or just go to rcs.pfsense.org and browse the repo.
-
The if condition I mentioned, wrapped around the captiveportal_write_db call seems to work. Halfway through a normal day we've got 1402 users online without anyone having dropped out of the database. Good difference from yesterday with 1200 users online and 130 missing from the database file.
The edit was done on the same captiveportal.inc file I uploaded before, as I get an error message when trying to download captiveportal.inc from the tree.
-
Well it is now on latest snapshots so you can upgrade or just download an update file.
Extract it and go get the file uploaded manually to pfSense install.Good to hear it works as it should now.
-
Updated to the snapshot last night and reauthentication seems to be running again with the snapshot code.
Unfortunately it's also started dropping logins out of the database again.
Can I once again suggest you alter captiveportal_prune_old() from ending with
captiveportal_write_db($cpdb, false, $unsetindexes);
to ending with
//Write database again if we have deleted anything, as an empty $unsetindexes will cause the database to rollback to when this function was called if(!empty($unsetindexes)) { captiveportal_write_db($cpdb, false, $unsetindexes); }
-
I actually put another solution in.
Please test that.