captive portal: nginx 504 GW timeout & 'dnctl: need a pipe/flowset/sched number' => MAC addr cleanup job needed
-
Hello all,
after a fresh installation of pfsense (2.7@Intel HW) to switch to a ZFS mirror and restoring a backup I had some trouble with the captive portal.
If I rebootet the system, it hangs while loading the captive portal. I had to login via SSH and restart php-fpm with /etc/rc.php-fpm_restart .
After that, the boot process finally continued.Q: How can I change the boot order of the services? IPSec VPN has to start before the captive portal is starting to avoid such a situation that you will be locked out.
If you are not on site, it is no longer possible to access the system. This is very unfavorable if the pfSense is 270km away and it is midnight.In the system log there are many lines like that:
Mar 14 21:41:11 pfsense1 php-fpm[17515]: /rc.captiveportal_configure_mac: The command '/sbin/dnctl pipe 0 config bw 0Kbit/s queue 100 buckets 16' returned exit code '64', the output was 'dnctl: need a pipe/flowset/sched number'
While this is happening it writes permanently into /var/db/captiveportaldn.rules the same data again and again.
It seems to me that this comes from the traffic quota (enabled "Per-user bandwidth restriction") together with the large count of MACs, so I disabled it in the captive portal config.If I disabled the captive portal, enabled it again and stored the config, it hangs and I got a 504 nginx timeout.
There were 8245 MAC addresses in the config ("Auto-added for voucher ******") because of an enabled "Pass-through MAC Auto Entry".
This is too time consuming while storing the config and nginx is running into timeout...To get out of it, I disabled the "Per-user bandwidth restriction", stopped the captive portal, deleted the /var/db/captiveportal* DBs, I also deleted the files /tmp/captiveportal*.
I had to open /cf/conf/config.xml with vi, jumped to the first entry of Auto-added..., then up to the entry starting line <passthrumac> of the first match and I saw that every entry has nine lines, typed '74205dd' (9*8245 MAC addresses) to delete all this entries/lines, stored it with ':x'.
Then I rebootet the firewall.
After that it was working again.
Q: Is there a way for a cron cleanup job to delete MAC addresses from the DB and config for outdated/invalid vouchers?Regards,
Ralf -
@getcom said in captive portal: nginx 504 GW timeout & 'dnctl: need a pipe/flowset/sched number' => MAC addr cleanup job needed:
The command '/sbin/dnctl pipe 0 config bw 0Kbit/s queue 100 buckets 16
/sbin/dnctl pipe 0 config bw 0Kbit/s queue 100 buckets 16
The pipe can't be 0.
Not sure about the bandwidth, maybe "0" here means just : no limit.The number of max pipes is known up front :
https://github.com/pfsense/pfsense/blob/89b927199c8b2c50b13bc368139d4a81ccb17889/src/etc/inc/captiveportal.inc#L1597
so, 62000 / 2 = 31000 indexes.
Note : $ruleno is set to 0 at start. There is no error condition checking ( !!) and if this no free index is found, "0" is return. This (I'm pretty sure) explains your 'fail' situation.
Attributes pipes are stored /maintained in a file you've mentioned "{$g['vardb_path']}/captiveportaldn.rules" = /var/db/captiveportaldn.rules
Looks to me that "unserializing'" that file (line 1603) searching for a free position, adding it in, and then "serializing" the array to a file and write it out, 8245 times during portal start, is just to much - takes to much time, nginx times out.
If you already have a SSD drive - or better, and
If you have already a 'big iron' processor (not same small arm device, or Atom Intel)
I'm not sure what you can do except not creating the situation where you wind up with "large count of MAC".@getcom said in captive portal: nginx 504 GW timeout & 'dnctl: need a pipe/flowset/sched number' => MAC addr cleanup job needed:
Q: Is there a way for a cron cleanup job to delete MAC addresses from the DB and config for outdated/invalid vouchers?
Answer : by not auto adding them in the first place.
Or : checking the GUI regularly, and remove older ones (I know, tedious).A script that parsers over all 'auto mac', reads the comment line, isolates the voucher Id, check that voucher if it is still valid and if not : remove the auto .....
Wait ....Isn't this what the "captiveportal_prune_old_automac()" function (line 810) is all about - is called every 5 minutes or so ?
It's doing something comparable, with concurrent logins. Easy to rebuild as "if voucher is expired, then ditch the auto added MAC"Btw : it's said
When enabled, a MAC passthrough entry is automatically added after the user has successfully authenticated. Users of that MAC address will never have to authenticate again. To remove the passthrough MAC entry either log in and remove it manually from the MAC tab or send a POST from another system. If this is enabled, the logout window will not be shown.
-
Hi Gertjan,
thank you for your answer.
As I understand from your comments, and this is also my experience, it is problematic if lots of MAC addresses are registered. There is a general problem with that.
First of all it is necessary that VPN has to start first BEFORE captive portal is starting.
How can I change the order of services?I found a commit from last week:
https://github.com/pfsense/pfsense/commit/8bfe17dae7ab15b7af802f69dbb7c421d098d38c"Prune old Captive Portal sessions for autoadded MAC. Fix #15299
Use the correct function to delete passthrumac entries. Remove the pipe
check since it's already handled by the function."This means to me that deletion of automac did not work before because of calling a wrong function.
Additionally the same for vouchers is not implemented. This is essential because I think that this is the main reason why a customer wants to have it.You said "Easy to rebuild as "if voucher is expired, then ditch the auto added MAC"". Should we implement that and commit a fix?
-
@getcom said in captive portal: nginx 504 GW timeout & 'dnctl: need a pipe/flowset/sched number' => MAC addr cleanup job needed:
https://github.com/pfsense/pfsense/commit/8bfe17dae7ab15b7af802f69dbb7c421d098d38c
Looks like that related.
It's an easy edit, go ahead !@getcom said in captive portal: nginx 504 GW timeout & 'dnctl: need a pipe/flowset/sched number' => MAC addr cleanup job needed:
You said "Easy to rebuild as "if voucher is expired, then ditch the auto added MAC"". Should we implement that and commit a fix?
The easiest solution would be : don't "auto add", as this is only a comfort option for your portal users. On the long run not for you !
They, the portal users, log in once using the voucher code, and from then on they stay logged in forever. Its up to you to remove the 'old' macs manually. Seems tedious to me.Is there a comment add to the auto added MAC entry ? If so, and it contains the voucher ID, it's easy to parse over all the mac entries, isolate the voucher code, test for validity (still time left) and if not, delete the mac entry all together (does doing a auto clean up ^^).
I'm not using vouchers at all on my portal, but I'll have some spare time next week, and I'll see what I can come up with.