CP Hard / Idle Timeout stops working when reloading squid 2 / squidguard !?

Nachtfalke

Hi everybody,

I am running pfsense since 2.0BETA but I first used CaptivePortal + squid since pfsense 2.0.1, 2.0.3 and 2.1.

On all these versions I had and have problems that the idle timeout (30min) and the hard timeout (180min) does not always work. I never tried to hard to find the reason and just restartet CP once a week to diconnect the "unwanted" and old user logins.

Now it looks like I could find out when the timeouts stop to work and when they do work:
If I reconfigure the CP the timeouts work. If I do any changes on squid 2.x or squidguard then the timeouts seems to stop working. But If I reconfigure the CP after changes on squid all users will be disconnected with TIMEOUT and then the new logged in users will be kicked when timeouts are reached.

Does anybody else have this problems?
Any workarounds?
Is it my misconfiguration?

Any suggestions are weclome :-)

insurin

I came across the same issue last night. I had my idle and hard set to 360. I left work last night, no users on the network from 9pm. I came in this morning and CP status was still showing users connected from yesterday.

Gertjan

Hi there.

Take 5 minutes, and read this : Shouldn't expired sessions be removed?
Can you reproduce ? Is this what you mean ?
In that thread, the faulty process was found ….

Nachtfalke

@Gertjan:

Hi there.

Take 5 minutes, and read this : Shouldn't expired sessions be removed?
Can you reproduce ? Is this what you mean ?
In that thread, the faulty process was found ….

Hi,

I checked my minicron and found this:


root   23597  0.0  0.0  5784  1108  ??  Is   12:39AM   0:00.00 /usr/local/bin/minicron 60 /var/run/cp_prunedb_gaestehaeuser.pid /etc/rc.prunecaptive
root   23742  0.0  0.0  5784  1156  ??  I    12:39AM   0:00.00 minicron: helper /etc/rc.prunecaptiveportal gaestehaeuser (minicron)

This has gone when I reconfigured something on squid or squidguard.
After reconfiguring CaptivePortal it is back.

Changing something on firewall rules does not make this minicron disapear.

So something what squid is doing will kill minicron for CP.
I am using squid2.

Gertjan

"squid 2 / squidguard"
And when one of these two is dying, it will takes with it the minicron that runs at that very moment (every minute) - and is (was) responsible for disconnecting portal users.
I'm pretty sure you are seeing the same thing.
The side effect is: people aren't disconnected from the portal anymore until you restart it (=restart the portal logic).

Let me guess: you are not using Radius authentication, right ?

Nachtfalke

Hi,

Your guess is wrong. I authenticate CP users by username and password against freeradius2 package.

mikenl

Same problem as the thread Gertjan mentioned, we're using radius also.

neo_X

I have the same problem, someone found the solution?

johnjohn

Same problem, pfSense 2.14, with local User Manager authentication, no squid/snort etc installed.

Gertjan

A double problem:
The subject:

when reloading squid 2 / squidguard

Let's say: "squid 2 / squidguard " are installed.

You're saying:
@johnjohn:

no squid/snort etc installed.

So, IF its related to the same problem, it isn't related to squid 2 / squidguard ….

But this thread IS squid 2 / squidguard related - Nachtfalke cleary shows a relationship.

You get the point ?

Anyway, some advice was presented so you are all able to get some more information about what’s happens and when.

Just posting: "there is a problem" is a waste of time.
For example: I'm using the Portal, Local User Manager - NO "squid 2 / squidguard" and this http://www.papy-team.org/munin/dyndns.org/brithotelfumel.dyndns.org/index.html#portalusers (my pfSEnse) shows to me that user get disconnected ;)

johnjohn

I bit the bullet and did a clean install of 2.1.4, now the Captive Portal works OK, the minicrons keep running, (3 days so far) The only item I had installed previously that maybe was causing the problem was the cron package; this time I did not install it.

Gertjan

:-\

http://www.papy-team.org/munin/dyndns.org/brithotelfumel.dyndns.org/index.html#portalusers (graphs starting at July 9 - lasting for 3 days -connects but no more disconnects))

Clearly enough, connections stopped getting disconnected.
A

ps ax | grep 'minicron

listed them all, except these two:

10802 ?? Is 0:00.00 /usr/local/bin/minicron 60 /var/run/cp_prunedb_cpzone.pid /etc/rc.prunecaptiveportal cpzone
11129 ?? I 0:00.22 minicron: helper /etc/rc.prunecaptiveportal cpzone (minicron)

My idle time out is 30 minutes - hard time out is 300 minutes (5 hours), so I knew approximately when the minicron instance died.
Sadly enough, nothing special in the logs.

Knowing that I have only one captive portal ‘zone’ and that it is called ‘cpzone’, I launched the minicron manually:

/usr/local/bin/minicron 60 /var/run/cp_prunedb_cpzone.pid /etc/rc.prunecaptiveportal cpzone

Good: function captiveportal_prune_old() in /etc/inc/captiveportal.inc got called again each minute, but …

 .... count($cpdb);

after a

$cpdb = captiveportal_read_db();

said: no or empty client database ! (== no connected clients), or the Captive Portal Status page DID show the connected clients (info coming from the same client database).

demco

Running command 'pkill -HUP cron' can terminate the minicron process. This is used by pfsense to configure cron job.

Package like squid uses the library and unintentionally kill the minicron process.

Also the command do not kill minicron started during bootup. Hence the other 3 are not affected.

Already opened a ticket (#3757) and see if developer can rectify the issue.

Gertjan

@demco:

Running command 'pkill -HUP cron' can terminate the minicron process.

Can ?
Is there some random factor that it does do so ones in a while ?

@demco:

This is used by pfsense to configure cron job.

To configure the 'real' cron jobs listed in /etc /ecrontab no other cron-look-alikes, as the one started by the captive portal.
Basically, it rewrites /etc/crontab and send s a signal to 'cron' - not the executable /usr/local/bin/minicron.

I remember seeing the minicron source code.
It forks in the background.
It will sleep for param1 seconds
If the PID param2 exists, then it will run the 'command' that was given as param3.
And reloop.

@demco:

Package like squid uses the library and unintentionally kill the minicron process..

Possible.
Btw: I'm not using any packages.
But I will trace this function, to see when its called.

@demco:

Also the command do not kill minicron started during bootup. Hence the other 3 are not affected.

I just ran several times:

/bin/pkill -HUP cron

The list with my minicrons:

[2.1.4-RELEASE][admin@my-domaine.]/etc/inc(15): ps ax | grep 'minicron'
11941  ??  Is     0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/bin/ping_hosts.sh
12140  ??  I      0:00.23 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
12237  ??  Is     0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /etc/rc.expireaccounts
12536  ??  I      0:00.02 minicron: helper /etc/rc.expireaccounts  (minicron)
12956  ??  Is     0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pid /etc/rc.update_alias_url_data
13034  ??  I      0:00.00 minicron: helper /etc/rc.update_alias_url_data  (minicron)
[b]38556[/b]  ??  Is     0:00.00 /usr/local/bin/minicron 60 /var/run/cp_prunedb_cpzone.pid /etc/rc.prunecaptiveportal cpzone
39174  ??  S      0:00.95 minicron: helper /etc/rc.prunecaptiveportal cpzone (minicron)
42092   0  S+     0:00.00 grep minicron

didn't change. No minicrons were killed.
Especially: this one:
38556 ?? Is 0:00.00 /usr/local/bin/minicron 60 /var/run/cp_prunedb_cpzone.pid /etc/rc.prunecaptiveportal cpzone
It's still running.

/bin/pkill -HUP cron

Didn't stop mine …
Btw: I have a line being send to my remote syslog every time "/etc/rc.prunecaptiveportal" is executed. It list the number of clients connected - and some major values like timeout; soft-timeout, etc.

@demco:

Already opened a ticket (#3757) and see if developer can rectify the issue.

What about adding some more details ?
(edit: here ticket #3757 ;))

The nasty thing is: we have no more (direct) access to the source of the program (binary) of /usr/local/bin/minicron
It would be easy to see if this executable listens or reacts to /bin/pkill -HUP cron signals.
But I doubt …..

demco

Gertjan,

As mentioned before the command only seem to affect minicron that has been restarted since system bootup.

You need to restart the minicron before testing. Just click the save button under the captive portal page, this should restart minicron with different pid#.

Now run pkill -HUP cron. Minicron should be killed.

Gertjan

:o

Thanks for the extra details.

I confirm, I ran 'pkill -HUP cron' several times, my 'captive portal mincron' wasn't killed.

Then, as advised above, I restarted the captive portal (using the small icon on the top right side).
'pkill -HUP cron' still wouldn't kill my 'captive portal mincron'.

I stopped (deactivated) the portal interface - and enabled it again.

Now, running 'pkill -HUP cron' DID KILL MY 'captive portal mincron' - every time.

So, after reboot, all is well.
But, if I deactivate and reactivate the portal interface, the function configure_cron() in /services.inc (when executed) will blow away the portal interface pruning process.

I guess you found a nasty bug !! :)

edit:

What's being mentioned here matches the issue : http://unix.stackexchange.com/questions/23930/how-to-pkill-by-command-name
I don't know why an initial (after boot) 'minicron' stays alive, and all the others, created when the portal re-activates again are getting killed.
The FreeBSD command pkill does some matching, so this:

/bin/pkill -HUP -f "cron -s"

might be the the solution (if minicrons shouldn't be restarted, but cron does).

demco

Looks like the developer resolved it by using this command instead.

sigkillbypid("{$g['varrun_path']}/cron.pid", "HUP");

mendilli

@demco:

Looks like the developer resolved it by using this command instead.

sigkillbypid("{$g['varrun_path']}/cron.pid", "HUP");

could you please tell us where(inwhich file, which line) this code must be replaced

demco

@mendilli:

could you please tell us where(inwhich file, which line) this code must be replaced

It was mentioned in reply #15.

Function configure_cron() in services.inc, replace the pkill line with the new code.