Hanging webGUI fix
-
Something is regularly STOPPING your webserver.
Aug 26 13:51:06 xxx.xxx.xxx.xxx lighttpd[63368]: (server.c.1546) server stopped by UID = 0 PID = 54459
Aug 26 13:51:06 xxx.xxx.xxx.xxx lighttpd[63368]: (server.c.1546) server stopped by UID = 0 PID = 54459So you need find out what process has that PID that appears on those log lines.
Also, from that log, I cannot see how you disabled the captive portal, the log suggests pretty clear is it NOT disabled at all.
Aug 26 13:48:49 xxx.xxx.xxx.xxx logportalauth[27062]: Restarting captive portal. Aug 26 13:49:16 xxx.xxx.xxx.xxx logportalauth[1184]: LOGIN: orbit, 00:0c:29:ca:be:91, 172.16.0.100 Aug 26 13:50:44 xxx.xxx.xxx.xxx logportalauth[1184]: FAILURE: orbit, 00:0c:29:ca:be:91, 172.16.0.100 Aug 26 13:51:07 xxx.xxx.xxx.xxx minicron: (/etc/rc.prunecaptiveportal) terminated by signal 15 (Terminated: 15) Aug 26 13:51:13 xxx.xxx.xxx.xxx logportalauth[27062]: Restarting captive portal.
You also apparently have some networking issues:
Aug 26 13:51:15 xxx.xxx.xxx.xxx php: : MONITOR: GW3G is down, removing from routing group
before the lighttpd closes the connection:
Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (network_writev.c.112) writev failed: Operation not permitted 14 Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (network_writev.c.112) writev failed: Operation not permitted 14 Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (connections.c.637) connection closed: write failed on fd 14 Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (connections.c.637) connection closed: write failed on fd 14
(Well, no wonder it seems that it "crashes" when your network is down.)
-
Something is regularly STOPPING your webserver.
Aug 26 13:51:06 xxx.xxx.xxx.xxx lighttpd[63368]: (server.c.1546) server stopped by UID = 0 PID = 54459
Aug 26 13:51:06 xxx.xxx.xxx.xxx lighttpd[63368]: (server.c.1546) server stopped by UID = 0 PID = 54459So you need find out what process has that PID that appears on those log lines.
Hm, how could I do that? I tried by running top and simply finding the PID on the screen when crash occures (just did), but the PID was not on the screen. Either the killer PID was low in usage so he didn't show in top, or it was an ad-hoc process that didn't exist before. Can I somehow send PID info upon process generation to the syslog server? Simply cronjobing "ps -aux" will probably not be effective.
BTW, since there is no point in using Alix board if I cannot use Darkstat or Captive Portal, I turned both of them on and am looking for the killer-PID.
Also, from that log, I cannot see how you disabled the captive portal, the log suggests pretty clear is it NOT disabled at all.
Sorry for not being clear, the log was made BEFORE I've turned the captive portal off, that explains the CP entries in the log. Just for clarification, Alix Board should be able to cope with captive portal, right?
You also apparently have some networking issues:
Aug 26 13:51:15 xxx.xxx.xxx.xxx php: : MONITOR: GW3G is down, removing from routing group
I could of course remove the Gateway Group entry, but the line is there everytime I have a crash. I will remove it and watch.
before the lighttpd closes the connection:
Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (network_writev.c.112) writev failed: Operation not permitted 14 Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (network_writev.c.112) writev failed: Operation not permitted 14 Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (connections.c.637) connection closed: write failed on fd 14 Aug 26 13:51:18 xxx.xxx.xxx.xxx lighttpd[25214]: (connections.c.637) connection closed: write failed on fd 14
(Well, no wonder it seems that it "crashes" when your network is down.)
It was a dirty log, don't remember what I've done at that exact moment. But thanks for the hint, I will try to exclude the networking issue by bringing the pfSense box and attaching it to my workstation directly. Though, we have some 60 Workstations on the switch, no problems. No firewall rules or any limiter entries that could cause problems from pfSense side.
Thanx for helping me out ;)
-
Please, post something useful, not "dirty" logs. The log shows that your network crashes and that's pretty much it. I'd suggest wiping the config and restarting form scratch.
-
Please, post something useful, not "dirty" logs. The log shows that your network crashes and that's pretty much it. I'd suggest wiping the config and restarting form scratch.
With "dirty" I ment I was constantly restarting services (darkstat, captive portal…) so I was not clear if I made the logs dirty by my own actions.
The installation is from scratch, I only had Darkstat running for few days and tried to get captive portal running today. I also have some basic port forwards. No additional services or some weird settings. I've tried the fail over route configuration, it was running on WAN and it was no math science so I excluded that as cause for my WebGUI problems, but in order to eliminate the causes, I removed it too (as you indirectly suggested). Am out of office now, but I have another CF I will copy pfSense to and will give it a go tomorrow from scratch, will let you know how it goes.For me it would be important to have an idea if I am overstretching the hardware, should Alix board be able to handle 12 Mbit internet connection, some firewall rules, RRD graphs (default ones), captive portal, DHCP Server, DNS forwarder, Darkstat monitoring and Gateway Failover, maybe 5 VLANs... Alix board should be able to handle that easily, right?
Anyway, thnx a lot for helping. My main suspect is the gateway I removed, will test tomorrow and do a clean install if necessary. Will let you know how it goes.
-
Perhaps you have a hardware problem? Something about to fail?
-
Perhaps you have a hardware problem? Something about to fail?
Most likely. Though, with statements like "The installation is from scratch" and "My main suspect is the gateway I removed"… ::)
-
doktornotor - You missed your calling as depression counselor… :D
-
doktornotor - You missed your calling as depression counselor… :D
-
Perhaps you have a hardware problem? Something about to fail?
Most likely. Though, with statements like "The installation is from scratch" and "My main suspect is the gateway I removed"… ::)
Hardware… hope not because it would be difficult to diagnose, hope it's a topology configuration problem (outside pfSense), I still don't know. We have a 3 line DSL that has one line down, waiting for ISP to fix that. But it's on WAN side so I don't think it could crash the WebGUI just like that. The thing with gateway is that I had a gateway on LAN, so main Gateway on WAN side, and failover on LAN side. I've tested the failover and it worked, but I might have concluded to easily that it's the right way to do. Anyway, I bought another managed switch so I will be able to get 2 WAN ports and have a clean installation on that side. I think there is not much point in pursuing the WebGUI issues before making sure the environment is the right one. But I'll repeat once more, I've really done nothing "unusual" to pfSense, I wouldn't expect much from install from scratch, the few settings I've made shouldn't be crashing the WebGUI.
I hope it was the failover gateway configuration causing problems. -
I think having all your gateways on WAN is a good idea…
Hope it goes well when you get your new equipment.
-
I think having all your gateways on WAN is a good idea…
Hope it goes well when you get your new equipment.
ISP fixed the DSL line and I've managed to put the Gateway on the WAN side of the pfSense box. 3-4 days running… no problems, so I am optimistic. Am not sure if it was the fault DSL connection or having the WAN Failover Gateway on the LAN side, but WebGUI doesen't seem to crash any more. Thanx for your help everybody!
-
Good stuff - I'm glad its working.