WebGUI works only for a few days
-
hi all, i have configured to use the webgui from remote, https worked perfectly, certificate too.
some day, i was not able to reach the webGUI, even the login-window did not appear. i have changed absolutely NOTHING. all the routing etc. was still working.i thought ok, let's restart the whole box and, what a surprise, everything was working again.
3 days later: the same problem. restarted again.
a few days later: same process.
a few days later: …
... and ... and ...that's annoying. this pfsense-box is mounted in a rack, in a computing center. it takes me 3 hours for every restart.
first thought it was a non-recurring problem, but obviously it's not.any ideas?! ??? thanks in advance!
-
A lot more detail is needed before any kind of useful answer can be provided.
1. What pfSense version?
2. What platform? (Full install, embedded, etc) on what kind of hardware?
3. What packages do you have installed?
4. If you enable SSH, are you still able to connect to SSH when the WebGUI has failed?
5. If you choose the console option to restart the WebGUI, rather than reboot, does that also restore access?
6. From the console, check the system log (clog /var/log/system.log) for errors before restarting next time, and report what they are.
That should help to start out with, anyhow.
-
hi, thank you for your reply. here the answers:
1. What pfSense version?
1.2.2
2. What platform? (Full install, embedded, etc) on what kind of hardware?
full install, server HP ProLiant DL120 G5, 5 ethernet ports (1 onboard, 4 on extra NIC)
3. What packages do you have installed?
just the basic 1.2.2 install
4. If you enable SSH, are you still able to connect to SSH when the WebGUI has failed?
i'm not able to connect to SSH, not from remote and even not from within the same LAN (on which the webGUI runs). i have to mention, that VPN is also gone. but routing still works.
I'm sorry, I have no possibility to try from console directly at the router at the moment.
5. If you choose the console option to restart the WebGUI, rather than reboot, does that also restore access?
no possibility to do so…
6. From the console, check the system log (clog /var/log/system.log) for errors before restarting next time, and report what they are.
see 5.thx again!
-
From that I can only guess that the system may be locked up somehow. It might be too busy passing other traffic to actually respond on the WebGUI. Especially with 5 NICs, it could be too busy with interrupts to handle other work if you have a substantial amount of traffic.
You might try turning on device polling under advanced options, see if it keeps running that way.
You may also want to try a 1.2.3-RC3 snapshot to see if it is any better.
NB: Just looked up the specs on that server, and based on what I see that probably is not the case, but it can't hurt to try.
-
Second vote for 1.2.3 which I am fairly confident will solve your problem.
Choose one of the full updates from:
http://snapshots.pfsense.org/FreeBSD_RELENG_7_2/pfSense_RELENG_1_2/updates/pfSense 1.2.3 is really close to full release and is more stable than 1.2.2 .
-
hey, bringing up this thread again, sorry for not answering such a long time amd thank you for your answers!
had a few restarts meanwhile and looking forward to install 1.2.3 tomorrow…
i had a look at the console directly at the box, I saw a lot of messages during the time the box blew up:ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: FAILURE - WRITE_DMA timed out on LBA=[…]
g_vfs_done(): ad4s1a[WRITE(offset=[…], length=[…])error=5these messages are repeating and nothing helps stopping it but a hard reset. the […] are changing numbers like 2465118723…
although this has nothing to do anymore with the webinterface only, could anybody please explain what that means?! -
That is a sign of hard drive failure.
Or possibly cabling or controller failure, but those are less likely.
Boot from something like ubcd and run a diagnostic on the hdd, see what results you get.
-
hey… as you might have recognized, i need little time to answer ;)
thank you for your interpretation. upgraded to 1.2.3 RC3 now, same problem. running hardware-tests came up with no result (i'm not sure if the raidcontroller was tested too...).
Long story short, called hp support, they ordered a new mainboard today. let's see what it will get... -
wacko… changed mainboard, the same problem remains. i can't imagine that harddisks are a problem, cause it's raid 1...
will change disks and write results here.... -
hi all, i've changed the hard disks, reconfigured raid 1 and did a new full install. restored settings with backup-xml and everything is working again. i hope that it will last for a long time now and that there will be no further problems…! thank you for your support!