Help to debug a crash please :)



  • Hi,

    For the second time I have my pfsense box freezing (now running RC3e - gonna upgrade to release).

    My issue is the following : when doing a reboot - I have no log kept from the previous boot - and no log rotation seems to take place. Is this a missconfiguration somewhere ?
    Because of this I have a hard time to diagnose what is going on.

    Here are more details regarding the 2 last crashes :

    • impossible to ping the box nor ssh to it.

    • USB Mouse/Keyboard plugged in but impossible to do anything on the console.

    • Pressing the resset button displays an ACPI kernel log (saying the message was received) , but nothing seem to happend.

    • Pressing some second to make it reboot the hard way works (hopefully) - but after this no log from before the crash are available.

    Anyway, I'm actually not lucky …
    I have 2 Dell SC1425 running PFSense, and the first one got hardware issues (kernel messages related to memory issues when crashing, memory test failing next reboot, Dell's test CDs saying there was hardware issues).
    Once I can I'm goint to pass the same test CD on the second box which crashed again yesterday - I would prefer to have an explainable hardware issue actually than anything else obscure.



  • Set up a remote syslog server and send your logs to it. This way you will have the last messages before the crash.



  • Too easy to be true :p
    Thanks.



  • Bad memory could cause panics or complete hangs.  As could overheating CPUs or underpowered CPUs.  If you ordered both those boxes at the same time, it wouldn't surprise me in the lease if both of them ended up with the same hardware error.

    –Bill



  • Well, just had some time (and not too much peoples in) to switch to the second pfsense box running on a second link.

    Testing went all well, no hardware issue detected.
    I took the occasion to disable hyperthreading and ethernet bootp, reinstall a fresh 1.0 release and setup software Raid1.

    Now … let's wait and see, I have a syslog server receiving messages.



  • Well …

    Just had the exact same crash.
    Couldn't ping or SSH to the box anymore, plugging an USB keyboard don't load the driver.
    Pressing the power button makes pfsense tell me it's gonna shutdown, but nothing happend then - pressing it again just displays a message saying that ... I cannot press it again :)

    Syslog didn't revealed anything interesting, just regular packets dropped by the firewall and then nothing else.

    Do anyone have other hints on how to debug the issue ?



  • did you enable all syslog options ?
    firewall is not intersting
    you need the kernel messages



  • Everything is checked except "Show raw filter logs".
    I see racoon, ssh, pf, pfftpx messages, but nothing system related … /etc/syslog.conf don't seem strange

    After checking /var/log/system.log, I can see that I didn't receive all messages I should - for example I didn't receive :

    Oct 16 12:11:43 vpn kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4 finished.
    Oct 16 12:11:43 vpn kernel: GEOM_MIRROR: Device gm0: provider ad4 activated.
    Oct 16 13:19:04 vpn pftpx[461]: #79 server refused connection
    Oct 16 14:16:48 vpn sshd[49969]: Accepted keyboard-interactive/pam for admin from 172.16.19.69 port 60786 ssh2

    Time to check out if I don't have a shitty syslog server (which is possible as in the hurry I used a windows gui server called Bubble syslog Server)



  • Now got a real syslog server receiving everything.

    Since them I noticed high load on the box, ntop was taking 100% of one CPU.
    I've now disabled / uninstalled it - but this could be the source of my issues (even though in that case it shouldn't crash the box).

    Now waiting to see if everything is stable again.


Log in to reply