Random failures to fully boot



  • Last week I installed pfSense on another machine and it appears to be having random issues starting up. Randomly the system will fail to boot, throwing a ton or "PID (###) exited on signal 11q (core dumped)", "unexpected character in input", syntax errors, "no such file or directory", "unexpected operator", etc errors. Other times it works fine. It seems if I make it wait longer at the pfsense boot menu, it will always fail to boot once I allow it to proceed. Quickly skipping past this screen appears to successfully boot more often (maybe fails one out of three boots). I tried for hours to find manually and searching here and the forums on how to get the startup log, without using a null modem cable, but I found nothing. "CLOG -f var/log/system.log" doesn't show me anything like I thought it would. Is there a way to get the startup log?

    How the system is setup:
    -AMD X2 CPU
    -MSI K9N SLU Platinum mb
    -2GB RAM (tried one stick, two different slots, etc)
    -SATA hard drive. No IDE drive to test. Fails to boot from pfSense flashed to use drive.

    What I've done:

    1. Memtest for 48 hours
    2. Stress test the system using IBT and Prime95
    3. Replaced power supply
    4. Turn off all features in the BIOS, underclocked RAM, turned off UDMA for the hard drive, turned off cool n quiet, pretty much all settings disabled except rs232 port
    5. Installed i386 4GB to hdd, AMD64 4GB to hdd, i386 512MB to hdd. All are VGA images.
    6. Updated BIOS
    7. Perhaps some other things I can't think of right now.
    8. Booting with the network cables detached
    9. Computer, VGA, and PS/2 keyboard attached only

    I've never had problems with pfsense on any computer, usually using them without VGA within embedded systems running CF cards. Can anyone give me a hint as to the next step for troubleshooting the inconsistent bootups? Any way I can get the startup log from the shell? Each time I do get to the pfSense menu, but many options fail due to the core dump errors.



  • Have you tried using a non-embedded version? Since you have a HDD, you can load the ISO and installed pfsense to the drive. You can even take the memstick installer and install it from there (make sure you choose option 3 … boot from usb ... to have a successful boot). That boot from usb option has some timing changes in there to help it boot.



  • Check your motherboard for bulging capacitors, capacitors oozing their guts, capacitors which have "blown their tops" etc. Some years ago there was a bunch of motherboards which used poor quality electrolytic (and cheap) capacitors. When these things failed they produced all sorts of weird symptoms including those you listed. Most of the usual motherboard suppliers used the bad capacitors in at least some of their motherboards. I suspect if your motherboard was made in the last four or five years it is probably not affected by this problem.



  • @wallabybob:

    Check your motherboard for bulging capacitors, capacitors oozing their guts, capacitors which have "blown their tops" etc. Some years ago there was a bunch of motherboards which used poor quality electrolytic (and cheap) capacitors. When these things failed they produced all sorts of weird symptoms including those you listed. Most of the usual motherboard suppliers used the bad capacitors in at least some of their motherboards. I suspect if your motherboard was made in the last four or five years it is probably not affected by this problem.

    I forgot to mention, I did LOOK at them and they are fine (no bulging, leaks, etc). They are mostly electrolytic on this board so it was one of the first things I looked at. The stability tests running memtest, IntelBurnTest (from a WinPE build), and Prime95 (also WinPE) make me believe that the caps SHOULD be fine as is, but you never know with those things.

    Either way, caps also age naturally. I'm not sure they're at fault, but it's a consideration.

    I'm tempted to try a different (SATA) drive next. It seems to fix issues for some people. All I have is a 500GB here to test with…

    Edit:
    I should add, I appear to be able to consistently boot from a 512MB flash drive that I created with the memstick image. I've never had much luck getting pfsense to install from the livecd/memsticks though, on any pc. I think virtualization was the only place where I had it working some years ago.

    Edit2:
    I will say though, the caps on the videocard ARE budging. I've seen those die from bad caps much more often recently than motherboards. MBs have been fine since about the P4 era. SO many failed Dell P4 machines...



  • Well, I cannot get a good install using a livecd. On automated installation, I get kernel issues on first pfsense boot. Doing it custom, I get stuck in the same place as the people here: http://forum.pfsense.org/index.php?action=printpage;topic=31907.0 , but none of the applicable workarounds in that thread help. The only success I've had to date has been booting to pfsense using a usb thumb drive with the memstick image. I cannot get pfsense to fail when booting to that image/device combination.

    I wish I had a compact flash card to try as I have some CF adapters at home. I've never had problems using quality CF cards, but always issues with hard drives. I don't want to install VMWare ESXi just to get around this problem…


  • Netgate Administrator

    Everything you have mentioned points at a HD problem to me. Especially because you're using the nano install that only really accesses the drive at boot.
    You can install the nano image to a USB stick and boot from that.

    Steve



  • I brought an IDE hard drive and CF card with 40-pin adapter in today and both of those work flawlessly. I did not have any success with either SATA hard drive yesterday, a 160GB Maxtor and 500GB Seagate, so I don't know if the drives themselves are at fault, or more likely the controller on the motherboard. Different ports, sata cable, and BIOS settings (there isn't much for this one) didn't help.

    I will likely stick with the CF install, as it is what I've always used successfully in the past.



  • @Masejoer:

    I did not have any success with either SATA hard drive yesterday, a 160GB Maxtor and 500GB Seagate,

    FreeBSD boot/startup can be sensitive to BIOS settings for SATA drives: "Legacy" or "IDE" can work much better than "AHCI" or "Raid" but there doesn't seem to be any "standard" for the option names.



  • @wallabybob:

    FreeBSD boot/startup can be sensitive to BIOS settings for SATA drives: "Legacy" or "IDE" can work much better than "AHCI" or "Raid" but there doesn't seem to be any "standard" for the option names.

    Yeah, this BIOS has none of that. It's the first that I can remember that didn't provide me options for the SATA channels. They were the first thing I looked for…


  • Netgate Administrator

    You may want to try the 2.1 snapshots which will likely have better SATA support. Of course if you're happy with CF then I'd stick with that.

    Steve


Log in to reply