System keeps braking down



  • Hello Community,

    I have a problem with a pfsensebox. Everytime our customer has an unexpected powercut the pfsensebox brakes down. It just doesn't start anymore. Replaced the HDD 2 times and still this error occured. So next step was to install a UPS, this way power stays on during a grid failure and the system keeps running. However a strange problem occurs.

    After a clean shutdown the system won't start anymore and guess what. Same issue as before!

    Any advice on what to do next? Could this be malfunctioning hardware or has it to do with the software? Version 2.01 is installed. Problem already occured with version 1.2.3 , 2.0 and 2.01

    There are 2 error messages when the system won't start anymore.

    First: Cannot find /bin/sh

    Second: elf32_loadimage: read failed / Unable to load a kernel!

    Hardware the box is running on:
    0256 Mbyte Memory                        CPU Geode LX 434 Mhz
    Pri Sla  KINGSTON SS100S28G              LBA Xlt 974-255-63  7824 Mbyte

    Any help is appreciated!

    regards,
    pfsensedummie


  • Netgate Administrator

    More information needed!  ;)

    Is this a Soekris box? Which one? What bios version?

    How are you installing pfSense? Which install type are you running?

    Steve



  • Hello Steve,

    Pfsensebox that is crashing
    Config:
    Soekris 5501
    Bios 1.33c
    Disk: SSD 8GB
    Install Single Processor ACPI enabled
    Version 2.00

    As you can see it is indeed a soekrisbox. Have to say I rebuild a pfsensebox with different install atm to test out. Running on exact the same hardware.

    Still running after resets/powerunplugs etc
    Soekris 5501
    Bios 1.33c
    Disk: SSD 8GB
    Install Embedded / ACPI disabled
    Version 2.00

    Regards,
    pfsensedummie


  • Netgate Administrator

    Ok.
    What is the reason for not running embedded on both boxes?

    When the box doesn't boot are you seeing those errors on the console? After the bios?

    Are you saying that this box will boot up correctly the first time and never again?

    How are you installing this, on another machine and then moving then drive?

    Steve


  • Netgate Administrator

    It looks likely to be a bootloader problem. See:
    http://doc.pfsense.org/index.php/Boot_Troubleshooting

    You could try changing the bios disk settings, particularly if it's a SATA drive.

    Ultimately you can probably get it to boot using an alternative bootloader. If it is a bootloader problem.  ;)

    Steve



  • Hello Steve,

    Thank you!

    Sorry for my incomplete info  ;)

    OK.

    I have a SATA disk. This I connect to an old HP desktop pc. From there on I install pfsense with a CD that I downloaded. Afterwards I build the pfsensebox. Then I have to change the FSTAB settings because the PC runs different settings.

    The non embedded install let's me change al the settings (fstab/serialconsole/ssh) from the pc I installed it on, then I switch the disk over to the soekrisbox.

    However I always have to change the fstab settings even so when I choose embedded install.
    Connect serialcable, watch pfsenseconsole, go to shell, change with vi the fstab settings, reboot the firewall and all is fine.

    The problem occurs mostly after a powerfailure. Then the box wouldn't boot anymore. Last time it was a clean shutdown and it just wouldn't start anymore. Connected serialcable and saw, elf32_loadimage: read failed / Unable to load a kernel!

    I'll read through to documentation you put up above here to see if it fixes my problem.

    Btw: the box was tested before it went out. Took the power out of couple of times during startup/shutdown and everytime it booted fine. After few weeks, same problem occurs at the customers office, machine won't boot. Strange….

    When the box doesn't boot are you seeing those errors on the console? After the bios?

    Yes, errors appear after the bios settings and after the F1 boot. Machine boots until it cannot find kernel or bin/sh

    Are you saying that this box will boot up correctly the first time and never again?

    Boots normally during testing period. Simulation of powerfailure etc etc doesn't effect the boots. Problem starts to occur after several weeks in production. (between 7 and 14 days mostly)

    How are you installing this, on another machine and then moving then drive?

    Disk in HP Desktop -> Move Disk to Soekris -> Change FSTAB settings - Boot normally.

    Regards,
    pfsensedummie


  • Netgate Administrator

    Ah, well that's probably not a bootloader issue then, since it boot fine during testing.
    It looks more like an actual data corruption problem. Reading through the reviews of that drive on newegg does not look good:
    http://www.newegg.com/Product/Product.aspx?Item=N82E16820139427
    Maybe a drive firmware update could help? There don't appear to be any though.  :(
    Is it possible to recover the drive by re-installing?
    You haven't said why you don't use an embedded install.

    Steve



  • @stephenw10:

    Is it possible to recover the drive by re-installing?
    You haven't said why you don't use an embedded install.

    Steve

    Goodmorning  ;)

    Yes, when the system is corrupted in reinstalls normally. No issues found.

    About the none embedded install: Because I make the installation on a desktop pc I can test if pfsense runs correctly. That's why I haven't chosen the embedded install. I wanted to check if installation has gone alright and if the pfsense boots. Then I switch the Serialport on within the webgui and I move over de disk to the closed system. This way I know for sure the box is running correcly. That's the reason for the non-embedded install.

    mmm that disk could be a problem though if I read your document….

    regards,
    pfsensedummie


  • Netgate Administrator

    If that's the only reason then perhaps it's time to consider switching to an embedded install or a different SSD.

    Steve



  • Thanks for your help Steve :)



  • Reinstall:

    1. Formatted the disk
    2. ACPI disabled
    3. Custom install
    4. Remove SWAP
    5. Embedded install
    6. Change FSTAB (ad4 to ad1)
    7. Systems boots

    Powerfailure, reboot, reset during startup, reset during shutdown, reset during loading firewall rules -> System came back up nicely everytime. Loaded customers config and repeated failure tests. System kept running without any problems. SMART status shows that everything is in perfect condition.

    Seems my issue has been totally resolved!

    Regards,
    pfsensedummie


  • Netgate Administrator

    Excellent!  :)

    Steve


Locked