Multiples crashes, error on different equipment



  • Hi everyone :)

    We have about 50 physical pfsense on all our customer base, and a lot of virtualized, and no problem! So far ... in one place, we have multiple repeated crashes on the 2.4.4 version of pfsense. We first thought of a power supply problem, because it's a rackmount box connected to an inverter. We took it out of the inverter to power it on a non-inverted socket, but it always crashes. I couldn't even boot on the OS.

    The output I have on the console cable:

    "can't exec /bin/sh for single user: Exec format error
    can't exec /bin/sh for single user: Exec format error
    can't exec /bin/sh for /etc/rc: Exec format error
    Enter full pathname of shell or RETURN for /bin/sh: random: unblocking device "
    

    For me the system was fucked up. So I changed my hardware, putting a new pfsense on a new motherboard APU.4D4 (never had any problem with these boards yet), update, re-inject the configuration ... everything is ok and functional. We have cut off all provider access so that only we have access. Plug this new pfsense on a non-corrugated socket and 24 hours later, another crash, and pfsense screwed up .... I'm going to go on site to get the crash logs and analyze them.

    We have a fairly basic configuration, with some VLANs and firewall rules. When injecting the configuration, no errors are displayed.

    I've recovered the crash logs from the first pfsense, if it helps, available as a .zip archive: https://share.ozerim.fr/index.php/s/aSJ54LHs5j0YjBX

    Has anyone ever had a case like this before? Because right now, I don't know where to look! I will try to copy the configuration manually on a new box and not to inject it, to see what it gives.

    Thomas



  • i'd say its disk failure / filesystem corruption



  • @Ozer_im said in Multiples crashes, error on different equipment:

    So I changed my hardware, putting a new ......

    But you kept the disk.
    As @heper : it's the disk.
    It's fsck time.



  • @Gertjan said in Multiples crashes, error on different equipment:

    @Ozer_im said in Multiples crashes, error on different equipment:

    So I changed my hardware, putting a new ......

    But you kept the disk.
    As @heper : it's the disk.
    It's fsck time.

    Thank you for your answers.
    I forgot to mention on my post that I also changed the motherboard hard drive, so I had a new motherboard, a new hard drive and an updated system.


  • Rebel Alliance

    crash_logs.txt shows:

    "mmcsd0: Error indicated: 1 Timeout"

    Which might indicate that

    "mmcsd0: 8GB <SDHC USD 1.0 SN 416A**** MFG 11/2018 by 116 J`> at mmc0 50.0MHz/4bit/65535-block"

    is broken and disconnects as soon as defective blocks are accessed.


  • Netgate Administrator

    This looks like an APU2 which is not Netgate hardware. Moved the thread.

    It's running 2.4.4p1 you should upgrade to the current pfSense version.

    It's running a config created in a newer version which is not necessarily valid in 2.4.4p1:

    <118>Loading configuration......
    <118>
    <118>*******************************************************************************
    <118>* WARNING!                                                                    *
    <118>* The current configuration has been created with a newer version of pfSense  *
    <118>* than this one! This can lead to serious misbehavior and even security       *
    <118>* holes! You are urged to either upgrade to a newer version of pfSense or     *
    <118>* revert to the default configuration immediately!                            *
    <118>*******************************************************************************
    <118>
    

    And yes in the first case it looks like it crashed out because it's running from SD card and the card or controller stopped responding.

    Steve



  • Thank you all for your answers.

    @stephenw10 said in Multiples crashes, error on different equipment:

    This looks like an APU2 which is not Netgate hardware. Moved the thread.

    I'm sorry, I hadn't noticed that.

    It's running 2.4.4p1 you should upgrade to the current pfSense version.

    It's running a config created in a newer version which is not necessarily valid in 2.4.4p1:

    <118>Loading configuration......
    <118>
    <118>*******************************************************************************
    <118>* WARNING!                                                                    *
    <118>* The current configuration has been created with a newer version of pfSense  *
    <118>* than this one! This can lead to serious misbehavior and even security       *
    <118>* holes! You are urged to either upgrade to a newer version of pfSense or     *
    <118>* revert to the default configuration immediately!                            *
    <118>*******************************************************************************
    <118>
    

    And yes in the first case it looks like it crashed out because it's running from SD card and the card or controller stopped responding.

    Steve

    Indeed it is not the same version. With a more recent version, I had multiple errors when re-importing on a new motherboard, with a more recent pfsense version. I preferred to use the version of my current configuration, which is 2.4.4-RELEASE-p1 (amd64).

    A new equipment was installed this morning, with a new motherboard and a new SD card. The configuration has been redone entirely from scratch and by hand, without using the import/export tool provided by pfsense. Fingers crossed ...



  • It's been two weeks now that a new router is in place and everything is working properly. It seems that starting from scratch and manually reconfiguring the router without going through the import/export tool has solved the problem!


Log in to reply