CF card corruption when power fails



  • Hi,

    I have been having a problem for a little while with pfSense on WRAP installations.  We have been providing these installs to people with the embedded version of 1.0.1 on WRAP 1E-2 boards, and SanDisk 128MB or 512MB CF cards.

    The problem we are having appears to relate to power outages, but the information I have read elsewhere in this forum seems to indicate that the embedded version is ro almost all of the time.

    The cards we are using are all brand new, and the units are test booted before being shipped to the clients, so we know that when they are shipped, they were not corrupt.

    If this had only happened once or twice it would not have been as prominent, but it has happened about 10 times all on different boxes in different locations.  Discussions with the clients indicate that sometimes they are aware of a power outage, and sometimes there is no sign of one, but we have no other clues as to what might be causing it.  There is no good indication that the config was being edited in any case.

    We also ship boxes with m0n0wall, and have never had the same issue, it seems to be peculiar to pfSense.

    Any clues anyone?



  • never heard of it
    but upgrading to the embeded snapshot of 15-03-07 will not be bad
    manny things are fixed afther 1.0.1



  • @jeroen234:

    never heard of it

    Well, now you have heard of it.

    Since I sense skepticism in your reply, I decided to do some tests.  I am fortunate enough to have access to plenty of new WRAP boards and SanDisk CF cards, so I have set up a unit the same way I set all the ones we have supplied to clients.

    I have a WRAP 1E-2 with a 128MB SanDisk CF card.  I used physdiskwrite to flash the embedded version of 1.0.1 onto it, then attached my terminal emulator to it and did some power cycling.

    The boot up takes about 90 secs, and I wait until it is complete before shutting power off again.

    On restart number 7, I saw the first unusual behaviour:

    blah, blah…
    Executing rc.d items...
    Stopping /usr/local/etc/rc.d/.sh...done.
    Starting /usr/local/etc/rc.d/
    .sh...done.
    Bootup complete

    FreeBSD/i386 (pfSense.local) (console)

    *** Welcome to pfSense 1.0.1-embedded on pfSense ***

    LAN                      ->  sis0    ->      192.168.1.1
      WAN*                    ->  sis1    ->      10.10.0.101(DHCP)
      OPT1(OPT1)              ->  sis2    ->      NONE

    pfSense console setup


    0)  Logout (SSH only)
    1)  Assign Interfaces
    2)  Set LAN IP address
    3)  Reset webConfigurator password
    4)  Reset to factory defaults
    5)  Reboot system
    6)  Halt system
    7)  Ping host
    8)  Shell
    9)  PFtop
    10)  Filter Logs
    11)  Restart webConfigurator

    Enter an option: Terminated

    FreeBSD/i386 (pfSense.local) (console)

    *** Welcome to pfSense 1.0.1-embedded on pfSense ***

    LAN                      ->  sis0    ->      192.168.1.1
      WAN*                    ->  sis1    ->      10.10.0.101(DHCP)
      OPT1(OPT1)              ->  sis2    ->      NONE

    pfSense console setup


    0)  Logout (SSH only)
    1)  Assign Interfaces
    2)  Set LAN IP address
    3)  Reset webConfigurator password
    4)  Reset to factory defaults
    5)  Reboot system
    6)  Halt system
    7)  Ping host
    8)  Shell
    9)  PFtop
    10)  Filter Logs
    11)  Restart webConfigurator

    Enter an option:

    That is, for no apparent reason, the console menu appears, is terminated, then comes back.  This happens within a few seconds of the bootup process completing and takes 15 secs to get back to the menu.  It happens once for each boot cycle.

    So, at this stage, the bootup process takes 105 secs.  I continued my testing and on restart number 27, this happened:

    Trying to mount root from ufs:/dev/ufs/pfSense

    ___
    / f
    / p _
    / Sense
    _

        _
    _/

    Welcome to pfSense 1.0.1 on the 'embedded' platform…

    Setting up embedded specific environment... done.
    Mounting filesystems...WARNING: /cf was not properly dismounted
    WARNING: R/W mount of /cf denied.  Filesystem is not clean - run fsck
    mount: /dev/ufs/pfSenseCfg: Operation not permitted
    ** /dev/ufs/pfSenseCfg
    ** Last Mounted on /cf
    ** Phase 1 - Check Blocks and Sizes
    ** Phase 2 - Check Pathnames
    ** Phase 3 - Check Connectivity
    ** Phase 4 - Check Reference Counts
    ** Phase 5 - Check Cyl groups
    7 files, 10 used, 1861 free (21 frags, 230 blocks, 1.1% fragmentation)

    ***** FILE SYSTEM MARKED CLEAN *****
    done.
    Creating symlinks......done.
    Launching PHP init system... done.
    Initializing................. done.

    So, the filesystem was not cleanly unmounted which would suggest that it was actually mounted r/w and thus quite susceptible to corruption on power outage.

    Now, at no stage since the first restart have any configuration changes been made.

    Yet, here we have a filesystem complaining that it was not cleanly unmounted, when it is supposed to be (from what I have read and been told, anyway) read only.  Perhaps I have missed something obvious but how can a read only filesystem be not cleanly unmounted.

    Also, can someone explain to me why the system is attempting to mount the cf card r/w in the first place and at what point it is remounted r/o, if at all?

    I understand that config changes can only be made with the card mounted r/w, but in this case I was not attempting to make any changes.

    Under what circumstances is the embedded version mounted r/w, and when (if at all) does it revert to r/o?

    If I am using it incorrectly, I am more than happy to be told what changes I need to make for it to be more robust.

    I will repeat this test with the March 15 snapshot, as suggested, but I strongly suspect that I will get the same results, especially since this doesn't appear to be a known issue so I doubt it has been specifically addressed in any updates.

    Best regards,

    Paul McGowan



  • / and /cf are mounted rw on bootup (to later support packages but pretend I just did not say that).

    After final bootup and before the console menu appears both / and /cf should then be mounted ro.

    do a mount on the console (or ssh) after bootup to see if the system is indeed mounting ro as it should.

    Here is my system after final bootup.

    mount

    /dev/ufs/pfSense on / (ufs, local, read-only)
    devfs on /dev (devfs, local)
    /dev/md0 on /tmp (ufs, local)
    /dev/md1 on /var (ufs, local)
    /dev/ufs/pfSenseCfg on /cf (ufs, local, read-only)
    devfs on /var/dhcpd/dev (devfs, local)

    Note, we fixed a bug recently that was affecting this.



  • Hi Scott,

    I do indeed get:

    mount

    /dev/ufs/pfSense on / (ufs, local, read-only)
    devfs on /dev (devfs, local)
    /dev/md0 on /tmp (ufs, local)
    /dev/md1 on /var (ufs, local)
    /dev/ufs/pfSenseCfg on /cf (ufs, local, read-only)
    devfs on /var/dhcpd/dev (devfs, local)
    /dev/md2 on /var/db/rrd (ufs, local, soft-updates)

    When you say you fixed a bug recently that was affecting this, does that mean it sometimes didn't remount r/o or that you now don't mount it r/w at all?

    Is there any (practical) way to not mount r/w during bootup?

    Given that it takes about 40 secs to get from mounting the filesystem to the bootup complete, that gives a fairly big window for problems and it is entirely possible that this is what is happening in somes cases.  If the power fails for some reason, and is restored for a period between 50 secs and 90 secs, the filesystem would (if I read your post correctly) be mounted r/w and so if it is interrupted again during that time, corruption can occur.

    How would I go about creating a verson that didn't mount r/w for future unmentioned expansions?

    Best regards,

    Paul McGowan



  • The bug was doing strange things and was only in the tree for about a week.

    To modify the behavior you will want to modify /etc/rc and remove all of the mount related calls.

    I really don't know what this will break and it surely will have cascading consequences.



  • Hi Scott,

    I was not talking about never mounting r/w just not during bootup.  I understand that it is necessaary whenever config changes are made.  Does that change anything, or is it still likely to have unexpected results?

    Can you tell me if anything actually writes to the partition during bootup on the embedded version?

    Best regards,

    Paul McGowan



  • @yawarra_paul:

    I was not talking about never mounting r/w just not during bootup.  I understand that it is necessaary whenever config changes are made.  Does that change anything, or is it still likely to have unexpected results?

    Only for future packages.

    @yawarra_paul:

    Can you tell me if anything actaully writes to the partition during bootup on the embedded version?

    Packages for the most part.  Not aware of anything else off the top of my head but the only way to tell is to test it.



  • Hi Scott,

    I'm going to give creating a "read-only during bootup" version a try since the versions we ship don't have a lot of need (or space) for installable packages.  As I understand it, when a new version comes out the whole CF usually gets reflashed.

    If it seems to work I will post the image on our website at: http://www.yawarra.com.au/sw-osimages.php for others to play with.

    Thanks for all your help, it is much appreciated.

    Best regards,

    Paul McGowan



  • @yawarra_paul:

    Hi Scott,

    I'm going to give creating a "read-only during bootup" version a try since the versions we ship don't have a lot of need (or space) for installable packages.  As I understand it, when a new version comes out the whole CF usually gets reflashed.

    If it seems to work I will post the image on our website at: http://www.yawarra.com.au/sw-osimages.php for others to play with.

    Thanks for all your help, it is much appreciated.

    Best regards,

    Paul McGowan

    No problem, please let me know how it works out and thanks again for the xmas present!


Log in to reply