File System Corruption on 2.2.x



  • Hi.

    We have all pfsense servers with a secondary disk used to store some big backups in a UFS partition. I have detected frequent file system corruption on systems running 2.2.2 version but we don't see this problem on our 2.1.4 systems (half of our servers are still at 2.1.4)

    After reading this https://blog.pfsense.org/?p=1815 we are worried with UFS stability on 2.2.x

    We are considering 2.2.5 upgrade, sync mount or using ZFS on that partition but we are not sure wich should work the best.

    Any help is welcome.

    Regards.


  • Banned

    You should have upgraded many months ago. (ZFS of course is an option as well… though, not sure what's the use of firewall for storing "big backups").



  • Thanks, we have a very limited budget and not always it's possible to do things on the best way  :-[

    We will upgrade to 2.2.2 to 2.2.5 soon, i hope that solve the problems.



  • Upgrading to 2.2.5 doesn't solve the problem. We are still getting filesystem problems deleting big amount of files. The "rm" process goes into uninterruptible sleep (D) and partition access is blocked, soft reboot doesn't work and power off is needed to recover.

    UFS support on 2.2.x is horrible.


  • Banned

    @colunga:

    We are still getting filesystem problems deleting big amount of files.

    And your "big backups" STILL do NOT belong on your firewall…



  • When you have hundreds of firewalls with big disks for proxy caching is not so crazy to use them to store other data  :-\



  • If you have a filesystem corruption, then an in-place upgrade probably won't address the problem. You might, of course, have a problem with the physical disk, in which case you're looking at a full hardware refresh - or at least a new hard disk.

    You might be best off initially just backing up your PFS config, then wiping the drive, re-install the OS and restore the original config to your fresh install. As I say, if your physical drive is to blame, then you'll find out soon enough if you need to replace your hardware.

    And Dok is absolutely right - firewalls are not the place to be storing data from other parts of your network.



  • It's probably not something you want to do on the firewall, but certainly shouldn't hurt anything. I haven't heard of any issues along those lines. All that relevant code, and the options we use for the filesystem, is the same as stock FreeBSD 10.1.

    Could you provide a replicable test case for the problem you're seeing?



  • IT can't be hardware related because has happened in more than 100 devices and always with 2.2.x

    We have on each system a 2TB disk for caching and storing data. We rsync some deduplicated backups every night with heavy use of hard-links.
    The problem always happens deleting old files from disk. The rm process goes into uninterruptible D state and it blocks access to the data disk. Then we need to do a shutdown -o -n -r to reboot the system, after reboot fsck_ufs always gives a lot of errors, sometimes 2 or 3 fsck are needed to mark the filesystem as clean but trying to delete the same content replicate the problem most of times.

    We know that this is not the best place to store data but we have +800 systems and this way we have a very cheap 800TB storage for no critical data.

    Thank you.



  • Again:

    @cmb:

    Could you provide a replicable test case for the problem you're seeing?

    We can get it reported and get attention upstream if you can provide a means of replicating.



  • We don't know how to reproduce it yet.

    Thanks.



  • I just stopped in to ask why you have 800+ firewalls. Are you running a country? Even an ISP only needs 2-3 firewalls for HA.



  • +800 public schools, each one with his own internet access.


Log in to reply