Guidance regarding switching to ZFS



  • What is the resource usage like with ZFS on pfSense?  (Running pfBlocker, and thinking about Snort or Suricata)

    I'll no doubt be forced to upgrade this in the next 12-18 months, but for now, here's the setup I have:

    CPU Type Intel(R) Celeron(R) CPU J1900 @ 1.99GHz
    Current: 1826 MHz, Max: 1993 MHz
    4 CPUs: 1 package(s) x 4 core(s)
    AES-NI CPU Crypto: No
    State table size: 0% (1312/1000000) Show states
    MBUF Usage: 2% (5252/245060)
    Load average: 0.67, 0.55, 0.46
    CPU usage: 11%
    Memory usage: 26% of 3960 MiB
    SWAP usage: 0% of 8191 MiB
    Disk usage:
        /
    85% of 101GiB - ufs
        /var/run
    6% of 3.4MiB - ufs in RAM

    Installed packages:

    • arping

    • darkstat

    • iftop

    • iperf

    • mailreport

    • nmap

    • ntopng

    • nut

    • openvpn-client-export

    • pfBlockerNG

    • RRD_Summary

    • softflowd

    • snort

    • Status_Traffic_Totals

    • sudo

    • syslog-ng

    I use ZFS on FreeNAS, and I was thinking about switching to ZFS for pfSense as I'd really love to be able to roll back an update if I don't like it.

    How much of a drag a ZFS boot drive would be on this system?  The FreeNAS project says don't even think about FreeNAS (ZFS Only-No UFS anymore) below 8GB.

    Given how full the SSD is, I either need to drastically cut the disk usage on ntopng, (it's sucking up about 60GB), stop using it all together, or buy a 256GB SSD.  I'm not williing to spend more money on this box, so if it won't pass muster, I'll have to wait for my next box before going ZFS.

    Any Guidance/Suggestions/Comments/Advice is much appreciated.



  • I wouldn't worry about it at all with 4GB. I saw maybe a 5% increase in memory use on my APU2C4 running Snort and pfBlockerNG and it sits pretty consistently at 20 % use.

    Save config, fresh install choosing ZFS guided install, upload config, done.



  • The FreeNAS project says don't even think about FreeNAS (ZFS Only-No UFS anymore) below 8GB.

    I suspect that is meant for those using deduplication, which is RAM-intensive.



  • @KOM:

    The FreeNAS project says don't even think about FreeNAS (ZFS Only-No UFS anymore) below 8GB.

    I suspect that is meant for those using deduplication, which is RAM-intensive.

    It's needed for ARC. Apparently guys were losing their pools and having other odd things going on with less than 8GB so they upped the minimum requirement to 8. If people stick to that the weird issues seem to go away.



  • @Jailer:

    @KOM:

    The FreeNAS project says don't even think about FreeNAS (ZFS Only-No UFS anymore) below 8GB.

    I suspect that is meant for those using deduplication, which is RAM-intensive.

    It's needed for ARC. Apparently guys were losing their pools and having other odd things going on with less than 8GB so they upped the minimum requirement to 8. If people stick to that the weird issues seem to go away.

    I'm wondering have any similar problems been experienced with pfSense?

    A pool with 8 x 8TB (or larger) pool with some heavy file i/o is certainly going to need a lot more ARC than a pfSense box.

    Have any problems been reported that are attributable to ZFS?

    I would think that ZFS might kick the crap out of a USB drive a lot faster than UFS - not my problem, because I'm using an SSD, but something to be mindful of.



  • At this point I'm not really seeing the advantage of ZFS in this use case, considering the resources it seems to require.



  • The quoted values seem very wrong to me but there's some information missing. You don't need any certain amount of ARC cache for a pool of certain size for ZFS to operate properly, it will still work fine if you allocate too little ARC cache but you won't be getting the maximum read performance that you could get out of the pool.

    On top of that, large amounts of ARC cache are for heavy fileserver use, on a firewall/router/proxy use I don't see where the system would be doing large amount of read operations simultaniously.



  • @kpa:

    The quoted values seem very wrong to me but there's some information missing.

    My reply was in reference to the FreeNAS needing lots of RAM, I guess I could have made that a bit more clear. It's use case is much different than a firewall and the 2 shouldn't be compared.

    @KOM:

    At this point I'm not really seeing the advantage of ZFS in this use case, considering the resources it seems to require.

    Some users were reporting file corruption with UFS. ZFS is quite resilient and as used with pfsense should not tax resources much.



  • A feature of ZFS that doesn't get mentioned often enough is the use of boot environments (BEs, beadm is the tool used to administer them).  Best way to upgrade a system I've ever seen.  If it fails in some manner, easy to fall back to a known good one.  I'd bet one could even automate a fall back if something in the new environment fails.  It's very easy to create mirrors with ZFS so you can get a lot of redundancy easily.  Yes, you can create mirrors for UFS using geom, but not too many folks do that.  Of course your system needs the ability to have multiple devices.

    Amount of RAM is very dependent on use case as others point out.  If you look at what a pfSense or any firewall is doing, they boot, read configuration, start up processes.  After that the bulk of what happens is writing to logs and "in memory stuff" (travsering state tables and such).  There are standard FreeBSD packages that let you look a the zfs usage stats, so you could keep an eye on things.

    ZFS by default will use system RAM so there may be some competition on that resource, but ZFS lets you tune and set limits.  ZFS also groups transactions (mostly reads) that get batched out to the device, so you may have periodic spikes in CPU usage because of this.

    Just my opinions.


  • Rebel Alliance Global Moderator

    (BEs, beadm is the tool used to administer them).  Best way to upgrade a system I've ever seen

    This would be the main factor for sure… So your saying I could take say a snapshot/mirror before I do an upgrade and if something goes wrong just boot the previous boot environment?

    So I have a sg4860.. Its got a 32GB eMMC and 8GB of ram.. why would it not have shipped with zfs if would add the ability for boot environments.. This would be fantastic option when upgrading or if wanting to play with the dev snapshots.  Most say cisco switches do this where when you update the firmware you upgrade the other image and pick which image you want to boot, etc.  Riverbeds do it the same way, etc.

    From this it doesn't seem like BEs are really a viable out of the box option yet?
    https://forum.pfsense.org/index.php?topic=130922.0



  • It may depend on what version of FreeBSD you're basing on.  The TrueOS project (ixSystems guys, PC-BSD followon) are basing on 12-CURRENT and a ZFS install and they have been doing binary upgrades into a new BE since day 1.  11-Release should support them just fine.

    The process is basically create snapshot and clone of the currently running environment and the upgrade is done into that clone.  Then you readjust flags and reboot into that upgraded BE.  If anything fails, the fallback is either:
    reboot, stop in bootloader, select previous BE and continue booting
    or
    if the system is up enough, use the beadm command to activate the previous BE and reboot.

    There are some rules you need to follow when creating any datasets (use of the canmount property to make sure things wind up where they need to be), but it should be doable.  This is the part that gets a lot of people, things needed for booting have to fall under whatever root dataset you create.  Just remember to limit the number of BE's so you don't run out of disk space.



  • @mer:

    ZFS by default will use system RAM so there may be some competition on that resource, but ZFS lets you tune and set limits.  ZFS also groups transactions (mostly reads) that get batched out to the device, so you may have periodic spikes in CPU usage because of this.

    This needs more qualification. ZFS isn't per se much different than UFS for example in that it does use some system memory for its bookkeeping data structures. What is different is that ZFS allocates the ARC cache from the so called wired memory, the non-swappable kernel memory.

    The grouping of transactions applies only to writes really because if there are only reads there is no need for transactions to ensure data integrity. It is of course possible that in some cases a write transaction blocks a read to the same data. Write performance, especially in applications like NFS, is one of ZFS's achilles' heels because it goes to such lengths to try to guarantee that synchronous writes are truly atomic.



  • So getting into the weeds, yes, ARC is nominally kernel memory, accounted for in the Wired statistics in top, but by design it will expand to a threshold of system memory.  Wired and nonswappable does not mean unrecoverable. 
    There are boottime tunables to change this behavior (vfs.zfs.arc_max, vfs.zfs.arc_min)

    So if you have 32GB of RAM in a system and your use case warrants it, ARC certainly will use I think up to all but 1 or 2 GB reserved for the kernel.  But if other processes demand RAM, ARC will flush and use less.  Plenty of documentation on this, but the best I've seen is the Michael W Lucas books on ZFS.