Average writes to Disk
-
I hope I remember correctly, but zfs by default writes every 5 seconds if there is anything to write. Also it depends on type of write operation (synchronous or asynchronous). Because it is easy to revert copy of the pfsense configuration I changed a little zfs configuration on my device (but with it there is a little bit bigger chance of problems in case of not clean shutdown)
So for my device I changed the following:
- make all writes asynchronous (probably won't help much here, but software doesn't have to wait for OS to finish writing)
zfs set sync=disabled zroot
- System -> Advanced -> System Tunables - set vfs.zfs.txg.timeout to 180 - it will increase time between writes from 5s to 180s; during this time all operations will happen in RAM; but writes could happen earlier if there is too much to write (also OS is able to force writing earlier - I'm not expert so I don't know all conditions); for me write happens once or twice during said 3 minutes
- because there is bigger chance of failure with 180s I made zfs to write everything twice (I have 100GB SSD so it is not problem); this won't guaranty data safety, but still (works only for newly written data - old won't be touched)
zfs set copies=2 zroot
That works for me. Please read about it before changing anything. Or at least have configuration backup before you try.
-
I use ZFS also but with RAM disks. this will reduce the writes to the ssd massively, right?
KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s 0.00 1 0.00 0.00 1 0.00 13.97 1 0.01 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 12 0.00 7.86 23 0.18 0.00 0 0.00 0.00 2 0.00 0.00 2 0.00 0.00 0 0.00 0.00 0 0.00 0.00 3 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00
-
I followed your suggestions and it looks good:
iostat -d 5 6 md0 ada0 pass0 KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s 0.00 0 0.00 15.55 14 0.21 0.38 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00
This is the production machine where initially I found that heavy writing. I give it 24 hours and keep an eye also on the LBA_written.
Thanks anyway!
fireodo -
@tomashk said in Average writes to Disk:
zfs set sync=disabled zroot
Is this command permanent or has it to be made on each reboot? Thanks
-
This is the SMART Test output from one of my boxes, APU2 Board with 16GB SSD...
241 Lifetime_Writes_GiB shows GB count written.... -> 4176.278 TB???? no way....SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 8694 12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 42 168 SATA_Phy_Error_Count 0x0012 100 100 000 Old_age Always - 0 170 Bad_Blk_Ct_Erl/Lat 0x0013 100 100 010 Pre-fail Always - 0/20 173 MaxAvgErase_Ct 0x0000 100 100 000 Old_age Offline - 385 (Average 775) 192 Unsafe_Shutdown_Count 0x0012 100 100 000 Old_age Always - 4 194 Temperature_Celsius 0x0023 070 070 000 Pre-fail Always - 30 196 Not_In_Use 0x0000 100 100 000 Old_age Offline - 0 218 CRC_Error_Count 0x0000 100 100 000 Old_age Offline - 0 241 Lifetime_Writes_GiB 0x0012 100 100 000 Old_age Always - 4176278 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended captive Completed without error 00% 8694 - # 2 Short captive Completed without error 00% 8694 - # 3 Short offline Completed without error 00% 2557 - SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
-
@tohil said in Average writes to Disk:
241 Lifetime_Writes_GiB shows GB count written.... -> 4176.278 TB???? no way....
Smart is always Device specific - you have to dig for the correct interpretation of those values ... 4 PB is heavy and I don't believe it even its theoretically possible ...
-
@fireodo said in Average writes to Disk:
@tomashk said in Average writes to Disk:
zfs set sync=disabled zroot
Is this command permanent or has it to be made on each reboot? Thanks
It is permanent. You can check if it is set with this command:
zfs get -r sync zroot
(it will show information for whole zfs file system)
and you can revert it with this command if I remember correctly:
zfs set sync=enabled zroot
-
-
Yeah I would hope that the default settings used for zfs install wouldn't be a determinant to life of SSD or eMMC would have issues with total writes as well..
I don't know of a way to check what eMMC reports for total life write, etc.
-
@johnpoz said in Average writes to Disk:
I don't know of a way to check what eMMC reports for total life write, etc.
As I don't have any device here with eMMC (OK, except my Phone but that's out of my research-reach :-D ) I cannot say anything about it. Is it smart capable? Documentation says its a kind of predecesor of SD-Card ...
-
If for whatever reason the onboard eMMC died, I know you can add a msata to it.. So that would always be an option.
But would be curious if there is a way to get info from it on total data written, etc.
-
There's no way to get write data from eMMC AFAIK.
-
@fireodo said in Average writes to Disk:
Thank you! Those settings do the trick! Maybe @jimp can take a look on this thread and maybe its something for the next version ...?
I changed it a few years ago because I was annoyed by constant blinking of the LED when SSD was working :). But I used those options having very basic knowledge of zfs. Now I know it a little bit better so I understand what are downsides of this configuration. And I guess it is far from being optimal. Also I'm not sure if setting it by default on zfs for next version would be good. There are too many variables for different devices to be sure if this is completely safe. But fortunately it easy to restore configuration if something is broken and I don't store important information on pfsense's drive.
-
@tomashk said in Average writes to Disk:
Also I'm not sure if setting it by default on zfs for next version would be good.
Thats why I thought it would be good if some of the developer with far more knowledge about that ZFS take a look at this situation and maybe find a more robust solution...
-
@stephenw10 if zfs does, at least in its default config cause more writes to the eMMC or SSD.. You mentioned something about
There was a bug at one time that was not mounting root as noatime
That increased drive writes in normal use, is there some change to the zfs config that might be prudent to change.. As being mentioned in this thread for example
System -> Advanced -> System Tunables - set vfs.zfs.txg.timeout to 180
I mean writing 20GB a day for not really doing anything does seem like a lot, its possible there is some error in the math, or interpretation of the info that leads to an exaggerated understanding off the amount written.. The amount of life time writes to eMMC is a concern if zfs is in fact drastically increasing the amount of data written..
-
@johnpoz said in Average writes to Disk:
I mean writing 20GB a day for not really doing anything does seem like a lot, its possible there is some error in the math
This cannot be excludet. I was searching and testing this matter aprox a week ago before I came to the forum with it.
-
I always disable DNS Reply Logging, check if that helps..
-
One thing I've noticed here is that having the dashboard open with the system information widget up will cause significantly more disk writes. Something that can easily happen when you are testing:
That's a 21.09 snapshot running ZFS.
Steve
-
@fireodo said in Average writes to Disk:
This cannot be exclude.
I didn't mean to seem like not trusting of your work, and how much data is being written. But I personally have a lack of understanding in enough depth to correlate what the output iostat shows and actual writes to the disk amounts too.
Even with smart data from SSD, these numbers can also be misinterpreted because the values shown in smart have to take into account the specific maker of the ssd, etc.. I have seen this myself on my nas, where is reports an insane amount of data written to the nvme cache. But if you do the math a bit different, then the numbers seem more sane..
Either way I feel your information does warrant further look into validation of how much actual data is being written.. Since if 20GB a day is being written, and you have small sized ssd or eMMC which do have limited amount of write for their life, the typical life could be shortened by a drastic amount..
edit: @stephenw10 that looks interesting.. Must be something new for 21.09, I do not see any sort of graph available in 21.05.1... Or is that something outside of pfsense, like vm host where you have pfsense running?
edit2: Would help if could pull such info from eMMC... From the limited amount looking into I have done so far, the mmc-util can be installed on pfsense.. but have not been able to glean any info from /dev/da0 using it.. Only thing I have managed to do is get this using smartctl
[21.05.1-RELEASE][admin@sg4860.local.lan]/: smartctl -d scsi -x /dev/da0 smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-STABLE amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: Generic Product: Ultra HS-COMBO Revision: 1.98 User Capacity: 30,601,641,984 bytes [30.6 GB] Logical block size: 512 bytes >> Terminate command early due to bad response to IEC mode page
I would of hoped this would of provided some info like the smart info you can get from a ssd
[21.05.1-RELEASE][admin@sg4860.local.lan]/: mmc --help Usage: mmc extcsd read <device> Print extcsd data from <device>.
But just asking for status returns
[21.05.1-RELEASE][admin@sg4860.local.lan]/: mmc status get /dev/da0 open: Operation not permitted
-
IMO, disabling sync means you put yourself in danger of not having a consistent filesystem on disk which is critical for a firewall. If sync is disabled it means ZFS lies to the OS about what has been committed to disk.
Maybe it could be disabled for just the log dataset, or
/tmp
for example, or something along those lines, but disabling it globally is a bad idea. Even log data could be critical in a situation that needs debugging.If it bugs you enough to disable it on a disk, I'd say that's a personal choice, but I'm not sure I could get behind recommending anyone do that by default.
I've been running ZFS on SSDs for years and so far haven't killed a disk with it.