RAM Disk enabled, but still constant writes to disk…
-
Pardon the interruption, but is this a Proxmox, VM, SSD or swap specific issue?
-
@provels That'd what we are trying to figure out. I am running VMware and it killed a ssd in no time. Disabling swap fixed it for me and not for others, so we are trying to figure out exactly what it is.
-
It's not VM specific, it's just far easier to see the disk IO in a VM. What the actual cause of the OPs issue where he had to remove SWAP is a mystery. I could not replicate.
-
@stephenw10 I personally didn't have the problem outside a VM. I was running Pfsense for years and years on a small 32GB SSD which would have failed 10 times over at the rate it killed my larger drive. The little drive passed SMART tests, etc and is still going.
Same config file (although I did test a fresh install).
-
Without noatime set I have seen some high drive write numbers, much higher than I expected. I've yet to see anything kill a drive though. At least not with drive writes alone.
With RAM drives enabled I'm seeing effectively 0 drive writes until I save a change etc. I think that's the same as you are pretty much.
Steve
-
@PeterBrockie Well, FWIW, with noatime, ramdisks, and swap enabled I see no disk activity at all on my pfSense VM VHDX in Hyper-V (2012R2).
Without noatime, but else same, as below.
-
It would seem very strange if this was caused by the choice of Hypervisor. I'm less familiar with the other hypervisors - does anyone know if Proxmox is the only one that uses the virtioblock device for the hard disk? If it was VM host related perhaps that could be related?
@stephenw10 - in the case of the writes I'm curious about - I believe those are not initiated by me directly - I am purposely trying to avoid touching the pfsense system while those writes are occurring. I don't login, or make any changes to the environment - it should be just routing (and logging). I can accept that there will be a few jobs on timers which occur (i.e. the ramdisk is dumped to disk periodically - but I have that set to 24hours) but I am surprised it would be anything so frequent.
I do find this truly intriguing. To me, this is less about killing SSD's, than it is about not really having a good handle on the what the system is doing. These are security focused platforms so it would be ideal if an administrator can make sense of what's happening.
I wonder if an experiment like this would work - on a test pfsense install - can we remount the / partition as ro and see what gets upset? It might be time to start breaking out more VM's...
-
Well what do you have configured on that VM? Any packages?
I see basically zero writes unless I'm saving a change or as you say it is writing out the RRD data or updating bogons etc.
-
@stephenw10
Thanks - yeah I don't have any packages except the openvpn client export.
Configuration wise- a few custom firewall rules
- iot vlan
- openvpn client interface
- openvpn server
Service wise it's the usual suspects (dhcpd, dpinger, ntpd, openvpn x2, sshd, syslogd and unbound).
Given what you and others have provided above - it must be something in my configuration or traffic.
I've got a proxmox 6.0 server which I've reinstalled pfsense 2.4.4 with no swap and configured the ssd to have noatime and enabled ramdisk. As soon as I enabled ramdisk it went super quiet (just around 9:30)
Now this setup is not really representative of my live setup (no real wan traffic and no clients) but (unless there is something different in promox 6.0 vs 6.1) it's a good indication that it's something due to the configuration and loading on the live pfsense setup.
Thanks for your help - I'll keep playing around and keep this post updated if I find anything else.
Incidentally on my fresh install, I did an iostat comparison between a reboot last night and this morning and it shows it wrote about 20megs to the disk. (Without RAMdisks).
-
Ok so just an update on this investigation. I've been away for a bit so not doing much with the setup.
During that time my live pfsense machine continues to write frequently to the disk. Over the last 10 days up, iostat shows it has written about 2500MB - which seems like a lot for something that shouldn't be writing anything to the disk.
Meanwhile, my test pfsense setup with no routing traffic has been very silent on disk.
I updated it to Promox 6.1-5 (from 6.0) and it was still fine.
I took my live pfsense XML and restored it on the test configuration (i had to shift around some ip addresses and interfaces to keep things legit) and it continued to be disk silent.So it seems like the disk activity requires some network activity, unfortunately my test VM machine doesn't have multiple nics, so I may have to configure some bridges to mimic more network traffic.
-
Adding a client machine to my test network generates some writes on my test installation, which confirms it is related to the existence of client machines. Since it's unlikely related to traffic (as most of that is logged in RAM), I guessed it maybe something related to DHCP leases.
I used a modified version of the find command listed by BlueScreenOfTOM above to identify some files being written to, and it seems like /etc/hosts is being written to quite regularly.
I looked at the contents and it seems to be related to the DHCP leases getting written to the /etc/hosts files
I believe this is caused by "Register DHCP leases in the DNS Resolver" being selected in the DHCP server settings, so I have removed that for now. Given my hostname is not really legit, these are pretty much pointless anyway.
So far, disabling that has reduced the writes to zero.
So perhaps the mystery is solved? :)