Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    RAM Disk enabled, but still constant writes to disk…

    Scheduled Pinned Locked Moved General pfSense Questions
    34 Posts 9 Posters 6.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      PeterBrockie @stephenw10
      last edited by

      @stephenw10 It's not that this issue was writing a lot of data, it was the number of writes per minute of little chunks. I'm 99% sure the zero swap install fixes the problem, and was the root cause. The whole point of having the RAM disk enabled was to save flash based storage, and for some reason it just wrote all the RAM disk info to swap as it came in (I suspect, I didn't look too much into the data being written, other than it was syncer doing the writing and it was all the time). It could be some interaction between ESXi and Pfsense since it doesn't seem to be a very widely reported issue.

      While testing I tried reinstalls (default, with swap), killing all the logging I could find, removing all packages (even VM tools, just in case), changing RAM disk settings, etc. It just kept writing away, 24/7 until I did the no swap install. :D The no swap install used a restore of the exact same config I was using.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Hmm, interesting. I usually install without swap anyway on devices running from flash. However I would not on an SSD and I've never noticed drive writes anywhere near that. Did it actually show swap being used?

        Steve

        P 1 Reply Last reply Reply Quote 0
        • P
          PeterBrockie @stephenw10
          last edited by

          @stephenw10 I'm not actually sure, I was really in the "just try things to get this damn thing to stop" mode so I didn't look too much into what exactly it was doing other than calling syncer constantly. I have no idea why it would care about swap, it had 8 gigs of memory (later 32 GB just to test it), and it was still writing. Note I use ZFS, I believe I tried a UFS install a while back to test and it still did it.

          I made a new VM, installed it. Disk writes. Restored my config and it still was writing to the disk. When I made a VM and installed without swap it went away both before and after I restored my config. Like I said in another reply, I don't think this is very widespread so I would like to he from the original poster since they seemed to have the exact same problem.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Hmm, interesting catch. Yeah 61TB in 2 months is waaaaay outside the range of anything I've seen. Even those systems that were not mounted noatime for a while.

            Steve

            1 Reply Last reply Reply Quote 0
            • E
              emobo
              last edited by

              Just providing some input to this thread because it very nicely captured a problem which I am (have?) experiencing.

              I have a 2.4.4 pfSense system running on a Proxmox Virtual Environment (6.1-3) and I was surprised to see that my SSD's (128GB LiteOn m.2 SATA SSD - consumer grade stuff) Wear Leveling SMART number dropped 4% in about 3 weeks. So I started to investigate optimizing proxmox and pfsense to reduce writes to the drive.

              On the pfSense side, I have been observing regular writes on the hypervisor (via iotop -a --only) which show the kvm of the pfSense system is writing to the disk rather constantly. Proxmox history shows it's around 10k on average:

              0290b7f6-8b83-4b77-a60a-2d89087df645-image.png

              On my 2.4.4 setup I have RAM disk enabled. I have a constant connection to a OpenVPN server and a not much else.

              Based on the observations in this thread, I reinstalled pfSense with a manually partitioned drive where I deleted the swap file (and enabled trim on the virtio based disk).

              After reinstalling I still observed writes on the VM, so I enabled noatime as well on the root mount.

              In this plot you can sort of see the effects of that (it shows up around 22:50-22:55 on the plot).

              17803b98-8f14-42b3-9957-e49d2906812c-image.png

              Now the writes are not so constant, but there are still a few periodically.
              Based on the change observed with the noatime setting, vs reinstalling without swap, I'm not sure if the swap was the problem. It would be interesting to know if PeterBrockie's successful setup did include the noatime mount option.

              Although I am definitely not in the range where the drive use is going to kill my SSD, but I think it's worthwhile noting that such writes do exist, when in theory we might expect no writing to occur. I'm still trying to understand what the other writes are - I'll try not to log into the pfsense system in the next little while and see what the hypervisor detects.

              Thanks to everyone who contributed to this post.

              P 1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by stephenw10

                Mmm, interesting. I too have a Proxmox system here with a least two pfSense VMs running in it continually. I also see 20-30k drive writes in each. I've set noatime manually now, I'll let you know.
                The smartctl status from it is interesting:

                === START OF SMART DATA SECTION ===
                SMART overall-health self-assessment test result: PASSED
                
                SMART/Health Information (NVMe Log 0x02)
                Critical Warning:                   0x00
                Temperature:                        32 Celsius
                Available Spare:                    100%
                Available Spare Threshold:          10%
                Percentage Used:                    0%
                Data Units Read:                    696,237 [356 GB]
                Data Units Written:                 1,119,658 [573 GB]
                Host Read Commands:                 3,490,850
                Host Write Commands:                11,895,271
                Controller Busy Time:               66
                Power Cycles:                       10
                Power On Hours:                     341
                Unsafe Shutdowns:                   2
                Media and Data Integrity Errors:    0
                Error Information Log Entries:      0
                Warning  Comp. Temperature Time:    0
                Critical Comp. Temperature Time:    0
                Temperature Sensor 1:               32 Celsius
                Temperature Sensor 2:               32 Celsius
                
                Error Information (NVMe Log 0x01, max 64 entries)
                No Errors Logged
                

                I'm not totally sure about that since the power on hours seem low, I've had that running for significantly longer than 2 weeks.

                Steve

                E 1 Reply Last reply Reply Quote 1
                • P
                  PeterBrockie @emobo
                  last edited by

                  @emobo I have a ZFS pool and haven't touched atime. As far as I can tell my specific disk write problem was solely an interaction between swap and the VM. Despite having plenty of free RAM and having all logs go to RAMdisks, it still wrote something to the swap constantly.

                  After removing swap:
                  Screenshot (44).png

                  1 Reply Last reply Reply Quote 1
                  • E
                    emobo @stephenw10
                    last edited by

                    @stephenw10
                    Ah sounds familiar. Is that smartctl status output from your proxmox Debian host or the pfsense Freebsd. I don't know the right command line arguments to get it from the pfSense VM. I'm not sure if it can be accessed there.

                    @PeterBrockie
                    Thanks, that is impressive and it's in the last hour so the resolution is high. That is exactly what I would hope to achieve as well. I have no swap now, so I'm unsure what else could be causing it.

                    I have more history now on my host log and it still shows activity:
                    87402877-484d-40da-871b-a67ab6bd47a8-image.png

                    Focusing on the last hour it's still showing periodic writes.
                    7d3c7f80-2b02-491a-bfce-18628ca63764-image.png

                    There is an old thread here from back in 2012 https://forum.netgate.com/topic/130424/ram-disk-enabled-but-still-constant-writes-to-disk

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Anything from 2012 is largely irrelevant at this point.

                      I only have 2.4.5 and 2.5 snapshots running but the results were broadly similar. Very basic installs.
                      Mounting root noatime produces ~50% decrease in drive writes. ~30kBps to ~18kBps.
                      Enabling RAM drives reduces it to 0 most of the time. There are obviously still some writes when the config updates etc.

                      Selection_752.png

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • E
                        emobo
                        last edited by emobo

                        Thanks - yes the noatime has a noticeable effect.
                        I'm puzzled how PeterBrockie's configuration could be so quiet while the other setups still have regular activity.

                        As a test, I tried disabling local logging but it seems to have little to no effect. This makes sense if the logs were being written to the ramdisk anyway.

                        1 Reply Last reply Reply Quote 0
                        • provelsP
                          provels
                          last edited by provels

                          Pardon the interruption, but is this a Proxmox, VM, SSD or swap specific issue?

                          Peder

                          MAIN - pfSense+ 24.11-RELEASE - Adlink MXE-5401, i7, 16 GB RAM, 64 GB SSD. 500 GB HDD for SyslogNG
                          BACKUP - pfSense+ 23.01-RELEASE - Hyper-V Virtual Machine, Gen 1, 2 v-CPUs, 3 GB RAM, 8GB VHDX (Dynamic)

                          P 1 Reply Last reply Reply Quote 0
                          • P
                            PeterBrockie @provels
                            last edited by

                            @provels That'd what we are trying to figure out. I am running VMware and it killed a ssd in no time. Disabling swap fixed it for me and not for others, so we are trying to figure out exactly what it is.

                            provelsP 1 Reply Last reply Reply Quote 1
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              It's not VM specific, it's just far easier to see the disk IO in a VM. What the actual cause of the OPs issue where he had to remove SWAP is a mystery. I could not replicate.

                              P 1 Reply Last reply Reply Quote 1
                              • P
                                PeterBrockie @stephenw10
                                last edited by

                                @stephenw10 I personally didn't have the problem outside a VM. I was running Pfsense for years and years on a small 32GB SSD which would have failed 10 times over at the rate it killed my larger drive. The little drive passed SMART tests, etc and is still going.

                                Same config file (although I did test a fresh install).

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Without noatime set I have seen some high drive write numbers, much higher than I expected. I've yet to see anything kill a drive though. At least not with drive writes alone.

                                  With RAM drives enabled I'm seeing effectively 0 drive writes until I save a change etc. I think that's the same as you are pretty much.

                                  Steve

                                  1 Reply Last reply Reply Quote 0
                                  • provelsP
                                    provels @PeterBrockie
                                    last edited by provels

                                    @PeterBrockie Well, FWIW, with noatime, ramdisks, and swap enabled I see no disk activity at all on my pfSense VM VHDX in Hyper-V (2012R2).
                                    Without noatime, but else same, as below.
                                    f2a5a30c-02cf-4926-8feb-a7a6f4b7c7f6-image.png

                                    Peder

                                    MAIN - pfSense+ 24.11-RELEASE - Adlink MXE-5401, i7, 16 GB RAM, 64 GB SSD. 500 GB HDD for SyslogNG
                                    BACKUP - pfSense+ 23.01-RELEASE - Hyper-V Virtual Machine, Gen 1, 2 v-CPUs, 3 GB RAM, 8GB VHDX (Dynamic)

                                    1 Reply Last reply Reply Quote 0
                                    • E
                                      emobo
                                      last edited by emobo

                                      It would seem very strange if this was caused by the choice of Hypervisor. I'm less familiar with the other hypervisors - does anyone know if Proxmox is the only one that uses the virtioblock device for the hard disk? If it was VM host related perhaps that could be related?

                                      @stephenw10 - in the case of the writes I'm curious about - I believe those are not initiated by me directly - I am purposely trying to avoid touching the pfsense system while those writes are occurring. I don't login, or make any changes to the environment - it should be just routing (and logging). I can accept that there will be a few jobs on timers which occur (i.e. the ramdisk is dumped to disk periodically - but I have that set to 24hours) but I am surprised it would be anything so frequent.

                                      I do find this truly intriguing. To me, this is less about killing SSD's, than it is about not really having a good handle on the what the system is doing. These are security focused platforms so it would be ideal if an administrator can make sense of what's happening.

                                      I wonder if an experiment like this would work - on a test pfsense install - can we remount the / partition as ro and see what gets upset? It might be time to start breaking out more VM's...

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Well what do you have configured on that VM? Any packages?

                                        I see basically zero writes unless I'm saving a change or as you say it is writing out the RRD data or updating bogons etc.

                                        Selection_754.png

                                        E 1 Reply Last reply Reply Quote 0
                                        • E
                                          emobo @stephenw10
                                          last edited by emobo

                                          @stephenw10
                                          Thanks - yeah I don't have any packages except the openvpn client export.
                                          Configuration wise

                                          • a few custom firewall rules
                                          • iot vlan
                                          • openvpn client interface
                                          • openvpn server

                                          Service wise it's the usual suspects (dhcpd, dpinger, ntpd, openvpn x2, sshd, syslogd and unbound).

                                          Given what you and others have provided above - it must be something in my configuration or traffic.

                                          I've got a proxmox 6.0 server which I've reinstalled pfsense 2.4.4 with no swap and configured the ssd to have noatime and enabled ramdisk. As soon as I enabled ramdisk it went super quiet (just around 9:30)
                                          ed06ab54-169c-4060-9e00-689e170a5156-image.png

                                          Now this setup is not really representative of my live setup (no real wan traffic and no clients) but (unless there is something different in promox 6.0 vs 6.1) it's a good indication that it's something due to the configuration and loading on the live pfsense setup.

                                          Thanks for your help - I'll keep playing around and keep this post updated if I find anything else.

                                          Incidentally on my fresh install, I did an iostat comparison between a reboot last night and this morning and it shows it wrote about 20megs to the disk. (Without RAMdisks).

                                          1 Reply Last reply Reply Quote 0
                                          • E
                                            emobo
                                            last edited by

                                            Ok so just an update on this investigation. I've been away for a bit so not doing much with the setup.

                                            During that time my live pfsense machine continues to write frequently to the disk. Over the last 10 days up, iostat shows it has written about 2500MB - which seems like a lot for something that shouldn't be writing anything to the disk.

                                            Meanwhile, my test pfsense setup with no routing traffic has been very silent on disk.
                                            I updated it to Promox 6.1-5 (from 6.0) and it was still fine.
                                            I took my live pfsense XML and restored it on the test configuration (i had to shift around some ip addresses and interfaces to keep things legit) and it continued to be disk silent.

                                            So it seems like the disk activity requires some network activity, unfortunately my test VM machine doesn't have multiple nics, so I may have to configure some bridges to mimic more network traffic.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.