Hard drive activity light on solid after upgrade to 2.4.0
-
I have been running a custom built PC as a pfSense firewall for several years. Yesterday I did a full reinstall, using ZFS, of 2.4.0, and restored my config after everything was back. Since I did that, my hard drive activity light has been solid on. It sometimes changes in intensity, almost a flicker, but never goes off. Everything was working before the upgrade, the light would blink properly only on activity. How would I find out if it's on solid because of constant activity or if its only on because something works different with ZFS and can safely be ignored?
-
You can look at the output of these commands to see disk activity:
top -m io
iostat
vmstat
systat -iostat
systat -vmstat
-
When I run top -m io, I get this:
last pid: 82822; load averages: 0.67, 0.58, 0.37 up 0+00:19:04 15:22:02 48 processes: 1 running, 47 sleeping CPU: 2.7% user, 0.6% nice, 9.8% system, 0.2% interrupt, 86.7% idle Mem: 113M Active, 8604K Inact, 370M Wired, 152K Buf, 7314M Free ARC: 122M Total, 119K MFU, 117M MRU, 2868K Anon, 341K Header, 1445K Other 33M Compressed, 89M Uncompressed, 2.72:1 Ratio Swap: 2048M Total, 2048M Free PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 33974 root 2603 36 0 0 0 0 0.00% bsnmpd 340 root 371 0 0 0 0 0 0.00% devd 37009 root 94 0 0 0 0 0 0.00% sh 52341 root 14 1 0 240 0 240 100.00% syslogd 10195 root 62 3 0 0 0 0 0.00% openvpn 38590 root 2 0 0 0 0 0 0.00% top 13657 root 4 1 0 0 0 0 0.00% filterlog 12214 root 6 2 0 0 0 0 0.00% openvpn 22510 root 18 0 0 0 0 0 0.00% dpinger 24144 root 2 0 0 0 0 0 0.00% sshd 25013 root 2 0 0 0 0 0 0.00% ntpd 307 root 2 0 0 0 0 0 0.00% php-fpm 32738 root 0 0 0 0 0 0 0.00% php-fpm 30380 root 0 0 0 0 0 0 0.00% radvd 18729 nobody 0 0 0 0 0 0 0.00% dnsmasq
This is a snapshot of what I'm seeing most of the time. I always see syslogd with usually anywhere between 100-250 writes, though sometimes more or less. Is this normal? iostat is saying my disk is always around .17 MB/s of activity. And here's a snapshot of systat -vmstat:
2 users Load 0.46 0.51 0.42 Oct 13 15:29 Mem usage: 6%Phy 2%Kmem Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 129064 8076 1131596 11048 7487448 count All 134080 13088 1152916 32356 pages Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt ioflt 5368 total 51 9076 789 83k 2163 27 cow 1 ehci0 16 zfod 2 ehci1 23 9.0%Sys 0.3%Intr 2.6%User 0.0%Nice 88.1%Idle ozfod 828 cpu0:timer | | | | | | | | | | %ozfod 1056 cpu1:timer =====> daefr 453 cpu2:timer 1 dtbuf prcfr 868 cpu3:timer Namei Name-cache Dir-cache 212027 desvn totfr 684 em0 264 Calls hits % hits % 881 numvn react 724 em1 265 1883 1883 100 268 frevn pdwak hdac0 266 48 pdpgs xhci0 267 Disks md0 ada0 cd0 pass0 pass1 pass2 intrn hdac1 269 KB/t 0.00 15.15 0.00 0.00 0.00 0.00 379596 wire 752 ahci0 272 tps 0 7 559 0 0 0 117092 act MB/s 0.00 0.10 0.00 0.00 0.00 0.00 8672 inact %busy 0 3 55 0 0 0 laund 7487448 free
I assume ada0 is my disk. Any advice would be appreciated.
-
The 100% is misleading in top because that doesn't mean it's using 100% of write capacity, just that of the things writing at that moment, syslogd is performing 100% of the writes happening right then.
Something is logging all of that, though, you'll need to check all of your log files and see which one is getting all of the messages
-
Yeah I sort of figured that, I was assuming my write speed would be more than .17 MB/s :)
After I posted all that, I had started looking into the logs. I see a lot of things being logged into the firewall log that look like this?:
Oct 13 15:54:21 LAN [fe80::9af1:70ff:fe46:d201]:5353 [ff02::fb]:5353 UDP
All of them are using IPv6, which I thought I had disabled, but I couldn't find where I initially configured that years ago just yet. All of them are on UDP and to either port 5353 or 5355. They show up on both the LAN interface or bridge0, which is a bridge I created for a OpenVPN TAP connection, though nobody is currently connected to the VPN. I have never seen this before and I'm not sure what I have going on.
At this point I don't seem to have a hardware issue, so maybe this post should be moved?
-
5353 is mdns.. Most likely just utter noise if your not running ipv6.. So either disable ipv6 at the clients sending it.. Or set it not to log that nonsense.
5355 would be LLMNR.. Yet again a noisy nonsense multicast name resolution protocol that could just be turned off at the client if your not actively using it.
You have windows machines I take it - they are some noisy POS if you ask me ;)
-
Yeah pretty much exclusively Windows. Nothing else has changed in my network in the past day, the new pfSense version is the only thing. I wonder why the sudden increase in logging, and with it disk activity? My desk is right next to the box so I know the old version didn't light the disk like this. I'll look into ignoring the logging for those items on Monday, thanks.
-
I think I need a little more help in figuring this all out. When I click on the X next to the log entry to find out why it is being blocked, I see this message:
The rule that triggered this action is: @7(1000105585) block drop in log inet6 all label "Default deny rule IPv6"
But when I go to my firewall rules, under LAN rules I see:
0 /0 B IPv6 * LAN net * * * * none Default allow LAN IPv6 to any rule
That was the only related rule I see and to me it appears to be allowing IPv6 traffic on the LAN interface, not blocking. Am I understanding how this works wrong, or is there an issue?
-
Check the main IPv6 switch at System > Advanced, Networking tab.
If "Allow IPv6" is unchecked it doesn't matter what IPv6 rules you add, it will be blocked by default.
-
Hmm. Allow IPv6 is checked.
-
So it seems I stumped everyone! I didn't have any more time to look at this today but need to see if I can figure this out again tomorrow. Any other ideas where I should look for issues?
-
I found the issue, seems I was on the wrong track. Unplugging my CD-ROM drive made the light go off and now I see normal activity. Seems for some reason the system was constantly checking the status of the drive, and this made the hard drive activity light light up for some reason. I am open to suggestions on how to stop this, but I obviously don't have a major issue and I can either ignore it or unplug the seldom-used optical drive.
-
I found the issue, seems I was on the wrong track. Unplugging my CD-ROM drive made the light go off and now I see normal activity. Seems for some reason the system was constantly checking the status of the drive, and this made the hard drive activity light light up for some reason. I am open to suggestions on how to stop this, but I obviously don't have a major issue and I can either ignore it or unplug the seldom-used optical drive.
Probably SNMP. See https://redmine.pfsense.org/issues/6882
-
Yep, that looks like it. I plugged the optical drive back in and bsnmp starting using a lot of cpu. I hadn't checked that before, guess I should have. Reading the bug report, it seems I can just leave a disc in the drive to work around this for now, and I can confirm it does. For an actual fix, it seems my best bet may be to wait for 2.4.1, correct?
-
Unplug the drive, put media in the drive, disable the hostres module in SNMP. Any of those will mitigate the issue. It's solved in 2.4.1 so it should be better after the upgrade.
-
Good deal. I appreciate all the help with the issue jimp!