Pfsense system crash
-
@stephenw10 said in Pfsense system crash:
That's a good test. You should be able to see a memory leak in ntop-ng in the top output though.
It could also be ntop struggling due to traffic from some other issue.
Steve
With ntop-ng disabled I have not had any issues so far. A YT video I watched recommended that ntop-ng not be running all the time anyway since it is somewhat of of resource hog.
Thanks.
-
@stephenw10 I have been running pfsense now for 22 days but just now am not being served any ip addresses and cannot log in. My cable modem says everything is fine so it's pfsense not running. CTL+t does respond with
"load: 0.00 cmd: login 53946 [tx->tx_sync_done_cv] 1881545.20r 0.00u 0.00s 0% 2708k"
Anything else I should do? I know a reboot will fix it for a while.
-
Here is a pic of the monitor connected and the output before the crash.
-
@vcr58 said in Pfsense system crash:
Here is a pic of the monitor connected and the output before the crash.
Your issue is related to write-io to the system disk. It seems your disk goes missing/not responding. This also supports why you get better stability without NtopNG as that package in particular does A LOT of write I/O.
Strange that it only happens with ZFS and not UFS. But ZFS uses a very different write strategy, and is a quite write intense in bursts opposed to UFS. So it would seem your SSD/eMMC/HDD is the culprit. Please remember that especially eMMC and NTopNG is not a good match as the write endurance could be worn out in a matter of a year or two.
-
Mmm, that looks like a bad/failing disk. It should never stop responding like that.
I would replace it and restest when you can.
Steve
-
@stephenw10 - I suppose it could be the SSD going bad although I never get any errors when running a scan on it. What @keyser said does make sense to me as well.
I do have an older WD SSD green that I could try so I will try that one and see what happens.
Thanks
-
I have an SSD here that continually throws CAM errors like that when pfSense is running from it. It has never actually failed during use but there's no way I would use it in any sort of critical role. It has failed to install before and I consider it dead.
Steve
-
@stephenw10 I have the same pfsense setup on a different SSD now. It is probably a better drive even though it's older (I think).
When the first drive stopped working the only log that showed CAM errors was on the monitor connected to the pfsense PC. After reboot I could not find the same info in pfsense logs anywhere. Would CAM errors show up in "Status/System Logs/System/General" in the web gui if the system was still running?
-
It's common to see drive errors like that on the console only because often logs cannot be written by that point.
-
@stephenw10 After replacing the SSD I have not seen any errors after 4 days of uptime, even with ntopng running, so problem was indeed the bad SSD.
Thank you so much for your help in troubleshooting my issue!
-