Having to restart pfsense every few hours - drops all connections
-
Set up a pfsense on an Optiplex 3070 SFF about a week ago.
Ran fine the first few days and didn't really mess with it much, but starting 2-3 days ago, it started just randomly losing all connection (can't ping pfsense or any other IP because the pfsense is running DHCP/DNS). Checked system logs but there's nothing obvious failing. Also not sure exactly where to look, but really hard to pinpoint any relevant logs.
I'd just be streaming YouTube or watching a movie, and the internet suddenly drops.
I'd check the local ping and sure enough, can't get a hit on anything.
This sometimes happens every few hours, sometimes within an hour, sometimes after a day or so.
Restarting the box resolves the issue for a few hours or so.Details on the setup:
- The pfsense box is an Optiplex 3070 SFF running an i5-9500, 8GB of RAM, 128GB SSD, authentic Intel I350-T4V2.
- Currently running 2 eero 6 Pros in bridge mode for wireless.
- Only packages I currently have downloaded are iperf and Tailscale. Don't want to add more at the risk of making things worse...
- Single WAN, single LAN. Nothing complicated, no VLANs setup yet.
- Only hitting around 2% CPU, 8% RAM usage at idle, spikes to MAYBE 15-20% if I run something like a full speed test.
- Firmware update was one of the first things I did when I booted up for the first time.
Has anyone run into this problem with a new setup?
I've tried so many different random things I've found online like checking/unchecking hardware offloading, disabling hyperthreading, disabling gateway monitoring, changing probe intervals, etc. -
@pnadd Can you plug in a monitor and see if an error is output to the console at that time?
-
@SteveITS Heh that was actually my last resort because my monitors are all being used and the cables are all tied with velcro every few feet...
It's the step I KNOW I have to do and just kept pushing off because it's going to be a huge PITA to take apart and put back together
I'll do it next time it cuts out. Just hoped this was a somewhat common issue at first with a relatively simple fix -
@SteveITS this is pretty much all there is when it crashes. I’m not sure how much it actually has to do with the crash because it comes up shortly after doing a reboot and the network runs fine for a few hours before disconnecting again
Also, I’m using this SSD if there is some sort of compatibility issue I’m unaware of: Patriot 128GB
-
Yeah that's a drive or drive controller error.
Make sure you don't have any power saving options enabled for the drive in the BIOS.
Steve
-
@stephenw10 Thanks! I shut off every power saving function I could find in the BIOS.
ASPM looks to be the biggest offender as that reduces PCIe device power, but we’ll have to see.
But would that type of power saving to the drive cause the symptoms I was seeing with the network just basically going inactive and having to restart?
Maybe it was lowering the power to the NIC? -
Yes, eventually. If pfSense loses disk access it won't stop routing immediately. However over time services start to fail as they require cache access etc. Usually DNS then DHCP then the webgui etc.
The drive could be overheating perhaps? ASPM does seems likely suspect though.
-
@pnadd said in Having to restart pfsense every few hours - drops all connections:
I’m not sure how much it actually has to do with the crash
An OS like FreeBSD, like Linux or any recent (if not all) Microsoft OSes : if the (boot) drive goes away, it's panic all over the place.
"A reboot and all is well" is the good side of things. Soon, it will take the entire file system with => no boot anymore.Better investigate in this drive, test with another one, things like that.
-
@stephenw10 @Gertjan around 24 hours after switching off all of the power saving modes, and everything is chugging along perfectly with zero errors or logs on the console.
I thought I had configured something wrong and would have to do a fresh reinstall and reconfig. Thank you so much!