SG-1000 Intermittently Losing Internet Connection, Human-Intervention Required
-
Hi guys
I bought an SG-1000 a couple of months ago, and it's my first experience with pfSense. It's been great, and I love it, apart from one issue I've been having:
Every so often (typically two to three times a day at seemingly fairly random times), I lose internet connectivity on all devices. Also, the web interface of the gateway is inacessible.
So far, the only resolution has been to either, re-plug WAN, re-plug LAN, or power cycle the unit, after which the internection connection is restored and the GUI is accessible. The connection never comes back on its own if I don't do one of those.
Not that it always shows errors in the web GUI, but the ones that have come up recently after regaining access to the GUI state "(pf_busy) PF was wedged/busy and has been reset", and "(Filter Reload) There were error(s) loading the rules: pfctl: DIOCXCOMMIT: Device busy - The line in question reads 0." (I took out the square brackets around the zero so it didn't mess up in the forum HTML)
I've seen other forum and reddit threads with similar issues suggesting that disabling gateway monitoring does the trick. Or that a switch inbetween the modem and gateway can sort it. I've done both, but have still been experiencing the problem. I'm trying to keep power usage down where possible, so I'd like to avoid the need for a switch inbetween the two, anyway.
I've also temporarily switched from using my ISP's modem to a Draytek Vigor 130, but no difference.
I've been making sure to install updates very often (typically at least once per week), as 2.4 was in beta. I'm now on 2.4.0-RC since yesterday.
I'm running some very specific Firewall rules to whitelist certain external IP alias groups on certain ports. I've also recently installed pfBlockerNG, though haven't edited anything with that since the install.
My connection speed is about 75Mbps, and at times I can have multiple devices maxing out the bandwidth with different protocols. I've noticed that the CPU usage can sometimes remain between 90%-100% - wondering if it the device just isn't able to cope with the throughput?
I'm afraid because of my noobness with pfSense, I'm not sure where best to look in logs to check what's been happening. If anyone is able to suggest what next steps to investigate, that would be much appreciated, thanks!
-
Please submit a ticket to https://customercare.netgate.com/ and reference this thread.
Thank you.
-
Thanks ivor - ticket submitted.
-
Just wondering if you are using the DNS resolver.
If you are has this been fixed? (I've been seeing the same issues for a long time, it recently got much worse)
Be good to know! -
Hey deadmalc.
I'm not using pfSense's inbuilt DNS resolver, no. Instead, I have the DNS for the network set to a local IP (which is a Pi-hole device).
The issue isn't resolved yet, but I need to get back to the support chaps with more info - they're waiting on me :)
I updated to the latest release last night, but the issue is still present.
I have wondered if the issue has something to do with clients coming and going on and off the network as the issue seems to occur roughly when someone arrives in the building, a device comes online or a device goes offline. But that could just be me imagining things.
I'm using a Netgear router in WAP mode as my access point, and am replacing that next week. Several reasons for doing so, but will be another thing ruled out if the issue still persists afterwards.
I'll post back here if there are any developments!
-
For deadmalc and anyone that has a similar issue, I can gladly report that removing the Netgear (and a Linsksys AP) from the network immediately resolved the issue!
I actually removed both APs at the same time and haven't yet reinstated either one to fully test which was causing the issue, as I'm still getting used to having 24/7 internet access :)
The Netgear was an R7500 running in AP mode. I have replaced it with a UniFi AP, and the issue with the SG-1000 has not happened since - well over a week now.
I assume the SG-1000 was being flooded with some kind of packets or traffic from the Netgear, causing it to fail. Hope this info helps.
-
Thanks for that. I have a similar setup.R8000 in AP MODE.
since I've stopped using the resolver and switching to the forwarder I've not had any issues. However I think your diagnosis makes sense.
Once I get the funds a proper nas and Wi-Fi ap will be on the cards.
Seems really weird tbh. But I'm glad pfsense isn't at fault.Thanks for the update
-
Aha, that's interesting that you also have a Netgear in AP mode.
When you mentioned it before, I didn't fully understand the resolver/forwarder setup.
I've checked mine, and the resolver is turned on though I don't actually use it, since all the devices on my network are using my Pi-hole for DNS, set either manually on each device or via DHCP.
I guess it could probably have both the resolver and forwarder completely off. Glad you've found a way around it.