Network suddenly slow
-
I know this could be a lot of things, but I suspect my SG-1100.
I had a pfSense firewall on my LAN for many years and just upgraded, in February, to an SG-1100 so I was on more recent software. At the time, I posted a number of times for help in fixing an issue that turned out to be a DNS issue that we could never fully track down and label.
I have internet through a cell data reseller. The ISP router has a LAN connection with a DHCP server on it. It connects via ethernet directly to the WAN interface on my SG-1100. Nothing else is on that no man's land area. The SG-1100 is the DHCP server for my LAN (using the DNS Resolver). Everything except mobile devices on the LAN is hardwired through ethernet cables in the walls (new construction) to a TP-Link switch.
I had problems with some YouTube videos this evening and things got slower and slower. I tried logging into my SG-1100, but it was glacial. After entering info and clicking on "Log In," it took over 30 seconds for the next page to load. Navigating on the SG-1100 pages is slow. I rebooted and could log in faster, but response was still slow.
Ping times to the firewall are 1/3 of a second or more. That's true of pinging anything on the network.
The only other kind of server I have on my network (other than the DHCP server on my SG-1100) are two file servers, one on Linux, one on a MacMini. Those provide NO networking service other than as file servers.
Things were fine, for years and years, on the old pfSense firewall. I upgraded and it took time to work out the DNS issues, but then things worked normally. But lately, I'm getting more and more delays connecting to anything and I have that long delay with any pings.
Could this be the SG-1100 creating a problem that is slowing down the LAN? (I haven't changed a thing on the LAN in weeks.)
What else do I need to check to see what could be suddenly causing such long delays?
-
You need to log into the firewall and grab some logs and post results here. There could literally be hundreds of things that could lead to the problems you describe.
When troubleshooting with computers (and a firewall is a computer), you always start with getting logs unless the hardware is fried and the computer won't even start. So if you want help, the first thing needed is information from the patient (the firewall in this case). Look in the system log to see what is showing.
Another thing to help posters with troubleshooting ideas is to give a detailed synopsis of your setup. You say SG-1100. What version of pfSense is on it -- 2.4.4_p3 or 2.4.5? Did you install any of the extra packages like pfBlockerNG, maybe you enabled the DNSBL option, what about an IDS/IPS or Squid? Any of those installed?
Are you 100% sure your Cellular ISP is not having issues? Do you have another method to test that like maybe a direct laptop connection to whatever modem device they provide? If that connection has issues, then it will be on your ISP to solve. If that connection is fine, then the problem is on your side.
-
Version info:
2.4.4-RELEASE-p3 (arm64)
built on Thu May 16 06:01:19 EDT 2019
FreeBSD 11.2-RELEASE-p10No extra packages on it at all.
I'm about 99% sure it's not the ISP:
- I get good latency when I ping external sites from the SG-1100
- When ping internally, I get high latency and the latency for external sites is only slightly more than what I get internally
- The results from the monitor IP constinstantly show low latency (17.16 ms average) compared to never getting less than 1/3 of a second when I ping internal addresses.
When I ping internal addresses, sometimes I ping by hostname, sometimes by IP address. It doesn't seem to make a difference in latency. Again, I know this could be other issues, but I never had this kind of problem until I hooked up the SG-1100.
I'm wondering if the best thing would be to load in the config from the original, old firewall, and use all of it. (I've generally just used the DHCP config from it.) I know the old version was from before the DNS Resolver was added and it only had the DNS Forwarder, using dnsmasq. I know dnsmasq is still in use, but from what I understand, use of it over the Forwarder is discouraged.
Which logs should I post? And I'm not clear how to download or save them as a file. Just in case it helps, I've included a screenshot of the system log. I'm not using IPSEC or any VPNs, so I'm not clear why they are restarted, unless it's part of a normal process.
-
So is your TP-Link switch a dumb switch or a managed one? What happens with ping times if you connect a machine (say a laptop or PC) directly to the LAN port of the SG-1100? In other words, try taking the TP-Link out of the picture for a test.
You say ping times are good when pinging external hosts from the firewall. So I take it that for those tests you are using the tools under the DIAGNOSTICS menu in pfSense. I do see some gateway alarms on your WAN side. Those could be actual physical packet loss, or it could just be the Google DNS farm (the 8.8.8.8 address you are monitoring) that gets busy and thus slows the response to pings as ICMP replies will be lower priority for them.
The SG-1100 has an internal switch. The out-of-the-box defaults should be okay, but you say you imported a partial config. Could be that as a result something the SG-1100 needs for its internal Marvell switch configuration is missing. If it's not a big deal, I would try resetting the SG-1100 to factory defaults and then configuring for your network from scratch WITHOUT importing your old configuration. That's one test you could try.
Other than the gateway alarms, which I don't think are causing low pings to your internal firewall interface, your log looks fairly clean.
My first guesses at this point would be a misconfiguration of the internal Marvell switch in the SG-1100, possibly a bad network cable connecting the SG-1100 LAN port to your TP-Link, or maybe an issue with the TP-Link switch itself. I guess it could potentially be a problem with the physical LAN port on the SG-1100. On the STATUS > INTERFACES menu are any physical layer errors showing for your LAN port? Do the speed and duplex settings look correct for your TP-Link switch connection (typically 1 Gig and full-dupex these days).
-
Sorry I didn't respond quickly. With the lockdown in our state, I'm finally getting time to get landscaping done because I have daylight all day, and I'm not stuck to weekends. I'm spending all day on that, when I can, and it leaves me wiped out! (This is stuff that has to be done for erosion control, so it's good I have time to do it now.)
I've never tried to manage my TP-Link switch. If it's managed, I'd have to look up the IP address in the DHCP lease list to find out where to log in. (Side point to that: That means it's still got all the settings it had under my old firewall - if it can be managed at all.)
I don't have a laptop, but my wife has one that needs TLC to keep it working. I'll be trying that to ping directly to the firewall as soon as I can get it set up and behaving.
I do know that sometimes 8.8.8.8 does not respond to pings. Previously, I assumed it always responded and that led to troubleshooting issues.
A couple thoughts on the configuration, since that could be an issue. The first is that I would think I could go through the XML file and remove any config info that I don't want to import. But, also, after the last reset of the SG-1100, I told pfSense to import only the DHCP configuration settings. I would assume that's "clean," in other words, if I pick only that one thing to import, it won't get other things at all. I hope that's right.
I'll read up on the Marvell witch and information on that to see what I can find about that. I'll swap out the cable between the SG-1100 and the TP-Link switch and give it a day, then plug it into another TP-Link port - that lag between the two could show me it's one or the other.
It's possible it could the link itself. I've had it for a good while and it generally sits in one place, but I have had to move it a few times as I change things around. I'll check all the info you gave me about the LAN port this evening to see what I can find.
I did pull the SG-1100 and put the old Soekris net5501 with the old pfSense back on and I've left it for a day. Things are working better with that, so that does convince me it's something to do with the SG-1100.
That makes me wonder. Since the net5501 is working well now, maybe I should:
- Make a backup of the net5501 now, since it's working well
- Edit the XML file to remove any settings pertaining to any switch
- Perhaps edit out any settings I don't recognize (and did not intentionally set on the old firewall)?
- Do a factory reset on the SG-1100
- Import the settings from the net5501
Along with that, as I mentioned, on the SG-1100, I'm using the DNS Resolver and not the Forwarder, but on the old one I'm using the Forwarder. Maybe, when I import from the old config after a factory reset, I should use the DNS Forwarder instead and just use the old config for that?
I'm beginning to wonder if my issues could be linked to some setting in the DNS Resolver that I didn't handle right and, if my old DNS Forwarder worked, then I should just use that. ("If it ain't broke, don't fix it.") The reason I haven't done this is I remember reading, somewhere, that dnsmasq is only there for backwards compatibility, which makes me worry that if I use it on the new system, at some point, it could be removed.
-
By default, pfSense-2.4.4 and higher will use the DNS Resolver. And using the Resolver is the better choice. But you have to then be sure you disable the Forwarder.
Unless you have some really elaborate DHCP configuration, I suggest doing the "factory defaults" reset on your SG-1100 and then manually configuring DHCP on the SG-1100 instead of trying to import an old setup from your previous firewall. Importing portions of a config file can be tricky and potentially lead to issues unless you are quite knowledgeable about the various pieces of the config file. For instance, some things like interface names are going to need changing as the SG-1100 has different physical NICs. There may be other changes as well that can trip up an import.
Pinging 8.8.8.8 used to be a good way of testing if your Internet was up and I did it,too, by having my gateway monitor ping Google DNS. However, I started to notice that frequently the 8.8.8.8 address will not respond to a ping or will be very slow to respond. This would cause gateway alarms. And from time to time, it would even cause
dpinger
to think my gateway was down when it actually was not. After chasing that rabbit for a while, I gave up and just reset my gateway monitor to ping my ISP-provided default gateway. While that won't tell me that I necessarily have a route out to the Internet, it does tell me if my side of the connection all the way to my ISP gateway is good. So I settled for that. You may have to do the same. -
I wouldn't say my DHCP configuration is complex, but it's lengthy. I have a number of devices that I need a specific IP address so I can use ssh to reach them or another device needs access and uses the IP address rather than a host name. (For instance, my DVR software uses two HDHomeRun tuners for over the air TV and it uses the IP address of each device, not the hostname.)
I've mentioned editing the XML file. Maybe I could delete all but the list of DHCP leases, so when I load the config file, it loads nothing but the leases. The rest is easy to set by hand.
-
You could setup a "ping charter" that pings from pfSense, the LAN interface, to one of your device, to measure LAN reachability.
You'll be seeing :
and Status > Monitoring, select Quality and NAS_Ping_test to see the graph.
-
@Gertjan Thank you! I'll use that for testing the LAN issues. Much appreciated!