SG-2440 Upload Speed Limited After a Few Minuites
-
@steve1515 said in SG-2440 Upload Speed Limited After a Few Minuites:
I've also been doing some iperf tests from the command line of my pfsense box. I'm noticing that when I'm getting full bandwidth, the 'Retr' (Retransmitted TCP packets) column indicates zero 99% of the time. When I'm being limited to 9Mbps, I get many more occasional re-transmits in the iperf test. These are usually like 1-5, so it's not much, but it seems to come up in almost all tests when my upload is limited. When the upload is back up at 20Mbps, I'm constantly able to get all zeros except for the occasional test. I'm not sure if that's an indicator of anything though??? ... I've put a switch in between the modem and the pfSense WAN. This seemed ok for about 1hr, but then it went back to 9Mbps upload.
This really sounds like physical ethernet errors are occurring. Has the 2440 been used in conditions where it might have been exposed to high humidity, noxious gases, excess vibration, overheating, overvoltage, ESD, etc? Have you tried swapping out the power supply? Have you tried testing speed between two ports on the 2440 configured as LANs?
-
@mer and @gabacho4 Checking out the config is a good point. I'll look through it and see if I notice anything related to speed limiting. I haven't yet tried reloading the config. I was actually thinking of re-building from scratch once the next release comes out in January.
@bPsdTZpW The box is about 4 years old and connected to a UPS with line conditioning. I do have temp/humidity monitoring and in the last year, the max temp was 86 degF and max humidity was 51%.
I think it's a good thought about the power supply, but why would it never have an issue with 200Mbps downloading? The download speed is never limited.
I haven't swapped the LAN port because I've also been doing iperf tests from the firewall itself and that traffic doesn't touch the LAN port. The iperf tests have followed the issue as seen from the LAN devices though, so I feel its something else.
Note: I probably won't have much time the next couple of days to do any more testing, but will try as I can. I'm interested to know what the source of this issue is... that is if we can figure it out.
-
I think it's a good thought about the power supply, but why would it never have an issue with 200Mbps downloading? The download speed is never limited.
Yeah, it doesn't make much sense, but sometimes hardware problems don't. I can point to several instances over my time as a device-driver developer for new hardware....
I haven't swapped the LAN port because I've also been doing iperf tests from the firewall itself and that traffic doesn't touch the LAN port. The iperf tests have followed the issue as seen from the LAN devices though, so I feel its something else.
Yeah, I missed that part, so I agree that the LAN ports don't seem to be involved.
-
Mmm, I really can't think of anything you could configure that behaves like this. But after exhausting everything else testing with a clean config is a good idea.
It 'feels' like a link negotiation issue except that it persists between ports and the intermediate switch made no difference.
-
It's unlikely -- but not unknown -- for devices to misbehave when used with certain other devices, intentionally or not. So on a lark, try changing your pfSense's hostname (system/general setup/hostname) to something quite different from its current value. Then reboot everything.
-
I've now done even more testing. Things are getting interesting and strange...
-
While doing testing, I began to realize that the issue occurs at a specific time which seems to be 58 minutes past each hour! I was able to confirm this a few times. 58 minutes past the hour seems to be when the upload speed gets limited.
-
I figured with knowledge of above, I'd try swapping out the pfSense for a laptop again and seeing what happens at 58 minutes past the hour. Tried with two different laptops... one with an Intel card and one with a Realtek and the laptops both work fine... they don't get their upload limited.
-
I looked through my pfSense config for any clues. I didn't see anything related to speed limiting.
-
I tried to change the pfSense hostname to something random like pc398757 and rebooted both the modem and pfSense. At 58 past the hour, same thing happened... upload limited.
-
I was able to get an SG-1100 to play with. I manually re-built my config (well 90% of it... I left out the OpenVPN server and pfBlockerNG). I didn't want to import the config in case something was bad in it, so I spent the time manually creating it again on the SG-1100. I swapped out the SG-2440 for the SG-1100. Guess what... at 58 past the hour... same thing... upload limited to 9Mbps!
-
I didn't see this before, because on Windows apparently flow control info is impossible to find, but I did a live boot into Ubuntu and was able to see via dmesg every time I plugged in the Ethernet cable that it told me if flow control was enabled or not. The interesting results here are...
- Realtek NIC laptop connected to modem = flow control enabled
- Realtek NIC laptop connected to pfSense port = flow control enabled
- Intel NIC laptop connected to modem = no flow control
- Intel NIC laptop connected to pfSense port = no flow control
(Both laptops never showed the upload limiting issue.)
dmesg on pfSense doesn't list flow control information, so I'm not sure if there is any good way to see more info on the pfSense boxes or not.
So, my thoughts so far are conflicting...
Part of me feels like this is a link issue like we all initially thought, but the other part of me feels like it isn't because bad cables/links don't tell time and decide to fail at 58 minutes past the hour. The NICs are also Intel vs Marvell in my two pfSense boxes.Part of me also fees like this may be a software issue in pfSense due to it only occurring in my 2 pfSense boxes. But I'm thoroughly confused now and would like to hear what everyone else thinks? I could be wrong in my assumptions and dmesg may be lying to me.
-
-
@steve1515 said in SG-2440 Upload Speed Limited After a Few Minuites:
. 58 minutes past the hour seems to be when the upload speed gets lim
Something in a cron job? Reloading something?
Something pushing stats/backup data to somewhere? -
I don't see any cron jobs in pfSense that I can see that would cause this. This is also on a new SG-1100 system as well, so I'm not sure what it could be.
Note: I'm not sure if this was clear before, but once the upload is limited it stays limited forever until I either unplug the WAN cable and re-plug it in or I reboot the pfSense box. This happened on both my SG-2440 and the SG-1100 I used for testing.
-
Different DHCP lease times? Like something upstream is expiring the lease and using some default shaping?
Does it matter if 58mins past the hour is 10mins after connecting or nearly 1hour?Anything logged at all at 58mins past?
-
It's a static IP on the WAN side, so I don't think it would be a DHCP issue. It also doesn't seem to matter if I reboot the modem or not or when I connect the WAN cable.
I think why I initially thought it was happening within 15 minutes, was because I was getting lucky and not noticing that it was time related before and was near the top of the hour. It doesn't seem to matter when I plug in though... once it's xx:58 on the clock it seems to happen.
I don't see anything logged at that time either.
-
Hmm, if you disable ntp and set the clock in pfSense differently does it still happen at the same time?
-
Modifying the pfSense clock was a great idea @stephenw10 !
When I disabled NTP and changed the clock on the pfSense box, the pfSense xx:58 time came and went without a loss in upload speed. When the actual time (from another NTP synced clock) hit xx:58 though, I lost my upload speed again.
This seems to go back to it being a modem/Comcast issue. The crazy thing is why only my 2 pfSense boxes even when tried with different MAC's/hostnames and never the 2 laptops?
-
Mmm, indeed, hard to see what it could possibly be. Static WAN IP eliminates most things. Different NIC types.
I guess maybe run a pcap on the WAN at exactly the time is starts clamping and see if it's sending something that pfSense doesn't respond to.
Hard to imagine what that could be that a laptop does respond to though. -
I did a packet capture on the WAN link on the pfSense box and saw nothing unusual. I did the same thing on a laptop just to see if there was anything noticeable... there was not.
Some background on the static IP set up that Comcast has... The modem has a routed /30 subnet where one IP (x.x.x.185) is my static IP and the other is the modem's IP (x.x.x.186). This modem also has the usual 10.x.x.x NAT stuff which I don't use and have disabled all features of including firewall, etc. But if you were to connect more PCs to the modem and use DHCP, you'd get a 10.x.x.x IP and have NAT'd Internet access going out of the x.x.x.186 IP in my static range. It's really "special" how Comcsat does this, and I'd live to be able to just bridge completely, but that's not how it works with Comcast statics. To clearfy some, the pfSense doesn't use DHCP or anything, it's a standard static config. The modem just also still works with NAT if you were to use DHCP. (Note: When testing with the laptops in the posts above, I've tried with the NAT IP and also at different times with my static (x.x.x.185) just like the pfSense. The upload never got cut in half, no matter what I did with the laptops.)
Now that the background info is laid out, one thing that I did notice was that when I was connected to the modem via the laptop with it configured to my static IP of x.x.x.185, I could connect to the modem's IP of x.x.x.186 in a web browser and get to it's config page. This for some reason does not work when I'm behind pfSense on the LAN. From the LAN connecting to x.x.x.186 always times out in the browser and a Wireshark shows that no packets are ever returned from the modem. The weird thing is that it used to work from the LAN a few years ago. One day I noticed it no longer worked from the LAN. When exactly, I don't know as I don't access the modem web page often. I just figured it was Comcast updating their security or something and that was that. I figured I would always just need to plug in a laptop and connect to it's 10.x.x.x gateway address to get to the web page.
Does this possibly indicate something weird happening with my pfSense? Thinking about it... I really don't see why connecting to the x.x.x.186 network from my LAN should fail. The packets should go out on the /30 subnet and the modem should just think it's talking to a local host on a directly attached interface.... just like the laptop could.
I'm kind of shooting in the dark here, but maybe if I toggle the "switch" that makes the web page work something else will start working... Thoughts? Anyone know how to do this?
I guess it's possibly a side question, but any thoughts on why I can't access the web page from the LAN but could from the laptop?
-
A weird addition to the above...
I just did a pcap on the WAN interface while trying to access the modem IP x.x.x.186 from the LAN. I can see that the packets go out from x.x.x.185 (pfSense WAN) to x.x.x.186 (modem IP), but no packets ever come back!
It's almost like the modem is somehow ignoring the pfSense but talks when I use the laptop.
-
@steve1515 : I just did a pcap on the WAN interface while trying to access the modem IP x.x.x.186 from the LAN. I can see that the packets go out from x.x.x.185 (pfSense WAN) to x.x.x.186 (modem IP), but no packets ever come back!
It's almost like the modem is somehow ignoring the pfSense but talks when I use the laptop.
I am surprised that the modem responds to packets containing its WAN address (x.x.x.186) that it receives on its LAN port (when sent via the laptop). Most modems have a LAN-side (private) address for administration. BTW, having a WAN-side address might open the modem to hacking via the WAN.
I wonder whether the modem is routing the packets to x.x.x.186 from pfSense out via its default gateway (that is, onto the WAN), which, if so, is why nothing ever comes back to pfSense.
Does traceroute from pfSense to the speed-test site return different values when your connection is fast and when it's slow?
Also do you happen to have RIP (https://docs.netgate.com/pfsense/en/latest/packages/routed.html ) enabled on pfSense? Or on the modem?
-
Try running a port test to it from pfSense dircetly (Diag > Test Port) that should duplicate what the laptop does exactly.
About the only difference here would be the TTL value of incoming traffic. Packets coming through pfSense have already been routed so would have a lower value. Usually that makes no difference because the TTL is high enough it never gets close to 0.
It's been a while since I looked at it but cell phone providers used to us that as a way on enforcing only a single client when tethering, you could not connect a router to it.Steve
-
@bPsdTZpW I think the modem implementation is a little more like this diagram I drew.
You can see that the modem has both my routed x.x.x.184/30 network with it's interface assigned the x.x.x.186 IP and also the 10.1.10.1 IP for it's NATing. In this diagram the laptops on the 10.1.10.1 network can get to the modems web page by going to 10.1.10.1 and if I were to configure a laptop with the x.x.x.185/30 IP and plug it in place of the pfSense then it could get to the modem web page via the x.x.x.186 IP. The weird thing is when I'm using either the pfSense Port Test or a PC on the pfSense LAN the modem doesn't ever return any packets back to a connection attempt to x.x.x.186 port 80.
Are you saying there's a way to setup static routes to make this work?
I also checked the traceroutes to the test server and there appears to be no change between fast and slow upload times. I also do not have RIP installed on the pfSense. The modem doesn't give me any RIP options in it's config. I also didn't see any RIP related packets when doing pcaps.
@stephenw10 I tried the Port Test and did a pcap while it was happening. I get the same thing... zero packets back from the modem. I took a look at the TTL of the traffic coming out of the pfSense WAN in the captures and it's well above zero. This seems pretty strange that the Port Test would not get any packets back, doesn't it?
This is really starting to look like the modem knows it's a pfSense box and is doing something strange.
-
It's still surprising (to me at least!) that is can do both those things at once.
Did you test it without anything connected locally to the 10.1.10.X subnet?
Or conversely did you test a laptop at the x.x.x.185 IP with another client on the 10.1.10.X subnet?
Steve
-
Are the router and 4port switch all part of the Comcast modem?
Kind of going off of @stephenw10 does the IP follow the port on the switch?