4100 ix Flow Control Help
-
@stephenw10 said in 4100 ix Flow Control Help:
Is that throughput difference a step change?
Yes, as far as I know. The throughput change happens either when I am not home or during the middle of the night. My Home Assistant running speed test once an hour is when I first see the change in throughput. I then confirm that speed test with my laptop and from the command prompt on the 4100.
Does the link show differently in each case? Are there a lot of errors/colisions on ix3 after it slows?
No, the ix3 interface shows no errors or collisions.
20Mbps is so low it has to be something pretty low level, like flow control as you state.
I think so because the previous igb interfaces on the SG-4680 with the same pfsense configuration did not exhibit this. The 4100's igc (Intel i225) interface so far is holding strong since 10pm Monday night (about 32 hours ago). I'll report back if the igc interface holds the WAN throughput longer than the ix3 interface (probably an update tomorrow).
-
@selfjc
Welp, a few moments after I made that post the download and upload speeds slowed down.I checked the WAN interface (it was on the igc3 interface) and there were no errors or collisions. I wrote down my public IP address.
I swapped the WAN cable from the cable modem back to the ix3 interface from the igc3 interface. I changed the pfsense WAN interface back to ix3 to follow the cable. My ISP handed me a new IP address. I reran speed tests with the same low speed results of ~20 Mbps download. The Traffic Shaper Limiter for CODEL and FQ_CODEL are on.
Then (WAN still in ix3 interface) I disabled my Traffic Shaper's Limiter floating firewall rule for CODEL and FQ_CODEL. Now I get full ISP link speeds again ~914 Mbps download and ~50 Mbps upload.
I admit I am not being super scientific for changing two variables just now of the public IP from my ISP and disabling the Traffic Shaper again. But this is starting to point to the Traffic Shaper's Limiter and floating firewall rule "filling up" or something.
I just now re-enabled my Floating Rule to implement the Traffic Shaper Limiter using the CODEL and FQ_CODEL and the download speeds are download of ~483 Mbps (the WANdown limit set to 700 Mbps) and the upload of ~20 Mbps (the WANup limit set to 22 Mbps).
Conclusion
The ix3 interface looks to be okay.
My Traffic Shaper Limiter rules seem to be "clogging up" after about 36 hours.Thank you for the help @stephenw10 !
-
Ah, nice find. That's weird though!
You see any errors logged from the shaper? The queues somehow completely full? -
@stephenw10
I will try to look at that if/when the download speeds slow down.Does Diagnostics -> Limiter Info contains that information?
-
Ah, sorry I was thinking they were AltQ shapers.
You might see something there though if the Limiters are misbehaving. -
Welp, this seems to be total connections or total throughput into/through the Traffic Shaper Limiter.
The ix3 interface serving as WAN just suffered the same slow down after only 12 hours of uptime and ~50GiB. I disabled the floating Firewall floating rule that forces in the WANdownQ and WANupQ. The old states and connections still suffer slow bandwidth but speed tests to new servers come right back up towards the ISP link speeds.
After a reboot, with the same ix3 interface and the same public IP the download speeds return back to full speed.
This is pointing more and more to the Traffic Shaper Limiter. I will now try just leaving that off to monitor if the bandwidth in and out through the WAN interface slows down again.
-
Hmm, like it hits this after ~50GB every time? That's...odd.
-
if this is a thing - then i wouldn't be surprised if it's actually something like 42.949GB. or 53.687 if there's a bit/byte conversion along the way
-
The issue cropped back up today while I was at work when my Home Assistant notified that the speedtest was slowed. I confirmed by IPSec VPNing to the 4100 and running the speedtest-cli from the Command Prompt.
I have now set the following per the Hardware Tuning Guide as an attempt:
kern.ipc.nmbclusters="1000000" kern.ipc.nmbjumbop="524288"
In the /boot/loader.conf.local and the following system tunable:
hw.intr_storm_threshold="10000"
I don't anticipate this to fix this issue I have because the issue happened also on the igc interfaces when set to WAN not just the ix interfaces. But it's worth exhausting all avenues.
I'll post an update back again if the bandwidth dropout happens again.
At that point I only have the following options left:
- Factory reset and forego the configuration restore
- RMA the 4100 Max
Any suggestions?
-
So that was with the Limiters disabled?
-
No surprise, look at your limiter parameters:
This Time is so low, your CPU clock is not high enough to work out the Queue.I use this on the 2100 and 6100, with ECN active:
AQM CoDel target 11ms interval 25ms ECN
-
@nocling
I'll make sure to keep that in mind when I add Traffic Shaping back in.Right now I have flashed the 4100 back to bare pfsense 23.01 because I was having the bandwidth dropout without Traffic Shaping.
The plan is to setup the interfaces with the segregated network IP ranges with only basic firewall from WAN to LANs. Hopefully the 4100 doesn't suffer drop outs with this arrangement. Then add back in the features I had before.
-
@stephenw10 said in 4100 ix Flow Control Help:
So that was with the Limiters disabled?
Yessir. I've exhausted my capabilities of trying to find what feature caused the 4100 to drop bandwidth and opted to "start over from scratch."
-
@selfjc
24 hour check in:
The 4100 hasn't dropped the bandwidth to ~20 Mbps. The speed tests maintain the full cable ISP link speeds ~500 Mbps and up to ~800 Mbps.So far, so good.
I hope I didn't just jinx this. I'll do another check in later this week if the bandwidth drops or later in the week to report back the status.
-
@selfjc
I spoke too soon. The internet speeds just dropped out again down under ~20 Mbps each direction. The full link speeds started 3pm yesterday and crashed out today around 8pm (29ish hours). A simple reboot with identical public IP from my ISP and I get back to full link speeds.I am now running Traffic Status Totals to try and catch the amount of throughput makes the internet speeds crash.
-
Have you tried just disconnecting and reconnecting the WAN cable when it's in that situation?
Or rebooting the modem?
It's hard to think of anything that would affect the throughput of the 4100 like that. About the only thing I could imagine might be overheating causing the CPU to go in to thermal throttling.That would usually be fairly obvious from the temperature readings though. And it would affect all traffic to/from the box including LAN side. Also even at it's minimum speed you would see more than 20Mbps!
Steve
-
@stephenw10 said in 4100 ix Flow Control Help:
Have you tried just disconnecting and reconnecting the WAN cable when it's in that situation?
I'll try that again. When I remember getting the ix3 interface to throttle and changing to igc3 interface for the WAN connection, the igc3 interface also was sluggish to start off until I did a Diagnostics -> Reboot -> Normal Reboot.
Or rebooting the modem?
I'll try that also when I get the 4100 to throttle the bandwidth.
It's hard to think of anything that would affect the throughput of the 4100 like that. About the only thing I could imagine might be overheating causing the CPU to go in to thermal throttling.That would usually be fairly obvious from the temperature readings though. And it would affect all traffic to/from the box including LAN side. Also even at it's minimum speed you would see more than 20Mbps!
Steve
The CPU temperature hovers around 48ºC to 50ºC when I do three concurrent speed tests through three different clients (two WiFi, one Ethernet). But if I get the bandwidth to throttle, I will check the CPU temperature first before the LAN cable and modem tries.
-
Mmm, that's no where near hot enough to start throttling.
Does it get a new public IP address when you reboot it?
One thing I might imagine is that the ISP is throttling the connection is reaction to something. And what that could be is the gateway monitoring pings over time. You might try disabling gateway monitoring as a test. Also consider your repeated testing itself may be seen as a problem.
Steve
-
@stephenw10 said in 4100 ix Flow Control Help:
Mmm, that's no where near hot enough to start throttling.
Does it get a new public IP address when you reboot it?
No sir, the router/cable modem maintains the same Public IP address.
One thing I might imagine is that the ISP is throttling the connection is reaction to something. And what that could be is the gateway monitoring pings over time. You might try disabling gateway monitoring as a test.
I disabled the Gateway Monitor Action. The Gate Monitor I have left running. If the bandwidth drops out, then I stop the Monitor as well.
Also consider your repeated testing itself may be seen as a problem.
Steve
That's what I am worried about too. I originally only noticed when I was checking my Home Assistant history for the download speed "stuck" at 20Mbps for nearly a full week. I did a normal reboot and got full link speeds back. At that first time (~April 20) I didn't think much of the bandwidth dropout. Then I noticed the bandwidth dropouts started happening nearly on a schedule around 24 to 36 hours after a Normal Reboot. I suspect this time variance happens because there are days where I am not home using streaming services and not consuming as much bandwidth and over time not as much throughput.
But none of this happened before with the SG-4680 was still working (prior to March 24th, 2023 - mainboard failure - no console - no booting). I had Home Assistant running speedtest once an hour, I watched normal streaming services, etc. without the bandwidth dropping out with the SG-4680.
I also can't blame the ISP (yet) because with the same Public IP within 2 minutes for the Normal Reboot to complete and the full link speeds come back. The ISP isn't alibied out yet either. If the bandwidth drops out, the LAN cable plug-replug, reboot the modem, etc. make no difference, then I will run my network through a consumer Netgear router after trying the disabled Gateway Monitoring.
Thanks for the ideas!
To do:
- Cable unplug-replug
- Reboot modem
- Disable Gateway Monitoring
- Test through a Netgear router
-
@selfjc said in 4100 ix Flow Control Help:
To do:
- Cable unplug-replug
The cable unplug and count to 30 seconds and replugging in the WAN cable netted the same Public IP from my ISP. But the speedtest-cli on the Home Assistant (on igc4 interface) and on pfSense (ix3 WAN) both came back up to full link speeds.
Interestingly, the speedtest.net through the browser still has throttling while speed.measurementlab.net and waveform's bufferbloat come back up to link speeds (through igc0 LAN1 to ix3 WAN).
Does this mean the dreaded ISP throttling is to blame?
- Reboot modem
- Disable Gateway Monitoring
- Test through a Netgear router
I didn't do these yet as the cable unplug and replug seemed to "fix" the bandwidth throttling.