Mixing different NIC Speeds (1Gb & 10Gb) Performance Problem Question
-
I'll give it a try in the am.
But 1 ? The modems port supports 2.5Gb and I pay for 2Gb service but when you plug a laptop with a 1Gb port Into that modem it will negotiate simply at 1Gb and give you a live ext IP. I guess you are thinking that if FC is disabled in windows and the speed test is poor then the isp supports and needs FC. However what If the speediest is fine, I guess what do the results mean in either case. I guess I'm hung up that there is no NAT on my end and we are live tapped in.
-
@ngr2001 If 802.3x FC is disabled and you still get 940/940, that means TCP Flow Control worked and it was able to tell the server to slow down so that the client doesn't end up with buffer overflow but still achieve its maximum potential. Which then points us back in the direction of pfSense. If it doesn't achieve 940/940, then we know TCP Flow Control doesn't work over DOCSIS and must rely on Layer 2 802.3x which is blah.....
-
TCP flow/congestion control always 'works', the fact that it just slows down rather than fails is evidence that it is. The issue here is that other flow control methods at lower layers adversely affect the TCP CC to drop the speed more than is required as I understand it.
In this situation though it appears that can only be the switch to pfSense link when it's overloaded by burst traffic. I can't see anything else that would explain it. It does feel like you might be able to some shaping in pfSense to mitigate it though. -
Just ran a new test with Flow Control Hard Disabled in Windows.
The result was no difference in speed, but again I expect this. The modems port although being 2.5Gb ready can auto negotiate either 2.5Gb or 1Gb. In my test its negotiating 1Gb to the laptop thus I would expect this to be a very clean 1:1 connection with no speed mismatches to cause an any issues.
Flow Control Off:
Flow Control Off:
Flow Control Off:
-
@ngr2001 Thanks. This was very helpful to me to eliminate DOCSIS being the cause of any TCP Flow Control failure. This points us back to the pfSense and how its interacting with switches at 1GbE and 10GbE. I have been analyzing packet captures at the client. Next will be analyzing packet capture of the pfSense LAN interface
-
I'd love to test the problematic setup but with a different firewall in place like a Ubiquiti Dream Machine Pro Max. Have you or anyone else tried that, this would at least help pinpoint whether or not there is a potential problem on the PFSense side, worth a shot ?
-
@ngr2001 It's worth a try.
-
Mmm, that would be a good test. As long as it uses the same modules.
Another good test would be swapping out the modules/link with something correctly supports flow-control. But that would require a different NIC.
-
So I got the 3850 installed with just a basic config.
1st test:
qos queue-softmax-multiplier 1200 -- NOT SET YET
PF WAN = FC Enabled (by default) and Showing active
PF LAN = No FC (I assume a clean 3850 doesn't have RX and TX FC by default)PF WAN @2.5Gb, PF LAN @2.5Gb, Client @1Gb = Poor Speed test (~500Mbps) - Output drops detected on switchport
PF WAN @2.5Gb, PF LAN @2.5Gb, Client @2.5Gb = Normal speed test (~900Mbps)
PF WAN @2.5Gb, PF LAN @2.5Gb, Client @10Gb = Normal speed test (~900Mbps)
I would like to enable FC on the 3850 and test that next if anyone knows the command. after that I will add the QOS fix and test.
FYI 900Mbps is normal for me right now being I have a Codel Limiter capping the speed at 900.
-
@ngr2001 said in Mixing different NIC Speeds (1Gb & 10Gb) Performance Problem Question:
I would like to enable FC on the 3850 and test that next if anyone knows the command. after that I will add the QOS fix and test.
I mentioned this before in the threads, Cisco supports RX but not TX Pause. This is why I have been focusing on finding the root cause of why TCP Flow Control isn't able to do its job properly. I can't and don't like to rely on the last resort method of using L2 802.3x FC.
You need to use the global QOS setting I provided:
qos queue-softmax-multiplier 1200
show interfaces tengigabitEthernet 1/0/48 capabilities | include Flowcontrol Flowcontrol: rx-(off,on,desired),tx-(none) show interface tenGigabitEthernet 1/0/48 flowcontrol Port Send FlowControl Receive FlowControl RxPause TxPause admin oper admin oper --------- -------- -------- -------- -------- ------- ------- Te1/0/48 Unsupp. Unsupp. on on 0 0
FYI 900Mbps is normal for me right now being I have a Codel Limiter capping the speed at 900.
I assume this is just for testing?
-
Mmm, I would definitely try disabling Codel shaping at least as a test. That's just another factor that will be affecting TCP.
-
Codel is a hard requirement for me, its the only way I can get an A+ bufferbloat score, this setup is for competitive gaming.
That said, If seems like as soon as I allow the download speed to surpass 900Mbps even my client on 2.5Gb all the way around struggles to get an A+ score. Feels like everything I do to surpass 1Gb takes me 3 steps back. If I were to allow 1750Mbps in Codel should I expect clients at 1Gb to achieve a clean A+ bufferbloat score?
-
I would have expected it to. CoDel generally 'just works'. However it still replies on being the slowest part of the route. If you set the Limiter higher than some other link then all the buffering happens there and not locally where you can control it. So it implies there is some other 1G link in the route you're testing over.
But just as a test I would still try disabling it. I wouldn't expect it to make any difference to the test results other than for buffer-bloat etc. But if you suddenly stop seeing the packet-loss / 500M limiit for 1G clients that would be a pretty big clue!
-
I'll Give it a shot.
So, in the use case of having a 2Gb Service from the ISP and the Codel Limiter it set to 1750Mbps, is there anyway to prevent bufferbloat for the 1Gb clients or are they just SOL.
For example, is there any way to make a second Codel limiter set to 875Mbps that only applies to certain clients or clients that are at 1Gb ?
-
Yes, you can create a lower bandwidth limiter and use different firewall rules to put specific clients into it.
Just to be clear when you first saw this issue you tested with the pfSense WAN at 1G right?
If it was 2.5G and the link to the switch at 10G then the last link at 1G to the client would be much easier to imagine causing buffering issues. -
I am pretty sure that is correct, but its also easy enough to test again.
-
Here are some interesting continued results:
- After enabling "qos queue-softmax-multiplier 1200" on the 3850 I had what seemed like immediate alleviation of the performance issues. However I noticed that over time the performance started to degrade. Audio clicks and minor drops on zoom calls etc. I checked the switch logs and sure enough there were a significant amount of drops on my clients switch port. Oddly running a speedtest did not instant create drops, they were just accumulating at what seemed random.
PF WAN @2.5Gb, PF LAN @2.5Gb, Client @1Gb = Good Speed Test (~800-900Mbps) + Output Drops on Switchport over time.
This has me completely discouraged I am at the point where it does just not seem to make sense to keep messing with this, trying to achieve a WAN speed of 2Gb creates so many issues that its simply not worth it. So I started reverting everything back to 1Gb, i removed the "qos queue-softmax-multiplier 1200" being it really should not be needed and I wanted a clean baseline, this is what I encountered.
Cisco QOS Fix Removed:
FC disabled on WAN, LAN, and Switch.PF WAN @1Gb, PF LAN @2.5Gb, Client @1Gb = Bad Speed Test (~500-600Mbps) + Output Drops on Switchport
PF WAN @1Gb, PF LAN @1Gb, Client @2.5Gb = Output Drops on Switchport
There would be no point running the WAN at 2.5Gb and everything else at 1Gb so I simply did not test.
So in the end the Cisco QOS fix seems to be only a Band-Aid at best, drops will come back and they are random at best. VOIP traffic seems to take the biggest hit. I don't know if PF is the issue or if its these midgrade switches, but this was now been reproduced exactly across a Cisco 3650, Cisco 3850, and a Brocade ICX-7250.
My best solution seems like I should leave everything at 1Gb and call it a day which just kills me inside knowing I'm leaving 50% of my bandwidth behind.
-
@ngr2001 After your test of a computer directly off the modem, I performed a speedtest with the Comcast XB8 in router mode and the Cisco 3850
Comcast Node <--DOCSIS 2.35Gbps--> Comcast XB8 <--2.5GbE--> Cisco 3850
-
Comcast LAN Port 1 GbE: 940/360Mbps
-
Comcast LAN Port 2 GbE: 940/360Mbps
-
Comcast LAN Port 3 GbE: 940/360Mbps
-
Comcast LAN Port 4 2.5GbE <--> Cisco 5-Speed mGig Port 1/0/48 negotiated uplink @ 2.5GbE link
-
Cisco Port 1-36 GbE: 940/360Mbps
-
Cisco Port 37-47 10GbE: 2350/360Mbps
QOS setting removed. This tells me pfSense is the issue.
-
-
I feel like the issue is likely on the PF side too, but which of your results is the bad one, I'm not 100% sure how to interpret your results.
-
@ngr2001 said in Mixing different NIC Speeds (1Gb & 10Gb) Performance Problem Question:
Audio clicks and minor drops on zoom calls etc. I checked the switch logs and sure enough there were a significant amount of drops on my clients switch port. Oddly running a speedtest did not instant create drops, they were just accumulating at what seemed random.
Odd. I am always using VoIP and never experience this. Maybe it is the pfSense limiter?