XG-1537 SFP+ 10GBASE-T module to 1Gbps client has effect on ingress/egress depending
-
Hello everyone,
I just purchased an XG-1537 for use as a firewall and VPN server for my home and I quite love it!
I'd like to use one SFP+ port to connect to WAN, so that I can bond the 2
igb
ports to my LAN. The WAN nor the LAN devices have SFP ports.So I purchased a 10gtek
SFP+ 10GBASE-T
module—which gets super spicy during use might I add—to slide into the XG-1537 to connect to the modem using a 0.5m cat8 cable; the modem autonegotiates to 1Gbps. PFSense says the SFP+ module is running at10Gbase-SR <full-duplex,rxpause,txpause>
.
This ... is expected, I think?
Perhaps this is my issue?Anyway, I have some strange TCP failures and slow speeds when a Speedtest is run. The upload or download speeds are significantly affected depending on how the SFP+ module is used in the network.
Can anyone supply any more troubleshooting tips?
I'd be eternally grateful,
Sean========================================
THE ISSUE
When I'm usingigb
ports for LAN and WAN, the whole setup works fantastically: ~940Mbps down / ~40Mbps up (standard Comcast stuff.)However, if I use an
SFP+ 10GBASE-T
module to connect either to LAN or WAN, weird stuff happens.For WAN (modem negotiates to 1Gbps), I get very poor upload speeds (egress), yet fully expected download (ingress) speeds: ~940Mbps down / ~4-7Mbps up.
If I flip the setup around and use the
SFP+ 10GBase-T
module for LAN (switch negotiates to 1Gbps), I get expected upload speeds of ~40Mbps and slow download speeds of about 60Mbps.The setup however currently uses the
SFP+ 10GBase-T
module to connect to WAN / the modem.PFSense ANALYSIS
CPU usage is about 3% as I'm passing this traffic through.
if I useps -CHIPS
I see 0.0% interrupt on all cores during download, and I get ~0.4% interrupt during the upload tests.I tried turning off flow control on the adapters using sysctl.
I've also fudged with some buffer sizes, but everything else remains untouched.PACKET ANALYSIS
Wireshark helped analyze a packet dump taken on the WAN interface during a Ookla Speedtest.There is this strange periodicity of TCP failures, almost as if there's some buffer of some sort being flushed every so often from when the
SFP+
module needs to jam packets through a 1Gbps connection. The failures were ~30% retransmissions.MORE INFO
The packet capture showed 0 TCP errors during the download portion of the Ookla Speed Test. -
If I use both the LAN and the WAN on the SFP+ ports:
WAN 1Gbps modem → (ethernet cable) → 10GBASE-T SFP+ Adaptor
10GBASE-T SFP+ Adaptor → (ethernet cable) → LAN 1Gbps switch... initiating a UDP speed test seems to be able to deliver a file from the firewall to a client on the LAN at ~12Mbps. Otherwise there is tons of packet loss.
# iperf3 -c 10.0.1.13 -u -b 1000M -i 1 -l 16000 warning: UDP block size 16000 exceeds TCP MSS 1460, may result in fragmentation / drops Connecting to host 10.0.1.13, port 5201 [ 5] local 10.0.1.1 port 32349 connected to 10.0.1.13 port 5201 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-1.00 sec 119 MBytes 999 Mbits/sec 7807 [ 5] 1.00-2.00 sec 119 MBytes 1.00 Gbits/sec 7814 [ 5] 2.00-3.00 sec 119 MBytes 1000 Mbits/sec 7811 [ 5] 3.00-4.00 sec 119 MBytes 1.00 Gbits/sec 7813 [ 5] 4.00-5.00 sec 119 MBytes 1000 Mbits/sec 7811 [ 5] 5.00-6.00 sec 119 MBytes 1.00 Gbits/sec 7813 [ 5] 6.00-7.00 sec 119 MBytes 1.00 Gbits/sec 7815 [ 5] 7.00-8.00 sec 119 MBytes 1000 Mbits/sec 7811 [ 5] 8.00-9.00 sec 119 MBytes 1.00 Gbits/sec 7814 [ 5] 9.00-10.00 sec 119 MBytes 1000 Mbits/sec 7810 - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.00 sec 1.16 GBytes 1000 Mbits/sec 0.000 ms 0/78119 (0%) sender [ 5] 0.00-10.03 sec 16.5 MBytes 13.8 Mbits/sec 0.055 ms 72141/73222 (99%) receiver iperf Done.
Going in the other direction from 1Gbps LAN client to the firewall gets ~10% packet loss, which is I believe expected.
-
That's not good if it thinks there's a 10G connection. Is the modem actually negotiating?
I would try setting it to 1G fixed in pfSense.
You might also try disabling flow-control. At 1G that's not normally an issue though.
Do you see errors in Status > Interfaces?
Steve
-
That's not good if it thinks there's a 10G connection. Is the modem actually negotiating?
The modem (MB8600) thinks it's got a 1GbE connection: it has a green LED for the downlink status indicator.
If I LAGG two ports, the indicator turns blue, indicating 2x 1GbE connections, bonded.The Unifi switch reads 1,000 FDX UPLINK on the uplink port.
I would try setting it to 1G fixed in pfSense.
The only supported media for
ix0
is:supported media: media autoselect media 10Gbase-SR
Do you think this 10gtek module is for some reason not backwards compatible?
Is there another way I can advertise a different transfer rate?You might also try disabling flow-control. At 1G that's not normally an issue though.
Flow control was disabled both at the interface-level and even system wide during boot-up.
<rxpause,txpause>
both disappear when disabled.
Sigh, the issue still happens, but yeah, that was a clever idea.
I even disabled it for whatever reason on bothigb
interfaces too.Do you see errors in Status > Interfaces?
Hm, I get a few around the same count for each of the 10GBASE-T interfaces (wan and lan).
There are a good deal of gateway errors (hundreds) from dpinger; no routing errors.
Feb 7 18:18:19 dpinger 87643 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr [redacted] bind_addr [redacted] identifier "WAN_DHCP " Feb 7 18:18:18 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:17 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:17 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:16 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:15 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:15 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:14 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:14 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:13 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:13 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:12 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65 Feb 7 18:18:12 dpinger 56375 WAN_DHCP [redacted]: sendto error: 65
There doesn't seem to be anything scary / horrific being reported from the kernel.
$ dmesg | grep ^ix ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver> mem 0xf9a00000-0xf9bfffff,0xf9c04000-0xf9c07fff irq 11 at device 0.0 on pci4 ix0: Using 2048 TX descriptors and 2048 RX descriptors ix0: Using 8 RX queues 8 TX queues ix0: Using MSI-X interrupts with 9 vectors ix0: allocated for 8 queues ix0: allocated for 8 rx queues ix0: Ethernet address: [redacted[ ix0: netmap queues/slots: TX 8/2048, RX 8/2048
Perhaps I should return the 10gtek module and try a different brand to further control variables?
Honestly, I'm fully out of ideas :( -
The fact the interface does not reflect the link type is not good. Hard to knopw what's happening there really. I would want something that show 1000base-T if that's how it's connected.
It's showing 10Gbase-SR which means it's not really communicating with the module as expected. I would try a different module if you can.
I'm assuming the switch is not in between the modem and XG-1537.Steve
-
Thank you @stephenw10 for the guidance.
I appreciate the extra set of eyes and insight.I would want something that show 1000base-T if that's how it's connected. It's showing 10Gbase-SR which means it's not really communicating with the module as expected. I would try a different module if you can.
I've ordered a different brand (FS.com) 10GBASE-T SFP+ module, whose datasheet specifically calls out 1000BASE-T backwards compatibility. Users there report this functionality appears to work for them, which may be promising.
I'm assuming the switch is not in between the modem and XG-1537.
Correct. One
ix
port leads to the modem; the other port leads to the LAN switch.I've swapped things back to using the
igb
ports; once the new modules appear, I'll report my findings here.Thanks!
-
Well, no luck, but I did figure out what's going on.
TLDR: The Intel FBSD driver simply won't support my use case.I fixed the problem by throwing more money at it: upgraded my Unifi switch to one that has SFP+ ports and I just use a DAC now and life is fine.
Two New Modules for Testing
Two more
SFP+ 10Gbase-T
modules were tested from varying trusted vendors.
Their chipsets both support multi-mode (they support SFP and SFP+).
They also advertise support for 100M/1G/2.5G/5G/10G over-copper data rates.No luck in the Netgate appliance: same behaviour on uplink speeds.
I only seeautonegotiate
and10Gbase-SR
in the rate dropdown.Unifi Software Works
Strangely, the new Unifi switch with SFP+ cages I bought to test things out was able to force negotiate any log base 10 rate above using the same module (1G, 10G) between my older Unifi switch using an ethernet cable.
Engineering Assessment
I reached out to a networking driver engineer at the "fruit based" company I work for.
Because the Unifi appliance works with my use case, this engineer suggests the challenge here is with the Intel FreeBSDix
driver.The FreeBSD driver appears to only support 10G data rates on SFP+ ports, plain and simple.
The driver appears to identify all SFP+ modules as fiber ones, regardless if they're10GBase-T
or10GBase-SR
.The main point is that
SFP+
ports really do tend to operate solely at 10Gbps data rates.
That's why they're there ...
It's not really Intel / Netgate's fault for not supporting my weird use case.
Though it's great (and weird) Unifi has support for what I want.Conclusion and Recommendation
This person recommended that I "stop trying to stuff a square peg into a round hole" and next time buy a real
10GBase-T
appliance (like the XG-1541) if I need 10G all the way down to 10M support on the same onboard ports. -
Hmm, well glad you were able to get up and running. Disappointing you were unable to make a module work directly. I have certainly seen 1G modules that worked fine for base-SR.
Steve
-
Yeah same. It's okay though.
The Unifi 24-port switch can force 1Gbps on the same SFP+ module that the Netgate appliance seems to have trouble with.There's some incompatibility somewhere in FBSD, the Intel driver, or the Supermicro SOC.
-
@seanstewart Have you considered using a 1Gbps copper SFP? I can try to test one on my XG-1537. I sent you a private chat.
-
I actually have two here from different vendors I tried.
One's coded for Cisco and the other's coded for Intel.This directive was added to the boot loader config file:
hw.ix.unsupported_sfp="1"
After rebooting, when I slide either SFP module into the SFP+ cage, I get an unsupported module error from the kernel.
ix0: Unsupported SFP+ module type was detected.
I remember seeing on the Supermicro documentation that these SFP+ ports are not multimode and only support SFP+ modules.