Netgate 6100 SFP+ connection error rate of 0.0055%. Should I be worried?
-
Hmm, do those VLANs function? I would have expected having their parent interfaces as bridge members to break the replies there. Though I don't think I've tested that in 23.09.1.
Historically bridges and VLANs have not combined well. I could absolutely imagine that causing those errors.
-
@stephenw10 Yup, all of the VLANS (100, 200 and 1003) are working just fine. For the one with jumbo frames (200), connections show the expected (large) mss and throughout is consistent with the increased frame size.
Apart from the weird receive length errors on the SFP+ uplink I have only notice one other anomaly which I have detailed here:
Again, though annoying this is not a dealbreaker.
Seems like FreeBSD and/or pfSense have a few potential bugettes to squash (or at least areas to enhance)...
-
Yeah when you combine VLANs and bridges you are in a grey area! I would want to confirm that mtu issue without the bridge first.
-
@stephenw10 So, I have now arrived at a point where I no longer have a bridge configured. The SFP+ interface in question is the primary (only) LAN interface. It does have one VLAN defined on it. Interface and VLAN MTU is 1500. I am still seeing a very low, but gradually increasing receive error count (tending towards ~0.005%) and all of the errors are 'rec_len' errors. This is very bemusing...
-
What is that NIC connected to? Any errors logged at the other end?
-
@stephenw10 It's connected, via an active optical cable, to an SFP+ 10 GB port on a TP-Link TL-SG3452X switch. That isn't reporting any errors at all for this link (or any others).
-
If it's actual fiber it can be worth cleaning it. Though I'd expect far more errors if it really was a dirt issue.
Other than that I'm not sure what else can be done.
You could try switching it to ix0.
-
@stephenw10 Active optical cables have integrated transceivers so there isn't really anything to clean. The error rate is very low so I will live with it for now. In a few weeks I will be rearranging some things so I will then try a DAC cable and/or ix0 instead of ix1 to see if that changes anything.
-
@ChrisJenk I just wanted to come back to this as it is still troubling me.
I have rearranged things and now the Netgate 6100 is connected to my primary switch via a 1m SFP+ DAC cable rather than a 15m SFP+ AOC cable. I'm also using different ports on the NetGate (ix0 now instead of ix1) and the switch. However I am still seeing moderate / sporadic busts of receive errors reported on the NetGate (netstat -i). Using sysctl again most of them seem to be rec_len errors. Again the switch end does not report any errors at all.
This is very perplexing... I am wondering if there may be a bug in the SFP+ driver, or some such?
-
@ChrisJenk said in Netgate 6100 SFP+ connection error rate of 0.0055%. Should I be worried?:
Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll
ix1 9000 <Link#6> 90:ec:77:7f:c9:d5 31488071 1452 0 74590982 0 0
ix1 - fe80::%ix1/64 fe80::92ec:77ff:fe7f:c9d5%ix1 1060 - - 98 - -Is that your WAN or LAN? While there's no problem with jumbo frames on the LAN, assuming other devices can handle them, you shouldn't be sending them to the WAN. PfSense should be sending ICMP too big messages when a jumbo frame tries to leave your LAN. You shouldn't be using jumbo frames on the WAN side, with the possible exception of if you're on Internet2.
-
@ChrisJenk said in Netgate 6100 SFP+ connection error rate of 0.0055%. Should I be worried?:
I am wondering if there may be a bug in the SFP+ driver, or some such?
That's always possible but it seems more likely it's actually packets that cannot be received correctly since the vast majority of ix installs do not see that.
@JKnott said in Netgate 6100 SFP+ connection error rate of 0.0055%. Should I be worried?:
You shouldn't be using jumbo frames on the WAN side
Yup that's true. Though I wouldn't expect to see receive errors generated by that.
-
@JKnott said in Netgate 6100 SFP+ connection error rate of 0.0055%. Should I be worried?:
@ChrisJenk said in Netgate 6100 SFP+ connection error rate of 0.0055%. Should I be worried?:
Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll
ix1 9000 <Link#6> 90:ec:77:7f:c9:d5 31488071 1452 0 74590982 0 0
ix1 - fe80::%ix1/64 fe80::92ec:77ff:fe7f:c9d5%ix1 1060 - - 98 - -Is that your WAN or LAN? While there's no problem with jumbo frames on the LAN, assuming other devices can handle them, you shouldn't be sending them to the WAN. PfSense should be sending ICMP too big messages when a jumbo frame tries to leave your LAN. You shouldn't be using jumbo frames on the WAN side, with the possible exception of if you're on Internet2.
It's my LAN and I do use Jumbo frames on that (carefully). However, the jumbo frames are restricted to two VLANs neither of which are configured on the NetGate so those frames should actually never reach the unit. I think that MTU is a hangover from an older config; I will set it back to the default.
However, in my current setup the LAN is now ix0 and it has an MTU of 1500.
Name Mtu Network Address Ipkts Ierrs Idrop Opkts **Oerrs** Coll ix0 1500 <Link#5> 90:ec:77:7f:c9:d4 17034814 **387** 0 20616901 0 0 ix0 - fe80::%ix0/64 fe80::92ec:77ff:fe7f:c9d4%ix0 4078 - - 18875 - - ix0 - 10.0.200.0/24 router 8993 - - 22104 - - ix0 - fd00::/64 router 8936 - - 9272 - - ix0 - xxxxxxxxx::/64 router.xxxxxxxxxxxx 0 - - 25628 - - ix0 - yyyyyyyyy::/64 yyyyyyyyyyyyyy::1 0 - - 42 - -