Weird Behavior with x710-da2 in 2.5.x
-
Hmm, doesn't sound fixed! Check Status > Interfaces for errors.
You migtht try just ifconfig down, ifconfig up on the interface instead if making a change. See if that brings the latency back to normal too. Or try just resaving it in pfSense without making a change.
Steve
-
Resaving in pfsense doesn't change. down/up does fix latency issue. Still low bandwidth. no errors in status > interfaces. https://imgur.com/a/dYCnYJ3
-
Ah try assigning and enabling ixl0 directly even you set it as type 'none'. It may not be applying the settings to VLANs only. And you may not be seeing the errors.
-
I added ixl0 as opt1 with type set to 'none' - no change, did I understand that correctly?
-
Also unchecking the disable hardware checsum/tcp offloading without rebooting and Im getting 600/400mbps, not quite 2.4.5 speeds but its encouraging.
-
Hmm, and no errors shown on ixl0 either state?
-
no, no errors on all 3 interfaces. Really confusing. There is a note in the boot log - about the nvm version not being expected version that the driver needs to be updated, but the same note is present in 2.4.5. I'm assuming thats just the ixl driver being a bit behind.
There is another user on reddit that reported the same issue with esxi 6.7. Works with other vms but not 2.5.2.
Also a note about pcie speed but its in a pcie 3.0 slot.
ixl0: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.3.0-k> mem 0xe6000000-0xe6ffffff,0xe7af8000-0xe7afffff irq 19 at device 0.0 on pci4 ixl0: fw 8.84.66032 api 1.14 nvm 8.40 etid 8000af82 oem 20.5120.13 ixl0: The driver for the device detected a newer version of the NVM image than expected. ixl0: Please install the most recent version of the network driver. ixl0: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C ixl0: Using 1024 TX descriptors and 1024 RX descriptors ixl0: Using 6 RX queues 6 TX queues ixl0: failed to allocate 7 MSI-X vectors, err: 6 ixl0: Using an MSI interrupt ixl0: Ethernet address: f8:f2:1e:87:a6:81 ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active ixl0: PCI Express Bus: Speed 5.0GT/s Unknown ixl0: PCI-Express bandwidth available for this device may be insufficient for optimal performance. ixl0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate. ixl0: SR-IOV ready ixl0: netmap queues/slots: TX 1/1024, RX 1/1024 ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl0: link state changed to UP
-
The bus speed note is just informational I wouldn't expect any issues because of that.
The firmware version could be a problem. I imagine the mismatch just isn't detected/shown in 2.4.5.
Can you test a 2.6 snapshot?
Steve
-
another good suggestion thanks. Ran 2.6.0.b.20220111.0600 and 2.7.0.a.20220115.0600 - same issue have to up/down the interface, no errors. I was able to get 800/450. No boot complaints about network driver being out of sync.
-
Hmm, something has to be changing but it's hard to see what that could be just by down/up-ing the interface.
You could start checking the sysctl stats for ixl0 but there's a lot to wade through:sysctl dev.ixl.0
You might be able to spot some key difference between the two states.
Steve
-
Hi @ashtonianagain - looking at the logs above:
- If the card is PCI Express 3.0, I'd expect the bus speed to be higher (e.g. 8.0GT/s), but that will depend on which the slot the card is sitting in on the motherboard it - could be that the bandwidth is shared or the slot is only 2.0/2.1 capable.
- I also saw that your system defaulted to using MSI vs. MSI-X. Are you passing the card through to pfSense or going fully virtual? A couple links to check out that may help:
https://forum.netgate.com/topic/158860/pfsense-latency-spikes-in-esxi
https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/In particular, this setting might help with some of the performance issues you are seeing:
hw.pci.honor_msi_blacklist=0
Hope this helps.
-
That syctl usually only affects vmxnet, which are on the MSI blacklist by default. But there's no harm setting it.
-
@stephenw10 said in Weird Behavior with x710-da2 in 2.5.x:
That syctl usually only affects vmxnet, which are on the MSI blacklist by default. But there's no harm setting it.
That's a good point - I was just going by the last post in this thread, thinking it might be worth a shot:
https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/5
-
@tman222 said in Weird Behavior with x710-da2 in 2.5.x:
hw.pci.honor_msi_blacklist=0
THIS WORKED! thank you guys for your help. After setting this I've been able to get normal pings and expected bandwidth performance.
Now Im not sure if I should enable or disable all the of the offloading for performance - what do you guys think?
-
Nice. That's interesting. I wonder what is on that list that affects pass-through NICs.
Good catch @tman222
The hardware off-load settings rarely make much difference. I would certainly not enable anything that's disabled by default. So that means only checksum off-load which is still enabled by default.
Steve
-
@stephenw10 said in Weird Behavior with x710-da2 in 2.5.x:
Nice. That's interesting. I wonder what is on that list that affects pass-through NICs.
Good catch @tman222
The hardware off-load settings rarely make much difference. I would certainly not enable anything that's disabled by default. So that means only checksum off-load which is still enabled by default.
Steve
Hi @stephenw10 - looking Jim's comment here and the bug report (towards the end):
https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/2
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203874If I understand right, this fix never made into FreeBSD 12. Would it be worth raising an issue on Redmine? At the very least, if the fix is missing, maybe add the sysctl tunable?
Thanks in advance!
-
I had thought this only applied to the vmxnet NICs but if you look at the diff on the patch it actually applied to the VMWare PCI bridge so I guess this could still be in play.
It is odd though, it still comes up with 6 queues in the above example, just using MSI not MSIX. -
I am the other individual mentioned. I have a Quad SFP+ PCI-e 3x card on a ESXi 6.7U3 server. It's an OEM? card (Branded- Silicom PE310G4I71LB-XR) which runs an older FW. I have passed through 2 ports to pfSense and it had the same latency issues mentioned but is working perfectly fine in Windows 10, Server 2022 and Sophos UTM (bleh). I will attempt the above fix this weekend when I perform a switchover and report back.
-
Issue still present on below build, resolved with the same change mentioned above.
2.7.0-DEVELOPMENT (amd64)
built on Fri Feb 04 19:41:27 UTC 2022
FreeBSD 12.3-STABLE -
@deridiot said in Weird Behavior with x710-da2 in 2.5.x:
Issue still present on below build, resolved with the same change mentioned above.
2.7.0-DEVELOPMENT (amd64)
built on Fri Feb 04 19:41:27 UTC 2022
FreeBSD 12.3-STABLEInstalling 2.6 and update to 21.05.2 was solving that problem for a X710-T2 adapter, so perhaps it could work for you too.