Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Weird Behavior with x710-da2 in 2.5.x

    Scheduled Pinned Locked Moved Hardware
    26 Posts 5 Posters 4.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      ashtonianagain @stephenw10
      last edited by

      oops, forgot the reboot. Uhh yea this is super weird.

      so after reboot when with invalid latency its the same as before the reboot:

      ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      	options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
      	ether f8:f2:1e:87:a6:81
      	inet6 fe80::faf2:1eff:fe87:a681%ixl0 prefixlen 64 scopeid 0x1
      	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
      	status: active
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      	plugged: SFP/SFP+/SFP28 10G Base-SR (LC)
      	vendor: FS PN: SFP-10G-T SN: F2030602793 DATE: 2021-06-29
      	module temperature: 44.77 C Voltage: 3.30 Volts
      	RX: 0.00 mW (-inf dBm) TX: 0.00 mW (-inf dBm)
      
      	SFF8472 DUMP (0xA0 0..127 range):
      	03 04 07 10 00 00 00 40 00 0C 00 06 67 00 00 00
      	03 01 00 00 46 53 20 20 20 20 20 20 20 20 20 20
      	20 20 20 20 00 00 1B 21 53 46 50 2D 31 30 47 2D
      	54 20 20 20 20 20 20 20 41 20 20 20 03 52 00 85
      	00 1A 00 00 46 32 30 33 30 36 30 32 37 39 33 20
      	20 20 20 20 32 31 30 36 32 39 20 20 68 90 01 6D
      	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      enc0: flags=0<> metric 0 mtu 1536
      	groups: enc
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
      	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
      	inet6 ::1 prefixlen 128
      	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
      	inet 127.0.0.1 netmask 0xff000000
      	groups: lo
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      pflog0: flags=100<PROMISC> metric 0 mtu 33160
      	groups: pflog
      pfsync0: flags=0<> metric 0 mtu 1500
      	groups: pfsync
      ixl0.10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
      	ether f8:f2:1e:87:a6:81
      	inet6 fe80::faf2:1eff:fe87:a681%ixl0.10 prefixlen 64 scopeid 0x6
      	inet 10.42.0.1 netmask 0xffffff00 broadcast 10.42.0.255
      	groups: vlan
      	vlan: 10 vlanpcp: 0 parent interface: ixl0
      	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
      	status: active
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      ixl0.999: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
      	ether f8:f2:1e:87:a6:81
      	inet6 fe80::faf2:1eff:fe87:a681%ixl0.999 prefixlen 64 scopeid 0x7
      	inet 192.168.1.47 netmask 0xffffff00 broadcast 192.168.1.255
      	groups: vlan
      	vlan: 999 vlanpcp: 0 parent interface: ixl0
      	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
      	status: active
      	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
      

      And then I check the box again, and it reverts back to the original state but fixes the latency issues....

      ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      	options=8100b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER>
      	ether f8:f2:1e:87:a6:81
      	inet6 fe80::faf2:1eff:fe87:a681%ixl0 prefixlen 64 scopeid 0x1
      	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
      	status: active
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      	plugged: SFP/SFP+/SFP28 10G Base-SR (LC)
      	vendor: FS PN: SFP-10G-T SN: F2030602793 DATE: 2021-06-29
      	module temperature: 44.77 C Voltage: 3.30 Volts
      	RX: 0.00 mW (-inf dBm) TX: 0.00 mW (-inf dBm)
      
      	SFF8472 DUMP (0xA0 0..127 range):
      	03 04 07 10 00 00 00 40 00 0C 00 06 67 00 00 00
      	03 01 00 00 46 53 20 20 20 20 20 20 20 20 20 20
      	20 20 20 20 00 00 1B 21 53 46 50 2D 31 30 47 2D
      	54 20 20 20 20 20 20 20 41 20 20 20 03 52 00 85
      	00 1A 00 00 46 32 30 33 30 36 30 32 37 39 33 20
      	20 20 20 20 32 31 30 36 32 39 20 20 68 90 01 6D
      	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      enc0: flags=0<> metric 0 mtu 1536
      	groups: enc
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
      	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
      	inet6 ::1 prefixlen 128
      	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
      	inet 127.0.0.1 netmask 0xff000000
      	groups: lo
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      pflog0: flags=100<PROMISC> metric 0 mtu 33160
      	groups: pflog
      pfsync0: flags=0<> metric 0 mtu 1500
      	groups: pfsync
      ixl0.10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      	ether f8:f2:1e:87:a6:81
      	inet6 fe80::faf2:1eff:fe87:a681%ixl0.10 prefixlen 64 scopeid 0x6
      	inet 10.42.0.1 netmask 0xffffff00 broadcast 10.42.0.255
      	groups: vlan
      	vlan: 10 vlanpcp: 0 parent interface: ixl0
      	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
      	status: active
      	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
      ixl0.999: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      	ether f8:f2:1e:87:a6:81
      	inet6 fe80::faf2:1eff:fe87:a681%ixl0.999 prefixlen 64 scopeid 0x7
      	inet 192.168.1.47 netmask 0xffffff00 broadcast 192.168.1.255
      	groups: vlan
      	vlan: 999 vlanpcp: 0 parent interface: ixl0
      	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
      	status: active
      	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
      
      A 1 Reply Last reply Reply Quote 0
      • A
        ashtonianagain @ashtonianagain
        last edited by

        I'd also like to note after further testing, everything works perfectly in pfsense 2.4.5 and I get 800/800mbps+. In 2.5.2 I get these latency issues out of the box, and even when I am able to temporarily 'fix' them by modifying those settings I'm only able to get ~100/100mbps. The vms are the same with the same passthrough settings.

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Hmm, doesn't sound fixed! Check Status > Interfaces for errors.

          You migtht try just ifconfig down, ifconfig up on the interface instead if making a change. See if that brings the latency back to normal too. Or try just resaving it in pfSense without making a change.

          Steve

          A 1 Reply Last reply Reply Quote 0
          • A
            ashtonianagain @stephenw10
            last edited by

            Resaving in pfsense doesn't change. down/up does fix latency issue. Still low bandwidth. no errors in status > interfaces. https://imgur.com/a/dYCnYJ3

            stephenw10S 1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator @ashtonianagain
              last edited by

              Ah try assigning and enabling ixl0 directly even you set it as type 'none'. It may not be applying the settings to VLANs only. And you may not be seeing the errors.

              1 Reply Last reply Reply Quote 0
              • A
                ashtonianagain
                last edited by

                I added ixl0 as opt1 with type set to 'none' - no change, did I understand that correctly?

                A 1 Reply Last reply Reply Quote 0
                • A
                  ashtonianagain @ashtonianagain
                  last edited by

                  Also unchecking the disable hardware checsum/tcp offloading without rebooting and Im getting 600/400mbps, not quite 2.4.5 speeds but its encouraging.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, and no errors shown on ixl0 either state?

                    A 1 Reply Last reply Reply Quote 0
                    • A
                      ashtonianagain @stephenw10
                      last edited by ashtonianagain

                      no, no errors on all 3 interfaces. Really confusing. There is a note in the boot log - about the nvm version not being expected version that the driver needs to be updated, but the same note is present in 2.4.5. I'm assuming thats just the ixl driver being a bit behind.

                      There is another user on reddit that reported the same issue with esxi 6.7. Works with other vms but not 2.5.2.

                      Also a note about pcie speed but its in a pcie 3.0 slot.

                      ixl0: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.3.0-k> mem 0xe6000000-0xe6ffffff,0xe7af8000-0xe7afffff irq 19 at device 0.0 on pci4
                      ixl0: fw 8.84.66032 api 1.14 nvm 8.40 etid 8000af82 oem 20.5120.13
                      ixl0: The driver for the device detected a newer version of the NVM image than expected.
                      ixl0: Please install the most recent version of the network driver.
                      ixl0: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
                      ixl0: Using 1024 TX descriptors and 1024 RX descriptors
                      ixl0: Using 6 RX queues 6 TX queues
                      ixl0: failed to allocate 7 MSI-X vectors, err: 6
                      ixl0: Using an MSI interrupt
                      ixl0: Ethernet address: f8:f2:1e:87:a6:81
                      ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active
                      ixl0: PCI Express Bus: Speed 5.0GT/s Unknown
                      ixl0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
                      ixl0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
                      ixl0: SR-IOV ready
                      ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
                      ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
                      ixl0: link state changed to UP
                      
                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        The bus speed note is just informational I wouldn't expect any issues because of that.

                        The firmware version could be a problem. I imagine the mismatch just isn't detected/shown in 2.4.5.

                        Can you test a 2.6 snapshot?

                        Steve

                        1 Reply Last reply Reply Quote 0
                        • A
                          ashtonianagain
                          last edited by

                          another good suggestion thanks. Ran 2.6.0.b.20220111.0600 and 2.7.0.a.20220115.0600 - same issue have to up/down the interface, no errors. I was able to get 800/450. No boot complaints about network driver being out of sync.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Hmm, something has to be changing but it's hard to see what that could be just by down/up-ing the interface.
                            You could start checking the sysctl stats for ixl0 but there's a lot to wade through:

                            sysctl dev.ixl.0
                            

                            You might be able to spot some key difference between the two states.

                            Steve

                            1 Reply Last reply Reply Quote 0
                            • T
                              tman222
                              last edited by tman222

                              Hi @ashtonianagain - looking at the logs above:

                              1. If the card is PCI Express 3.0, I'd expect the bus speed to be higher (e.g. 8.0GT/s), but that will depend on which the slot the card is sitting in on the motherboard it - could be that the bandwidth is shared or the slot is only 2.0/2.1 capable.
                              2. I also saw that your system defaulted to using MSI vs. MSI-X. Are you passing the card through to pfSense or going fully virtual? A couple links to check out that may help:

                              https://forum.netgate.com/topic/158860/pfsense-latency-spikes-in-esxi
                              https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/

                              In particular, this setting might help with some of the performance issues you are seeing:

                              hw.pci.honor_msi_blacklist=0
                              

                              Hope this helps.

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                That syctl usually only affects vmxnet, which are on the MSI blacklist by default. But there's no harm setting it.

                                T 1 Reply Last reply Reply Quote 0
                                • T
                                  tman222 @stephenw10
                                  last edited by

                                  @stephenw10 said in Weird Behavior with x710-da2 in 2.5.x:

                                  That syctl usually only affects vmxnet, which are on the MSI blacklist by default. But there's no harm setting it.

                                  That's a good point - I was just going by the last post in this thread, thinking it might be worth a shot:

                                  https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/5

                                  1 Reply Last reply Reply Quote 0
                                  • A
                                    ashtonianagain
                                    last edited by

                                    @tman222 said in Weird Behavior with x710-da2 in 2.5.x:

                                    hw.pci.honor_msi_blacklist=0

                                    THIS WORKED! thank you guys for your help. After setting this I've been able to get normal pings and expected bandwidth performance.

                                    Now Im not sure if I should enable or disable all the of the offloading for performance - what do you guys think?

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by stephenw10

                                      Nice. That's interesting. I wonder what is on that list that affects pass-through NICs.

                                      Good catch @tman222 👍

                                      The hardware off-load settings rarely make much difference. I would certainly not enable anything that's disabled by default. So that means only checksum off-load which is still enabled by default.

                                      Steve

                                      T 1 Reply Last reply Reply Quote 0
                                      • T
                                        tman222 @stephenw10
                                        last edited by

                                        @stephenw10 said in Weird Behavior with x710-da2 in 2.5.x:

                                        Nice. That's interesting. I wonder what is on that list that affects pass-through NICs.

                                        Good catch @tman222 👍

                                        The hardware off-load settings rarely make much difference. I would certainly not enable anything that's disabled by default. So that means only checksum off-load which is still enabled by default.

                                        Steve

                                        Hi @stephenw10 - looking Jim's comment here and the bug report (towards the end):

                                        https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/2
                                        https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203874

                                        If I understand right, this fix never made into FreeBSD 12. Would it be worth raising an issue on Redmine? At the very least, if the fix is missing, maybe add the sysctl tunable?

                                        Thanks in advance!

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          I had thought this only applied to the vmxnet NICs but if you look at the diff on the patch it actually applied to the VMWare PCI bridge so I guess this could still be in play.
                                          It is odd though, it still comes up with 6 queues in the above example, just using MSI not MSIX.

                                          D 1 Reply Last reply Reply Quote 0
                                          • D
                                            deridiot @stephenw10
                                            last edited by

                                            @stephenw10

                                            I am the other individual mentioned. I have a Quad SFP+ PCI-e 3x card on a ESXi 6.7U3 server. It's an OEM? card (Branded- Silicom PE310G4I71LB-XR) which runs an older FW. I have passed through 2 ports to pfSense and it had the same latency issues mentioned but is working perfectly fine in Windows 10, Server 2022 and Sophos UTM (bleh). I will attempt the above fix this weekend when I perform a switchover and report back.

                                            D 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.