Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Weird Behavior with x710-da2 in 2.5.x

    Scheduled Pinned Locked Moved Hardware
    26 Posts 5 Posters 4.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Hmm, but rebooting after having enabled it comes up the same?

      And then disabling it again corrects the ping latency?

      Steve

      A 1 Reply Last reply Reply Quote 0
      • A
        ashtonianagain @stephenw10
        last edited by

        oops, forgot the reboot. Uhh yea this is super weird.

        so after reboot when with invalid latency its the same as before the reboot:

        ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        	options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
        	ether f8:f2:1e:87:a6:81
        	inet6 fe80::faf2:1eff:fe87:a681%ixl0 prefixlen 64 scopeid 0x1
        	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        	status: active
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        	plugged: SFP/SFP+/SFP28 10G Base-SR (LC)
        	vendor: FS PN: SFP-10G-T SN: F2030602793 DATE: 2021-06-29
        	module temperature: 44.77 C Voltage: 3.30 Volts
        	RX: 0.00 mW (-inf dBm) TX: 0.00 mW (-inf dBm)
        
        	SFF8472 DUMP (0xA0 0..127 range):
        	03 04 07 10 00 00 00 40 00 0C 00 06 67 00 00 00
        	03 01 00 00 46 53 20 20 20 20 20 20 20 20 20 20
        	20 20 20 20 00 00 1B 21 53 46 50 2D 31 30 47 2D
        	54 20 20 20 20 20 20 20 41 20 20 20 03 52 00 85
        	00 1A 00 00 46 32 30 33 30 36 30 32 37 39 33 20
        	20 20 20 20 32 31 30 36 32 39 20 20 68 90 01 6D
        	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        enc0: flags=0<> metric 0 mtu 1536
        	groups: enc
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        	inet6 ::1 prefixlen 128
        	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        	inet 127.0.0.1 netmask 0xff000000
        	groups: lo
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        pflog0: flags=100<PROMISC> metric 0 mtu 33160
        	groups: pflog
        pfsync0: flags=0<> metric 0 mtu 1500
        	groups: pfsync
        ixl0.10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        	ether f8:f2:1e:87:a6:81
        	inet6 fe80::faf2:1eff:fe87:a681%ixl0.10 prefixlen 64 scopeid 0x6
        	inet 10.42.0.1 netmask 0xffffff00 broadcast 10.42.0.255
        	groups: vlan
        	vlan: 10 vlanpcp: 0 parent interface: ixl0
        	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        	status: active
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        ixl0.999: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        	ether f8:f2:1e:87:a6:81
        	inet6 fe80::faf2:1eff:fe87:a681%ixl0.999 prefixlen 64 scopeid 0x7
        	inet 192.168.1.47 netmask 0xffffff00 broadcast 192.168.1.255
        	groups: vlan
        	vlan: 999 vlanpcp: 0 parent interface: ixl0
        	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        	status: active
        	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
        

        And then I check the box again, and it reverts back to the original state but fixes the latency issues....

        ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        	options=8100b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER>
        	ether f8:f2:1e:87:a6:81
        	inet6 fe80::faf2:1eff:fe87:a681%ixl0 prefixlen 64 scopeid 0x1
        	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        	status: active
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        	plugged: SFP/SFP+/SFP28 10G Base-SR (LC)
        	vendor: FS PN: SFP-10G-T SN: F2030602793 DATE: 2021-06-29
        	module temperature: 44.77 C Voltage: 3.30 Volts
        	RX: 0.00 mW (-inf dBm) TX: 0.00 mW (-inf dBm)
        
        	SFF8472 DUMP (0xA0 0..127 range):
        	03 04 07 10 00 00 00 40 00 0C 00 06 67 00 00 00
        	03 01 00 00 46 53 20 20 20 20 20 20 20 20 20 20
        	20 20 20 20 00 00 1B 21 53 46 50 2D 31 30 47 2D
        	54 20 20 20 20 20 20 20 41 20 20 20 03 52 00 85
        	00 1A 00 00 46 32 30 33 30 36 30 32 37 39 33 20
        	20 20 20 20 32 31 30 36 32 39 20 20 68 90 01 6D
        	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        enc0: flags=0<> metric 0 mtu 1536
        	groups: enc
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        	inet6 ::1 prefixlen 128
        	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        	inet 127.0.0.1 netmask 0xff000000
        	groups: lo
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        pflog0: flags=100<PROMISC> metric 0 mtu 33160
        	groups: pflog
        pfsync0: flags=0<> metric 0 mtu 1500
        	groups: pfsync
        ixl0.10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        	ether f8:f2:1e:87:a6:81
        	inet6 fe80::faf2:1eff:fe87:a681%ixl0.10 prefixlen 64 scopeid 0x6
        	inet 10.42.0.1 netmask 0xffffff00 broadcast 10.42.0.255
        	groups: vlan
        	vlan: 10 vlanpcp: 0 parent interface: ixl0
        	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        	status: active
        	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        ixl0.999: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        	ether f8:f2:1e:87:a6:81
        	inet6 fe80::faf2:1eff:fe87:a681%ixl0.999 prefixlen 64 scopeid 0x7
        	inet 192.168.1.47 netmask 0xffffff00 broadcast 192.168.1.255
        	groups: vlan
        	vlan: 999 vlanpcp: 0 parent interface: ixl0
        	media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        	status: active
        	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
        
        A 1 Reply Last reply Reply Quote 0
        • A
          ashtonianagain @ashtonianagain
          last edited by

          I'd also like to note after further testing, everything works perfectly in pfsense 2.4.5 and I get 800/800mbps+. In 2.5.2 I get these latency issues out of the box, and even when I am able to temporarily 'fix' them by modifying those settings I'm only able to get ~100/100mbps. The vms are the same with the same passthrough settings.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Hmm, doesn't sound fixed! Check Status > Interfaces for errors.

            You migtht try just ifconfig down, ifconfig up on the interface instead if making a change. See if that brings the latency back to normal too. Or try just resaving it in pfSense without making a change.

            Steve

            A 1 Reply Last reply Reply Quote 0
            • A
              ashtonianagain @stephenw10
              last edited by

              Resaving in pfsense doesn't change. down/up does fix latency issue. Still low bandwidth. no errors in status > interfaces. https://imgur.com/a/dYCnYJ3

              stephenw10S 1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator @ashtonianagain
                last edited by

                Ah try assigning and enabling ixl0 directly even you set it as type 'none'. It may not be applying the settings to VLANs only. And you may not be seeing the errors.

                1 Reply Last reply Reply Quote 0
                • A
                  ashtonianagain
                  last edited by

                  I added ixl0 as opt1 with type set to 'none' - no change, did I understand that correctly?

                  A 1 Reply Last reply Reply Quote 0
                  • A
                    ashtonianagain @ashtonianagain
                    last edited by

                    Also unchecking the disable hardware checsum/tcp offloading without rebooting and Im getting 600/400mbps, not quite 2.4.5 speeds but its encouraging.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, and no errors shown on ixl0 either state?

                      A 1 Reply Last reply Reply Quote 0
                      • A
                        ashtonianagain @stephenw10
                        last edited by ashtonianagain

                        no, no errors on all 3 interfaces. Really confusing. There is a note in the boot log - about the nvm version not being expected version that the driver needs to be updated, but the same note is present in 2.4.5. I'm assuming thats just the ixl driver being a bit behind.

                        There is another user on reddit that reported the same issue with esxi 6.7. Works with other vms but not 2.5.2.

                        Also a note about pcie speed but its in a pcie 3.0 slot.

                        ixl0: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.3.0-k> mem 0xe6000000-0xe6ffffff,0xe7af8000-0xe7afffff irq 19 at device 0.0 on pci4
                        ixl0: fw 8.84.66032 api 1.14 nvm 8.40 etid 8000af82 oem 20.5120.13
                        ixl0: The driver for the device detected a newer version of the NVM image than expected.
                        ixl0: Please install the most recent version of the network driver.
                        ixl0: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
                        ixl0: Using 1024 TX descriptors and 1024 RX descriptors
                        ixl0: Using 6 RX queues 6 TX queues
                        ixl0: failed to allocate 7 MSI-X vectors, err: 6
                        ixl0: Using an MSI interrupt
                        ixl0: Ethernet address: f8:f2:1e:87:a6:81
                        ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active
                        ixl0: PCI Express Bus: Speed 5.0GT/s Unknown
                        ixl0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
                        ixl0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
                        ixl0: SR-IOV ready
                        ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
                        ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
                        ixl0: link state changed to UP
                        
                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          The bus speed note is just informational I wouldn't expect any issues because of that.

                          The firmware version could be a problem. I imagine the mismatch just isn't detected/shown in 2.4.5.

                          Can you test a 2.6 snapshot?

                          Steve

                          1 Reply Last reply Reply Quote 0
                          • A
                            ashtonianagain
                            last edited by

                            another good suggestion thanks. Ran 2.6.0.b.20220111.0600 and 2.7.0.a.20220115.0600 - same issue have to up/down the interface, no errors. I was able to get 800/450. No boot complaints about network driver being out of sync.

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Hmm, something has to be changing but it's hard to see what that could be just by down/up-ing the interface.
                              You could start checking the sysctl stats for ixl0 but there's a lot to wade through:

                              sysctl dev.ixl.0
                              

                              You might be able to spot some key difference between the two states.

                              Steve

                              1 Reply Last reply Reply Quote 0
                              • T
                                tman222
                                last edited by tman222

                                Hi @ashtonianagain - looking at the logs above:

                                1. If the card is PCI Express 3.0, I'd expect the bus speed to be higher (e.g. 8.0GT/s), but that will depend on which the slot the card is sitting in on the motherboard it - could be that the bandwidth is shared or the slot is only 2.0/2.1 capable.
                                2. I also saw that your system defaulted to using MSI vs. MSI-X. Are you passing the card through to pfSense or going fully virtual? A couple links to check out that may help:

                                https://forum.netgate.com/topic/158860/pfsense-latency-spikes-in-esxi
                                https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/

                                In particular, this setting might help with some of the performance issues you are seeing:

                                hw.pci.honor_msi_blacklist=0
                                

                                Hope this helps.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  That syctl usually only affects vmxnet, which are on the MSI blacklist by default. But there's no harm setting it.

                                  T 1 Reply Last reply Reply Quote 0
                                  • T
                                    tman222 @stephenw10
                                    last edited by

                                    @stephenw10 said in Weird Behavior with x710-da2 in 2.5.x:

                                    That syctl usually only affects vmxnet, which are on the MSI blacklist by default. But there's no harm setting it.

                                    That's a good point - I was just going by the last post in this thread, thinking it might be worth a shot:

                                    https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/5

                                    1 Reply Last reply Reply Quote 0
                                    • A
                                      ashtonianagain
                                      last edited by

                                      @tman222 said in Weird Behavior with x710-da2 in 2.5.x:

                                      hw.pci.honor_msi_blacklist=0

                                      THIS WORKED! thank you guys for your help. After setting this I've been able to get normal pings and expected bandwidth performance.

                                      Now Im not sure if I should enable or disable all the of the offloading for performance - what do you guys think?

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by stephenw10

                                        Nice. That's interesting. I wonder what is on that list that affects pass-through NICs.

                                        Good catch @tman222 👍

                                        The hardware off-load settings rarely make much difference. I would certainly not enable anything that's disabled by default. So that means only checksum off-load which is still enabled by default.

                                        Steve

                                        T 1 Reply Last reply Reply Quote 0
                                        • T
                                          tman222 @stephenw10
                                          last edited by

                                          @stephenw10 said in Weird Behavior with x710-da2 in 2.5.x:

                                          Nice. That's interesting. I wonder what is on that list that affects pass-through NICs.

                                          Good catch @tman222 👍

                                          The hardware off-load settings rarely make much difference. I would certainly not enable anything that's disabled by default. So that means only checksum off-load which is still enabled by default.

                                          Steve

                                          Hi @stephenw10 - looking Jim's comment here and the bug report (towards the end):

                                          https://forum.netgate.com/topic/157688/remove-vmware-msi-x-from-the-pci-blacklist/2
                                          https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203874

                                          If I understand right, this fix never made into FreeBSD 12. Would it be worth raising an issue on Redmine? At the very least, if the fix is missing, maybe add the sysctl tunable?

                                          Thanks in advance!

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            I had thought this only applied to the vmxnet NICs but if you look at the diff on the patch it actually applied to the VMWare PCI bridge so I guess this could still be in play.
                                            It is odd though, it still comes up with 6 queues in the above example, just using MSI not MSIX.

                                            D 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.