Intel XL710-BM1 based card issues
-
Hi everyone, I recently got a fs.com intel xl710-bm1 based NIC (4x SFP+) to build a new router. It's using an Intel Core i3 10th gen 16gb of ram and ssd as boot drive.
My plan for it is to be a future-proof router/firewall for when my ISP starts providing >1gbps speeds.I'm using the card's ports as follows:
ixl0 - main WAN (10GbE Intel SFP+ adapter)
ixl1 - secondary WAN (load balanced with main WAN) (10GbE Intel SFP+ adapter)
ixl2 - LAN LAGG with ixl3 (DAC)
ixl3 - LAN LAGG with ixl2 (DAC)This gives me a 20Gbps link to an aggregation switch that then connects to everything else around the house.
and gives me 2 possible 10Gbps WAN links, currently using 2 1gbps links via ISP bridged ONT/modem combo.I have:
Hardware Checksum Offloading ON
Hardware TCP Segmentation Offloading ON
Hardware Large Receive Offloading OFF (as recommended in the ixl driver information)
(I've also played around with these and they didn't change anything related to my problems)My card FW is the following:
dev.ixl.0.fw_version: fw 8.5.67516 api 1.15 nvm 8.50 etid 8000b6de oem 1.263.0
The issue I'm having
So now to the issues I have found
- My main WAN connection seems to be resetting it's offloading configuration after a while, on an ifconfig it should report
options=e503bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
but after evey reboot, and after some time the options change tooptions=8500b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO>
. This only happens on this interface. - When a lot of UDP traffic is coming in on any of the WAN interfaces they start dropping packets and the ping increases by a lot (~50ms to 2s increase) according to the gateway monitoring service, but the internet connection seems fine. I've tried tuning this with information on the pfSense tuning guide and some posts on forums but couldn't get it fixed.
I've tried to compile the latest 1.12.29 ixl driver and install it on pfSense, but the packet loss problem got even worst.
Has anyone got any experience with something like this? What else could I possibly try to fix the packet loss problem?
Thanks!
- My main WAN connection seems to be resetting it's offloading configuration after a while, on an ifconfig it should report
-
Hmm, how are you measuring the loss? What is your WAN monitoring set to?
Normally I would recommend disabling all the hardware off-loading if you have any issues anyway. Many of those can be set per interface in the sysctls though. So if that's the first NIC check:
sysctl dev.ixl.0
.Steve
-
@stephenw10
I'm measuring the loss with the gateway monitoring tool built into the dashboard, I've tried it by monitoring the default gateway IP my DHCP wan connection gets and also by monitoring some public DNS servers like 8.8.8.8 and 1.1.1.1. I've also measured some loss while pining a server on the internet from my lan resulting in some packet loss.I've played around with all possible offloading settings and the results were the same
Here's the output from
sysctl dev.ixl.0
(main WAN connection):dev.ixl.0.wake: 0 dev.ixl.0.mac.xoff_recvd: 0 dev.ixl.0.mac.xoff_txd: 0 dev.ixl.0.mac.xon_recvd: 0 dev.ixl.0.mac.xon_txd: 0 dev.ixl.0.mac.tx_frames_big: 0 dev.ixl.0.mac.tx_frames_1024_1522: 18778282 dev.ixl.0.mac.tx_frames_512_1023: 49600 dev.ixl.0.mac.tx_frames_256_511: 298983 dev.ixl.0.mac.tx_frames_128_255: 1117686 dev.ixl.0.mac.tx_frames_65_127: 13985802 dev.ixl.0.mac.tx_frames_64: 1798195 dev.ixl.0.mac.checksum_errors: 0 dev.ixl.0.mac.rx_jabber: 0 dev.ixl.0.mac.rx_oversized: 0 dev.ixl.0.mac.rx_fragmented: 0 dev.ixl.0.mac.rx_undersize: 0 dev.ixl.0.mac.rx_frames_big: 0 dev.ixl.0.mac.rx_frames_1024_1522: 90114760 dev.ixl.0.mac.rx_frames_512_1023: 113876 dev.ixl.0.mac.rx_frames_256_511: 920153 dev.ixl.0.mac.rx_frames_128_255: 1721526 dev.ixl.0.mac.rx_frames_65_127: 4461360 dev.ixl.0.mac.rx_frames_64: 3618517 dev.ixl.0.mac.rx_length_errors: 0 dev.ixl.0.mac.remote_faults: 1 dev.ixl.0.mac.local_faults: 1 dev.ixl.0.mac.illegal_bytes: 0 dev.ixl.0.mac.crc_errors: 0 dev.ixl.0.mac.bcast_pkts_txd: 37 dev.ixl.0.mac.mcast_pkts_txd: 1865 dev.ixl.0.mac.ucast_pkts_txd: 36026646 dev.ixl.0.mac.good_octets_txd: 30138288202 dev.ixl.0.mac.rx_discards: 0 dev.ixl.0.mac.bcast_pkts_rcvd: 0 dev.ixl.0.mac.mcast_pkts_rcvd: 888 dev.ixl.0.mac.ucast_pkts_rcvd: 100949304 dev.ixl.0.mac.good_octets_rcvd: 138060991717 dev.ixl.0.pf.txq03.itr: 122 dev.ixl.0.pf.txq03.bytes: 6777214411 dev.ixl.0.pf.txq03.packets: 8006676 dev.ixl.0.pf.txq03.mss_too_small: 0 dev.ixl.0.pf.txq03.tso: 0 dev.ixl.0.pf.txq02.itr: 122 dev.ixl.0.pf.txq02.bytes: 8488906955 dev.ixl.0.pf.txq02.packets: 10423485 dev.ixl.0.pf.txq02.mss_too_small: 0 dev.ixl.0.pf.txq02.tso: 0 dev.ixl.0.pf.txq01.itr: 122 dev.ixl.0.pf.txq01.bytes: 7054333499 dev.ixl.0.pf.txq01.packets: 8560584 dev.ixl.0.pf.txq01.mss_too_small: 0 dev.ixl.0.pf.txq01.tso: 1 dev.ixl.0.pf.txq00.itr: 122 dev.ixl.0.pf.txq00.bytes: 7662717174 dev.ixl.0.pf.txq00.packets: 9036589 dev.ixl.0.pf.txq00.mss_too_small: 0 dev.ixl.0.pf.txq00.tso: 0 dev.ixl.0.pf.rxq03.itr: 62 dev.ixl.0.pf.rxq03.desc_err: 0 dev.ixl.0.pf.rxq03.bytes: 43745309924 dev.ixl.0.pf.rxq03.packets: 31377789 dev.ixl.0.pf.rxq03.irqs: 9385422 dev.ixl.0.pf.rxq02.itr: 62 dev.ixl.0.pf.rxq02.desc_err: 0 dev.ixl.0.pf.rxq02.bytes: 29771692282 dev.ixl.0.pf.rxq02.packets: 22140631 dev.ixl.0.pf.rxq02.irqs: 9035943 dev.ixl.0.pf.rxq01.itr: 62 dev.ixl.0.pf.rxq01.desc_err: 0 dev.ixl.0.pf.rxq01.bytes: 33051286308 dev.ixl.0.pf.rxq01.packets: 24568731 dev.ixl.0.pf.rxq01.irqs: 8834556 dev.ixl.0.pf.rxq00.itr: 62 dev.ixl.0.pf.rxq00.desc_err: 0 dev.ixl.0.pf.rxq00.bytes: 31088674507 dev.ixl.0.pf.rxq00.packets: 22861444 dev.ixl.0.pf.rxq00.irqs: 8369457 dev.ixl.0.pf.bcast_pkts_txd: 37 dev.ixl.0.pf.mcast_pkts_txd: 624 dev.ixl.0.pf.ucast_pkts_txd: 36026646 dev.ixl.0.pf.good_octets_txd: 29983164822 dev.ixl.0.pf.rx_discards: 2288 dev.ixl.0.pf.bcast_pkts_rcvd: 0 dev.ixl.0.pf.mcast_pkts_rcvd: 884 dev.ixl.0.pf.ucast_pkts_rcvd: 100953626 dev.ixl.0.pf.good_octets_rcvd: 138061874353 dev.ixl.0.admin_irq: 2 dev.ixl.0.eee.rx_lpi_count: 0 dev.ixl.0.eee.tx_lpi_count: 0 dev.ixl.0.eee.rx_lpi_status: 0 dev.ixl.0.eee.tx_lpi_status: 0 dev.ixl.0.eee.enable: 0 dev.ixl.0.fw_lldp: 1 dev.ixl.0.dynamic_tx_itr: 0 dev.ixl.0.dynamic_rx_itr: 0 dev.ixl.0.rx_itr: 62 dev.ixl.0.tx_itr: 122 dev.ixl.0.unallocated_queues: 380 dev.ixl.0.fw_version: fw 8.5.67516 api 1.15 nvm 8.50 etid 8000b6de oem 1.263.0 dev.ixl.0.current_speed: 10 Gbps dev.ixl.0.supported_speeds: 6 dev.ixl.0.advertise_speed: 6 dev.ixl.0.fc: 0 dev.ixl.0.iflib.rxq3.rxq_fl0.buf_size: 2048 dev.ixl.0.iflib.rxq3.rxq_fl0.credits: 1023 dev.ixl.0.iflib.rxq3.rxq_fl0.cidx: 135 dev.ixl.0.iflib.rxq3.rxq_fl0.pidx: 134 dev.ixl.0.iflib.rxq2.rxq_fl0.buf_size: 2048 dev.ixl.0.iflib.rxq2.rxq_fl0.credits: 1023 dev.ixl.0.iflib.rxq2.rxq_fl0.cidx: 340 dev.ixl.0.iflib.rxq2.rxq_fl0.pidx: 339 dev.ixl.0.iflib.rxq1.rxq_fl0.buf_size: 2048 dev.ixl.0.iflib.rxq1.rxq_fl0.credits: 1023 dev.ixl.0.iflib.rxq1.rxq_fl0.cidx: 184 dev.ixl.0.iflib.rxq1.rxq_fl0.pidx: 183 dev.ixl.0.iflib.rxq0.rxq_fl0.buf_size: 2048 dev.ixl.0.iflib.rxq0.rxq_fl0.credits: 1023 dev.ixl.0.iflib.rxq0.rxq_fl0.cidx: 460 dev.ixl.0.iflib.rxq0.rxq_fl0.pidx: 459 dev.ixl.0.iflib.txq3.r_abdications: 0 dev.ixl.0.iflib.txq3.r_restarts: 0 dev.ixl.0.iflib.txq3.r_stalls: 0 dev.ixl.0.iflib.txq3.r_starts: 7997959 dev.ixl.0.iflib.txq3.r_drops: 0 dev.ixl.0.iflib.txq3.r_enqueues: 7997959 dev.ixl.0.iflib.txq3.ring_state: pidx_head: 1044 pidx_tail: 1044 cidx: 1044 state: IDLE dev.ixl.0.iflib.txq3.txq_cleaned: 12138890 dev.ixl.0.iflib.txq3.txq_processed: 12138898 dev.ixl.0.iflib.txq3.txq_in_use: 8 dev.ixl.0.iflib.txq3.txq_cidx_processed: 402 dev.ixl.0.iflib.txq3.txq_cidx: 394 dev.ixl.0.iflib.txq3.txq_pidx: 402 dev.ixl.0.iflib.txq3.no_tx_dma_setup: 0 dev.ixl.0.iflib.txq3.txd_encap_efbig: 0 dev.ixl.0.iflib.txq3.tx_map_failed: 0 dev.ixl.0.iflib.txq3.no_desc_avail: 0 dev.ixl.0.iflib.txq3.mbuf_defrag_failed: 0 dev.ixl.0.iflib.txq3.m_pullups: 0 dev.ixl.0.iflib.txq3.mbuf_defrag: 0 dev.ixl.0.iflib.txq2.r_abdications: 0 dev.ixl.0.iflib.txq2.r_restarts: 0 dev.ixl.0.iflib.txq2.r_stalls: 0 dev.ixl.0.iflib.txq2.r_starts: 10226616 dev.ixl.0.iflib.txq2.r_drops: 0 dev.ixl.0.iflib.txq2.r_enqueues: 10226618 dev.ixl.0.iflib.txq2.ring_state: pidx_head: 1213 pidx_tail: 1213 cidx: 1213 state: IDLE dev.ixl.0.iflib.txq2.txq_cleaned: 15324161 dev.ixl.0.iflib.txq2.txq_processed: 15324169 dev.ixl.0.iflib.txq2.txq_in_use: 8 dev.ixl.0.iflib.txq2.txq_cidx_processed: 9 dev.ixl.0.iflib.txq2.txq_cidx: 1 dev.ixl.0.iflib.txq2.txq_pidx: 9 dev.ixl.0.iflib.txq2.no_tx_dma_setup: 0 dev.ixl.0.iflib.txq2.txd_encap_efbig: 0 dev.ixl.0.iflib.txq2.tx_map_failed: 0 dev.ixl.0.iflib.txq2.no_desc_avail: 0 dev.ixl.0.iflib.txq2.mbuf_defrag_failed: 0 dev.ixl.0.iflib.txq2.m_pullups: 0 dev.ixl.0.iflib.txq2.mbuf_defrag: 0 dev.ixl.0.iflib.txq1.r_abdications: 0 dev.ixl.0.iflib.txq1.r_restarts: 0 dev.ixl.0.iflib.txq1.r_stalls: 0 dev.ixl.0.iflib.txq1.r_starts: 8549118 dev.ixl.0.iflib.txq1.r_drops: 0 dev.ixl.0.iflib.txq1.r_enqueues: 8549173 dev.ixl.0.iflib.txq1.ring_state: pidx_head: 1992 pidx_tail: 1992 cidx: 1992 state: IDLE dev.ixl.0.iflib.txq1.txq_cleaned: 12612789 dev.ixl.0.iflib.txq1.txq_processed: 12612797 dev.ixl.0.iflib.txq1.txq_in_use: 8 dev.ixl.0.iflib.txq1.txq_cidx_processed: 189 dev.ixl.0.iflib.txq1.txq_cidx: 181 dev.ixl.0.iflib.txq1.txq_pidx: 189 dev.ixl.0.iflib.txq1.no_tx_dma_setup: 0 dev.ixl.0.iflib.txq1.txd_encap_efbig: 0 dev.ixl.0.iflib.txq1.tx_map_failed: 0 dev.ixl.0.iflib.txq1.no_desc_avail: 0 dev.ixl.0.iflib.txq1.mbuf_defrag_failed: 0 dev.ixl.0.iflib.txq1.m_pullups: 0 dev.ixl.0.iflib.txq1.mbuf_defrag: 0 dev.ixl.0.iflib.txq0.r_abdications: 0 dev.ixl.0.iflib.txq0.r_restarts: 0 dev.ixl.0.iflib.txq0.r_stalls: 0 dev.ixl.0.iflib.txq0.r_starts: 8985020 dev.ixl.0.iflib.txq0.r_drops: 0 dev.ixl.0.iflib.txq0.r_enqueues: 8985180 dev.ixl.0.iflib.txq0.ring_state: pidx_head: 0813 pidx_tail: 0813 cidx: 0813 state: IDLE dev.ixl.0.iflib.txq0.txq_cleaned: 13362500 dev.ixl.0.iflib.txq0.txq_processed: 13362508 dev.ixl.0.iflib.txq0.txq_in_use: 8 dev.ixl.0.iflib.txq0.txq_cidx_processed: 332 dev.ixl.0.iflib.txq0.txq_cidx: 324 dev.ixl.0.iflib.txq0.txq_pidx: 332 dev.ixl.0.iflib.txq0.no_tx_dma_setup: 0 dev.ixl.0.iflib.txq0.txd_encap_efbig: 0 dev.ixl.0.iflib.txq0.tx_map_failed: 0 dev.ixl.0.iflib.txq0.no_desc_avail: 0 dev.ixl.0.iflib.txq0.mbuf_defrag_failed: 0 dev.ixl.0.iflib.txq0.m_pullups: 0 dev.ixl.0.iflib.txq0.mbuf_defrag: 0 dev.ixl.0.iflib.override_nrxds: 0 dev.ixl.0.iflib.override_ntxds: 0 dev.ixl.0.iflib.separate_txrx: 0 dev.ixl.0.iflib.core_offset: 0 dev.ixl.0.iflib.tx_abdicate: 0 dev.ixl.0.iflib.rx_budget: 0 dev.ixl.0.iflib.disable_msix: 0 dev.ixl.0.iflib.override_qs_enable: 0 dev.ixl.0.iflib.override_nrxqs: 0 dev.ixl.0.iflib.override_ntxqs: 0 dev.ixl.0.iflib.driver_version: 2.3.0-k dev.ixl.0.%parent: pci1 dev.ixl.0.%pnpinfo: vendor=0x8086 device=0x1572 subvendor=0x8086 subdevice=0x0000 class=0x020000 dev.ixl.0.%location: slot=0 function=0 dbsf=pci0:1:0:0 handle=\_SB_.PCI0.PEG0.PEGP dev.ixl.0.%driver: ixl dev.ixl.0.%desc: Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.3.0-k
I have been trying to diagnose this for a few days, if there's any more information you might need let me know.
-
You might try disabling fw_lldp there, that has been known to cause issues.
Are you sure this is an issue with the NIC and not the modem or just the WAN connection?
Steve
-
@stephenw10
Thanks for the recommendation, I'll try it and see how it handles in the next few hours/day.I believe it's NIC related as I was previously using an Ubiquiti EdgeRouter and never had a problem with dropped packets with it.
-
Hi @stephenw10,
After a day I'm still getting some packet loss on both WAN interfaces, but it does seem better.
Packet loss is down to around ~2% on heavy load times, used to be ~4-6%Any other sugestions on ways to debug this?
Thanks!
-
Some more information that might be relevant.
After a reboot this is what I'm getting:$ ifconfig ixl0: ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: WAN options=e503bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether 64:9d:99:b1:8c:18 inet6 fe80::669d:99ff:feb1:8c18%ixl0 prefixlen 64 scopeid 0x1 inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.xxx media: Ethernet autoselect (10Gbase-Twinax <full-duplex>) status: active nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
$ netstat -I ixl0 Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll ixl0 1500 <Link#1> 64:9d:99:b1:8c:18 855346 0 3125 385867 0 0 ixl0 - fe80::%ixl0/6 fe80::669d:99ff:f 18 - - 166 - - ixl0 - xx.xx.xx.xxx/ blxx-xxx-xxx.dsl. 6235 - - 5 - -
$ vmstat -z ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP UMA Kegs: 224, 0, 150, 3, 152, 0, 0 UMA Zones: 1448, 0, 167, 1, 169, 0, 0 UMA Slabs: 80, 0, 16522, 28, 16927, 0, 0 UMA Hash: 256, 0, 6, 9, 15, 0, 0 4 Bucket: 32, 0, 204, 2046, 5110, 0, 0 6 Bucket: 48, 0, 47, 2028, 909, 0, 0 8 Bucket: 64, 0, 343, 2199, 4279, 19, 0 12 Bucket: 96, 0, 38, 987, 306, 0, 0 16 Bucket: 128, 0, 98, 1669, 4678, 1, 0 32 Bucket: 256, 0, 256, 1064, 4697, 8, 0 64 Bucket: 512, 0, 237, 227, 2380,2292, 0 128 Bucket: 1024, 0, 210, 254, 3041, 1, 0 256 Bucket: 2048, 0, 281, 51, 1118, 19, 0 vmem: 1856, 0, 3, 1, 3, 0, 0 vmem btag: 56, 0, 10042, 1389, 10042, 81, 0 VM OBJECT: 256, 0, 5357, 868, 242809, 0, 0
(These were the only ones with FAIL)
edit:
$ ping 8.8.8.8 --- 8.8.8.8 ping statistics --- 501 packets transmitted, 500 received, 0.199601% packet loss, time 500910ms rtt min/avg/max/mdev = 14.737/144.937/1864.901/327.166 ms, pipe 2
Getting a lot of latency spikes.
-
Hmm, quite a few input drops on ixl0. So you see dropped traffic on other interfaces?
Usually that's because the NIC is not being serviced fast enough but that seems unlikely with that CPU.
Steve
-
@stephenw10
I've checked again right now and both ixl0 and ixl1 (both my wan ports) have some Idrop packets, but they haven't increased since I restarted the system about 1h30m ago. It also seems like they start with these values right after boot, so I would guess it's some delay between them getting packets and the packets starting getting processed.The CPU sits at ~2% load, I don't have any CPU intensive packages running.
-
Hmm, OK I'd be running a pcap to confirm packets are actually being lost and where if you can.
-
Hi @stephenw10
So today I've been capturing some packets during the day during the more heavy load times.
I've analysed the captures to best of my knowledge with Wireshark and I can see that some UDP packets are definitely being dropped. All the ones I found were exiting from the NIC to the internet.I'm not sure what else to do with this information. Today the gateway monitoring service was around 3-10% packet loss.
Not sure what else I can do about this.Thank you for you help so far!
-
So you can see ICMP packets leaving the NIC and no replies?
Where/how are you seeing UDP traffic dropped?
Hard to say what you can do here. I would normally be trying a different NIC at this point but that might not be an option for you.
Steve
-
Yeah I can see multiple ICMP requests without a response, I will try the motherboard's integrated realtek nic this weekend and see how that goes.
-
@stephenw10
After some more testing during the weekend It seems the reason I was having packet loss was because of using WAN load balancing.
After trying it with the motherboard's integrated realtek nic and also getting packet loss, I tried disabling WAN load balancing and the packet loss went away. Now I don't know if this was a pfsense issue or just my ISP's modem being shity, since both connections come from the same modem for greater than 1gbps speed (both wans have a different public ip).Anyway, it seems to be fixed for now if I don't use wan load balancing. I'll be checking with my ISP to see if they have any information on their end.
Thanks for your help!
-
Hmm, two WANs on the same modem? How is that presented at the modem? Interesting.
Yeah I would guess that has to be some conflict going on there... hard to say exactly what though.
-
It's GPON and the modem has multiple ports, and you get up to 2 different public IPs. It's a way to get over 1gbps since the ports of the modem are only GbE but GPON allows for more than that, and my ISP doesn't seem to care about it.
-
Interesting. You get different public IPs in different subnets?
You can't load-balance like that between WANs that use the same gateway for example.
Steve