Intel XL710-BM1 based card issues
-
Hi @stephenw10,
After a day I'm still getting some packet loss on both WAN interfaces, but it does seem better.
Packet loss is down to around ~2% on heavy load times, used to be ~4-6%Any other sugestions on ways to debug this?
Thanks!
-
Some more information that might be relevant.
After a reboot this is what I'm getting:$ ifconfig ixl0: ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: WAN options=e503bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether 64:9d:99:b1:8c:18 inet6 fe80::669d:99ff:feb1:8c18%ixl0 prefixlen 64 scopeid 0x1 inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.xxx media: Ethernet autoselect (10Gbase-Twinax <full-duplex>) status: active nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
$ netstat -I ixl0 Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll ixl0 1500 <Link#1> 64:9d:99:b1:8c:18 855346 0 3125 385867 0 0 ixl0 - fe80::%ixl0/6 fe80::669d:99ff:f 18 - - 166 - - ixl0 - xx.xx.xx.xxx/ blxx-xxx-xxx.dsl. 6235 - - 5 - -
$ vmstat -z ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP UMA Kegs: 224, 0, 150, 3, 152, 0, 0 UMA Zones: 1448, 0, 167, 1, 169, 0, 0 UMA Slabs: 80, 0, 16522, 28, 16927, 0, 0 UMA Hash: 256, 0, 6, 9, 15, 0, 0 4 Bucket: 32, 0, 204, 2046, 5110, 0, 0 6 Bucket: 48, 0, 47, 2028, 909, 0, 0 8 Bucket: 64, 0, 343, 2199, 4279, 19, 0 12 Bucket: 96, 0, 38, 987, 306, 0, 0 16 Bucket: 128, 0, 98, 1669, 4678, 1, 0 32 Bucket: 256, 0, 256, 1064, 4697, 8, 0 64 Bucket: 512, 0, 237, 227, 2380,2292, 0 128 Bucket: 1024, 0, 210, 254, 3041, 1, 0 256 Bucket: 2048, 0, 281, 51, 1118, 19, 0 vmem: 1856, 0, 3, 1, 3, 0, 0 vmem btag: 56, 0, 10042, 1389, 10042, 81, 0 VM OBJECT: 256, 0, 5357, 868, 242809, 0, 0
(These were the only ones with FAIL)
edit:
$ ping 8.8.8.8 --- 8.8.8.8 ping statistics --- 501 packets transmitted, 500 received, 0.199601% packet loss, time 500910ms rtt min/avg/max/mdev = 14.737/144.937/1864.901/327.166 ms, pipe 2
Getting a lot of latency spikes.
-
Hmm, quite a few input drops on ixl0. So you see dropped traffic on other interfaces?
Usually that's because the NIC is not being serviced fast enough but that seems unlikely with that CPU.
Steve
-
@stephenw10
I've checked again right now and both ixl0 and ixl1 (both my wan ports) have some Idrop packets, but they haven't increased since I restarted the system about 1h30m ago. It also seems like they start with these values right after boot, so I would guess it's some delay between them getting packets and the packets starting getting processed.The CPU sits at ~2% load, I don't have any CPU intensive packages running.
-
Hmm, OK I'd be running a pcap to confirm packets are actually being lost and where if you can.
-
Hi @stephenw10
So today I've been capturing some packets during the day during the more heavy load times.
I've analysed the captures to best of my knowledge with Wireshark and I can see that some UDP packets are definitely being dropped. All the ones I found were exiting from the NIC to the internet.I'm not sure what else to do with this information. Today the gateway monitoring service was around 3-10% packet loss.
Not sure what else I can do about this.Thank you for you help so far!
-
So you can see ICMP packets leaving the NIC and no replies?
Where/how are you seeing UDP traffic dropped?
Hard to say what you can do here. I would normally be trying a different NIC at this point but that might not be an option for you.
Steve
-
Yeah I can see multiple ICMP requests without a response, I will try the motherboard's integrated realtek nic this weekend and see how that goes.
-
@stephenw10
After some more testing during the weekend It seems the reason I was having packet loss was because of using WAN load balancing.
After trying it with the motherboard's integrated realtek nic and also getting packet loss, I tried disabling WAN load balancing and the packet loss went away. Now I don't know if this was a pfsense issue or just my ISP's modem being shity, since both connections come from the same modem for greater than 1gbps speed (both wans have a different public ip).Anyway, it seems to be fixed for now if I don't use wan load balancing. I'll be checking with my ISP to see if they have any information on their end.
Thanks for your help!
-
Hmm, two WANs on the same modem? How is that presented at the modem? Interesting.
Yeah I would guess that has to be some conflict going on there... hard to say exactly what though.
-
It's GPON and the modem has multiple ports, and you get up to 2 different public IPs. It's a way to get over 1gbps since the ports of the modem are only GbE but GPON allows for more than that, and my ISP doesn't seem to care about it.
-
Interesting. You get different public IPs in different subnets?
You can't load-balance like that between WANs that use the same gateway for example.
Steve