High interrupts on WAN/LAN interfaces?
-
@viragomann Yeah, it's a good idea, but I've also set Disable Gateway Monitoring Action on the main WAN gateway, so I wouldn't expect it to have any effect...
-
Interestingly I added a USB ethernet adapter and I'm not seeing any loss...
So. yeah, I'm almost positive it's hardware related, probably with the i226-v NICs... -
dpinger restarting like that is almost certainly a symptom of the packet loss alarms not the cause.
@rmeskill said in High interrupts on WAN/LAN interfaces?:
I've got a few system tunables set (dev.cpu.0.cx_lowest = C3, dev.hwpstate_intel.0.epp = 90
I'd be suspicious of those. Try running the CPU at a higher target speed. Do you have those settings for all CPU cores?
Some similar boxes like that have some terrible default BIOS values to reduce CPU heat that cause them to run at reduced speeds.
Do you have eee enabled for the i226 NICs?
When you are seeing those interrupt rates what traffic is passing?
-
I went to try and turn eee on for the NICs and it completely hosed my system-had to go back and turn it off (which appears the default) via a direct console connection to be able to get things working again. I think the interrupt rates themselves are actually irrelevant, rather it's the packet loss that's the symptom and the issue. I have absolutely no idea why or what is correlated to the loss. I'm currently trying an external USB NIC to see if that has the same issues or not, setup in a GW Group configuration so it should automatically failover if the main i226-v NIC starts seeing loss...
-
@stephenw10 I did also move the dev.cpu.0.cx_lowest = C3, dev.hwpstate_intel.0.epp = 90 to = 100. But in the 5m since I added the USB NIC it seems that's also seeing loss. But I can confirm I have another (GLiNet) router plugged into this modem, running a similar setup, and it works fine with no loss, so the only consistency here is the Topton PFSense box...
-
Actually at this point it just looks like the PFSense system itself is completely hosed. I've no idea why, but now even the GUI loads awfully slowly from the local network and sometimes gives me a 503 error on reboot. I'm probably just going to blow it away and start fresh
-
Yup eee should be disabled.
The values for hwpstate_intel should be set lower for higher performance. Setting it to 100 means the most power saving / lowest performance. Try setting it to 50 or just disable that. The default value depends on what the BIOS passes though.
-
@stephenw10 as it is I just deleted it entirely, so it shouldn't be doing any limiting at all. I couldn't find any specific features in the BIOS pointing at this, though, so I'm just running on defaults now. Fully rebuilt PFSense and still having the same loss issues, though, but, as I've confirmed loss on a second router as well, I'm leaning back towards an issue with the provider instead...
-
The hwpstate_intel driver is enabled by default and should select 50 by default. That should be more than sufficient but as I said some of those devices have some very odd choices set in the BIOS by default. You should try setting a high performance value there like 30 and see if it makes any difference.
-
@stephenw10 I'll give 30 a go. The seeming irony to my situation is my RTTs to my gateway monitor IP aren't even bad, it's just I'm seeing enormous loss (on ICMP):
If I run a throughput test I sometimes see 200-400Mbps down, which is in line with what I'd expect, but the issue is tunnels are dropping and apps disconnecting -
Check the mac stats shown in:
sysctl dev.igc.0
or whichever NICs you are using. -
@stephenw10 What should I be looking for?
dev.igc.0.mac_stats.tso_txd: 0
dev.igc.0.mac_stats.tx_frames_1024_1522: 1344934
dev.igc.0.mac_stats.tx_frames_512_1023: 15901
dev.igc.0.mac_stats.tx_frames_256_511: 13890
dev.igc.0.mac_stats.tx_frames_128_255: 21666
dev.igc.0.mac_stats.tx_frames_65_127: 1302018
dev.igc.0.mac_stats.tx_frames_64: 680328
dev.igc.0.mac_stats.mcast_pkts_txd: 0
dev.igc.0.mac_stats.bcast_pkts_txd: 3
dev.igc.0.mac_stats.good_pkts_txd: 3378737
dev.igc.0.mac_stats.total_pkts_txd: 3378737
dev.igc.0.mac_stats.good_octets_txd: 2039654276
dev.igc.0.mac_stats.good_octets_recvd: 4277785341
dev.igc.0.mac_stats.rx_frames_1024_1522: 2895839
dev.igc.0.mac_stats.rx_frames_512_1023: 83944
dev.igc.0.mac_stats.rx_frames_256_511: 18340
dev.igc.0.mac_stats.rx_frames_128_255: 38581
dev.igc.0.mac_stats.rx_frames_65_127: 613592
dev.igc.0.mac_stats.rx_frames_64: 1816390
dev.igc.0.mac_stats.mcast_pkts_recvd: 0
dev.igc.0.mac_stats.bcast_pkts_recvd: 1759297
dev.igc.0.mac_stats.good_pkts_recvd: 5466687
dev.igc.0.mac_stats.total_pkts_recvd: 5467091
dev.igc.0.mac_stats.xoff_txd: 0
dev.igc.0.mac_stats.xoff_recvd: 0
dev.igc.0.mac_stats.xon_txd: 0
dev.igc.0.mac_stats.xon_recvd: 0
dev.igc.0.mac_stats.alignment_errs: 0
dev.igc.0.mac_stats.crc_errs: 0
dev.igc.0.mac_stats.recv_errs: 0
dev.igc.0.mac_stats.recv_jabber: 0
dev.igc.0.mac_stats.recv_oversize: 0
dev.igc.0.mac_stats.recv_fragmented: 0
dev.igc.0.mac_stats.recv_undersize: 0
dev.igc.0.mac_stats.recv_no_buff: 0
dev.igc.0.mac_stats.missed_packets: 0
dev.igc.0.mac_stats.defer_count: 0
dev.igc.0.mac_stats.sequence_errors: 0
dev.igc.0.mac_stats.symbol_errors: 0
dev.igc.0.mac_stats.collision_count: 0
dev.igc.0.mac_stats.late_coll: 0
dev.igc.0.mac_stats.multiple_coll: 0
dev.igc.0.mac_stats.single_coll: 0
dev.igc.0.mac_stats.excess_coll: 0igc.0 is my WAN, fwiw
-
Missed packets or errors. Check the other NICs in use. Also check the other sysctl values for errors.
Also check dev.igc.X.iflib.override_nrxqs. We had to set that 1 on the 4200 to prevent context switching issues.
-
[24.03-RELEASE][admin@pfSense.home.arpa]/root: sysctl -a | grep iflib.override_nrxqs
dev.igc.3.iflib.override_nrxqs: 0
dev.igc.2.iflib.override_nrxqs: 0
dev.igc.1.iflib.override_nrxqs: 0
dev.igc.0.iflib.override_nrxqs: 0 -
Mmm, try setting those to 1. You may need to add them as loader values to /boot/loader.conf.local.
-
@stephenw10 yeah, no dice on that.
The only thing I will say is I tried a USB NIC and that also was seeing packet loss, so at this point I can't fathom, besides a CPU issue as a whole with the Topton box, how this could be anything but the provider's problem. Unfortunately they insist there's no connectivity issues to their modem and the gateway I'm connecting to is upstream of that, so theoretically they should see an issue if there was one...
-
No difference at all with a much lower hwpstate_intel setting?
-
@stephenw10 I blew away my router entirely and rebuilt it from scratch, restoring some of the FW side of things but not the system tunables and other settings just so I could confirm I was starting with a clean config. Here's what I have:
And here's what we look like:
ANd here's what the system log looks like for gateways:
-
Hmm, and no dropped packets or errors in the sysctl output?
-
[24.03-RELEASE][admin@pfSense.home.arpa]/root: sysctl -a | grep drop
kern.ipc.tls.ifnet.reset_dropped: 0
vfs.cache.stats.drops: 0
net.inet.ip.intr_queue_drops: 0
net.inet.ip.intr_direct_queue_drops: 0
net.inet.icmp.drop_redirect: 0
net.inet.tcp.rexmit_drop_options: 0
net.inet.tcp.drop_synfin: 1
net.pf.default_to_drop: 0
hw.cxgbe.ofld_cong_drop: 0
hw.cxgbe.cong_drop: 0
hw.cxgbe.nm_cong_drop: 1
hw.cxgbe.drop_pkts_with_l4_errors: 0
hw.cxgbe.drop_pkts_with_l3_errors: 0
hw.cxgbe.drop_pkts_with_l2_errors: 1
hw.cxgbe.drop_ip_fragments: 0
kstat.zfs.misc.fm.erpt-dropped: 0
dev.igc.3.dropped: 0
dev.igc.3.iflib.txq3.r_drops: 0
dev.igc.3.iflib.txq2.r_drops: 0
dev.igc.3.iflib.txq1.r_drops: 0
dev.igc.3.iflib.txq0.r_drops: 0
dev.igc.2.dropped: 0
dev.igc.2.iflib.txq3.r_drops: 0
dev.igc.2.iflib.txq2.r_drops: 0
dev.igc.2.iflib.txq1.r_drops: 0
dev.igc.2.iflib.txq0.r_drops: 0
dev.igc.1.dropped: 0
dev.igc.1.iflib.txq3.r_drops: 0
dev.igc.1.iflib.txq2.r_drops: 0
dev.igc.1.iflib.txq1.r_drops: 0
dev.igc.1.iflib.txq0.r_drops: 0
dev.igc.0.dropped: 0
dev.igc.0.iflib.txq3.r_drops: 0
dev.igc.0.iflib.txq2.r_drops: 0
dev.igc.0.iflib.txq1.r_drops: 0
dev.igc.0.iflib.txq0.r_drops: 0