• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

High interrupts on WAN/LAN interfaces?

Hardware
4
55
4.4k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R
    rmeskill
    last edited by Sep 16, 2024, 4:46 PM

    Hey all, I've been trying to chase down a packet loss issue with my PFSense on Topton (AliExpress) N5105 (i226 interface) router. I get random bouts of enormous packet loss that doesn't affect other devices plugged into my provider modem, so I'm thinking it has to be with my PFSense/Topton router. I've disabled all hardware offloading, made sure I'm on the latest version of software, but no matter what I do I still see random, sporadic runs of 60-80% packet loss. The other 20-40% of packets get through mostly fine, with no significant delay, it's just a ton of dropped packets.

    I've got a few system tunables set (dev.cpu.0.cx_lowest = C3, dev.hwpstate_intel.0.epp = 90) and had flow control turned off on the intel cards, but have since deleted those. Here are my interrupt numbers:

    WAN:
    In/out packets: 3736652/3186398 (3.89 GiB/1.99 GiB)
    In/out packets (pass): 3736652/3186398 (3.89 GiB/1.99 GiB)
    In/out packets (block): 15919/235 (1.88 MiB/57 KiB)
    In/out errors: 0/0
    Collisions: 0
    Interrupts: 4583207 (1126/s)

    LAN:
    In/out packets: 3216598/3722933 (1.98 GiB/3.86 GiB)
    In/out packets (pass): 3216598/3722933 (1.98 GiB/3.86 GiB)
    In/out packets (block): 76029/598 (4.45 MiB/255 KiB)
    In/out errors: 0/0
    Collisions: 0
    Interrupts: 4441446 (1090/s)

    Something feels hokey here, right? What would be causing interrupt numbers like this and might it be the source of my sporadic high packet loss?

    R 1 Reply Last reply Sep 16, 2024, 4:59 PM Reply Quote 0
    • R
      rmeskill @rmeskill
      last edited by Sep 16, 2024, 4:59 PM

      🔒 Log in to view
      This is to my (ISP) gateway router, adjacent to me. If I plug another router in there it works fine with seemingly no loss. And this isn't event entirely consistent-it might happen for 10m or 10h but then stop for a day or a week or even multiple months. The packet loss kicked up again for seemingly no reason

      R 1 Reply Last reply Sep 16, 2024, 5:33 PM Reply Quote 0
      • R
        rmeskill @rmeskill
        last edited by Sep 16, 2024, 5:33 PM

        The only thing I see in the logs, which might be a bit suspicious is an awful lot of messages like these:

        Sep 16 16:57:35 dpinger 35604 exiting on signal 15
        Sep 16 16:57:35 dpinger 34661 exiting on signal 15
        Sep 16 16:57:35 dpinger 36758 exiting on signal 15

        I have 3 wireguard tunnels running that are bound to the public interface-I wonder if these, in some way, are interfering with the WAN connection...

        R 1 Reply Last reply Sep 16, 2024, 5:47 PM Reply Quote 0
        • R
          rmeskill @rmeskill
          last edited by Sep 16, 2024, 5:47 PM

          One last insight here... When I make any change to the WAN interface (change speed/duplex, disable bogon networks/etc) I see 15-30s of loss-free traffic. Every time. But then it comes and goes after that. I have absolutely no idea what that means, but it's at least another data point...

          V 1 Reply Last reply Sep 16, 2024, 6:11 PM Reply Quote 0
          • V
            viragomann @rmeskill
            last edited by Sep 16, 2024, 6:11 PM

            @rmeskill
            Go to the gateway settings and state an alternative monitoring IP to ensure, that it's not just the gateway dropping ICMP packets.

            R 1 Reply Last reply Sep 16, 2024, 6:37 PM Reply Quote 0
            • R
              rmeskill @viragomann
              last edited by Sep 16, 2024, 6:37 PM

              @viragomann Yeah, it's a good idea, but I've also set Disable Gateway Monitoring Action on the main WAN gateway, so I wouldn't expect it to have any effect...

              R 1 Reply Last reply Sep 16, 2024, 7:14 PM Reply Quote 0
              • R
                rmeskill @rmeskill
                last edited by Sep 16, 2024, 7:14 PM

                Interestingly I added a USB ethernet adapter and I'm not seeing any loss...
                🔒 Log in to view
                So. yeah, I'm almost positive it's hardware related, probably with the i226-v NICs...

                1 Reply Last reply Reply Quote 0
                • S
                  stephenw10 Netgate Administrator
                  last edited by Sep 16, 2024, 11:52 PM

                  dpinger restarting like that is almost certainly a symptom of the packet loss alarms not the cause.

                  @rmeskill said in High interrupts on WAN/LAN interfaces?:

                  I've got a few system tunables set (dev.cpu.0.cx_lowest = C3, dev.hwpstate_intel.0.epp = 90

                  I'd be suspicious of those. Try running the CPU at a higher target speed. Do you have those settings for all CPU cores?

                  Some similar boxes like that have some terrible default BIOS values to reduce CPU heat that cause them to run at reduced speeds.

                  Do you have eee enabled for the i226 NICs?

                  When you are seeing those interrupt rates what traffic is passing?

                  R 2 Replies Last reply Sep 17, 2024, 12:26 PM Reply Quote 0
                  • R
                    rmeskill @stephenw10
                    last edited by Sep 17, 2024, 12:26 PM

                    I went to try and turn eee on for the NICs and it completely hosed my system-had to go back and turn it off (which appears the default) via a direct console connection to be able to get things working again. I think the interrupt rates themselves are actually irrelevant, rather it's the packet loss that's the symptom and the issue. I have absolutely no idea why or what is correlated to the loss. I'm currently trying an external USB NIC to see if that has the same issues or not, setup in a GW Group configuration so it should automatically failover if the main i226-v NIC starts seeing loss...

                    1 Reply Last reply Reply Quote 0
                    • R
                      rmeskill @stephenw10
                      last edited by Sep 17, 2024, 12:29 PM

                      @stephenw10 I did also move the dev.cpu.0.cx_lowest = C3, dev.hwpstate_intel.0.epp = 90 to = 100. But in the 5m since I added the USB NIC it seems that's also seeing loss. But I can confirm I have another (GLiNet) router plugged into this modem, running a similar setup, and it works fine with no loss, so the only consistency here is the Topton PFSense box...

                      R 1 Reply Last reply Sep 17, 2024, 12:55 PM Reply Quote 0
                      • R
                        rmeskill @rmeskill
                        last edited by Sep 17, 2024, 12:55 PM

                        Actually at this point it just looks like the PFSense system itself is completely hosed. I've no idea why, but now even the GUI loads awfully slowly from the local network and sometimes gives me a 503 error on reboot. I'm probably just going to blow it away and start fresh

                        1 Reply Last reply Reply Quote 0
                        • S
                          stephenw10 Netgate Administrator
                          last edited by Sep 17, 2024, 3:01 PM

                          Yup eee should be disabled.

                          The values for hwpstate_intel should be set lower for higher performance. Setting it to 100 means the most power saving / lowest performance. Try setting it to 50 or just disable that. The default value depends on what the BIOS passes though.

                          R 1 Reply Last reply Sep 17, 2024, 3:24 PM Reply Quote 0
                          • R
                            rmeskill @stephenw10
                            last edited by Sep 17, 2024, 3:24 PM

                            @stephenw10 as it is I just deleted it entirely, so it shouldn't be doing any limiting at all. I couldn't find any specific features in the BIOS pointing at this, though, so I'm just running on defaults now. Fully rebuilt PFSense and still having the same loss issues, though, but, as I've confirmed loss on a second router as well, I'm leaning back towards an issue with the provider instead...

                            1 Reply Last reply Reply Quote 0
                            • S
                              stephenw10 Netgate Administrator
                              last edited by Sep 17, 2024, 3:34 PM

                              The hwpstate_intel driver is enabled by default and should select 50 by default. That should be more than sufficient but as I said some of those devices have some very odd choices set in the BIOS by default. You should try setting a high performance value there like 30 and see if it makes any difference.

                              R 1 Reply Last reply Sep 17, 2024, 3:44 PM Reply Quote 0
                              • R
                                rmeskill @stephenw10
                                last edited by Sep 17, 2024, 3:44 PM

                                @stephenw10 I'll give 30 a go. The seeming irony to my situation is my RTTs to my gateway monitor IP aren't even bad, it's just I'm seeing enormous loss (on ICMP):
                                🔒 Log in to view
                                If I run a throughput test I sometimes see 200-400Mbps down, which is in line with what I'd expect, but the issue is tunnels are dropping and apps disconnecting

                                1 Reply Last reply Reply Quote 0
                                • S
                                  stephenw10 Netgate Administrator
                                  last edited by Sep 17, 2024, 4:09 PM

                                  Check the mac stats shown in: sysctl dev.igc.0 or whichever NICs you are using.

                                  R 1 Reply Last reply Sep 17, 2024, 4:18 PM Reply Quote 0
                                  • R
                                    rmeskill @stephenw10
                                    last edited by rmeskill Sep 17, 2024, 4:19 PM Sep 17, 2024, 4:18 PM

                                    @stephenw10 What should I be looking for?

                                    dev.igc.0.mac_stats.tso_txd: 0
                                    dev.igc.0.mac_stats.tx_frames_1024_1522: 1344934
                                    dev.igc.0.mac_stats.tx_frames_512_1023: 15901
                                    dev.igc.0.mac_stats.tx_frames_256_511: 13890
                                    dev.igc.0.mac_stats.tx_frames_128_255: 21666
                                    dev.igc.0.mac_stats.tx_frames_65_127: 1302018
                                    dev.igc.0.mac_stats.tx_frames_64: 680328
                                    dev.igc.0.mac_stats.mcast_pkts_txd: 0
                                    dev.igc.0.mac_stats.bcast_pkts_txd: 3
                                    dev.igc.0.mac_stats.good_pkts_txd: 3378737
                                    dev.igc.0.mac_stats.total_pkts_txd: 3378737
                                    dev.igc.0.mac_stats.good_octets_txd: 2039654276
                                    dev.igc.0.mac_stats.good_octets_recvd: 4277785341
                                    dev.igc.0.mac_stats.rx_frames_1024_1522: 2895839
                                    dev.igc.0.mac_stats.rx_frames_512_1023: 83944
                                    dev.igc.0.mac_stats.rx_frames_256_511: 18340
                                    dev.igc.0.mac_stats.rx_frames_128_255: 38581
                                    dev.igc.0.mac_stats.rx_frames_65_127: 613592
                                    dev.igc.0.mac_stats.rx_frames_64: 1816390
                                    dev.igc.0.mac_stats.mcast_pkts_recvd: 0
                                    dev.igc.0.mac_stats.bcast_pkts_recvd: 1759297
                                    dev.igc.0.mac_stats.good_pkts_recvd: 5466687
                                    dev.igc.0.mac_stats.total_pkts_recvd: 5467091
                                    dev.igc.0.mac_stats.xoff_txd: 0
                                    dev.igc.0.mac_stats.xoff_recvd: 0
                                    dev.igc.0.mac_stats.xon_txd: 0
                                    dev.igc.0.mac_stats.xon_recvd: 0
                                    dev.igc.0.mac_stats.alignment_errs: 0
                                    dev.igc.0.mac_stats.crc_errs: 0
                                    dev.igc.0.mac_stats.recv_errs: 0
                                    dev.igc.0.mac_stats.recv_jabber: 0
                                    dev.igc.0.mac_stats.recv_oversize: 0
                                    dev.igc.0.mac_stats.recv_fragmented: 0
                                    dev.igc.0.mac_stats.recv_undersize: 0
                                    dev.igc.0.mac_stats.recv_no_buff: 0
                                    dev.igc.0.mac_stats.missed_packets: 0
                                    dev.igc.0.mac_stats.defer_count: 0
                                    dev.igc.0.mac_stats.sequence_errors: 0
                                    dev.igc.0.mac_stats.symbol_errors: 0
                                    dev.igc.0.mac_stats.collision_count: 0
                                    dev.igc.0.mac_stats.late_coll: 0
                                    dev.igc.0.mac_stats.multiple_coll: 0
                                    dev.igc.0.mac_stats.single_coll: 0
                                    dev.igc.0.mac_stats.excess_coll: 0

                                    igc.0 is my WAN, fwiw

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      stephenw10 Netgate Administrator
                                      last edited by Sep 17, 2024, 4:24 PM

                                      Missed packets or errors. Check the other NICs in use. Also check the other sysctl values for errors.

                                      Also check dev.igc.X.iflib.override_nrxqs. We had to set that 1 on the 4200 to prevent context switching issues.

                                      R 1 Reply Last reply Sep 17, 2024, 4:42 PM Reply Quote 0
                                      • R
                                        rmeskill @stephenw10
                                        last edited by Sep 17, 2024, 4:42 PM

                                        @stephenw10

                                        [24.03-RELEASE][admin@pfSense.home.arpa]/root: sysctl -a | grep iflib.override_nrxqs
                                        dev.igc.3.iflib.override_nrxqs: 0
                                        dev.igc.2.iflib.override_nrxqs: 0
                                        dev.igc.1.iflib.override_nrxqs: 0
                                        dev.igc.0.iflib.override_nrxqs: 0

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          stephenw10 Netgate Administrator
                                          last edited by Sep 17, 2024, 5:21 PM

                                          Mmm, try setting those to 1. You may need to add them as loader values to /boot/loader.conf.local.

                                          R 1 Reply Last reply Sep 17, 2024, 5:51 PM Reply Quote 0
                                          6 out of 55
                                          • First post
                                            6/55
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.