• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Interface errors, missed packets and rec overruns

Hardware
2
12
438
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    stephenw10 Netgate Administrator
    last edited by Jul 29, 2024, 2:53 PM

    Looks like slightly different cause though. recv_no_buff vs defer_count

    The first thing I'd try is reassigning the NICs to use a different one as WAN and see if the issue follows it.

    Next I'd try a different flow-control setting and check the current negotiated value. That should prevent the other side overloading the receive buffers if both ends support it.

    Steve

    G 1 Reply Last reply Jul 29, 2024, 4:19 PM Reply Quote 0
    • G
      GeorgePatches @stephenw10
      last edited by Jul 29, 2024, 4:19 PM

      @stephenw10 said in Interface errors, missed packets and rec overruns:

      Looks like slightly different cause though. recv_no_buff vs defer_count

      I noticed that too, but I'm not sure what "defer" means here. I mean I'm also not sure what "recv_no_buff" means, but my educated guess is it received a packet but had no buffer space available to place it in.

      @stephenw10 said in Interface errors, missed packets and rec overruns:

      The first thing I'd try is reassigning the NICs to use a different one as WAN and see if the issue follows it.

      I tried that already, the 82583V NICs all do the same thing.

      @stephenw10 said in Interface errors, missed packets and rec overruns:

      Next I'd try a different flow-control setting and check the current negotiated value.

      So it's currently set to 3 for flow control. How do I check the negotiated value?

      1 Reply Last reply Reply Quote 0
      • S
        stephenw10 Netgate Administrator
        last edited by Jul 29, 2024, 6:03 PM

        AFAIK recv_no_buff implies there are no available receive buffers. So potentially you could increase the buffers.

        I will say that I see some no_buff failures on a box here and don't see any connection issues:

        [2.7.2-RELEASE][admin@xtm5.stevew.lan]/root: sysctl dev.em.0 | grep buf
        dev.em.0.mac_stats.recv_no_buff: 36
        dev.em.0.iflib.rxq1.rxq_fl0.buf_size: 2048
        dev.em.0.iflib.rxq0.rxq_fl0.buf_size: 2048
        dev.em.0.iflib.txq1.mbuf_defrag_failed: 0
        dev.em.0.iflib.txq1.mbuf_defrag: 0
        dev.em.0.iflib.txq0.mbuf_defrag_failed: 0
        dev.em.0.iflib.txq0.mbuf_defrag: 0
        

        Good question about seeing how it's linked though! em doesn't appear to report that. I'd try setting fc to 0 and see if that changes anything.

        1 Reply Last reply Reply Quote 0
        • G
          GeorgePatches
          last edited by GeorgePatches Jul 29, 2024, 8:05 PM Jul 29, 2024, 8:03 PM

          OK, I found some interesting things. Turns out old Intel datasheets have really thorough descriptions of what all these counters mean. Intel 82583V datasheet

          Defer Count: This register counts defer events. A defer event occurs when the transmitter cannot immediately send a packet due to the medium being busy either because:
          • Another device is transmitting
          • The IPG timer has not expired
          • Half-duplex deferral events
          • Reception of XOFF frames
          • The link is not up
          This register only increments if transmits are enabled. The behavior of this counter is slightly different in the 82583V relative to previous devices. For the 82583V, this counter does not increment for streaming transmits that are deferred due to TX IPG.

          Receive No Buffers Count: This register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal). The packet is still received if there is space in the FIFO. This register only increments if receives are enabled. This register does not increment when flow control packets are received.

          Missed Packets Count: Counts the number of missed packets. Packets are missed when the receive FIFO has insufficient space to store the incoming packet. This could be caused because of too few buffers allocated, or because there is insufficient bandwidth on the IO bus. Events setting this counter cause RXO, the receiver overrun interrupt, to be set. This register does not increment if receives are not enabled. Note that these packets are also counted in the Total Packets Received register as well as in the Total Octets Received register.

          1 Reply Last reply Reply Quote 0
          • G
            GeorgePatches
            last edited by Jul 29, 2024, 8:39 PM

            So Defers are specifically transmits and won't ever be Missed packets. And Recv_No_Buff is the NIC has a frame, but the CPU has no buffer for it. Recv_no_buff can become missed_packets if the situation persists long enough.

            I reconfigured things. Instead of shoving everything into em.0 in a router on a stick config, I changed em.0 to be the only WAN and all the LAN VLANs on em.1. This was weird as I can literally just watch the recv_no_buff incrementing on em.0, but em.1 which is seeing the same number of packets is not having any trouble.

            Then I reconfigured things again, this time put the WAN on em.5 and the LAN VLANs on em.4. This setup has no immediate issues, but in the past it slowly accumulates errors just the same.

            My brother's stats confuse me. Defers seem like they should mostly not happen on a full-duplex link, but maybe this is older hardware that increments this when XOFF is received. His defers count is very similar to the XOFF received count. More confusing is that he's getting missed packets, but without recv_no_buff. That doesn't seem like it should be possible.

            1 Reply Last reply Reply Quote 0
            • S
              stephenw10 Netgate Administrator
              last edited by Jul 29, 2024, 9:55 PM

              Hmm, interesting. Are those all the exact same NIC chip? Different PCIe bus maybe?

              G 1 Reply Last reply Jul 30, 2024, 12:40 PM Reply Quote 0
              • G
                GeorgePatches @stephenw10
                last edited by Jul 30, 2024, 12:40 PM

                @stephenw10 em.0 is a 82574L, but em.1-5 are 82583V's. According to pciconf I think they all have a direct x1 lane to the CPU.

                My brother is using an old HP 4 port nic with 82571 chips. I'm starting to think the older 82571 just doesn't have the recv_no_buff register.

                G 1 Reply Last reply Jul 30, 2024, 12:51 PM Reply Quote 0
                • G
                  GeorgePatches @GeorgePatches
                  last edited by Jul 30, 2024, 12:51 PM

                  Actually I just found a bit in the 2014 errata update for the 82571EB that might explain my brother's missed packets. It might just be old crap.

                  1. Missed RX Packets

                  Problem:
                  When the device operates with multiple-requests or Large Send enabled, there could be receive packet loss. When the Tx FIFO is full, the Tx flow may block the host DMA interface of the device. When the transmission of packets is prevented for a long time, due to capture effect or very long backoff in half-duplex, the transmit FIFO is filled and the fetch of Rx descriptors is prevented also. This will prevent the release of the packets from the Rx FIFO to the host, causing the Rx buffer to overflow and the loss of incoming packets. This is a temporary state that will be released once the transmit side is be able to empty the Tx packet buffer.

                  Implication:
                  There could be some packet loss in the Rx path if the transmission of packets is prevented for a long time. Normally, if this occurs, these packets will be re-transmitted by upper-layer protocols.

                  Workaround:
                  None

                  1 Reply Last reply Reply Quote 0
                  • S
                    stephenw10 Netgate Administrator
                    last edited by Jul 30, 2024, 1:18 PM

                    Hmm the 82574 is extremely common. I would have expected that to work more reliably if anything. But there is a difference there so whatever is causing a problem on em0 the 82583V apparently doesn't suffer from it.

                    1 Reply Last reply Reply Quote 0
                    • G
                      GeorgePatches
                      last edited by Jul 30, 2024, 1:58 PM

                      Of the 2 options I have on this box it's supposed to be the "better" one. It has MSI-X and dual tx/rx queues to the 82583V's MSI and single tx/rx queue. 🤷

                      Also, I definitely had the em.5/WAN em.4/LAN setup in the past and it would miss packets over time, but this time it's all good. 🤷

                      Only I've reconfigured this so many times and it's never worked as well as it finally is. Computers man, what the hell.

                      1 Reply Last reply Reply Quote 1
                      12 out of 12
                      • First post
                        12/12
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.