Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Interface errors, missed packets and rec overruns

    Scheduled Pinned Locked Moved Hardware
    12 Posts 2 Posters 665 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      GeorgePatches
      last edited by GeorgePatches

      Oh, I forgot possibly the most interesting tidbit. My brother is experiencing the same behavior and his system has a Ryzen 3 2200G. CPU to spare on that machine with an old quad NIC Intel 82571. It only seems to happen on the WAN side. I had the same experience when my WAN was separate. (I currently have all my traffic VLANed into a single physical NIC).

      Brother's sysctl dev.em.0 output

      dev.em.0.interrupts.rx_overrun: 0
      dev.em.0.interrupts.rx_desc_min_thresh: 0
      dev.em.0.interrupts.tx_queue_min_thresh: 24
      dev.em.0.interrupts.tx_queue_empty: 0
      dev.em.0.interrupts.tx_abs_timer: 9824
      dev.em.0.interrupts.tx_pkt_timer: 7967
      dev.em.0.interrupts.rx_abs_timer: 0
      dev.em.0.interrupts.rx_pkt_timer: 80527
      dev.em.0.interrupts.asserts: 484449432
      dev.em.0.mac_stats.tso_ctx_fail: 0
      dev.em.0.mac_stats.tso_txd: 0
      dev.em.0.mac_stats.tx_frames_1024_1522: 83478008
      dev.em.0.mac_stats.tx_frames_512_1023: 619494980
      dev.em.0.mac_stats.tx_frames_256_511: 4447683
      dev.em.0.mac_stats.tx_frames_128_255: 26350772
      dev.em.0.mac_stats.tx_frames_65_127: 90891554
      dev.em.0.mac_stats.tx_frames_64: 42590904
      dev.em.0.mac_stats.mcast_pkts_txd: 1192
      dev.em.0.mac_stats.bcast_pkts_txd: 3346
      dev.em.0.mac_stats.good_pkts_txd: 867253901
      dev.em.0.mac_stats.total_pkts_txd: 867253901
      dev.em.0.mac_stats.good_octets_txd: 564113139828
      dev.em.0.mac_stats.good_octets_recvd: 600386896379
      dev.em.0.mac_stats.rx_frames_1024_1522: 395679755
      dev.em.0.mac_stats.rx_frames_512_1023: 17668892
      dev.em.0.mac_stats.rx_frames_256_511: 7229265
      dev.em.0.mac_stats.rx_frames_128_255: 158134130
      dev.em.0.mac_stats.rx_frames_65_127: 48923871
      dev.em.0.mac_stats.rx_frames_64: 0
      dev.em.0.mac_stats.mcast_pkts_recvd: 6626
      dev.em.0.mac_stats.bcast_pkts_recvd: 0
      dev.em.0.mac_stats.good_pkts_recvd: 627635913
      dev.em.0.mac_stats.total_pkts_recvd: 629974672
      dev.em.0.mac_stats.xoff_txd: 0
      dev.em.0.mac_stats.xoff_recvd: 1165746
      dev.em.0.mac_stats.xon_txd: 0
      dev.em.0.mac_stats.xon_recvd: 1165746
      dev.em.0.mac_stats.coll_ext_errs: 0
      dev.em.0.mac_stats.alignment_errs: 0
      dev.em.0.mac_stats.crc_errs: 0
      dev.em.0.mac_stats.recv_errs: 0
      dev.em.0.mac_stats.recv_jabber: 0
      dev.em.0.mac_stats.recv_oversize: 0
      dev.em.0.mac_stats.recv_fragmented: 0
      dev.em.0.mac_stats.recv_undersize: 0
      dev.em.0.mac_stats.recv_no_buff: 0
      dev.em.0.mac_stats.missed_packets: 7310
      dev.em.0.mac_stats.defer_count: 1161488
      dev.em.0.mac_stats.sequence_errors: 0
      dev.em.0.mac_stats.symbol_errors: 0
      dev.em.0.mac_stats.collision_count: 0
      dev.em.0.mac_stats.late_coll: 0
      dev.em.0.mac_stats.multiple_coll: 0
      dev.em.0.mac_stats.single_coll: 0
      dev.em.0.mac_stats.excess_coll: 0
      dev.em.0.queue_rx_0.rx_irq: 0
      dev.em.0.queue_rx_0.rxd_tail: 518
      dev.em.0.queue_rx_0.rxd_head: 519
      dev.em.0.queue_tx_0.tx_irq: 0
      dev.em.0.queue_tx_0.txd_tail: 164
      dev.em.0.queue_tx_0.txd_head: 164
      dev.em.0.fc_low_water: 29220
      dev.em.0.fc_high_water: 30720
      dev.em.0.rx_control: 67403778
      dev.em.0.device_control: 1209795137
      dev.em.0.watchdog_timeouts: 0
      dev.em.0.rx_overruns: 111
      dev.em.0.link_irq: 0
      dev.em.0.dropped: 0
      dev.em.0.eee_control: 1
      dev.em.0.itr: 488
      dev.em.0.tx_abs_int_delay: 66
      dev.em.0.rx_abs_int_delay: 66
      dev.em.0.tx_int_delay: 66
      dev.em.0.rx_int_delay: 0
      dev.em.0.rs_dump: 0
      dev.em.0.reg_dump: General Registers
      	CTRL	 481c0241
      	STATUS	 00080387
      	CTRL_EXT	 101400c0
      
      Interrupt Registers
      	ICR	 00000000
      
      RX Registers
      	RCTL	 04048002
      	RDLEN	 00004000
      	RDH	 00000207
      	RDT	 00000206
      	RXDCTL	 00010000
      	RDBAL	 b0034000
      	RDBAH	 00000000
      
      TX Registers
      	TCTL	 3103f0fa
      	TDBAL	 b002c000
      	TDBAH	 00000000
      	TDLEN	 00004000
      	TDH	 000000aa
      	TDT	 000000aa
      	TXDCTL	 0341011f
      	TDFH	 00001296
      	TDFT	 00001298
      	TDFHS	 00001296
      	TDFPC	 00000000
      
      
      dev.em.0.fc: 3
      dev.em.0.debug: -1
      dev.em.0.fw_version: EEPROM V5.12-2
      dev.em.0.nvm: -1
      dev.em.0.iflib.rxq0.rxq_fl0.buf_size: 2048
      dev.em.0.iflib.rxq0.rxq_fl0.credits: 1023
      dev.em.0.iflib.rxq0.rxq_fl0.cidx: 519
      dev.em.0.iflib.rxq0.rxq_fl0.pidx: 518
      dev.em.0.iflib.rxq0.cpu: 3
      dev.em.0.iflib.txq0.r_abdications: 324
      dev.em.0.iflib.txq0.r_restarts: 56590
      dev.em.0.iflib.txq0.r_stalls: 56590
      dev.em.0.iflib.txq0.r_starts: 858032115
      dev.em.0.iflib.txq0.r_drops: 1072
      dev.em.0.iflib.txq0.r_enqueues: 868117756
      dev.em.0.iflib.txq0.ring_state: pidx_head: 1279 pidx_tail: 1279 cidx: 1279 state: IDLE
      dev.em.0.iflib.txq0.txq_cleaned: 1474843782
      dev.em.0.iflib.txq0.txq_processed: 1474843822
      dev.em.0.iflib.txq0.txq_in_use: 44
      dev.em.0.iflib.txq0.txq_cidx_processed: 174
      dev.em.0.iflib.txq0.txq_cidx: 134
      dev.em.0.iflib.txq0.txq_pidx: 178
      dev.em.0.iflib.txq0.no_tx_dma_setup: 0
      dev.em.0.iflib.txq0.txd_encap_efbig: 0
      dev.em.0.iflib.txq0.tx_map_failed: 0
      dev.em.0.iflib.txq0.no_desc_avail: 0
      dev.em.0.iflib.txq0.mbuf_defrag_failed: 0
      dev.em.0.iflib.txq0.m_pullups: 602346602
      dev.em.0.iflib.txq0.mbuf_defrag: 0
      dev.em.0.iflib.txq0.cpu: 2
      dev.em.0.iflib.override_nrxds: 0
      dev.em.0.iflib.override_ntxds: 0
      dev.em.0.iflib.use_logical_cores: 0
      dev.em.0.iflib.separate_txrx: 0
      dev.em.0.iflib.core_offset: 0
      dev.em.0.iflib.tx_abdicate: 0
      dev.em.0.iflib.rx_budget: 0
      dev.em.0.iflib.disable_msix: 1
      dev.em.0.iflib.override_qs_enable: 0
      dev.em.0.iflib.override_nrxqs: 0
      dev.em.0.iflib.override_ntxqs: 0
      dev.em.0.iflib.driver_version: 7.7.8-fbsd
      dev.em.0.%parent: pci3
      dev.em.0.%pnpinfo: vendor=0x8086 device=0x10bc subvendor=0x103c subdevice=0x704b class=0x020000
      dev.em.0.%location: slot=0 function=0 dbsf=pci0:18:0:0
      dev.em.0.%driver: em
      dev.em.0.%desc: Intel(R) PRO/1000 PT 82571EB/82571GB (Quad Copper)
      
      
      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Looks like slightly different cause though. recv_no_buff vs defer_count

        The first thing I'd try is reassigning the NICs to use a different one as WAN and see if the issue follows it.

        Next I'd try a different flow-control setting and check the current negotiated value. That should prevent the other side overloading the receive buffers if both ends support it.

        Steve

        G 1 Reply Last reply Reply Quote 0
        • G
          GeorgePatches @stephenw10
          last edited by

          @stephenw10 said in Interface errors, missed packets and rec overruns:

          Looks like slightly different cause though. recv_no_buff vs defer_count

          I noticed that too, but I'm not sure what "defer" means here. I mean I'm also not sure what "recv_no_buff" means, but my educated guess is it received a packet but had no buffer space available to place it in.

          @stephenw10 said in Interface errors, missed packets and rec overruns:

          The first thing I'd try is reassigning the NICs to use a different one as WAN and see if the issue follows it.

          I tried that already, the 82583V NICs all do the same thing.

          @stephenw10 said in Interface errors, missed packets and rec overruns:

          Next I'd try a different flow-control setting and check the current negotiated value.

          So it's currently set to 3 for flow control. How do I check the negotiated value?

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            AFAIK recv_no_buff implies there are no available receive buffers. So potentially you could increase the buffers.

            I will say that I see some no_buff failures on a box here and don't see any connection issues:

            [2.7.2-RELEASE][admin@xtm5.stevew.lan]/root: sysctl dev.em.0 | grep buf
            dev.em.0.mac_stats.recv_no_buff: 36
            dev.em.0.iflib.rxq1.rxq_fl0.buf_size: 2048
            dev.em.0.iflib.rxq0.rxq_fl0.buf_size: 2048
            dev.em.0.iflib.txq1.mbuf_defrag_failed: 0
            dev.em.0.iflib.txq1.mbuf_defrag: 0
            dev.em.0.iflib.txq0.mbuf_defrag_failed: 0
            dev.em.0.iflib.txq0.mbuf_defrag: 0
            

            Good question about seeing how it's linked though! em doesn't appear to report that. I'd try setting fc to 0 and see if that changes anything.

            1 Reply Last reply Reply Quote 0
            • G
              GeorgePatches
              last edited by GeorgePatches

              OK, I found some interesting things. Turns out old Intel datasheets have really thorough descriptions of what all these counters mean. Intel 82583V datasheet

              Defer Count: This register counts defer events. A defer event occurs when the transmitter cannot immediately send a packet due to the medium being busy either because:
              • Another device is transmitting
              • The IPG timer has not expired
              • Half-duplex deferral events
              • Reception of XOFF frames
              • The link is not up
              This register only increments if transmits are enabled. The behavior of this counter is slightly different in the 82583V relative to previous devices. For the 82583V, this counter does not increment for streaming transmits that are deferred due to TX IPG.

              Receive No Buffers Count: This register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal). The packet is still received if there is space in the FIFO. This register only increments if receives are enabled. This register does not increment when flow control packets are received.

              Missed Packets Count: Counts the number of missed packets. Packets are missed when the receive FIFO has insufficient space to store the incoming packet. This could be caused because of too few buffers allocated, or because there is insufficient bandwidth on the IO bus. Events setting this counter cause RXO, the receiver overrun interrupt, to be set. This register does not increment if receives are not enabled. Note that these packets are also counted in the Total Packets Received register as well as in the Total Octets Received register.

              1 Reply Last reply Reply Quote 0
              • G
                GeorgePatches
                last edited by

                So Defers are specifically transmits and won't ever be Missed packets. And Recv_No_Buff is the NIC has a frame, but the CPU has no buffer for it. Recv_no_buff can become missed_packets if the situation persists long enough.

                I reconfigured things. Instead of shoving everything into em.0 in a router on a stick config, I changed em.0 to be the only WAN and all the LAN VLANs on em.1. This was weird as I can literally just watch the recv_no_buff incrementing on em.0, but em.1 which is seeing the same number of packets is not having any trouble.

                Then I reconfigured things again, this time put the WAN on em.5 and the LAN VLANs on em.4. This setup has no immediate issues, but in the past it slowly accumulates errors just the same.

                My brother's stats confuse me. Defers seem like they should mostly not happen on a full-duplex link, but maybe this is older hardware that increments this when XOFF is received. His defers count is very similar to the XOFF received count. More confusing is that he's getting missed packets, but without recv_no_buff. That doesn't seem like it should be possible.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmm, interesting. Are those all the exact same NIC chip? Different PCIe bus maybe?

                  G 1 Reply Last reply Reply Quote 0
                  • G
                    GeorgePatches @stephenw10
                    last edited by

                    @stephenw10 em.0 is a 82574L, but em.1-5 are 82583V's. According to pciconf I think they all have a direct x1 lane to the CPU.

                    My brother is using an old HP 4 port nic with 82571 chips. I'm starting to think the older 82571 just doesn't have the recv_no_buff register.

                    G 1 Reply Last reply Reply Quote 0
                    • G
                      GeorgePatches @GeorgePatches
                      last edited by

                      Actually I just found a bit in the 2014 errata update for the 82571EB that might explain my brother's missed packets. It might just be old crap.

                      1. Missed RX Packets

                      Problem:
                      When the device operates with multiple-requests or Large Send enabled, there could be receive packet loss. When the Tx FIFO is full, the Tx flow may block the host DMA interface of the device. When the transmission of packets is prevented for a long time, due to capture effect or very long backoff in half-duplex, the transmit FIFO is filled and the fetch of Rx descriptors is prevented also. This will prevent the release of the packets from the Rx FIFO to the host, causing the Rx buffer to overflow and the loss of incoming packets. This is a temporary state that will be released once the transmit side is be able to empty the Tx packet buffer.

                      Implication:
                      There could be some packet loss in the Rx path if the transmission of packets is prevented for a long time. Normally, if this occurs, these packets will be re-transmitted by upper-layer protocols.

                      Workaround:
                      None

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Hmm the 82574 is extremely common. I would have expected that to work more reliably if anything. But there is a difference there so whatever is causing a problem on em0 the 82583V apparently doesn't suffer from it.

                        1 Reply Last reply Reply Quote 0
                        • G
                          GeorgePatches
                          last edited by

                          Of the 2 options I have on this box it's supposed to be the "better" one. It has MSI-X and dual tx/rx queues to the 82583V's MSI and single tx/rx queue. 🤷

                          Also, I definitely had the em.5/WAN em.4/LAN setup in the past and it would miss packets over time, but this time it's all good. 🤷

                          Only I've reconfigured this so many times and it's never worked as well as it finally is. Computers man, what the hell.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.