LAN Interface "In" errors



  • Hello,

    I have tried for the past week to resolve this errors issue I am seeing only on the LAN interface. Before modifications, the errors would increment by thousands on the LAN interface. By default "Hardware TCP Segmentation Offloading" and "Hardware Large Receive Offloading" are checked under "System > Advanced > Networking" indicating that they are disabled. System was rebooted. The errors still persist. Then I found this doc https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Broadcom_bce.284.29_Cards and added the entries from there related to broadcom cards. The errors are now slower to increment, but errors are still seen.

    As of right now I am stuck. what are these errors? what is considered an error?

    in tcpdump, I noted many many packets with bad chksum. Mostly from the Wifi AP

    My network is set up like so

    Cable Modem –> bge0 (WAN) pfsense running dell power edge r200 (LAN) bge1 --> Enterasys c2g124-48p switch (8 ports enabled 40 ports disabled) --> WiFi AP Asus RT-AC66U

    
    LAN Interface:
    
    Status
        up
    MAC Address
        00:21:9b:fc:96:1b
    IPv4 Address
        192.168.50.1
    Subnet mask IPv4
        255.255.255.0
    IPv6 Link Local
        fe80::221:9bff:fefc:961b%bge1
    MTU
        1500
    Media
        1000baseT <full-duplex>In/out packets
        8233132/17058096 (1.75 GiB/19.30 GiB)
    In/out packets (pass)
        8233132/17058096 (1.75 GiB/19.30 GiB)
    In/out packets (block)
        7943/0 (565 KiB/0 B)
    In/out errors
        4594/0
    Collisions
        0</full-duplex> 
    
    
    System Information:
    
    System 	pfSense
    
    BIOS 	Vendor: Dell Inc.
    Version: 1.4.3
    Release Date: Fri May 15 2009
    Version 	2.4.3-RELEASE (amd64)
    built on Mon Mar 26 18:02:04 CDT 2018
    FreeBSD 11.1-RELEASE-p7
    
    The system is on the latest version.
    Version information updated at Sat Apr 14 12:11:02 CDT 2018  
    CPU Type 	Intel(R) Xeon(R) CPU X3360 @ 2.83GHz
    Current: 2000 MHz, Max: 2834 MHz
    4 CPUs: 1 package(s) x 4 core(s)
    AES-NI CPU Crypto: No
    Kernel PTI 	Enabled
    Uptime 	1 Day 13 Hours 40 Minutes 48 Seconds
    Current date/time 	
    Sat Apr 14 12:36:13 CDT 2018
    DNS server(s) 	
    
        127.0.0.1
        8.8.8.8
        8.8.4.4
    
    Last config change 	Sat Apr 14 11:49:26 CDT 2018
    State table size 	
    0% (193/814000) Show states
    MBUF Usage 	
    3% (3296/131072)
    Load average 	
    0.20, 0.23, 0.17
    CPU usage 	
    0%
    Memory usage 	
    3% of 8146 MiB
    SWAP usage 	
    0% of 3979 MiB
    Disk usage:
         / 	
    1% of 222GiB - ufs
         /var/run 	
    4% of 3.4MiB - ufs in RAM
    
    


  • try to change the ethernet cable between your firewall and switch, and maybe if nothing is change, try to change the port on the switch.



  • I think I may have figured out something here. Looking at my AP (Asus RT-AC66U) the interface errors being transmitted match the LAN "In" errors on pfSense.  Keep in mind that the AP and pf Sense are not directly connected I'm thinking now that pfSense is not at fault and is subject to the Garbage In Garbage out. Further testing this theory, I wired a computer directly to my Enterasys switch and pfSense does not increment in errors.

    Below is the output of ifconfig from my AP:

    
    br0       Link encap:Ethernet  HWaddr 60:45:CB:B0:2C:68  
              inet addr:192.168.50.3  Bcast:192.168.50.255  Mask:255.255.255.0
              UP BROADCAST RUNNING ALLMULTI MULTICAST  MTU:1500  Metric:1
              RX packets:1475234 errors:0 dropped:0 overruns:0 frame:0
              TX packets:786186 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0 
              RX bytes:270674350 (258.1 MiB)  TX bytes:143958571 (137.2 MiB)
    
    eth0      Link encap:Ethernet  HWaddr 60:45:CB:B0:2C:68  
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:45066001 errors:0 dropped:0 overruns:0 frame:0
              TX packets:15976706 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000 
              RX bytes:937025919 (893.6 MiB)  TX bytes:4052578657 (3.7 GiB)
              Interrupt:179 Base address:0x4000 
    
    eth1      Link encap:Ethernet  HWaddr 60:45:CB:B0:2C:68  
              UP BROADCAST RUNNING ALLMULTI MULTICAST  MTU:1500  Metric:1
              RX packets:100915 errors:0 dropped:0 overruns:0 frame:891588
              TX packets:452609 errors:3871 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000 
              RX bytes:17667996 (16.8 MiB)  TX bytes:179308043 (171.0 MiB)
              Interrupt:163 
    
    eth2      Link encap:Ethernet  HWaddr 60:45:CB:B0:2C:6C  
              UP BROADCAST RUNNING ALLMULTI MULTICAST  MTU:1500  Metric:1
              RX packets:16604878 errors:0 dropped:0 overruns:0 frame:6313378
              TX packets:42886378 errors:33205 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000 
              RX bytes:3880400783 (3.6 GiB)  TX bytes:957843658 (913.4 MiB)
              Interrupt:169 
    
    lo        Link encap:Local Loopback  
              inet addr:127.0.0.1  Mask:255.0.0.0
              UP LOOPBACK RUNNING MULTICAST  MTU:16436  Metric:1
              RX packets:266157 errors:0 dropped:0 overruns:0 frame:0
              TX packets:266157 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0 
              RX bytes:62906197 (59.9 MiB)  TX bytes:62906197 (59.9 MiB)
    
    vlan1     Link encap:Ethernet  HWaddr 60:45:CB:B0:2C:68  
              UP BROADCAST RUNNING ALLMULTI MULTICAST  MTU:1500  Metric:1
              RX packets:45065999 errors:0 dropped:0 overruns:0 frame:0
              TX packets:15976706 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0 
              RX bytes:56569758523 (52.6 GiB)  TX bytes:3813453606 (3.5 GiB)
    
    

  • Netgate Administrator

    Hmm, that's fun.  ;)

    I would also disable 'Hardware Checksum Offloading' if you have not done already.

    Steve



  • I used to get these errors when I enabled traffic shaping



  • I found something else now regarding my LAN "In" errors.

    probing sysctl dev.bge.1 shows the following:

    
    dev.bge.1.stats.tx.BroadcastPkts: 512
    dev.bge.1.stats.tx.MulticastPkts: 3593
    dev.bge.1.stats.tx.UnicastPkts: 2039687
    dev.bge.1.stats.tx.LateCollisions: 0
    dev.bge.1.stats.tx.ExcessiveCollisions: 0
    dev.bge.1.stats.tx.DeferredTransmissions: 0
    dev.bge.1.stats.tx.MultipleCollisionFrames: 0
    dev.bge.1.stats.tx.SingleCollisionFrames: 0
    dev.bge.1.stats.tx.InternalMacTransmitErrors: 0
    dev.bge.1.stats.tx.XoffSent: 0
    dev.bge.1.stats.tx.XonSent: 0
    dev.bge.1.stats.tx.Collisions: 0
    dev.bge.1.stats.tx.ifHCOutOctets: 2812725981
    dev.bge.1.stats.rx.UndersizePkts: 0
    dev.bge.1.stats.rx.Jabbers: 0
    dev.bge.1.stats.rx.FramesTooLong: 0
    dev.bge.1.stats.rx.xoffStateEntered: 0
    dev.bge.1.stats.rx.ControlFramesReceived: 0
    dev.bge.1.stats.rx.xoffPauseFramesReceived: 0
    dev.bge.1.stats.rx.xonPauseFramesReceived: 0
    dev.bge.1.stats.rx.AlignmentErrors: 0
    dev.bge.1.stats.rx.FCSErrors: 0
    dev.bge.1.stats.rx.BroadcastPkts: 795
    dev.bge.1.stats.rx.MulticastPkts: 372
    dev.bge.1.stats.rx.UnicastPkts: 1255895
    dev.bge.1.stats.rx.Fragments: 0
    dev.bge.1.stats.rx.ifHCInOctets: 440643493
    dev.bge.1.stats.RecvThresholdHit: 0
    dev.bge.1.stats.InputErrors: 0
    dev.bge.1.stats.InputDiscards: 1924
    dev.bge.1.stats.NoMoreRxBDs: 0
    dev.bge.1.stats.DmaWriteHighPriQueueFull: 0
    dev.bge.1.stats.DmaWriteQueueFull: 0
    dev.bge.1.stats.FramesDroppedDueToFilters: 0
    dev.bge.1.forced_udpcsum: 0
    dev.bge.1.msi: 1
    dev.bge.1.forced_collapse: 0
    dev.bge.1.%parent: pci3
    dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1028 subdevice=0x023c class=0x020000
    dev.bge.1.%location: slot=0 function=0 dbsf=pci0:4:0:0
    dev.bge.1.%driver: bge
    dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x004201
    
    

    What are InputDiscards?

    
    dev.bge.1.stats.InputDiscards: 1924
    
    

    If I connect my laptop or computer to the bge1 interface and run speed tests the discards do not increment. However, if I connect my RT-AC66U router (in AP mode)  directly to the interface and do the same speed test the InputDiscards increment like crazy.  300-500 Discards per test.

    Is there anything I can do to reduce those discard errors? or find out why a packet is getting discarded?



  • try some of the tuning mentioned here: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards
    There is a section about "Packet loss with many (small) UDP packets"

    you can also try and make a vmware esxi vm and visualize pfsense to see if the errors go away.



  • @strangegopher:

    try some of the tuning mentioned here: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards
    There is a section about "Packet loss with many (small) UDP packets"

    you can also try and make a vmware esxi vm and visualize pfsense to see if the errors go away.

    I went through this wiki previously, it dramatically decreased the error,. but did not eliminate them. I re-read it again and decided to increase the kern.ipc.nmbclusters to 1 million in /boot/loader.conf.local since I have 8GB RAM on this machine.

    Here are my settings from /boot/loader.conf.local (The only item changed since last update is the kern.ipc.nmbclusters  from 131072 to 1000000)

    
    net.inet.tcp.tso=0
    kern.ipc.nmbclusters="1000000"
    hw.bge.tso_enable=0
    hw.pci.enable_msix=0
    net.isr.direct_force=1
    net.isr.direct=1
    
    

    I read again and it said that the two ISR variables should be in the system tunables page, but it should still work yes?



  • Not sure if it will work, i would add it to both places just to be sure.

    I would also try and disable all the hardware offloading at the bottom of the page under "System > Advanced" on the "Networking" tab

    Edit: One thing you can try is get a 2 or 4 port intel gigabit nic if you got expansion slots.



  • @strangegopher:

    Not sure if it will work, i would add it to both places just to be sure.

    I would also try and disable all the hardware offloading at the bottom of the page under "System > Advanced" on the "Networking" tab

    Edit: One thing you can try is get a 2 or 4 port intel gigabit nic if you got expansion slots.

    I was thinking about doing that, I despise Broadcom devices for *nix.  Brings back flashbacks of using fw-cutter and "reverse engineering" a driver suitable for my Linux kernel at the time under CentOS 5. Even then it worked "flaky at best"

    I think I might locate an Intel NIC and disable the on board Broadcom stuff.

    Its so strange though, the only device that is having these errors is bge1 (LAN) bge0 has no issues. Swap the interfaces so that bge0 was LAN and bge1 was WAN and that interface has the issues.  I know I'm running a modern pfsense on an ancient dinosaur of a machine, but its strange. Why is my LAN so "noisy" and its all coming from the Asus RT-AC66U. If I connect via ethernet to anything, the switch, the Asus RT-AC66U, even the pfsense box directly no errors.

    What's frustrating is that there's no ethtool or any way to diagnose what packets are being discarded and at what layer. Is it the CPU doing the discarding or what?

    I have disabled all of the hardware offloading, changed a few parameters here and there, but without any tools so to speak I'm flying blind.



  • So I am reading about FreeBSD 11, there are so so many tunables for networking.

    Here are my current tunables in System > Advanced > System Tunables

    
    Changed Tunables:
    net.inet.tcp.log_debug = 1
    net.inet.tcp.tso = 0
    net.isr.direct_force = 1
    net.isr.direct = 1
    hw.pci.enable_msix = 0
    hw.pci.enable_msi = 0
    hw.bge.tso_enable = 0
    net.inet.icmp.icmplim = 1000
    net.inet.tcp.delayed_ack = 1
    net.inet.tcp.drop_synfin = 0
    net.inet.tcp.syncookies = 0
    net.inet.ip.fastforwarding = 1
    
    Unchanged Tunables:
    net.inet.ip.portrange.first = 1024
    net.inet.tcp.blackhole = 2
    net.inet.udp.blackhole = 1
    net.inet.ip.random_id = 1
    net.inet.ip.redirect = 1
    net.inet6.ip6.redirect = 1
    net.inet6.ip6.use_tempaddr = 0
    net.inet6.ip6.prefer_tempaddr = 0
    net.inet.tcp.recvspace = 65228
    net.inet.tcp.sendspace = 65228
    net.inet.udp.maxdgram = 57344
    net.link.bridge.pfil_onlyip = 0
    net.link.bridge.pfil_member = 1
    net.link.bridge.pfil_bridge = 0
    net.link.tap.user_open = 1
    net.link.vlan.mtag_pcp = 1
    kern.randompid = 347
    net.inet.ip.intr_queue_maxlen = 1000
    hw.syscons.kbd_reboot = 0
    vfs.read_max = 32
    kern.ipc.maxsockbuf = 4262144
    net.inet.ip.process_options = 0
    kern.random.harvest.mask = 351
    net.route.netisr_maxqlen = 1024
    net.inet.udp.checksum = 1
    net.inet.icmp.reply_from_interface = 1
    net.inet6.ip6.rfc6204w3 = 1
    net.enc.out.ipsec_bpf_mask = 0x0001
    net.enc.out.ipsec_filter_mask = 0x0001
    net.enc.in.ipsec_bpf_mask = 0x0002
    net.enc.in.ipsec_filter_mask = 0x0002
    net.key.preferred_oldsa = 0
    net.inet.carp.senderr_demotion_factor = 0
    net.pfsync.carp_demotion_factor = 0
    net.raw.recvspace = 65536
    net.raw.sendspace = 65536
    net.inet.raw.recvspace = 131072
    net.inet.raw.maxdgram = 131072
    kern.corefile = /root/%N.core
    
    

    I have increased my "speed" however these settings have done nothing for the errors.  I am beginning to thing I'm worrying about the errors for nothing as Im getting 515 Mbps down over Wifi and close to 850  Mbps via ethernet.

    I have gigabit Internet over cable.

    What do you all think? Am I overly concerned regarding the errors for nothing?  Im not seeing any indications of "serious errors" in dmesg  Just these errors regarding timestamps missing, but that's the tcp debug. I was hoping to find if I could find out what packets are being dropped.

    
    TCP: [192.168.50.30]:62732 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [192.168.50.30]:62735 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [192.168.50.30]:62734 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [192.168.50.30]:62730 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [192.168.50.30]:62732 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [192.168.50.30]:62735 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [192.168.50.30]:62734 to [192.168.50.1]:80 tcpflags 0x10<ack>; tcp_do_segment: Timestamp missing, no action
    TCP: [208.123.73.93]:443 to [72.47.40.251]:44841 tcpflags 0x12<syn,ack>; tcp_do_segment: Timestamp not expected, no action
    TCP: [208.123.73.93]:443 to [72.47.40.251]:28208 tcpflags 0x12<syn,ack>; tcp_do_segment: Timestamp not expected, no action
    TCP: [10.0.0.1]:45674 to [192.168.50.1]:22 tcpflags 0x2<syn>; syncache_add: Received duplicate SYN, resetting timer and retransmitting SYN|ACK
    TCP: [10.0.0.1]:45674 to [192.168.50.1]:22 tcpflags 0x2<syn>; syncache_add: Received duplicate SYN, resetting timer and retransmitting SYN|ACK</syn></syn></syn,ack></syn,ack></ack></ack></ack></ack></ack></ack></ack> 
    

    Are there any other buffer tunables I can use to increase the buffer size?

    Below are my state table sizes and MBUF:

    
    State table size  0% (317/814000)
    MBUF Usage 	0% (3046/1000000)
    
    

    In case I havent stated previously Here is the CPU info:

    
    Intel(R) Xeon(R) CPU X3360 @ 2.83GHz
    Current: 2000 MHz, Max: 2834 MHz
    4 CPUs: 1 package(s) x 4 core(s)
    AES-NI CPU Crypto: No 
    
    

  • Netgate Administrator

    Can you generate traffic via the Asus without using wifi? Via the switch ports there of from the device itself?

    Wifi in inherently prone to errors just due to random interference, reflections etc. If you have a wifi interface in pfSense directly for example you will always see some errors on it. However I would expect to see most of those layer 1 type errors on the Asus and not passed to pfSense. Anything at layer 2 may be though, if you have the AP connected at L2.

    Broadcom Ethernet was never anywhere near as bad as wifi. In fact they were second to Intel for a while IMO. If you can use an Intel NIC though you should.

    Can you try putting a switch in between pfSense and the Asus device? That would rule out some obscure incompatibility.

    Steve