802.3x flow control goes berzerk, can't disable it! help!



  • Hello

    I'm currently running pfSense 2.1.1-PREREALSE (amd64) on an Intel NUC DC3217IYE with a cheap Netgear GS108Ev2 VLAN Switch, basically in the router-on-a-stick topology, with pfSense having 4 VLANS; em0_vlan1 (LAN), em0_vlan2(WAN), em0_vlan3(IPTV_In), em0_vlan4(IPTV_Clients)

    Everything was working great, right until i started running IPTV trough the box, thats when all hell broke loose :O

    Basically whenever the IPTV box starts doing its thing, some device on the network keeps sending out 802.3x pause packets on the network (see pic), which causes frozen pictures and pings to pfSense (on LAN) above 1500ms, even though not much traffic is flowing trough it. I'm quite convinced the device sending the packets must be pfSense on the Intel NUC, because I did try with other hardware where i copied my pfSense setup 1:1 with dd to the other HDD, so the only difference would be the hardware, and the NIC driver, which on the other machine was age(4). On this hardware everything was working perfectly, no problems running a couple 1080i streams trough it, and no pause packets shows up in the capture.

    You might say that maybe the NUC just can't handle the traffic, but it has no problems maxing my 60/60 connection, WHILE running 4x 1080i streams with udpxy (multicast to http).

    The funney thing is, that whenever i boot up the IPTV box, no real picture ever comes on, and the pings to pfSense goes trough the roof, but the instant that i power off the iptv box, pings to pfSense are back to <1ms, even though the iptv box never got around to send its IGMP leave message, so the stream is still going trough pfSense, no change in the bandwidth usage, yet pings are back to normal. Weird? Could it be that its actually the IPTV box sending the pause messages? But why would it do that with em(4) and not age(4)? To me nothing in the capture seems to suggest that something is wrong with the stream that the NUC sends out.

    I did try to disable flow control in pfSense, both with hw.em.fc_setting=0 in /boot/loader.conf.local and the sysctl dev.em.0.fc: 0, to no avail, the driver seems to ignore it.
    Also I tried building pfSense with the newest e1000 driver from FreeBSD HEAD, no difference.

    Well, I think that about sums up my problem.. If anyone could help me out here, I would be deeply appreciative :)

    sysctl dev.em:

    dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.3.8
    dev.em.0.%driver: em
    dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.GLAN
    dev.em.0.%pnpinfo: vendor=0x8086 device=0x1503 subvendor=0x8086 subdevice=0x2044 class=0x020000
    dev.em.0.%parent: pci0
    dev.em.0.nvm: -1
    dev.em.0.debug: -1
    dev.em.0.fc: 0
    dev.em.0.rx_int_delay: 0
    dev.em.0.tx_int_delay: 66
    dev.em.0.rx_abs_int_delay: 66
    dev.em.0.tx_abs_int_delay: 66
    dev.em.0.itr: 488
    dev.em.0.rx_processing_limit: 100
    dev.em.0.eee_control: 1
    dev.em.0.link_irq: 0
    dev.em.0.mbuf_alloc_fail: 0
    dev.em.0.cluster_alloc_fail: 0
    dev.em.0.dropped: 0
    dev.em.0.tx_dma_fail: 0
    dev.em.0.rx_overruns: 0
    dev.em.0.watchdog_timeouts: 0
    dev.em.0.device_control: 1477444160
    dev.em.0.rx_control: 67141650
    dev.em.0.fc_high_water: 23584
    dev.em.0.fc_low_water: 20552
    dev.em.0.queue0.txd_head: 330
    dev.em.0.queue0.txd_tail: 331
    dev.em.0.queue0.tx_irq: 0
    dev.em.0.queue0.no_desc_avail: 0
    dev.em.0.queue0.rxd_head: 670
    dev.em.0.queue0.rxd_tail: 669
    dev.em.0.queue0.rx_irq: 0
    dev.em.0.mac_stats.excess_coll: 0
    dev.em.0.mac_stats.single_coll: 0
    dev.em.0.mac_stats.multiple_coll: 0
    dev.em.0.mac_stats.late_coll: 0
    dev.em.0.mac_stats.collision_count: 0
    dev.em.0.mac_stats.symbol_errors: 0
    dev.em.0.mac_stats.sequence_errors: 0
    dev.em.0.mac_stats.defer_count: 0
    dev.em.0.mac_stats.missed_packets: 0
    dev.em.0.mac_stats.recv_no_buff: 0
    dev.em.0.mac_stats.recv_undersize: 0
    dev.em.0.mac_stats.recv_fragmented: 0
    dev.em.0.mac_stats.recv_oversize: 0
    dev.em.0.mac_stats.recv_jabber: 0
    dev.em.0.mac_stats.recv_errs: 0
    dev.em.0.mac_stats.crc_errs: 0
    dev.em.0.mac_stats.alignment_errs: 0
    dev.em.0.mac_stats.coll_ext_errs: 0
    dev.em.0.mac_stats.xon_recvd: 6882
    dev.em.0.mac_stats.xon_txd: 0
    dev.em.0.mac_stats.xoff_recvd: 6882
    dev.em.0.mac_stats.xoff_txd: 0
    dev.em.0.mac_stats.total_pkts_recvd: 650339
    dev.em.0.mac_stats.good_pkts_recvd: 636575
    dev.em.0.mac_stats.bcast_pkts_recvd: 200
    dev.em.0.mac_stats.mcast_pkts_recvd: 565265
    dev.em.0.mac_stats.rx_frames_64: 0
    dev.em.0.mac_stats.rx_frames_65_127: 0
    dev.em.0.mac_stats.rx_frames_128_255: 0
    dev.em.0.mac_stats.rx_frames_256_511: 0
    dev.em.0.mac_stats.rx_frames_512_1023: 0
    dev.em.0.mac_stats.rx_frames_1024_1522: 0
    dev.em.0.mac_stats.good_octets_recvd: 787053898
    dev.em.0.mac_stats.good_octets_txd: 962332835
    dev.em.0.mac_stats.total_pkts_txd: 720778
    dev.em.0.mac_stats.good_pkts_txd: 720778
    dev.em.0.mac_stats.bcast_pkts_txd: 8
    dev.em.0.mac_stats.mcast_pkts_txd: 524627
    dev.em.0.mac_stats.tx_frames_64: 0
    dev.em.0.mac_stats.tx_frames_65_127: 0
    dev.em.0.mac_stats.tx_frames_128_255: 0
    dev.em.0.mac_stats.tx_frames_256_511: 0
    dev.em.0.mac_stats.tx_frames_512_1023: 0
    dev.em.0.mac_stats.tx_frames_1024_1522: 0
    dev.em.0.mac_stats.tso_txd: 0
    dev.em.0.mac_stats.tso_ctx_fail: 0
    dev.em.0.interrupts.asserts: 1020338
    dev.em.0.interrupts.rx_pkt_timer: 0
    dev.em.0.interrupts.rx_abs_timer: 0
    dev.em.0.interrupts.tx_pkt_timer: 0
    dev.em.0.interrupts.tx_abs_timer: 0
    dev.em.0.interrupts.tx_queue_empty: 0
    dev.em.0.interrupts.tx_queue_min_thresh: 0
    dev.em.0.interrupts.rx_desc_min_thresh: 0
    dev.em.0.interrupts.rx_overrun: 0
    dev.em.0.wake: 0
    
    

    dmesg:

    Copyright (c) 1992-2012 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.
    FreeBSD is a registered trademark of The FreeBSD Foundation.
    FreeBSD 8.3-RELEASE-p14 #0: Tue Feb 11 11:44:30 CET 2014
        root@.<removed>.com:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.8 amd64
    Timecounter "i8254" frequency 1193182 Hz quality 0
    CPU: Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz (1797.69-MHz K8-class CPU)
      Origin = "GenuineIntel"  Id = 0x306a9  Family = 6  Model = 3a  Stepping = 9
      Features=0xbfebfbff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Features2=0x35bae3bf <sse3,pclmulqdq,dtes64,mon,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4.1,sse4.2,x2apic,popcnt,tscdlt,xsave,avx,f16c>AMD Features=0x28100800 <syscall,nx,rdtscp,lm>AMD Features2=0x1 <lahf>TSC: P-state invariant
    real memory  = 4294967296 (4096 MB)
    avail memory = 4038668288 (3851 MB)
    ACPI APIC Table: <intel d33217gk="">FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
    FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 SMT threads
     cpu0 (BSP): APIC ID:  0
     cpu1 (AP): APIC ID:  1
     cpu2 (AP): APIC ID:  2
     cpu3 (AP): APIC ID:  3
    ACPI Warning: FADT (revision 5) is longer than ACPI 2.0 version, truncating length 268 to 244 (20101013/tbfadt-392)
    ioapic0 <version 2.0="">irqs 0-23 on motherboard
    wlan: mac acl policy registered
    ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw/.
    ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
    module_register_init: MOD_LOAD (ipw_bss_fw, 0xffffffff804aa0b0, 0) error 1
    ipw_ibss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw/.
    ipw_ibss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
    module_register_init: MOD_LOAD (ipw_ibss_fw, 0xffffffff804aa150, 0) error 1
    ipw_monitor: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw/.
    ipw_monitor: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
    module_register_init: MOD_LOAD (ipw_monitor_fw, 0xffffffff804aa1f0, 0) error 1
    kbd1 at kbdmux0
    cryptosoft0: <software crypto="">on motherboard
    padlock0: No ACE support.
    acpi0: <intel>on motherboard
    acpi0: [ITHREAD]
    acpi0: Power Button (fixed)
    Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
    acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
    cpu0: <acpi cpu="">on acpi0
    cpu1: <acpi cpu="">on acpi0
    cpu2: <acpi cpu="">on acpi0
    cpu3: <acpi cpu="">on acpi0
    pcib0: <acpi host-pci="" bridge="">port 0xcf8-0xcff on acpi0
    pci0: <acpi pci="" bus="">on pcib0
    vgapci0: <vga-compatible display="">port 0xf000-0xf03f mem 0xf7800000-0xf7bfffff,0xe0000000-0xefffffff irq 16 at device 2.0 on pci0
    pci0: <simple comms="">at device 22.0 (no driver attached)
    em0: <intel(r) 1000="" pro="" network="" connection="" 7.3.8="">port 0xf060-0xf07f mem 0xf7c00000-0xf7c1ffff,0xf7c28000-0xf7c28fff irq 20 at device 25.0 on pci0
    em0: Using an MSI interrupt
    em0: [FILTER]
    ehci0: <ehci (generic)="" usb="" 2.0="" controller="">mem 0xf7c27000-0xf7c273ff irq 16 at device 26.0 on pci0
    ehci0: [ITHREAD]
    usbus0: EHCI version 1.0
    usbus0: <ehci (generic)="" usb="" 2.0="" controller="">on ehci0
    pci0: <multimedia, hda="">at device 27.0 (no driver attached)
    ehci1: <ehci (generic)="" usb="" 2.0="" controller="">mem 0xf7c26000-0xf7c263ff irq 23 at device 29.0 on pci0
    ehci1: [ITHREAD]
    usbus1: EHCI version 1.0
    usbus1: <ehci (generic)="" usb="" 2.0="" controller="">on ehci1
    isab0: <pci-isa bridge="">at device 31.0 on pci0
    isa0: <isa bus="">on isab0
    atapci0: <intel panther="" point="" sata300="" controller="">port 0xf130-0xf137,0xf120-0xf123,0xf110-0xf117,0xf100-0xf103,0xf0f0-0xf0ff,0xf0e0-0xf0ef irq 19 at device 31.2 on pci0
    atapci0: [ITHREAD]
    ata2: <ata channel="">at channel 0 on atapci0
    ata2: [ITHREAD]
    ata3: <ata channel="">at channel 1 on atapci0
    ata3: [ITHREAD]
    pci0: <serial bus,="" smbus="">at device 31.3 (no driver attached)
    atapci1: <intel panther="" point="" sata300="" controller="">port 0xf0d0-0xf0d7,0xf0c0-0xf0c3,0xf0b0-0xf0b7,0xf0a0-0xf0a3,0xf090-0xf09f,0xf080-0xf08f irq 19 at device 31.5 on pci0
    atapci1: [ITHREAD]
    ata4: <ata channel="">at channel 0 on atapci1
    ata4: [ITHREAD]
    ata5: <ata channel="">at channel 1 on atapci1
    ata5: [ITHREAD]
    acpi_button0: <power button="">on acpi0
    acpi_tz0: <thermal zone="">on acpi0
    acpi_tz1: <thermal zone="">on acpi0
    acpi_hpet0: <high precision="" event="" timer="">iomem 0xfed00000-0xfed003ff on acpi0
    Timecounter "HPET" frequency 14318180 Hz quality 900
    atrtc0: <at realtime="" clock="">port 0x70-0x77 irq 8 on acpi0
    atrtc0: Warning: Couldn't map I/O.
    orm0: <isa option="" roms="">at iomem 0xc0000-0xcefff,0xcf000-0xcffff on isa0
    atkbd: unable to set the command byte.
    sc0: <system console="">at flags 0x100 on isa0
    sc0: VGA <16 virtual consoles, flags=0x300>
    vga0: <generic isa="" vga="">at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    atkbdc0: <keyboard controller="" (i8042)="">at port 0x60,0x64 on isa0
    atkbd0: <at keyboard="">irq 1 on atkbdc0
    kbd0 at atkbd0
    atkbd0: [GIANT-LOCKED]
    atkbd0: [ITHREAD]
    psm0: unable to set the command byte.
    ppc0: cannot reserve I/O port range
    est0: <enhanced speedstep="" frequency="" control="">on cpu0
    p4tcc0: <cpu frequency="" thermal="" control="">on cpu0
    est1: <enhanced speedstep="" frequency="" control="">on cpu1
    p4tcc1: <cpu frequency="" thermal="" control="">on cpu1
    est2: <enhanced speedstep="" frequency="" control="">on cpu2
    p4tcc2: <cpu frequency="" thermal="" control="">on cpu2
    est3: <enhanced speedstep="" frequency="" control="">on cpu3
    p4tcc3: <cpu frequency="" thermal="" control="">on cpu3
    Timecounters tick every 1.000 msec
    IPsec: Initialized Security Association Processing.
    usbus0: 480Mbps High Speed USB v2.0
    usbus1: 480Mbps High Speed USB v2.0
    ad4: 122104MB <plextor px-128m5m="" 1.02="">at ata2-master UDMA100 SATA 6Gb/s
    ugen0.1: <intel>at usbus0
    uhub0: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr="">on usbus0
    ugen1.1: <intel>at usbus1
    uhub1: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr="">on usbus1
    uhub0: 2 ports with 2 removable, self powered
    uhub1: 2 ports with 2 removable, self powered
    ugen0.2: <vendor 0x8087="">at usbus0
    uhub2: <vendor 2="" 9="" 0x8087="" product="" 0x0024,="" class="" 0,="" rev="" 2.00="" 0.00,="" addr="">on usbus0
    ugen1.2: <vendor 0x8087="">at usbus1
    uhub3: <vendor 2="" 9="" 0x8087="" product="" 0x0024,="" class="" 0,="" rev="" 2.00="" 0.00,="" addr="">on usbus1
    uhub2: 6 ports with 6 removable, self powered
    uhub3: 8 ports with 8 removable, self powered
    SMP: AP CPU #1 Launched!
    SMP: AP CPU #3 Launched!
    SMP: AP CPU #2 Launched!
    Trying to mount root from ufs:/dev/ad4s1a
    WARNING: / was not properly dismounted
    ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present;
                to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
    ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior.
                 Consider tuning vm.kmem_size and vm.kmem_size_max
                 in /boot/loader.conf.
    ZFS filesystem version 5
    ZFS storage pool version 28
    padlock0: No ACE support.
    aesni0: No AESNI support.
    coretemp0: <cpu on-die="" thermal="" sensors="">on cpu0
    coretemp1: <cpu on-die="" thermal="" sensors="">on cpu1
    coretemp2: <cpu on-die="" thermal="" sensors="">on cpu2
    coretemp3: <cpu on-die="" thermal="" sensors="">on cpu3
    vlan0: changing name to 'em0_vlan1'
    vlan1: changing name to 'em0_vlan2'
    vlan2: changing name to 'em0_vlan3'
    vlan3: changing name to 'em0_vlan4'
    em0: link state changed to UP
    em0_vlan1: link state changed to UP
    em0_vlan2: link state changed to UP
    em0_vlan3: link state changed to UP
    em0_vlan4: link state changed to UP</cpu></cpu></cpu></cpu></vendor></vendor></vendor></vendor></intel></intel></intel></intel></plextor></cpu></enhanced></cpu></enhanced></cpu></enhanced></cpu></enhanced></at></keyboard></generic></system></isa></at></high></thermal></thermal></power></ata></ata></intel></serial></ata></ata></intel></isa></pci-isa></ehci></ehci></multimedia,></ehci></ehci></intel(r)></simple></vga-compatible></acpi></acpi></acpi></acpi></acpi></acpi></intel></software></version></intel></lahf></syscall,nx,rdtscp,lm></sse3,pclmulqdq,dtes64,mon,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4.1,sse4.2,x2apic,popcnt,tscdlt,xsave,avx,f16c></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe></removed> 
    

    img of capture with pause packets Notice the identical source and dst macs, weird?
    img of crazy pings

    Maybe one workaround would be to buy a switch which has the option of disabling flow control, that would make the pause spam have no effect, right? (since its link local)



  • So I "fixed" it. Turns out that it most likely was the netgear switch sending out the pause packets, even tho it was updated to latest firmware version and all..
    I bought a Cisco SG300-10 switch and set it up with the same basic vlan configuration and enabled flow control on every port aaaand whatta you know? Everything works without a hitch..

    Soo, lesson learned: Not EVER buying a netgear product again!

    SG300-10 has the added bonus of being able to handle the igmp proxying between VLANs, so pfSense doesn't have to!