Igb driver yet again



  • Hi

    I have a fresh pfSense 2.0 RC1 install form 20110421 amd64 snapshot. Server has:
    2x realtek 8139 (rl)
    1x quad port intel pro/1000 (igb)

    pciconf -l
    hostb0@pci0:0:0:0:      class=0x060000 card=0x50001458 chip=0x29c08086 rev=0x10 hdr=0x00
    pcib1@pci0:0:1:0:       class=0x060400 card=0x50001458 chip=0x29c18086 rev=0x10 hdr=0x01
    vgapci0@pci0:0:2:0:     class=0x030000 card=0xd0001458 chip=0x29c28086 rev=0x10 hdr=0x00
    pcib5@pci0:0:28:0:      class=0x060400 card=0x50011458 chip=0x27d08086 rev=0x01 hdr=0x01
    uhci0@pci0:0:29:0:      class=0x0c0300 card=0x50041458 chip=0x27c88086 rev=0x01 hdr=0x00
    uhci1@pci0:0:29:1:      class=0x0c0300 card=0x50041458 chip=0x27c98086 rev=0x01 hdr=0x00
    uhci2@pci0:0:29:2:      class=0x0c0300 card=0x50041458 chip=0x27ca8086 rev=0x01 hdr=0x00
    uhci3@pci0:0:29:3:      class=0x0c0300 card=0x50041458 chip=0x27cb8086 rev=0x01 hdr=0x00
    ehci0@pci0:0:29:7:      class=0x0c0320 card=0x50061458 chip=0x27cc8086 rev=0x01 hdr=0x00
    pcib6@pci0:0:30:0:      class=0x060401 card=0x50001458 chip=0x244e8086 rev=0xe1 hdr=0x01
    isab0@pci0:0:31:0:      class=0x060100 card=0x50011458 chip=0x27b88086 rev=0x01 hdr=0x00
    atapci0@pci0:0:31:1:    class=0x01018a card=0xb0011458 chip=0x27df8086 rev=0x01 hdr=0x00
    atapci1@pci0:0:31:2:    class=0x01018f card=0xb0021458 chip=0x27c08086 rev=0x01 hdr=0x00
    none0@pci0:0:31:3:      class=0x0c0500 card=0x50011458 chip=0x27da8086 rev=0x01 hdr=0x00
    pcib2@pci0:1:0:0:       class=0x060400 card=0x00000000 chip=0x8018111d rev=0x0e hdr=0x01
    pcib3@pci0:2:2:0:       class=0x060400 card=0x00000000 chip=0x8018111d rev=0x0e hdr=0x01
    pcib4@pci0:2:4:0:       class=0x060400 card=0x00000000 chip=0x8018111d rev=0x0e hdr=0x01
    igb0@pci0:3:0:0:        class=0x020000 card=0x145a8086 chip=0x10d68086 rev=0x02 hdr=0x00
    igb1@pci0:3:0:1:        class=0x020000 card=0x145a8086 chip=0x10d68086 rev=0x02 hdr=0x00
    igb2@pci0:4:0:0:        class=0x020000 card=0x145a8086 chip=0x10d68086 rev=0x02 hdr=0x00
    igb3@pci0:4:0:1:        class=0x020000 card=0x145a8086 chip=0x10d68086 rev=0x02 hdr=0x00
    rl0@pci0:6:0:0: class=0x020000 card=0x032010bd chip=0x813910ec rev=0x10 hdr=0x00
    rl1@pci0:6:1:0: class=0x020000 card=0x032010bd chip=0x813910ec rev=0x10 hdr=0x00
    

    If I configure anything (assign interface) on igb0 - igb3 machine hangs for about 15s and then reboots. If the network cable is unplugged i can assign the interface but it will hang as soon as i connect the cable, literally. During those 15s no messages appear on the screen, keyboard is unresponsive. I've seen there were issues with this driver and the proposed solution was to set the following in /boot/loader.conf or /boot/loader.conf.local:

    hw.igb.num_queues="4"
    dev.igb.0.enable_lro="0"
    dev.igb.1.enable_lro="0"
    
    

    Problem is:

    [2.0-RC1][root@pfSense.localdomain]/root(16): sysctl -a | grep enable_lro
    [2.0-RC1][root@pfSense.localdomain]/root(17): sysctl -a | grep hw.igb
    [2.0-RC1][root@pfSense.localdomain]/root(18):
    

    dmesg:

    Copyright (c) 1992-2010 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.
    FreeBSD is a registered trademark of The FreeBSD Foundation.
    FreeBSD 8.1-RELEASE-p3 #0: Thu Apr 21 19:44:40 EDT 2011
        root@FreeBSD_8.0_pfSense_2.0-AMD64.snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.8 amd64
    Timecounter "i8254" frequency 1193182 Hz quality 0
    CPU: Intel(R) Celeron(R) CPU        E3300  @ 2.50GHz (2500.02-MHz K8-class CPU)
      Origin = "GenuineIntel"  Id = 0x1067a  Family = 6  Model = 17  Stepping = 10
      Features=0xbfebfbff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Features2=0x400e3bd <sse3,dtes64,mon,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,xsave>AMD Features=0x20100800 <syscall,nx,lm>AMD Features2=0x1 <lahf>TSC: P-state invariant
    real memory  = 2147483648 (2048 MB)
    avail memory = 2040205312 (1945 MB)
    ACPI APIC Table: <gbt   ="" gbtuacpi="">
    FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
    FreeBSD/SMP: 1 package(s) x 2 core(s)
     cpu0 (BSP): APIC ID:  0
     cpu1 (AP): APIC ID:  1
    ioapic0: Changing APIC ID to 2
    ioapic0 <version 2.0=""> irqs 0-23 on motherboard
    wlan: mac acl policy registered
    kbd1 at kbdmux0
    cryptosoft0: <software crypto=""> on motherboard
    padlock0: No ACE support.
    acpi0: <gbt gbtuacpi=""> on motherboard
    acpi0: [ITHREAD]
    acpi0: Power Button (fixed)
    acpi0: reservation of 0, a0000 (3) failed
    acpi0: reservation of 100000, 7f4e0000 (3) failed
    Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
    acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
    cpu0: <acpi cpu=""> on acpi0
    cpu1: <acpi cpu=""> on acpi0
    acpi_hpet0: <high precision="" event="" timer=""> iomem 0xfed00000-0xfed003ff on acpi0
    Timecounter "HPET" frequency 14318180 Hz quality 900
    acpi_button0: <power button=""> on acpi0
    pcib0: <acpi host-pci="" bridge=""> port 0xcf8-0xcff on acpi0
    pci0: <acpi pci="" bus=""> on pcib0
    pcib1: <pci-pci bridge=""> irq 16 at device 1.0 on pci0
    pci1: <pci bus=""> on pcib1
    pcib2: <pci-pci bridge=""> at device 0.0 on pci1
    pci2: <pci bus=""> on pcib2
    pcib3: <pci-pci bridge=""> at device 2.0 on pci2
    pci3: <pci bus=""> on pcib3
    igb0: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.1.7=""> port 0xcf00-0xcf1f mem 0xfd7a0000-0xfd7bffff,0xfd200000-0xfd3fffff,0xfd7fc000-0xfd7fffff irq 18 at device 0.0 on pci3
    igb0: Using MSIX interrupts with 5 vectors
    igb0: [ITHREAD]
    igb0: [ITHREAD]
    igb0: [ITHREAD]
    igb0: [ITHREAD]
    igb0: [ITHREAD]
    igb1: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.1.7=""> port 0xce00-0xce1f mem 0xfd7c0000-0xfd7dffff,0xfd400000-0xfd5fffff,0xfd7f8000-0xfd7fbfff irq 19 at device 0.1 on pci3
    igb1: Using MSIX interrupts with 5 vectors
    igb1: [ITHREAD]
    igb1: [ITHREAD]
    igb1: [ITHREAD]
    igb1: [ITHREAD]
    igb1: [ITHREAD]
    pcib4: <pci-pci bridge=""> at device 4.0 on pci2
    pci4: <pci bus=""> on pcib4
    igb2: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.1.7=""> port 0xbf00-0xbf1f mem 0xfd1a0000-0xfd1bffff,0xfcc00000-0xfcdfffff,0xfd1fc000-0xfd1fffff irq 16 at device 0.0 on pci4
    igb2: Using MSIX interrupts with 5 vectors
    igb2: [ITHREAD]
    igb2: [ITHREAD]
    igb2: [ITHREAD]
    igb2: [ITHREAD]
    igb2: [ITHREAD]
    igb3: <intel(r) 1000="" pro="" network="" connection="" version="" -="" 2.1.7=""> port 0xbe00-0xbe1f mem 0xfd1c0000-0xfd1dffff,0xfce00000-0xfcffffff,0xfd1f8000-0xfd1fbfff irq 17 at device 0.1 on pci4
    igb3: Using MSIX interrupts with 5 vectors
    igb3: [ITHREAD]
    igb3: [ITHREAD]
    igb3: [ITHREAD]
    igb3: [ITHREAD]
    igb3: [ITHREAD]
    vgapci0: <vga-compatible display=""> port 0xff00-0xff07 mem 0xfdf00000-0xfdf7ffff,0xd0000000-0xdfffffff,0xfd900000-0xfd9fffff irq 16 at device 2.0 on pci0
    agp0: <intel g33="" svga="" controller=""> on vgapci0
    agp0: detected 7164k stolen memory
    agp0: aperture size is 256M
    pcib5: <acpi pci-pci="" bridge=""> irq 16 at device 28.0 on pci0
    pci5: <acpi pci="" bus=""> on pcib5
    uhci0: <intel 82801g="" (ich7)="" usb="" controller="" usb-a=""> port 0xfe00-0xfe1f irq 23 at device 29.0 on pci0
    uhci0: [ITHREAD]
    uhci0: LegSup = 0x2f00
    usbus0: <intel 82801g="" (ich7)="" usb="" controller="" usb-a=""> on uhci0
    uhci1: <intel 82801g="" (ich7)="" usb="" controller="" usb-b=""> port 0xfd00-0xfd1f irq 19 at device 29.1 on pci0
    uhci1: [ITHREAD]
    uhci1: LegSup = 0x2f00
    usbus1: <intel 82801g="" (ich7)="" usb="" controller="" usb-b=""> on uhci1
    uhci2: <intel 82801g="" (ich7)="" usb="" controller="" usb-c=""> port 0xfc00-0xfc1f irq 18 at device 29.2 on pci0
    uhci2: [ITHREAD]
    uhci2: LegSup = 0x2f00
    usbus2: <intel 82801g="" (ich7)="" usb="" controller="" usb-c=""> on uhci2
    uhci3: <intel 82801g="" (ich7)="" usb="" controller="" usb-d=""> port 0xfb00-0xfb1f irq 16 at device 29.3 on pci0
    uhci3: [ITHREAD]
    uhci3: LegSup = 0x2f00
    usbus3: <intel 82801g="" (ich7)="" usb="" controller="" usb-d=""> on uhci3
    ehci0: <intel 82801gb="" r="" (ich7)="" usb="" 2.0="" controller=""> mem 0xfdfff000-0xfdfff3ff irq 23 at device 29.7 on pci0
    ehci0: [ITHREAD]
    usbus4: EHCI version 1.0
    usbus4: <intel 82801gb="" r="" (ich7)="" usb="" 2.0="" controller=""> on ehci0
    pcib6: <acpi pci-pci="" bridge=""> at device 30.0 on pci0
    pci6: <acpi pci="" bus=""> on pcib6
    rl0: <realtek 10="" 8139="" 100basetx=""> port 0xde00-0xdeff mem 0xfdbff000-0xfdbff0ff irq 20 at device 0.0 on pci6
    miibus0: <mii bus=""> on rl0
    rlphy0: <realtek internal="" media="" interface=""> PHY 0 on miibus0
    rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    rl0: [ITHREAD]
    rl1: <realtek 10="" 8139="" 100basetx=""> port 0xdc00-0xdcff mem 0xfdbfe000-0xfdbfe0ff irq 19 at device 1.0 on pci6
    miibus1: <mii bus=""> on rl1
    rlphy1: <realtek internal="" media="" interface=""> PHY 0 on miibus1
    rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    rl1: [ITHREAD]
    isab0: <pci-isa bridge=""> at device 31.0 on pci0
    isa0: <isa bus=""> on isab0
    atapci0: <intel ich7="" udma100="" controller=""> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf800-0xf80f at device 31.1 on pci0
    ata0: <ata 0="" channel=""> on atapci0
    ata0: [ITHREAD]
    atapci1: <intel ich7="" sata300="" controller=""> port 0xf700-0xf707,0xf600-0xf603,0xf500-0xf507,0xf400-0xf403,0xf300-0xf30f irq 19 at device 31.2 on pci0
    atapci1: [ITHREAD]
    ata2: <ata 0="" channel=""> on atapci1
    ata2: [ITHREAD]
    ata3: <ata 1="" channel=""> on atapci1
    ata3: [ITHREAD]
    pci0: <serial bus,="" smbus=""> at device 31.3 (no driver attached)
    atrtc0: <at realtime="" clock=""> port 0x70-0x73 on acpi0
    uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
    uart0: [FILTER]
    ppc0: <parallel port=""> port 0x378-0x37f irq 7 on acpi0
    ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
    ppc0: [ITHREAD]
    ppbus0: <parallel port="" bus=""> on ppc0
    plip0: <plip network="" interface=""> on ppbus0
    plip0: [ITHREAD]
    lpt0: <printer> on ppbus0
    lpt0: [ITHREAD]
    lpt0: Interrupt-driven port
    ppi0: <parallel i="" o=""> on ppbus0
    atkbdc0: <keyboard controller="" (i8042)=""> port 0x60,0x64 irq 1 on acpi0
    atkbd0: <at keyboard=""> irq 1 on atkbdc0
    kbd0 at atkbd0
    atkbd0: [GIANT-LOCKED]
    atkbd0: [ITHREAD]
    psm0: <ps 2="" mouse=""> irq 12 on atkbdc0
    psm0: [GIANT-LOCKED]
    psm0: [ITHREAD]
    psm0: model IntelliMouse Explorer, device ID 4
    sc0: <system console=""> at flags 0x100 on isa0
    sc0: VGA <16 virtual consoles, flags=0x300>
    vga0: <generic isa="" vga=""> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    est0: <enhanced speedstep="" frequency="" control=""> on cpu0
    est: CPU supports Enhanced Speedstep, but is not recognized.
    est: cpu_vendor GenuineIntel, msr 6164c2506000925
    device_attach: est0 attach returned 6
    p4tcc0: <cpu frequency="" thermal="" control=""> on cpu0
    est1: <enhanced speedstep="" frequency="" control=""> on cpu1
    est: CPU supports Enhanced Speedstep, but is not recognized.
    est: cpu_vendor GenuineIntel, msr 6164c2506000925
    device_attach: est1 attach returned 6
    p4tcc1: <cpu frequency="" thermal="" control=""> on cpu1
    Timecounters tick every 1.000 msec
    IPsec: Initialized Security Association Processing.
    usbus0: 12Mbps Full Speed USB v1.0
    usbus1: 12Mbps Full Speed USB v1.0
    usbus2: 12Mbps Full Speed USB v1.0
    usbus3: 12Mbps Full Speed USB v1.0
    usbus4: 480Mbps High Speed USB v2.0
    ad0: 152626MB <seagate st3160215a="" 3.aad=""> at ata0-master UDMA100 
    ugen0.1: <intel> at usbus0
    uhub0: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr=""> on usbus0
    ugen1.1: <intel> at usbus1
    uhub1: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr=""> on usbus1
    ugen2.1: <intel> at usbus2
    uhub2: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr=""> on usbus2
    ugen3.1: <intel> at usbus3
    uhub3: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr=""> on usbus3
    ugen4.1: <intel> at usbus4
    uhub4: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr=""> on usbus4
    acd0: CDROM <gcr-8523b 1.03=""> at ata0-slave PIO4 
    SMP: AP CPU #1 Launched!
    Root mount waiting for: usbus4 usbus3 usbus2 usbus1 usbus0
    uhub0: 2 ports with 2 removable, self powered
    uhub1: 2 ports with 2 removable, self powered
    uhub2: 2 ports with 2 removable, self powered
    uhub3: 2 ports with 2 removable, self powered
    Root mount waiting for: usbus4
    Root mount waiting for: usbus4
    Root mount waiting for: usbus4
    uhub4: 8 ports with 8 removable, self powered
    Trying to mount root from ufs:/dev/ad0s1a
    pflog0: promiscuous mode enabled
    rl1: link state changed to UP</gcr-8523b></intel></intel></intel></intel></intel></intel></intel></intel></intel></intel></seagate></cpu></enhanced></cpu></enhanced></generic></system></ps></at></keyboard></parallel></printer></plip></parallel></parallel></at></serial></ata></ata></intel></ata></intel></isa></pci-isa></realtek></mii></realtek></realtek></mii></realtek></acpi></acpi></intel></intel></intel></intel></intel></intel></intel></intel></intel></intel></acpi></acpi></intel></vga-compatible></intel(r)></intel(r)></pci></pci-pci></intel(r)></intel(r)></pci></pci-pci></pci></pci-pci></pci></pci-pci></acpi></acpi></power></high></acpi></acpi></gbt></software></version></gbt ></lahf></syscall,nx,lm></sse3,dtes64,mon,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,xsave></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>
    

    I would appreciate any pointers…



  • @msk:

    I've seen there were issues with this driver and the proposed solution was to set the following in /boot/loader.conf or /boot/loader.conf.local:

    Those values must be set in /boot/loader.conf.local because /boot/loader.conf can be overwritten on firmware upgrade.

    The settings in /boot/loader.conf.local get turned into kernel environment variables by the boot loader and the correct way to check them is by (for example) # kenv -q | grep hw.igb
    Because kernel environment variables are set by the boot loader before the kernel starts they can be read during the initialisation of device drivers that are built into the kernel. sysctl values are not set until later in system startup, after initialisation of device drivers built into the kernel.

    Does your system output any kind of panic message before it reboots?



  • Hmm… i should probably have mentioned that i already set those variables in /boot/loader.conf.local. Just thought they should be showing in sysctl -a output for some reason. But ok, i stand corrected.

    No. No messages of any kind are printed. Screen just freezes and then after about 15 - 20s machine reboots. Im planning to do more testing but for now i can more or less safely state:

    1. assigning an igb* interface from pfsense menu always causes a reboot either right away if the cable is attached or in case it isnt at the time then right after it is
    2. even if the interface is not configured at all plugging a network cable into the port sometimes (not always) causes a reboot
    3. if the cable is plugged before boot the unplugging it also sometimes (not always) causes a reboot
    4. setting above-mentioned vars in /boot/loader.conf.local has no effect

    My testing was by no means extensive due to lack of time. In all maybe 10 to 15 reboots with a complete reinstall from the same snapshot (2.0RC1 20110421 amd64) somewhere in between. I have not tried uniprocessor kernel yet which i've also seen being recommended.

    I'm willing to do any testing necessary but ill be away for the next week. So ill have access to the machine no sooner than next friday.



  • If anyones interested it turned out to be a pure hardware issue. pfSense has nothing to do with the problem. I tried Vyatta just out of curiosity and the symptoms were exactly the same. Probably the cheap gigabyte mainboard is to be blamed but who knows ;/



  • Have you tried removing the Realtek NICs (I believe they're cheap PCI 10/100 units) and try to boot with just the Intel Quad port on the board?



  • Yes i have tried booting without the Realtek NICs. I disabled the onboard gigabit NIC as there seems to be no stable driver for it under FreeBSD. I also tried upgrading the BIOS because there was something about "Improving lan compatibility" in the Changelog (not sure what that means) and reset the settings to default after. Nothing helped. The mainboard is Gigabyte GA-G31M-ES2L (http://gigabyte.com/products/product-page.aspx?pid=3485#ov). Obviously i wouldn't recommend it for quad port intel cards ;)

    Thought i could get away with using cheap hardware i had laying around (as in worst case buy another one when it breaks). Reality is brutal but i sometimes wonder how cool would it be if things just worked like advertised ;)

    I've practically given up on this one and are planning wait until i can get some different hardware for testing, esp. a new mainboard. Unless of course you guys have some more ideas…

    [edit]
    Forgot to add. For my last reinstall a few days ago i chose the uniprocessor kernel so it should not be smp related.



  • Have you tried using a PCI VGA card and disabling the onboard VGA?  The chipset wasn't meant to operate both the PCIe x16 slot in conjunction with the IGP.



  • I wouldn't have thought of that, thanks. Hope i can dig up an old PCI VGA somewhere.


Log in to reply