2.0.2 version crashing



  • After installing 2.0.2 version on a new intel hardware (MB and 4 NICs) pfsene is crashing multiple times every day.  :-\

    I did memtest before install, no RAM problem.

    Here is the crash file.
    crash.txt



  • I suggest you try either a 2.0.3 build or a 2.1 build. 2.1 builds have a much more up to date kernel and device drivers than 2.0.x builds.



  • thanks for reply, I will consider update to 2.0.3.
    The thing is that exact same hardware config was running without any problems on 2.0.1 version.



  • After upgraded to 2.0.3 PRERelease, crashing once a day, at least. No change.

    I have TWO boxes with identical hardware (crashing same way):

    MB Intel DH77EB
    CPU Celeron G460
    4x Intel NICs EXPI9301CT PRO/1000 and 1x Onboard (All in use): WAN1-DHCP / WAN2-PPPoE / DMZ / LAN1 / LAN2

    Installed Packages: OpenVPN TAP Fix, OpenVPN Export Client, Squid+SquiGuard, Sarg, Bacula-Client, Zabbix-agent

    Tunnables at "System Tunnables":
    kern.ipc.nmbclusters - 131072
    kern.maxfiles - 131070
    kern.maxfilesperproc - 32768
    net.inet.ip.portrange.last - 65535
    hw.igb.num_queues - 1

    It can't be hardware fault. I have another same one running 2.0.1 version with no issues at all. Is possible that could be a em(4) driver bug? Could be OS related, because 2.0.1 version is based on FreeBSD 8.1-RELEASE-p6 and newer versions is 8.1-RELEASE-p13.

    Need some hint.



  • @jon_pow:

    Is possible that could be a em(4) driver bug?

    Possible, but not certain. The em rx handler is reported in most of the stack traces in the crash report you posted but that probably is where the bug is detected, not where it occurs.

    The crash report includes the following:
    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address = 0x10
    fault code = supervisor read data, page not present
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80395e64b0
    frame pointer         = 0x28:0xffffff80395e6500
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq263: em3:rx 0)

    Fatal trap 9: general protection fault while in kernel mode
    cpuid = 0; apic id = 00
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80396074e0
    frame pointer         = 0x28:0xffffff8039607530
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq266: em4:rx 0)

    Fatal trap 9: general protection fault while in kernel mode
    cpuid = 0; apic id = 00
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80000bfe60
    frame pointer         = 0x28:0xffffff80000bfeb0
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq256: em0:rx 0)

    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address = 0x10
    fault code = supervisor read data, page not present
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80395e64b0
    frame pointer         = 0x28:0xffffff80395e6500
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq263: em3:rx 0)

    Fatal trap 9: general protection fault while in kernel mode
    cpuid = 0; apic id = 00
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80000bfe60
    frame pointer         = 0x28:0xffffff80000bfeb0
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq256: em0:rx 0)

    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address = 0x10
    fault code = supervisor read data, page not present
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80395e64b0
    frame pointer         = 0x28:0xffffff80395e6500
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq263: em3:rx 0)

    Fatal trap 9: general protection fault while in kernel mode
    cpuid = 0; apic id = 00
    instruction pointer = 0x20:0xffffffff807cad35
    stack pointer         = 0x28:0xffffff80396074e0
    frame pointer         = 0x28:0xffffff8039607530
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 12 (irq266: em4:rx 0)

    @jon_pow:

    Need some hint.

    1. Try a 2.1 snapshot build - as mentioned earlier the code is much more "current" than the code in 2.0.x builds
    2. Try a i386 build - I have seen a number of reports of this sort of crash from people running amd64 builds. I don't recall seeing reports of this in i386 builds
    3. The selection of crashes reported here reference em0, em3 and em4. What is different about em1 and em2 that they don't get a mention? Different types of traffic? Switch em1 and em3 and switch em2 and em4 and see if the "problem" moves.
    4. Go back to 2.0.1



  • I toke the machine to my network and no crash at all.

    I replace the box with same hardware profile in another  environment and 2.0.3 version, still crashing. Then return to 2.0.1 version, but still crashing. Very strange problem.

    I send crash reports every day, hoping someone helps.

    
    May  9 14:14:52 sec kernel: Fatal trap 12: page fault while in kernel mode
    May  9 14:14:52 sec kernel: cpuid = 1; apic id = 01
    May  9 14:14:52 sec kernel: fault virtual address       = 0x10
    May  9 14:14:52 sec kernel: fault code          = supervisor read data, page not present
    May  9 14:14:52 sec kernel: instruction pointer = 0x20:0xffffffff807cad25
    May  9 14:14:52 sec kernel: stack pointer               = 0x28:0xffffff803bca43a0
    May  9 14:14:52 sec kernel: frame pointer               = 0x28:0xffffff803bca43f0
    May  9 14:14:52 sec kernel: code segment                = base 0x0, limit 0xfffff, type 0x1b
    May  9 14:14:52 sec kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
    May  9 14:14:52 sec kernel: processor eflags    = interrupt enabled, resume, IOPL = 0
    May  9 14:14:52 sec kernel: current process             = 22984 (openvpn)
    
    May  9 18:03:32 sec kernel: Fatal trap 12: page fault while in kernel mode
    May  9 18:03:32 sec kernel: cpuid = 0; apic id = 00
    May  9 18:03:32 sec kernel: fault virtual address       = 0x21
    May  9 18:03:32 sec kernel: fault code          = supervisor read data, page not present
    May  9 18:03:32 sec kernel: instruction pointer = 0x20:0xffffffff807cad1b
    May  9 18:03:32 sec kernel: stack pointer               = 0x28:0xffffff80395ab4b0
    May  9 18:03:32 sec kernel: frame pointer               = 0x28:0xffffff80395ab500
    May  9 18:03:32 sec kernel: code segment                = base 0x0, limit 0xfffff, type 0x1b
    May  9 18:03:32 sec kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
    May  9 18:03:32 sec kernel: processor eflags    = interrupt enabled, resume, IOPL = 0
    May  9 18:03:32 sec kernel: current process             = 12 (irq260: em2:rx 0)
    
    

Log in to reply