Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    System lock-up

    Scheduled Pinned Locked Moved 2.1 Snapshot Feedback and Problems - RETIRED
    5 Posts 3 Posters 2.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      alanfs
      last edited by

      Hi, 
      I have a pair of dell 2950's running pfsense 2.1 in a carp pair. After several months of testing I rolled the setup into production within a few hours the primary and eventually the secondary machines locked up without warning or logs. I have repeated this several times and it doesn't appear to be related to load. The set-up is fairly complex as I have two WAN connections and a DMZ. I have been running a different machine with pfsense 1. in this role for three or four years with out a hickup -  ;D I have ruled out hardware as I have tried this on 2 pairs of machines all of which have failed in the (apparently) same way. The current machines have 6 Nics (2 on-board) and 2 dual-interface NC7170's. I am a long term user  and fan of pfSense, but I am nowhere near an expert, so I want to know how I should go about trying to hunt down the problem. I cannot reproduce the problem in testing, but it dies every time it goes into my production network, unfortunately this makes me very unpopular with the customers so I need to get insight rather than wade and experiment! The average traffic is about 50Mbps down/40Mbps up, but like I say I don't think its load related.

      Thanks all!
      Alan

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Those boxes appear to have two on board Broadcom NICs plus whatever others you have (possibly more Broadcom). You should try the recommended NIC tweak for Broadcom on Dell hardware:
        http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Broadcom_bce.284.29_Cards
        I don't know how applicable that is to 2.1 but it's easy to try.

        Steve

        Edit: The NC7170's seem to be Intel so possible try the Intel tweaks also. Like it says this is especially a problem if you are running 64bit.

        1 Reply Last reply Reply Quote 0
        • A
          alanfs
          last edited by

          Hi, I have tried the changes you suggested and the firewall died after about 3 hours this morning, same symptom ie no trace out output of any kind. CARP kicks in correctly, but the first machine never recovers so when the second eventually dies CARP runs out of options  :P

          One think I did just notice was that the network cards all seem to be intel. The built in nics are apparently Broadcom - I am not on site any more, so I cant swear to that!

          Thanks for your help  :)

          My /boot/loader.conf looks like :-
          autoboot_delay="3"
          vm.kmem_size="435544320"
          vm.kmem_size_max="535544320"
          kern.ipc.nmbclusters="131072"
          console="comconsole"
          hw.bce.tso_enable="0"
          hw.pci.enable_msix="0"
          hw.igb.num_queries="1"
          if_igb_load="YES"
          legal.intel_ipw.license_ack="1"
          legal.intel_wpi.license_ack="1"

          Dmesg:-

          Copyright © 1992-2010 The FreeBSD Project.
          Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
                  The Regents of the University of California. All rights reserved.
          FreeBSD is a registered trademark of The FreeBSD Foundation.
          FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 18:15:35 EST 2011
              root@FreeBSD_8.0_pfSense_2.0-AMD64.snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8 amd64
          Timecounter "i8254" frequency 1193182 Hz quality 0
          CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3591.24-MHz K8-class CPU)
            Origin = "GenuineIntel"  Id = 0xf41  Family = f  Model = 4  Stepping = 1
            Features=0xbfebfbff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Features2=0x659d <sse3,dtes64,mon,ds_cpl,est,tm2,cnxt-id,cx16,xtpr>AMD Features=0x20100800 <syscall,nx,lm>TSC: P-state invariant
          real memory  = 6442450944 (6144 MB)
          avail memory = 6186790912 (5900 MB)
          ACPI APIC Table: <dell  pe="" bkc ="">FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
          FreeBSD/SMP: 2 package(s) x 1 core(s) x 2 HTT threads
          cpu0 (BSP): APIC ID:  0
          cpu1 (AP/HT): APIC ID:  1
          cpu2 (AP): APIC ID:  6
          cpu3 (AP/HT): APIC ID:  7
          ioapic0: Changing APIC ID to 8
          ioapic1: Changing APIC ID to 9
          ioapic2: Changing APIC ID to 10
          ioapic3: Changing APIC ID to 11
          ioapic0 <version 2.0="">irqs 0-23 on motherboard
          ioapic1 <version 2.0="">irqs 32-55 on motherboard
          ioapic2 <version 2.0="">irqs 64-87 on motherboard
          ioapic3 <version 2.0="">irqs 96-119 on motherboard
          netisr_init: forcing maxthreads to 1 and bindthreads to 0 for device polling
          wlan: mac acl policy registered
          kbd1 at kbdmux0
          cryptosoft0: <software crypto="">on motherboard
          padlock0: No ACE support.
          acpi0: <dell pe="" bkc="">on motherboard
          acpi0: [ITHREAD]
          acpi0: Power Button (fixed)
          Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
          acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
          cpu0: <acpi cpu="">on acpi0
          cpu1: <acpi cpu="">on acpi0
          cpu2: <acpi cpu="">on acpi0
          cpu3: <acpi cpu="">on acpi0
          acpi_hpet0: <high precision="" event="" timer="">iomem 0xfed00000-0xfed003ff on acpi0
          Timecounter "HPET" frequency 14318180 Hz quality 900
          pcib0: <acpi host-pci="" bridge="">port 0xcf8-0xcff on acpi0
          pci0: <acpi pci="" bus="">on pcib0
          pcib1: <acpi pci-pci="" bridge="">at device 2.0 on pci0
          pci1: <acpi pci="" bus="">on pcib1
          pcib2: <acpi pci-pci="" bridge="">at device 0.0 on pci1
          pci2: <acpi pci="" bus="">on pcib2
          amr0: <lsilogic megaraid="" 1.53="">mem 0xf80f0000-0xf80fffff,0xfe9c0000-0xfe9fffff irq 46 at device 14.0 on pci2
          amr0: Using 64-bit DMA
          amr0: [ITHREAD]
          amr0: delete logical drives supported by controller
          amr0: <lsilogic perc="" 4e="" di="">Firmware 5B2D, BIOS H435, 256MB RAM
          pcib3: <acpi pci-pci="" bridge="">at device 0.2 on pci1
          pci3: <acpi pci="" bus="">on pcib3
          pcib4: <acpi pci-pci="" bridge="">at device 4.0 on pci0
          pci4: <acpi pci="" bus="">on pcib4
          pcib5: <acpi pci-pci="" bridge="">at device 5.0 on pci0
          pci5: <acpi pci="" bus="">on pcib5
          pcib6: <acpi pci-pci="" bridge="">at device 0.0 on pci5
          pci6: <acpi pci="" bus="">on pcib6
          em0: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.3="">port 0xecc0-0xecff mem 0xfe6e0000-0xfe6fffff irq 64 at device 7.0 on pci6
          em0: [FILTER]
          pcib7: <acpi pci-pci="" bridge="">at device 0.2 on pci5
          pci7: <acpi pci="" bus="">on pcib7
          em1: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.3="">port 0xdcc0-0xdcff mem 0xfe4e0000-0xfe4fffff irq 65 at device 8.0 on pci7
          em1: [FILTER]
          pcib8: <acpi pci-pci="" bridge="">at device 6.0 on pci0
          pci8: <acpi pci="" bus="">on pcib8
          pcib9: <acpi pci-pci="" bridge="">at device 0.0 on pci8
          pci9: <acpi pci="" bus="">on pcib9
          pcib10: <acpi pci-pci="" bridge="">at device 0.2 on pci8
          pci10: <acpi pci="" bus="">on pcib10
          em2: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.3="">port 0xccc0-0xccff mem 0xfe1e0000-0xfe1fffff,0xfe180000-0xfe1bffff irq 96 at device 2.0 on pci10
          em2: [FILTER]
          em3: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.3="">port 0xcc80-0xccbf mem 0xfe1c0000-0xfe1dffff irq 97 at device 2.1 on pci10
          em3: [FILTER]
          em4: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.3="">port 0xcc40-0xcc7f mem 0xfe160000-0xfe17ffff,0xfe100000-0xfe13ffff irq 101 at device 3.0 on pci10
          em4: [FILTER]
          em5: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.3="">port 0xcc00-0xcc3f mem 0xfe140000-0xfe15ffff irq 102 at device 3.1 on pci10
          em5: [FILTER]
          uhci0: <intel 82801eb="" (ich5)="" usb="" controller="" usb-a="">port 0xace0-0xacff irq 16 at device 29.0 on pci0
          uhci0: [ITHREAD]
          usbus0: <intel 82801eb="" (ich5)="" usb="" controller="" usb-a="">on uhci0
          uhci1: <intel 82801eb="" (ich5)="" usb="" controller="" usb-b="">port 0xacc0-0xacdf irq 19 at device 29.1 on pci0
          uhci1: [ITHREAD]
          usbus1: <intel 82801eb="" (ich5)="" usb="" controller="" usb-b="">on uhci1
          uhci2: <intel 82801eb="" (ich5)="" usb="" controller="" usb-c="">port 0xaca0-0xacbf irq 18 at device 29.2 on pci0
          uhci2: [ITHREAD]
          usbus2: <intel 82801eb="" (ich5)="" usb="" controller="" usb-c="">on uhci2
          ehci0: <intel 82801eb="" r="" (ich5)="" usb="" 2.0="" controller="">mem 0xfeb00000-0xfeb003ff irq 23 at device 29.7 on pci0
          ehci0: [ITHREAD]
          usbus3: EHCI version 1.0
          usbus3: <intel 82801eb="" r="" (ich5)="" usb="" 2.0="" controller="">on ehci0
          pcib11: <acpi pci-pci="" bridge="">at device 30.0 on pci0
          pci11: <acpi pci="" bus="">on pcib11
          vgapci0: <vga-compatible display="">port 0xbc00-0xbcff mem 0xf0000000-0xf7ffffff,0xfdef0000-0xfdefffff irq 18 at device 13.0 on pci11
          isab0: <pci-isa bridge="">at device 31.0 on pci0
          isa0: <isa bus="">on isab0
          atapci0: <intel ich5="" udma100="" controller="">port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
          ata0: <ata 0="" channel="">on atapci0
          ata0: [ITHREAD]
          ata1: <ata 1="" channel="">on atapci0
          ata1: [ITHREAD]
          atrtc0: <at realtime="" clock="">port 0x70-0x7f irq 8 on acpi0
          fdc0: <floppy drive="" controller="">port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
          fdc0: [FILTER]
          fd0: <1440-KB 3.5" drive> on fdc0 drive 0
          uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
          uart0: [FILTER]
          uart0: console (9600,n,8,1)
          orm0: <isa option="" roms="">at iomem 0xc0000-0xcafff,0xcd800-0xcefff,0xcf000-0xd07ff,0xec000-0xeffff on isa0
          sc0: <system console="">at flags 0x100 on isa0
          sc0: VGA <16 virtual consoles, flags=0x300>
          vga0: <generic isa="" vga="">at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
          atkbdc0: <keyboard controller="" (i8042)="">at port 0x60,0x64 on isa0
          atkbd0: <at keyboard="">irq 1 on atkbdc0
          kbd0 at atkbd0
          atkbd0: [GIANT-LOCKED]
          atkbd0: [ITHREAD]
          ppc0: cannot reserve I/O port range
          est0: <enhanced speedstep="" frequency="" control="">on cpu0
          est: CPU supports Enhanced Speedstep, but is not recognized.
          est: cpu_vendor GenuineIntel, msr 122d0000122d
          device_attach: est0 attach returned 6
          p4tcc0: <cpu frequency="" thermal="" control="">on cpu0
          est1: <enhanced speedstep="" frequency="" control="">on cpu1
          est: CPU supports Enhanced Speedstep, but is not recognized.
          est: cpu_vendor GenuineIntel, msr 122d0000122d
          device_attach: est1 attach returned 6
          p4tcc1: <cpu frequency="" thermal="" control="">on cpu1
          est2: <enhanced speedstep="" frequency="" control="">on cpu2
          est: CPU supports Enhanced Speedstep, but is not recognized.
          est: cpu_vendor GenuineIntel, msr 122d0000122d
          device_attach: est2 attach returned 6
          p4tcc2: <cpu frequency="" thermal="" control="">on cpu2
          est3: <enhanced speedstep="" frequency="" control="">on cpu3
          est: CPU supports Enhanced Speedstep, but is not recognized.
          est: cpu_vendor GenuineIntel, msr 122d0000122d
          device_attach: est3 attach returned 6
          p4tcc3: <cpu frequency="" thermal="" control="">on cpu3
          Timecounters tick every 1.000 msec
          IPsec: Initialized Security Association Processing.
          usbus0: 12Mbps Full Speed USB v1.0
          usbus1: 12Mbps Full Speed USB v1.0
          usbus2: 12Mbps Full Speed USB v1.0
          usbus3: 480Mbps High Speed USB v2.0
          acd0: CDROM <teac cd-rom="" cd-224e="" k.9a="">at ata0-master UDMA33
          amr0: delete logical drives supported by controller
          amrd0: <lsilogic megaraid="" logical="" drive="">on amr0
          amrd0: 69880MB (143114240 sectors) RAID 1 (optimal)
          SMP: AP CPU #1 Launched!
          SMP: AP CPU #3 Launched!
          SMP: AP CPU #2 Launched!
          ugen1.1: <intel>at usbus1ugen0.1: <intel>at usbus0ugen2.1: <intel>at usbus2ugen3.1: <intel>at usbus3
          uhub0:

          <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus1
          uhub1: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr="">on usbus3
          uhub2: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus2
          uhub3: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus0
          uhub0: 2 ports with 2 removable, self powered
          uhub3: 2 ports with 2 removable, self powered
          uhub2: 2 ports with 2 removable, self powered
          Root mount waiting for: usbus3
          Root mount waiting for: usbus3
          uhub1: 6 ports with 6 removable, self powered
          Root mount waiting for: usbus3
          ugen3.2: <vendor 0x413c="">at usbus3
          uhub4: <vendor 2="" 9="" 0x413c="" product="" 0xa001,="" class="" 0,="" rev="" 2.00="" 0.00,="" addr="">on usbus3
          uhub4: 2 ports with 2 removable, self powered
          ugen3.3: <dell>at usbus3
          ukbd0: <dell 0="" 3="" dell="" usb="" keyboard,="" class="" 0,="" rev="" 1.10="" 1.05,="" addr="">on usbus3
          kbd2 at ukbd0
          Trying to mount root from ufs:/dev/amrd0s1a
          pflog0: promiscuous mode enabled
          vip253: link state changed to UP
          vip210: link state changed to UP
          vip247: link state changed to UP
          vip244: link state changed to UP
          vip243: link state changed to UP
          vip242: link state changed to UP
          vip240: link state changed to UP
          vip236: link state changed to UP
          vip235: link state changed to UP
          vip234: link state changed to UP
          vip233: link state changed to UP
          vip230: link state changed to UP
          vip227: link state changed to UP
          vip226: link state changed to UP
          vip223: link state changed to UP
          vip250: link state changed to UP
          vip254: link state changed to UP
          vip246: link state changed to UP
          vip215: link state changed to UP
          vip218: link state changed to UP
          vip211: link state changed to UP
          vip209: link state changed to UP
          vip167: link state changed to UP
          vip204: link state changed to UP
          vip159: link state changed to UP
          vip195: link state changed to UP
          vip17: link state changed to UP
          vip185: link state changed to UP
          vip184: link state changed to UP
          vip178: link state changed to UP
          vip175: link state changed to UP
          vip172: link state changed to UP
          vip170: link state changed to UP
          vip166: link state changed to UP
          vip21: link state changed to UP
          vip194: link state changed to UP
          vip165: link state changed to UP
          vip164: link state changed to UP
          vip252: link state changed to UP
          vip251: link state changed to UP
          vip40: link state changed to UP
          vip163: link state changed to UP
          vip162: link state changed to UP
          vip161: link state changed to UP
          vip30: link state changed to UP
          vip123: link state changed to UP
          vip124: link state changed to UP
          vip35: link state changed to UP
          vip125: link state changed to UP
          vip5: link state changed to UP
          vip27: link state changed to UP
          vip51: link state changed to UP
          vip54: link state changed to UP
          vip57: link state changed to UP
          vip129: link state changed to UP
          vip126: link state changed to UP
          vip149: link state changed to UP
          vip177: link state changed to UP
          vip24: link state changed to UP
          vip88: link state changed to UP
          vip158: link state changed to UP
          vip4: link state changed to UP
          vip188: link state changed to UP
          vip130: link state changed to UP
          vip1: link state changed to UP
          vip221: link state changed to UP
          vip220: link state changed to UP
          vip212: link state changed to UP
          vip213: link state changed to UP
          vip217: link state changed to UP
          vip219: link state changed to UP
          ugen3.3: <dell>at usbus3 (disconnected)
          ukbd0: at uhub4, port 2, addr 3 (disconnected)</dell></dell></dell></vendor></vendor></intel></intel></intel></intel></intel></intel></intel></intel></lsilogic></teac></cpu></enhanced></cpu></enhanced></cpu></enhanced></cpu></enhanced></at></keyboard></generic></system></isa></floppy></at></ata></ata></intel></isa></pci-isa></vga-compatible></acpi></acpi></intel></intel></intel></intel></intel></intel></intel></intel></intel(r)></intel(r)></intel(r)></intel(r)></acpi></acpi></acpi></acpi></acpi></acpi></intel(r)></acpi></acpi></intel(r)></acpi></acpi></acpi></acpi></acpi></acpi></acpi></acpi></lsilogic></lsilogic></acpi></acpi></acpi></acpi></acpi></acpi></high></acpi></acpi></acpi></acpi></dell></software></version></version></version></version></dell ></syscall,nx,lm></sse3,dtes64,mon,ds_cpl,est,tm2,cnxt-id,cx16,xtpr></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>

          1 Reply Last reply Reply Quote 0
          • R
            Roots0
            last edited by

            Have you tried putting a monitor on one of the servers and looking at any messages after a lock up? Might help debugging.

            Mobile Computer & Network Support Stockport, UK
            www.timotten.co.uk

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Ok so, as you say, it looks like you have 6 Intel NICs and they're all the legacy type, em(4) driver.

              Thus apart from the the nmbclusters tweak the others are not doing anything.

              You should use /boot/loader.conf.local for additonal loader options as /boot/loader.conf can be overwritten at a firmware upgrade.

              I don't see why you need to load the igb driver.
              Put in hw.em.num_queries="1" instead of igb.
              Remove the bce stuff.

              Cross your fingers!  ;)

              Interestingly I don't have any OIDs at hw.em but most were introduced after FreeBSD 8.1.
              Edit: Doesn't appear to be a valid OID under 8.3 either.  :-\

              Steve

              Edit: As suggested above, if it is out of nmbclusters that should show up after a crash.
              Also putting a dmesg list in a code box makes your post much easier to read.  ;)

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.