Core Dumped - less than 12h after upgrading to 2.4.5-RELEASE-p1
-
Upgraded my pfsense install yesterday, was running 2.4.1, hadn't been rebooted in a LONG time. Less than 12h after the upgrade, the box cratered in a ROYAL style. Services stopped working one-by-one, till I couldn't even SSH into it any more. Could reboot it from the web gui, after trying repeatedly.
Admin homedir is full of core files, that I'm sure would be really useful, need to figure out how to get them into the right hands.
-rw------- 1 root wheel 557056 Aug 17 18:45 bsnmpd.core -rw------- 1 root wheel 512000 Aug 17 20:22 dc.core -rw------- 1 root wheel 905216 Aug 17 20:22 dhcpd.core -rw------- 1 root wheel 520192 Aug 17 20:06 gnid.core -rw------- 1 root wheel 978944 Aug 17 18:45 lldpd.core -rw------- 1 root wheel 643072 Aug 17 20:22 ntpq.core -rw------- 1 root wheel 614400 Aug 17 18:45 openssl.core -rw------- 1 root wheel 622592 Aug 17 18:45 openvpn.core -rw------- 1 root wheel 8028160 Aug 17 20:22 php-cgi.core -rw------- 1 root wheel 8044544 Aug 17 20:22 php.core -rw------- 1 root wheel 790528 Aug 17 20:21 sshd.core -rw------- 1 root wheel 819200 Aug 17 18:45 zabbix_agentd.core
Looking back in the logs, the oldest entries I find are:
Aug 17 17:18:04 kernel pid 90504 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:18:04 kernel pid 86262 (ntpq), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 12933 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 11592 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 8050 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 7236 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 3145 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 1704 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 98109 (dc), jid 0, uid 0: exited on signal 11 (core dumped) Aug 17 17:17:01 kernel pid 97146 (dc), jid 0, uid 0: exited on signal 11 (core dumped)
I could probably attach the core files to the post, anyone able to take a look at them, figure out what crashed? Doesn't make me feel good about 2.4.5-p1 at the moment.
-
Yes, please attach textdump.tar and info.0 from the Dashboard page
What is your hardware?
Please showdmesg
-
[2.4.5-RELEASE][admin@pfsense001]/root: dmesg Copyright (c) 1992-2020 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.3-STABLE #243 abf8cba50ce(RELENG_2_4_5): Tue Jun 2 17:53:37 EDT 2020 root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-245/obj/amd64/YNx4Qq3j/build/ce-crossbuild-245/sources/FreeBSD-src/sys/pfSense amd64 FreeBSD clang version 8.0.1 (tags/RELEASE_801/final 366581) (based on LLVM 8.0.1) VT(vga): resolution 640x480 CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz (2400.02-MHz K8-class CPU) Origin="GenuineIntel" Id=0x6fb Family=0x6 Model=0xf Stepping=11 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM> AMD Features=0x20100800<SYSCALL,NX,LM> AMD Features2=0x1<LAHF> VT-x: HLT,PAUSE TSC: P-state invariant, performance statistics real memory = 4294967296 (4096 MB) avail memory = 4034084864 (3847 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: <GBT GBTUACPI> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) ioapic0: Changing APIC ID to 2 ioapic0 <Version 2.0> irqs 0-23 on motherboard SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! Timecounter "TSC-low" frequency 1200009771 Hz quality 1000 module_register_init: MOD_LOAD (ipw_bss_fw, 0xffffffff806a2f20, 0) error 1 wlan: mac acl policy registered kbd1 at kbdmux0 000.000022 [4213] netmap_init netmap: loaded module module_register_init: MOD_LOAD (vesa, 0xffffffff812d9960, 0) error 19 mlx5en: Mellanox Ethernet driver 3.5.2 (September 2019) nexus0 vtvga0: <VT VGA driver> on motherboard cryptosoft0: <software crypto> on motherboard padlock0: No ACE support. acpi0: <GBT GBTUACPI> on motherboard acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 attimer0: <AT timer> port 0x40-0x43 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 450 Event timer "HPET1" frequency 14318180 Hz quality 440 Event timer "HPET2" frequency 14318180 Hz quality 440 atrtc0: <AT realtime clock> port 0x70-0x73 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <PCI-PCI bridge> irq 16 at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pcib2: <PCI-PCI bridge> at device 0.0 on pci1 pci2: <PCI bus> on pcib2 pcib3: <PCI-PCI bridge> at device 2.0 on pci2 pci3: <PCI bus> on pcib3 igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xdf00-0xdf1f mem 0xfd9a0000-0xfd9bffff,0xfd400000-0xfd5fffff,0xfd9fc000-0xfd9fffff irq 18 at device 0.0 on pci3 igb0: Using MSIX interrupts with 5 vectors igb0: Bound queue 0 to cpu 0 igb0: Bound queue 1 to cpu 1 igb0: Bound queue 2 to cpu 2 igb0: Bound queue 3 to cpu 3 igb0: netmap queues/slots: TX 4/1024, RX 4/1024 igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xde00-0xde1f mem 0xfd9c0000-0xfd9dffff,0xfd600000-0xfd7fffff,0xfd9f8000-0xfd9fbfff irq 19 at device 0.1 on pci3 igb1: Using MSIX interrupts with 5 vectors igb1: Bound queue 0 to cpu 0 igb1: Bound queue 1 to cpu 1 igb1: Bound queue 2 to cpu 2 igb1: Bound queue 3 to cpu 3 igb1: netmap queues/slots: TX 4/1024, RX 4/1024 pcib4: <PCI-PCI bridge> at device 4.0 on pci2 pci4: <PCI bus> on pcib4 igb2: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xcf00-0xcf1f mem 0xfd3a0000-0xfd3bffff,0xfce00000-0xfcffffff,0xfd3fc000-0xfd3fffff irq 16 at device 0.0 on pci4 igb2: Using MSIX interrupts with 5 vectors igb2: Bound queue 0 to cpu 0 igb2: Bound queue 1 to cpu 1 igb2: Bound queue 2 to cpu 2 igb2: Bound queue 3 to cpu 3 igb2: netmap queues/slots: TX 4/1024, RX 4/1024 igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xce00-0xce1f mem 0xfd3c0000-0xfd3dffff,0xfd000000-0xfd1fffff,0xfd3f8000-0xfd3fbfff irq 17 at device 0.1 on pci4 igb3: Using MSIX interrupts with 5 vectors igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: netmap queues/slots: TX 4/1024, RX 4/1024 vgapci0: <VGA-compatible display> port 0xff00-0xff07 mem 0xfc800000-0xfcbfffff,0xd0000000-0xdfffffff irq 16 at device 2.0 on pci0 agp0: <Intel G41 SVGA controller> on vgapci0 agp0: aperture size is 256M, detected 32764k stolen memory vgapci0: Boot video device pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 pcib5: [GIANT-LOCKED] pci5: <ACPI PCI bus> on pcib5 pcib6: <PCI-PCI bridge> irq 16 at device 0.0 on pci5 pci6: <PCI bus> on pcib6 pcib7: <PCI-PCI bridge> irq 19 at device 3.0 on pci6 pci7: <PCI bus> on pcib7 re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xbe00-0xbeff mem 0xfdeff000-0xfdefffff,0xfdcfc000-0xfdcfffff irq 19 at device 0.0 on pci7 re0: Using 1 MSI-X message re0: Chip rev. 0x2c800000 re0: MAC rev. 0x00100000 miibus0: <MII bus> on re0 rgephy0: <RTL8169S/8110S/8211 1000BASE-T media interface> PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Using defaults for TSO: 65518/35/2048 re0: netmap queues/slots: TX 1/256, RX 1/256 pcib8: <PCI-PCI bridge> irq 19 at device 7.0 on pci6 pci8: <PCI bus> on pcib8 re1: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xae00-0xaeff mem 0xfddff000-0xfddfffff,0xfdbfc000-0xfdbfffff irq 19 at device 0.0 on pci8 re1: Using 1 MSI-X message re1: Chip rev. 0x2c800000 re1: MAC rev. 0x00100000 miibus1: <MII bus> on re1 rgephy1: <RTL8169S/8110S/8211 1000BASE-T media interface> PHY 1 on miibus1 rgephy1: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re1: Using defaults for TSO: 65518/35/2048 re1: netmap queues/slots: TX 1/256, RX 1/256 pcib9: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0 pcib9: [GIANT-LOCKED] pci9: <ACPI PCI bus> on pcib9 alc0: <Atheros AR8151 v1.0 PCIe Gigabit Ethernet> port 0x9f00-0x9f7f mem 0xfdac0000-0xfdafffff irq 17 at device 0.0 on pci9 alc0: 11776 Tx FIFO, 12032 Rx FIFO alc0: Using 1 MSI message(s). miibus2: <MII bus> on alc0 atphy0: <Atheros F1 10/100/1000 PHY> PHY 0 on miibus2 atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow alc0: Using defaults for TSO: 65518/35/2048 uhci0: <Intel 82801G (ICH7) USB controller USB-A> port 0xfe00-0xfe1f irq 23 at device 29.0 on pci0 usbus0 on uhci0 usbus0: 12Mbps Full Speed USB v1.0 uhci1: <Intel 82801G (ICH7) USB controller USB-B> port 0xfd00-0xfd1f irq 19 at device 29.1 on pci0 usbus1 on uhci1 usbus1: 12Mbps Full Speed USB v1.0 uhci2: <Intel 82801G (ICH7) USB controller USB-C> port 0xfc00-0xfc1f irq 18 at device 29.2 on pci0 usbus2 on uhci2 usbus2: 12Mbps Full Speed USB v1.0 uhci3: <Intel 82801G (ICH7) USB controller USB-D> port 0xfb00-0xfb1f irq 16 at device 29.3 on pci0 usbus3 on uhci3 usbus3: 12Mbps Full Speed USB v1.0 ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem 0xfdfff000-0xfdfff3ff irq 23 at device 29.7 on pci0 usbus4: EHCI version 1.0 usbus4 on ehci0 usbus4: 480Mbps High Speed USB v2.0 pcib10: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci10: <ACPI PCI bus> on pcib10 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel ICH7 SATA300 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf800-0xf80f at device 31.2 on pci0 ata0: <ATA channel> at channel 0 on atapci0 ata1: <ATA channel> at channel 1 on atapci0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse Explorer, device ID 4 orm0: <ISA Option ROM> at iomem 0xc0000-0xcc7ff on isa0 ppc0: cannot reserve I/O port range acpi_perf0: <ACPI CPU Frequency Control> on cpu0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 921092106000921 device_attach: est1 attach returned 6 est2: <Enhanced SpeedStep Frequency Control> on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 921092106000921 device_attach: est2 attach returned 6 est3: <Enhanced SpeedStep Frequency Control> on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 921092106000921 device_attach: est3 attach returned 6 Timecounters tick every 1.000 msec ugen4.1: <Intel EHCI root HUB> at usbus4 ugen2.1: <Intel UHCI root HUB> at usbus2 ugen3.1: <Intel UHCI root HUB> at usbus3 ugen1.1: <Intel UHCI root HUB> at usbus1 uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus4 uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3 uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2 ugen0.1: <Intel UHCI root HUB> at usbus0 uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1 uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub4: 2 ports with 2 removable, self powered uhub3: 2 ports with 2 removable, self powered uhub0: 8 ports with 8 removable, self powered ada0 at ata0 bus 0 scbus0 target 0 lun 0 ada0: <INTEL SSDSA2CW120G3 4PC10362> ATA8-ACS SATA 2.x device ada0: Serial Number BTPR141500PH120LGN ada0: 150.000MB/s transfers (SATA, UDMA5, PIO 8192bytes) ada0: 114473MB (234441648 512 byte sectors) ada0: quirks=0x1<4K> Trying to mount root from ufs:/dev/ufsid/5dfcea10cfecd0c1 [rw]... random: unblocking device. CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz (2400.02-MHz K8-class CPU) Origin="GenuineIntel" Id=0x6fb Family=0x6 Model=0xf Stepping=11 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM> AMD Features=0x20100800<SYSCALL,NX,LM> AMD Features2=0x1<LAHF> VT-x: HLT,PAUSE TSC: P-state invariant, performance statistics coretemp0: <CPU On-Die Thermal Sensors> on cpu0 coretemp1: <CPU On-Die Thermal Sensors> on cpu1 est1: <Enhanced SpeedStep Frequency Control> on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 921092106000921 device_attach: est1 attach returned 6 coretemp2: <CPU On-Die Thermal Sensors> on cpu2 est2: <Enhanced SpeedStep Frequency Control> on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 921092106000921 device_attach: est2 attach returned 6 coretemp3: <CPU On-Die Thermal Sensors> on cpu3 est3: <Enhanced SpeedStep Frequency Control> on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 921092106000921 device_attach: est3 attach returned 6
-
@viktor_g said in Core Dumped - less than 12h after upgrading to 2.4.5-RELEASE-p1:
Yes, please attach textdump.tar and info.0 from the Dashboard page
What is your hardware?
Please showdmesg
Sorry, I don't see where to find the textdump.tar or info.0?
cores.zip <= zip file of the *.core listed above.
-
Hi,
First of all, don't worry.
You'll be needing a (direct) console access when these things happen - if even SSH goes down ....
A process or program can contain a bug that can pop on in situations that exist on your system. But a bug in a process that died a couple of minutes ago can't impact another process. They only share the processor, the kernel and the hardware (memory that is).
Most of the pfSense users (95 % or plus ?) use 2.4.5-p1 these days, as we do not want to deal with possible security bugs (I rather have my system down as hacked). 2Your .4.1 was dangerously old.
Close to none on this forum are complaining about all ( ? ) processes dying.Your processes dying have one thing in common : signal 11 => https://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html
Also : check memory usage.Btw : I've been using p1 since day 1 : did not saw any core dumps. Neither with previous versions.
The fastest way to convince yourself pfSense is fine : swap hardware, or fire up a VM.
-
yeah its pointing to an i/o error, failing disk or memory or hardware been ran out of spec (XMP ram, overclock etc.).
-
@chrcoluk said in Core Dumped - less than 12h after upgrading to 2.4.5-RELEASE-p1:
yeah its pointing to an i/o error, failing disk or memory or hardware been ran out of spec (XMP ram, overclock etc.).
Thanks. Nothing has changed in the hardware profile in years, it's actually bone stock. Might be on to something with the disk however, I'll force a filesystem check tonight when I can take it down without impacting people.
https://docs.netgate.com/pfsense/en/latest/hardware/forcing-a-filesystem-check.html
-
Note : even excellent hardware can die on you.