pfsense on APU won't boot any more after upgrade
-
Hello,
after an upgrade, pfsene on an APU board won't boot anymore. The system was running for about a year without a flaw.
Maybe the SSD (Kingston, 120GB) has gone bad? But wouldn't that be a big coincedence, when it would go bad just on the reboot?
How would I track down the problem and/or retrieve the configuration from the non-booting system?
Thanks in Advance!
Attached is the output on the serial console when booting.
PC Engines APU BIOS build date: Apr 5 2014 Reading data from file [bootorder] SeaBIOS (version ?-20140405_120742-frink) SeaBIOS (version ?-20140405_120742-frink) Found coreboot cbmem console @ df150400 Found mainboard PC Engines APU Relocating init from 0x000e8e71 to 0xdf1065e0 (size 39259) Found CBFS header at 0xfffffb90 found file "bootorder" in cbmem CPU Mhz=1000 Found 28 PCI devices (max PCI bus is 05) Copying PIR from 0xdf160400 to 0x000f27a0 Copying MPTABLE from 0xdf161400/df161410 to 0x000f25b0 with length 1ec Copying ACPI RSDP from 0xdf162400 to 0x000f2590 Copying SMBIOS entry point from 0xdf16d800 to 0x000f2570 Using pmtimer, ioport 0x808 Scan for VGA option rom EHCI init on dev 00:12.2 (regs=0xf7f08420) Found 1 lpt ports Found 2 serial ports AHCI controller at 11.0, iobase f7f08000, irq 11 EHCI init on dev 00:13.2 (regs=0xf7f08520) EHCI init on dev 00:16.2 (regs=0xf7f08620) Searching bootorder for: /rom@img/setup Searching bootorder for: /rom@img/memtest Searching bootorder for: /pci@i0cf8/*@11/drive@0/disk@0 AHCI/0: registering: "AHCI/0: KINGSTON SMS200S3120G ATA-8 Hard-Disk (111 GiByte" OHCI init on dev 00:12.0 (regs=0xf7f04000) OHCI init on dev 00:13.0 (regs=0xf7f05000) OHCI init on dev 00:14.5 (regs=0xf7f06000) OHCI init on dev 00:16.0 (regs=0xf7f07000) Searching bootorder for: /pci@i0cf8/usb@16,2/storage@1/*@0/*@0,0 Searching bootorder for: /pci@i0cf8/usb@16,2/usb-*@1 USB MSC vendor='Multiple' product='Card Reader' rev='1.00' type=0 removable=1 Device reports MEDIUM NOT PRESENT scsi_is_ready returned -1 Unable to configure USB MSC drive. Unable to configure USB MSC device. All threads complete. Scan for option roms Build date: Apr 5 2014 System memory size: 4592 MB Press F12 for boot menu. Searching bootorder for: HALT drive 0x000f2500: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=234441648 Space available for UMB: c0000-ee800, f0000-f2500 Returned 253952 bytes of ZoneHigh e820 map has 7 items: 0: 0000000000000000 - 000000000009fc00 = 1 RAM 1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED 2: 00000000000f0000 - 0000000000100000 = 2 RESERVED 3: 0000000000100000 - 00000000df14e000 = 1 RAM 4: 00000000df14e000 - 00000000e0000000 = 2 RESERVED 5: 00000000f8000000 - 00000000f9000000 = 2 RESERVED 6: 0000000100000000 - 000000011f000000 = 1 RAM enter handle_19: NULL Booting from Hard Disk... Booting from 0000:7c00 //bboooott//ccoonnffiigg:: --SS111155220000 --DD IInnvvaalliidd ffoorrmmaatt --
-
Sorry, forgot to add output of the boot menu:
Press F12 for boot menu. Select boot device: 1. AHCI/0: KINGSTON SMS200S3120G ATA-8 Hard-Disk (111 GiBytes) 2. Payload [setup] 3. Payload [memtest] Searching bootorder for: HALT drive 0x000f2500: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=234441648 Space available for UMB: c0000-ee800, f0000-f2500 Returned 253952 bytes of ZoneHigh e820 map has 7 items: 0: 0000000000000000 - 000000000009fc00 = 1 RAM 1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED 2: 00000000000f0000 - 0000000000100000 = 2 RESERVED 3: 0000000000100000 - 00000000df14e000 = 1 RAM 4: 00000000df14e000 - 00000000e0000000 = 2 RESERVED 5: 00000000f8000000 - 00000000f9000000 = 2 RESERVED 6: 0000000100000000 - 000000011f000000 = 1 RAM enter handle_19: NULL Booting from Hard Disk... Booting from 0000:7c00 //bboooott//ccoonnffiigg:: --SS111155220000 --DD IInnvvaalliidd ffoorrmmaatt
-
had a similar issue when updating to 2.4.4_3 on an apu3, where the kernel was corrupted.
m-sata seems to be ok, recovered the config and did a fresh install. took the opportunity to update the bios as well. -
It won't reinstall.
The 120GB kingston SSD shows 0 cylinders, and only 32kBytes capacity.
I tried to access it with dd: dd also can read and/or write ONLY 32kBytes.
How comes that the upgrade process rendered the SSD unusable?
-
Some investigation revealed, that an erroneous setting "host protected area" might cause such a problem.
So I checked on another (identical) hardware from the pfsense "Command Prompt" menu in the web interface:
running
camcontrol hpa ada0
gives:pass0: <KINGSTON SMS200S3120G 600ABBF0> ATA8-ACS SATA 3.x device pass0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) Feature Support Enabled Value Host Protected Area (HPA) yes no 234441648/234441648 HPA - Security no
So I tried the same from the pfsense rescue shell on the installation media:
# camcontrol hpa ada0 pass0: <SandForce{200026BB} 306ABBR0> ATA8-ACS SATA 2.x device pass0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) camcontrol: ATA READ_NATIVE_MAX_ADDRESS48 failed: 0 # camcontrol hpa ada0 -U pfsense pass0: <SandForce{200026BB} 306ABBR0> ATA8-ACS SATA 2.x device pass0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) camcontrol: HPA is not supported by this device #
So I'm somewhat wired: Why is there no HPA support on the identical hardware? On the running pfsense, HPA seems to work fine on the identical hardware.
I also tried "secure erase":
# camcontrol security ada0 -s Erase -e Erase pass0: <SandForce{200026BB} 306ABBR0> ATA8-ACS SATA 2.x device pass0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) camcontrol: Security not supported #
Also not supported?
-
With HPA you can hide memory from your operating system to over-provision the memory and increase its lifetime.
It looks to me that your SSD indeed got bricked. Even if you can get it to work again (e.g. reflash its firmware if that is even possible with that specific drive) I would not use it again for anything critical like a router.
I do not believe this happened due to a bug in pfsense. Updates are just very "hard" on disks compared to normal routing operation, so it could really just be a coincidence it broke during the update.