No ZFS pool located error - intermittent
-
I've been playing with the ZFS install (pfSense 2.4.1) on a HP DL360 G7. It so happens that occasionally (I would say around 50% of the time) when I reboot I get the following error:
Attempting to boot from hard rive (C:)
gptzfsboot : error 1 lba 32
error 1
gptzfboot : error 1 lba 4294967288
gptzfboot : error 1 lba 1
gptzfboot : error 1 lba 1
gptzfboot : No ZFS pool located, can't bootSimply rebooting it (CTR-ALT-DEL) eventually works (it does take multiple reboots sometimes), but this is less than ideal for an environment where this thing must be up at all times. I shudder to think about upgrading it in the field remotely.
The relevant hardware specs are these:
- 4 x 76GB SAS HD in a hardware RAID 10 config (RAID from the HP RAID card directly)
- 48GB of RAM
- I installed ZFS and chose RAID 10 from the pfSense install UI (not sure I did well here)
Is there anything I must know about ZFS that I clearly do not understand? I understood this was a solid file system, more so than UFS at the cost of a small performance hit. This server can certainly handle it.
-
It sounds like maybe you have some kind of BIOS or controller issue where it can't actually read the drive properly. It could very well be a hardware problem, or the first signs of one.
This may be unrelated, but if you are doing RAID on the RAID card, then do not setup RAID in ZFS.
If you want to do RAID in ZFS, then have the RAID controller in the hardware expose the disks directly (e.g. JBOD mode), and then use zfs for RAID (zraid)
-
This may be unrelated, but if you are doing RAID on the RAID card, then do not setup RAID in ZFS.
If you want to do RAID in ZFS, then have the RAID controller in the hardware expose the disks directly (e.g. JBOD mode), and then use zfs for RAID (zraid)
You answered my big unknown - since I have the choice, what would you recommend? That I use the hardware RAID, I imagine?
Is there any point to ZFS if I use the hardware RAID?
-
I don't know your controller. Unless it's a full and complete RAID controller (not a crappy half software RAID deal) then it's probably better to let ZFS handle the RAID.
Honestly you're probably better off letting zfs do it anyhow. Read up on zraid for reasons why.
-
I don't know your controller. Unless it's a full and complete RAID controller (not a crappy half software RAID deal) then it's probably better to let ZFS handle the RAID.
Honestly you're probably better off letting zfs do it anyhow. Read up on zraid for reasons why.
It`s a real RAID card (HP p410i). Will read up on it, but for the moment it isn't clear whats my best course of action (emphasis on reliability)
-
I think it is FreeBSD related bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=144234
Here is another thread - https://groups.google.com/forum/m/#!msg/bsdmailinglist/MbV1CIBXG4g/L12CTKg6yFIJ
I am not sure but you can try to create two virtual disks on your raid 10 and try to install on second one.
Also it is possible that you have some usb or other device that have problems with detecting on boot so be sure you do not have anything plugged in USB or disable unnecessary devices in uefi or bios.
Another one solution is to install VMware ESXI and then install pfSense on it, ESXI listing your raid controller as supported. -
I think it is FreeBSD related bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=144234
Here is another thread - https://groups.google.com/forum/m/#!msg/bsdmailinglist/MbV1CIBXG4g/L12CTKg6yFIJ
I am not sure but you can try to create two virtual disks on your raid 10 and try to install on second one.
Also it is possible that you have some usb or other device that have problems with detecting on boot so be sure you do not have anything plugged in USB or disable unnecessary devices in uefi or bios.
Another one solution is to install VMware ESXI and then install pfSense on it, ESXI listing your raid controller as supported.My intuition tells me it has something to do with ZFS RAID10 over hardware RAID10 (with a single logical disk in ZFS on top of that since the RAID10 had all 4 disks)
I've done plenty of experimental Linux and pfSense install on this server (all UFS before though) - things have been working ever since I removed the RAID10 logical disk. I don't think it's the controller, but I am keeping an eye on it.
I believe my best bet is to go forward with 4 disks in JBOD mode and ZFS, I'm just waiting on a long-enough SAS cable to try that (I did find an old RAID adaptec 2405 card in my "misc. PC part box" that supports JBOD, which the HP on board raid does not. Will see if that works out)
-
The general rule of thumb is to never put ZFS on top of a RAID unless that RAID is also ZFS.
-
If you have a real hardware RAID card and you must define some sort of array in order for the OS (FreeBSD) to see the disk(s) you can define the array(s) as single disk RAID0 array(s). The best option is of course a disk controller (be it RAID or not) that just does the basic I/O and offers the raw disks to the operating system as they are.
-
If you have a real hardware RAID card and you must define some sort of array in order for the OS (FreeBSD) to see the disk(s) you can define the array(s) as single disk RAID0 array(s). The best option is of course a disk controller (be it RAID or not) that just does the basic I/O and offers the raw disks to the operating system as they are.
If I use the onboard RAID card on the HP server, 4 x RAID0 is indeed the best I can do. I am waiting on a SAS cable to test another RAID card I had lying around that just might expose the raw disks to the OS - fingers crossed.
-
UPDATE: So I did get my SAS cable, and plugged it in to the HP server's backplane. I am bypassing (indeed I disabled it) the HP onboard RAID. I connected the backplane to an Adaptec 2405 RAID card, that is supported (according to the online documentation by Freebsd 11.1.
I set up my 4 SAS drives as JBOD on the Adaptec 2405 Raid card, and they seem to be setup well from what I can see in the Adaptec BIOS menu. When I do boot into the pfSense installer, I cannot choose those drives - no drives (except for my USB Key) appear as a drive on which I can install pfSense.
Does anyone have any clue to why pfSense would not see those drives or what I need to do for the drives to be usable?
-
Boot from the install media and drop to a shell when offered the option, then do:
dmesg | grep aac
The controller should be supported by the aac(4) driver but if there are driver initialization problems you'll only see them in the dmesg output.
-
Boot from the install media and drop to a shell when offered the option, then do:
Code: [Select]
dmesg | grep aacEverything seems perfectly fine from dmesg - I do see the Adaptec 2405 specifically detected and working. I just don't see any disks when trying to install
So the next step I took was to stop configuring the drives as JOBD - I configured them as a single RAID10, just to test a theory. The logical volume showed up in the pfSense installer!
I then tried to find some other pattern or information, and I noticed that the following message appeared after the Adaptec BIOS POST only when the drives where in JOBD mode, but not as an Arrays
No boot device available no INT13 devices BIOS not installed' at boot time
When the drives configured as an array I see "Bios installed successfully"
So it definitely has something to do with JOBD mode as opposed to an array. And this is the point where I don`t know at all where to look next, as the whole point of this was to use JOBD mode and ZFS and forgo true hardware RAID.
-
UPDATE:
So I found a non-obvious way in the Adaptec config to turn JOBD drives into "simple volumes". I believe JOBD not being bootable drives (according to Adaptec specs) keeps them from being seen in pfSense.
Not sure what's the implication of this, but I went forward with it since this is just a test system for now.
pfSense saw all 4 disks, installed correctly, etc.
Now I've got a running pfSense, but I am not sure it is ZFS-friendly. One thing I noticed is the absence of SMART info in the pfSense SMART page. There just isn't any device listed.
As for the ZFS file system, this is the "zpool status" result. At least this shows my ZFS in RAID10 mode correctly:
pool: zroot state: ONLINE scan: resilvered 2.36M in 0h0m with 0 errors on Fri Nov 10 14:49:57 2017 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 aacd0p3 ONLINE 0 0 0 aacd1p3 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 aacd2p3 ONLINE 0 0 0 aacd3p3 ONLINE 0 0 0 errors: No known data errors
How do I access the drive's health info from pfSense at this point, if nothing shows up in the SMART screen?
-
Replying to my own problem, for future readers who may have the same issue.
- This RAID card does not make JOBD drives bootable. It seems like a weird arbitrary decision, but that's the way it is and I haven't found a way around it.
- Morphing the JODB drives into simple volumes seems to create the equivalent of individual RAID0 drives. Which does not expose the hardware to FreeBSD/pfSense. So yes, I am seeing 4 logical drives, but to get the full ZFS-experience and reliability I understand the OS must see the hardware, not just logical volumes.
There seems to be no solution to this, besides getting a card with a proper pass-through mode. Which is what I am doing now.