Where can I grab the config?
-
My pfSense system, for whatever reason ended up non-bootable with what seems a corrupted file system. I'll try to boot from a LiveCD and see if I can fsck or otherwise access the disk.
Afterwards I'll likely have to reformat and reinstall for the sake of reliability.Provided I can access the disk, is there some way I can pull the configuration off the disk in a way useful to configure the system after it's been reinstalled from scratch?
What do I need to save, and how do I need to proceed to put it back into action on a new install?
Thanks!
-
When you boot from liveCD, there's a rescue option exactly for this.
-
When you boot from liveCD, there's a rescue option exactly for this.
I'll check it out. Haven't gotten to that point yet; last I had to install from CD was in the 1.x days, and as far as I can remember this option didn't exist back then, but this is great news we have this now :)
Hope it works out…
-
so you don't have a backup of your config? That never ceases to amaze me.. Why would you want/need to recover config from failed disk? Boot a clean install and just restore you config.
-
so you don't have a backup of your config? That never ceases to amaze me.. Why would you want/need to recover config from failed disk? Boot a clean install and just restore you config.
I do, but not the most recent config. Been traveling and using an iPad to do stuff like this.
So I could install an older config and try to recreate the changes, or just recover the config off the disk.
Real issue however seem to be that there's bug with system shutdown which can leave the file system in a bad state; looks like 2.2.3 is trying to address this; unfortunately a bit too late in my case… -
While it's already screwed - install the latest 2.2.3 snapshot from scratch and just restore the config to (hopefully) avoid that filesystem corruption in future.
-
really a bug that messes up the disk? On a normal shutdown? Dok is the king of bug reports, no link to this dok?
Is it on specific hardware only?
-
No, not HW specific at all. Just takes an unclean shutdown to screw things: https://redmine.pfsense.org/issues/4523
-
Thanks.. So phew that clearly states kernel panic or unclean shut down.. Sounded like the OP was saying it just happens if I shut my box down normally.
Nice thing about virtual is I always have a snapshot can roll back too, and I do take a config backup after any change in the config.
-
So for whatever reason the "rescue" option dies when I boot from an install CD.
I did manange to fsck the disk however, and have it mounted now at /tmp/hdrescue and tared up the /tmp/hdrescue/cf/conf folder, which contains the config.xml, the various .rrd files, etc. and I copied that onto a thumbdrive.
So I can extract all the files needed from that archive.Is the config.xml file already in the proper format for restoring, or does it need to be altered in some way?
Is there a way to include the rrd data, just for the sake of it, or would that require some complex operations?
Just want to make sure I have everything before I go and nuke the disk.
Any ideas if it's worth going ZFS on an SSD, or is the file system fix included in 2.2.3 good, and UFS better for SSD?
Given that I have the box in my paws now, I may as well make sure it's optimized for SSD use and the latest features, before I send it away again…
-
The fix in 2.2.3 is definitely good. Start with a 2.2.3 install, or upgrade to 2.2.3, and you'll be fine. We have systems here that have been through thousands of power cycles (scripted snmpset in a loop to power cycle ports on an IP PDU) in a circumstance where the passwd and group files have just been written out when power is cycled (which is what triggers the problem) with no issue.
It's easy to re-use the existing drive just copying /etc/master.passwd and /etc/group from some other system. The sync process will update its contents accordingly if there are differences. Otherwise can just copy /cf/conf/config.xml and /var/db/rrd/ contents to your new drive.
-
Ah, way cool. That's good news!
If I only need to copy two files over, that's easy. Thankfully I do have the other unit of the pair within reach, so I just need to grab the files from there, toss them onto a USB stick, and move it over.
Will try that later on, but now I need to get some sleep ;)
-
Hm, seems like that didn't quite work as expected. Could grab /etc/master.passwd and /etc/group from the twin system (other side of the VPN, otherwise nearly identical config), and copied it over to the damaged system.
But there seems to be something else corrupt:Boot hangs like before with a message like:
Trying to mount root from ufs:/dev/ufsid/ <some hex="" string="">[rw]...</some>
and that's where it's stuck. So it seems that something else got damaged. Also, the /etc/master.passwd and /etc/group were still on the drive, whether corrupted or not I can't say, but they weren't zero length.
So either I was bitten by something other than the 4523 bug, or that bug affects also other parts of the system.
A few lines before the last message, there is also this line:
ada0: Previously was know as ad5
Could a new version of the OS have renumbered the devices, and that's why it can't find the root file system???
-
Boot hangs like before with a message like:
Trying to mount root from ufs:/dev/ufsid/ <some hex="" string="">[rw]...</some>
and that's where it's stuck. So it seems that something else got damaged.
Don't think so. That's where it kicks over to serial console if you had serial console enabled in your restored config. It wouldn't also output to VGA in that case until boot was completely finished. It's probably sitting at an interface assignment prompt on the serial console.
A few lines before the last message, there is also this line:
ada0: Previously was know as ad5
Could a new version of the OS have renumbered the devices, and that's why it can't find the root file system???
It changed the device number, but that message means it set it accordingly so it knows it used to be ad5, and is now ada0. If it failed to mountroot, it would have kicked out to a mountroot failure prompt shortly after the above "trying to mount root" prompt.
-
@cmb:
Boot hangs like before with a message like:
Trying to mount root from ufs:/dev/ufsid/ <some hex="" string="">[rw]...</some>
and that's where it's stuck. So it seems that something else got damaged.
Don't think so. That's where it kicks over to serial console if you had serial console enabled in your restored config. It wouldn't also output to VGA in that case until boot was completely finished. It's probably sitting at an interface assignment prompt on the serial console.
I see. I don't think I ever knowingly enabled serial console, particularly since I have always set up the systems over the VGA console, and don't even have the proper serial cable I'd need to hook up the unit to some terminal (emulator). Probably decades since I last dealt with an actual terminal session, despite still having a vt100 sitting in my basement somewhere ;)
Is there a not too convoluted way that I can edit some file on the drive to switch things back to VGA and disable serial console? Because otherwise I have to either blindly trust that everything works and send the unit halfway across the country and just hope it works, or I'll have to install from scratch (likely the latter), but if I can change a few characters somewhere with a text editor that's quicker than going through this whole thing of reinstallation, and then restoring the config, etc.
-
hm, might be a consequence of the boot_serial being set inadvertently in 2.2.2 if you haven't upgraded it to 2.2.3 yet. You can set it to vidconsole at the loader prompt.
https://doc.pfsense.org/index.php/Boot_Troubleshooting#Booting_with_an_alternate_console -
OK, this worked, sort of.
Not sure if the configuration is bonkers or if the system threw a hissy-fit because it's configured for a fixed IP that won't work here…
...need to create a pseudo Internet gateway for it first and upload a 2.2.3 system from there.
Anyway, one way or the other I'll get it done ;)Thanks!