Cold standby for pfsense

  • Hi,
    I'm in the process of configuring a cold standby for our pfsense firewall (v. 2.3.2). I had expected that it would more or less be done by backing up the config and restoring the file to the secondary machine. But things are not that easy, as it turns out.

    First, I had to find that interfaces are called igb0/em0/em1 in one machine, while in the other one it's em0/em1/em2. Okay, I was able to sort that out in the xml file before restoring it.

    Next issue is that, after restoring, it gets stuck for half an hour or so at "synchronizing user settings". Could that be connected with our LDAP authentication for IPSec?

    When it's through with that, it will try for ages to check for updates - no matter if it's connected to the Internet or not. (From, it seems, I have to live with that for the time being…)

    And when it's finally through there, I still can't login. After entering my credentials (it's a local account), there's no reaction for some minutes. Finally, it says "Successful login" at the console, but on the browser, it says "no page assigned to this user".

    This is getting a bit frustrating. Are there any general hints how this can be done? Would things be better if we had identical hardware? Does the package "Backup Files and Directories" promise more success? Or is there anything else we can do to make the exchange of hardware more smoothly?

    Help greatly appreciated.

  • One possible solution is to work with virtualization if your hardware supports it. This makes pfSense truly "portable" in terms of hardware compatibility, and even allows running it on a Windows host if you're in a pinch (I didn't test this yet in a real world scenario, but I'm pretty sure it would work).

    If you're interested: VMware ESXi is free (you just have to register for a key), and can be installed to and booted from an USB stick. Any x86 machine with a 64-bit compatible CPU (two cores minimum), 4 GB RAM (more is better), Intel VT-x or AMD RVI support, and at least one NIC should work as a VMware "Whitebox". The VMs can be stored on local SATA drives or on a iSCSI datastore. Usage is pretty straightforward once you wrap your head around virtualization as a concept, and there's tons of literature and help on the internet to get you started.

  • Hi, thanks for your quick reply!
    I'm well familiar with virtualization; however, I was hoping for a solution that would not add, but lower complexity.
    Exchanging the box, restoring the config. Something that can be done by anyone, in emergency.

    Well, I'll probably buy access to the pfsense book and see if there are any further hints regarding recovery from a backup. Shouldn't be that difficult, after all. Thanks, though!

  • Most of the time where is no problems with restoring config at all.
    1st - are you 100% sure you didn't messed up while manually redacting config file? Like correct linefeeds, encoding?
    More so, why doing this? When you restore config, if there is a mismatch in interface names - restore wizard will ask you for correct mapping of interfaces.
    2 - try restoring config by parts - to determine at what part fails.

  • @pan_2: Okay - from your message, I understand that manually editing config.xml is not recommended.

    Instead I tried "Pre-Flight Installer Configuration Recovery" from the pfsense book.
    During boot, it says "Looking for config.xml on da0 [found msdos] done." Looks good up to here.
    But the installation doesn't start automatically, it is interrupted at the very first page (define keymaps etc.). When I choose "restore from config.xml" manually on the next page, I get an error saying "Can't stat /dev/da0sa1: No such file or directory".

    Well, okay. Next, I tried a clean install on a third machine and once again restoring the full config via Web GUI. As you said, the diverging interface names (here it's hn0 to hn4) are recognized and I'm asked to re-assign them. No problem here. After that, it's again the same issue: when trying to sign in, the web interface says "No page assigned to this user! Click here to logout.". As you asked: I can restore most areas without problems, for example firewall rules. Only restoring "System" (which contains the local users) causes that I'm no longer able to sign in.

  • Something special regarding your user settings?
    Maybe post it here?

    Also try to enable SSH before config export (make sure it works and you can login through it) and check if you can login by SSH after config import on new instance.

    Next, I tried a clean install on a third machine and once again restoring the full config via Web GUI

    I always prefer to do it this way.

  • Thanks, Soyokaze.
    I found one thing:
    We use locals users to authenticate at Web GUI, while we authenticate via LDAP (Active Directory) to connect by IPSec.
    Both works on the current machine.

    While configuring authentication for IPSec, I had switched System>User Manager>Settings>Authentication Server from "Local Database" to our AD server. This seems to cause the problem. If I return to "local database", backup the config and restore it, i can successfully login at the WebGUI. As far as I see, LDAP authentication for VPN still works.

    Maybe someone can clarify, what this option is meant for exactly and how it should be set in our constellation?
    By the way, I also wouldn't mind to authenticate at the WebGUI via LDAP. But I haven't managed yet to get this working.

  • Ah, as I thought.

    1. WebGUI auth through LDAP works, I did it couple of times
    2. Something with your environment prevents your restored configuration to successfully authenticate you through LDAP (and judging from symptoms - could not even connect to LDAP server), causing timeout and eventually fail to auth you.
    3. "System>User Manager>Settings>Authentication Server" is needed for WebGUI auth, AFAIR. Consult with this topic for configuring LDAP auth.

  • That went into the right direction.
    I had defined the LDAP server by FQDN. But during testing, there was also a problem with DNS lookups, so the server was not found. Instead defining it by IP address solved the problem.
    There had been some more challenges to take (WAN connections didn't come up right away…), but the process of exchanging hardware finally worked. What a relief!

Log in to reply