Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Failed master node

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    4 Posts 2 Posters 660 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • N Offline
      nokia88
      last edited by

      Hi All,

      I'm using 2 Supermicro boxes in a HA/CARP setup. The boxes are taking care for the dhcp servers on all the vlans.
      Today a hardware failure occured on the master. The fail-over to the slave went well. I replaced the master's hardware, installed pfsense and restored a 2 day old config.xml.

      I was expecting the master to sync with the slave and become the master as before but it was not. During first boot of the new master both interfaces were down so i brought them up via the shell, but then all kind of errors started scrolling on the screen to fast to read. Also i was unable to access the webconfigurator or ping the lan interface.

      Looks like i did something wrong but i don't know what.
      What is the exact recovery procedure incase of a master hardware failure?

      1 Reply Last reply Reply Quote 0
      • DerelictD Offline
        Derelict LAYER 8 Netgate
        last edited by

        Sounds like you did OK.

        The best thing to do in that situation is a configuration freeze on the secondary since the secondary will not sync back to the primary. Else a log will have to be kept of all changes so they can be made again to the primary node.

        I would bring the replacement master back up disconnected from the network, restore the configuration, and set permanent CARP maintenance mode there.

        Then I would connect it to the network, start it, and be sure everything comes up in CARP BACKUP state. Make sure everything looks good then disable CARP maintenance mode and fail back.

        It sounds like you might have experienced a kernel panic. Was a crash dump present when you restarted it?

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        1 Reply Last reply Reply Quote 0
        • N Offline
          nokia88
          last edited by

          Thanks for your reply.

          So i started from scrath:

          • Installed same pfsense version on node1 as node2
          • Restored backup xml
          • set parmenent CARP maintenance mode from the console using option 12 and command "enablecarpmaint exec;"
          • connected both netwerkcables and ifconfig up for both interfaces

          Text started scrolling on the console screen.
          Filter synchronize: beginning XMLRPC sync data to https://XXX/xmlrpc.php
          A communication error occured while attempting to call XMLRPC method host_firmware_version. Unable to connect to tls://xxx:443 Error can't assign requested address.

          arpresolve: can't allocate llinfo for 10.0.0.2 on em1.

          1 Reply Last reply Reply Quote 0
          • DerelictD Offline
            Derelict LAYER 8 Netgate
            last edited by

            Well it is going to need at least a sync cable to sync over.

            You might also want to disable XMLRPC sync on the restored primary until you are ready to do that too. Or ifnore that error.

            If it is supposed to be syncing and cannot, you'll have to work out why there is no connectivity between the two.

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.