Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Pfsense Failover drops connections/ interuppted on Restart of Primary

    Scheduled Pinned Locked Moved General pfSense Questions
    13 Posts 2 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      It should not behave like that. If you have state sync configured correctly it syncs states both ways so that states created on the secondary when it was master should be on the primary when it fails back. It looks like that is working to some extent otherwise no states would exist on the primary after it was rebooted.
      Check the interface order in the config files is identical.
      Check for any errors on the sync interface.

      Steve

      1 Reply Last reply Reply Quote 0
      • A
        AcaaliK
        last edited by

        Hi @stephenw10 I have double checked sync states is enabled on both firewalls and each unit has it's opposite number's IP address on the sync interface.

        I have checked the interface order and it is exactly the same both on the units; also checked the xml config downloaded from each unit.

        The sync interfaces has no errors went to Status -> Interfaces and below is screenshoot:

        87cfaffb-0228-4a01-900c-367d4d318e9b-image.png

        Please let me know if they are any other checks I can perform.

        Thank you

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by stephenw10

          Does it show approximately the same number of states on both nodes?

          An interesting test, if you're able to do it, would be to set the primary in maintenance mode and then reboot it. It should still be in maintenance mode after that. Bring it back to normal mode after a few minutes and see if it does the same thing. I'm wondering if the CARP is switching back to the Primary before the states have sync'ed somehow.

          Steve

          A 1 Reply Last reply Reply Quote 0
          • A
            AcaaliK @stephenw10
            last edited by

            @stephenw10 Hi sorry for the delayed response, was waiting on a maintenance window to try your suggestion. I can confirm the states are the same on both main and standby or at least within a certain range, as they keep changing.

            I put the main firewall in maintenance mode and rebooted and we didn’t experience any interruptions. Definitely looks like the main unit becomes the master before all the states are synced from the backup after the mains restart.

            Please advise on the next steps.

            Thanks a lot for the help.

            Regards

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              So to be clear you were able to bring the Primary out of maintenance mode some time after booting and it failed back without loosing connections?

              Steve

              A 1 Reply Last reply Reply Quote 0
              • A
                AcaaliK @stephenw10
                last edited by

                @stephenw10 Hi Stephen, I confirm that is correct.

                Thank you

                Regards

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Hmmm, I've never seen that. How many states would you have open typically when you do this?

                  A 1 Reply Last reply Reply Quote 0
                  • A
                    AcaaliK @stephenw10
                    last edited by

                    @stephenw10 Hello Stephen, about 36000 states.

                    Regards

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by stephenw10

                      Hmm, that's not a huge number. I'll see if I can find anything about this.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Hmm, looks like it's this: https://redmine.pfsense.org/issues/2218

                        Clearly almost nobody hits that, I've never seen it and that ticket is 7 years old!

                        Do you have packages installed that might be delaying the state sync as it mentions there?

                        Thanks,
                        Steve

                        A 1 Reply Last reply Reply Quote 0
                        • A
                          AcaaliK @stephenw10
                          last edited by

                          @stephenw10 Wow!! been scratching my head over this for awhile. Good to know it’s a known issue, was contemplating rebuilding the units from scratch.

                          I have only snort and HA proxy enabled, I saw a mention of HA proxy in the link you provided.

                          I do have plans of enabling other services like OpenVPN in the near future.

                          I hope a solution can be found especially for the unplanned shutdowns due to power for example, as a have autorestart on power restoration enabled for the units.

                          Thanks for all the help and support. I truly appreciate.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Mmm, hard to see what we can do here without patching something quite low level.

                            Ideally we would want it to remain in CARP maintenance until the states have syncd. That would probably need to be selectable though as some people will not be syncing states.

                            We could probably force the Primary to boot into maintenance mode at every boot requiring manual intervention to failback. It would still failback automatically if the secondary went off-line entirely. Would that be in any way practical for you?

                            Steve

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.