Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    HA randomly BACKUP goes to MASTER state

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    21 Posts 4 Posters 4.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      m4rek11 @B_IT
      last edited by

      @Przemyslaw85, Im using monitoring system too and when the problem occuring lots of hosts (vlans) are not visible for a while within it.

      @b_it, thank you for link, please, let us to known about your test.

      B 1 Reply Last reply Reply Quote 0
      • B
        B_IT @m4rek11
        last edited by

        @m4rek11 @Przemyslaw85 Sure I will. I hope that I will back with good news. Stay tuned

        B 1 Reply Last reply Reply Quote 0
        • B
          B_IT @B_IT
          last edited by

          On last Saturday I added the two patches I mentioned in the previous comment and so far it looks much better. I don't see too many unnecessary messages, both firewalls are stable after these few days. Here are all the patches I added directly to mitigate CARP issue:

          Fix CARP event storm when leaving persistent CARP maintenance mode 1/2
          https://github.com/pfsense/pfsense/commit/8a906fba5e42d391227dfc39311d02b570576d50.patch

          Fix CARP event storm when leaving persistent CARP maintenance mode 2/2
          https://github.com/pfsense/pfsense/commit/3c15b353c6968801cfffb7d3b30a7069d2330a3e.patch

          during patching Saturday I also manually added this one:
          Fix Clicking Save & Force Update on a Dynamic DNS entry results in a GUI timeout
          https://github.com/pfsense/pfsense/commit/bdffb77d1aa21770b23ef408ad9fba79d0825ec5.patch

          and I applied this three patches from recommended section:
          Disable pf counter data preservation to temporarily work around latency when reloading large rulesets (Redmine #12827)

          Fix Captive Portal handling of non-TCP traffic after login (Redmine #12834)

          Fix OpenVPN dashboard widget client termination (Redmine #12817)

          to sum up: for now I will stay with 2.6.0 version with patches

          P 1 Reply Last reply Reply Quote 0
          • P
            Przemyslaw85 @B_IT
            last edited by

            @b_it I understand I have made changes for mode 1/2 and mode 2/2.
            For mode 1/2 I have to do steps for server 1 or both.

            My pfSense box w HA:
            Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
            Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

            B 1 Reply Last reply Reply Quote 0
            • B
              B_IT @Przemyslaw85
              last edited by

              @przemyslaw85 I think that every node should have the same set of patches. So I patched first node, and than the second node.

              this name is just my own convention name:

              Fix CARP event storm when leaving persistent CARP maintenance mode 1/2
              Fix CARP event storm when leaving persistent CARP maintenance mode 2/2

              For CARP issue the second patch is not going to apply without the first one. This the view from one node (the second has the same set o patches)
              4bc2fdee-0f66-45d5-a658-dfb4ca325c88-obraz.png

              P 1 Reply Last reply Reply Quote 0
              • P
                Przemyslaw85 @B_IT
                last edited by

                @b_it I confirm the operation of the patches.
                Yesterday I made a few changes to the original files using the file editor. I didn't know there was such a module as patches. I had to revert to the original changes from a copy made before editing.
                As I added 1/2 2/2 patches and Dynamin DNS I did not notice any improvement. Only after I added patches # 12827, # 12834, # 12816 and # 12817 I can say that now the system works as it should.

                My pfSense box w HA:
                Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                B 1 Reply Last reply Reply Quote 0
                • B
                  B_IT @Przemyslaw85
                  last edited by

                  @przemyslaw85 Seems to me that when I started to patch (CARP) I saw that firewall is more responsive making later changes (patching) but I didn't wait too long - just rebooted both nodes to be sure that all selected patches are fully applied.
                  I have to admit that I started to make more thorough tests after I rebooting FWs (with mentioned patch set), so I can't be sure what really helped and how much.
                  BTW; The patching mechanism was introduced around version 2.5, and I've already learned from his beginning that I have to be careful selecting patches.

                  M 1 Reply Last reply Reply Quote 0
                  • M
                    m4rek11 @B_IT
                    last edited by

                    @Przemyslaw85, @B_IT, after that changes did you have carp storm in logs and that MASTER -> BACKUP, BACKUP ->MASTER change for little time?

                    B P 2 Replies Last reply Reply Quote 0
                    • B
                      B_IT @m4rek11
                      last edited by

                      @m4rek11 I am looking into logs I see that during applying patch there are some entries, but after patching I see only a few, and they all looks as they should (at least for me) and they have reason (eg. rebooted node). I wouldn't call them storm and definitely I don't see flipping MASTER - BACKUP entries now.

                      1 Reply Last reply Reply Quote 0
                      • P
                        Przemyslaw85 @m4rek11
                        last edited by Przemyslaw85

                        @m4rek11 After applying the patches, I did not notice that the routers changed the roles of Master-> Backup, Backup-> Master.
                        All the problems went with those when I made any changes to the rules, dns or DHCP.

                        I found my configuration error early. For unknown reason, for 2 different networks I sent the same vhid for Virtual IP. But the problems were still there. After applying the patches, the problem was gone.

                        My pfSense box w HA:
                        Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                        Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.