Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Random multiple master

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    8 Posts 3 Posters 2.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mayk
      last edited by

      Hello Board,

      We have the problem that for some reason in our carp setup the backup system suddenly thinks it has to become master. This problem presents itself at random. We can't pinpoint what to do to reproduce.

      The systemlog on the master presents the following :
      Mar 13 17:36:56 pfSense2 kernel: opt3_vip7: link state changed to UP
      Mar 13 17:36:56 pfSense2 kernel: opt3_vip6: link state changed to UP
      Mar 13 17:36:56 pfSense2 kernel: lan_vip3: MASTER -> BACKUP (more frequent advertisement received)
      Mar 13 17:36:57 pfSense2 kernel: opt3_vip9: MASTER -> BACKUP (more frequent advertisement received)
      Mar 13 17:36:57 pfSense2 kernel: opt3_vip7: MASTER -> BACKUP (more frequent advertisement received)
      Mar 13 17:36:57 pfSense2 kernel: opt3_vip6: MASTER -> BACKUP (more frequent advertisement received)
      Mar 13 17:36:59 pfSense2 kernel: lan_vip3: link state changed to DOWN
      Mar 13 17:36:59 pfSense2 kernel: opt3_vip9: link state changed to DOWN
      Mar 13 17:36:59 pfSense2 kernel: opt3_vip7: link state changed to DOWN
      Mar 13 17:36:59 pfSense2 kernel: opt3_vip6: link state changed to DOWN

      We also notice the console spamming information about interupt storm detected.

      Searching the forum gives multiple suggestions on resolving this, but until now nothing helped.

      • We tried setting the sysctl with some tuned values :

      net.inet.carp.preempt=1
      net.inet.carp.allow=1
      net.inet.carp.log=1
      net.inet.carp.drop_echoed=1
      hw.intr_storm_threshold=10000

      • When checking what device is causing the storm (vmstat -i) we see it's a certain internet connection. We moved this connection from the onboard to our more powerfull quad-port intel nic. This did not help.

      our setup is as followed :

      pf1
      lan 172.17.7.253/16
      carp 10.155.0.1/24
      wan1
      wan2
      wan3
      dmz

      pf2
      lan 172.17.7.252/16
      carp 10.155.0.2/24
      wan1
      wan2
      wan3
      dmz

      Both sharing 12 virtual ips from the type carp. The carp is a physical dedicated nicport.

      When the systems are up and running, i can reboot the master, and the backup takes over, and when the master comes back it will take back the functions and become master again, and the second tells me it's backup again in the carp status.

      The setup is followed from this link : http://www.howtoforge.com/how-to-configure-a-pfsense-2.0-cluster-using-carp

      Does someone have a pointer in where to troubleshoot this ?

      Thank you in advance. If there is more information needed ( more ip information ) i'm glad to provide ..

      Regards,

      Mayk

      1 Reply Last reply Reply Quote 0
      • V
        viragomann
        last edited by

        I had a similar issue here. The wan vips on the backup pfSense also bekame master, while the primary was master.

        For me it help to shut off flow control at wan interface by adding "hw.<if_name>.fc_setting=0".
        Maybe give it a try.

        Regards,

        Richard</if_name>

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          The problem isn't with pfSense, but at layer 2.

          If anything interferes with the multicast heartbeats, you'll get multiple masters if the secondary can't see the heartbeats from the master.

          So either the two are not visible in the same subnet, something (not pfSense) is blocking their CARP heartbeats, or there could be a VHID conflict with some other CARP/VRRP/HSRP device on the same layer 2.

          Remember: Upvote with the šŸ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • M
            mayk
            last edited by

            @jimp:

            The problem isn't with pfSense, but at layer 2.

            If anything interferes with the multicast heartbeats, you'll get multiple masters if the secondary can't see the heartbeats from the master.

            So either the two are not visible in the same subnet, something (not pfSense) is blocking their CARP heartbeats, or there could be a VHID conflict with some other CARP/VRRP/HSRP device on the same layer 2.

            Hello.. Thank you for the answer. My guest it is something interfering has allways been there but i can't pinpoint it.Ā  Regarding the carp , the interfaces are direct attached with a crosscable.Ā  This would bypass the theory of interference from a switch.
            The thing that is bothering me is that i also see it is logging more frequent advertisement received on the lan.. and other interfaces..Ā  I will go and double check the vhid settings.

            thank you again for the reply. Are there other things to check, or data to be provided to analyse this further ?

            @viragoman , thanks for the tip.. i will check this out too.

            Regards,
            Mayk

            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              @mayk:

              Hello.. Thank you for the answer. My guest it is something interfering has allways been there but i can't pinpoint it.Ā  Regarding the carp , the interfaces are direct attached with a crosscable.Ā  This would bypass the theory of interference from a switch.

              That's incorrect. CARP Heartbeats happen on every interface with a CARP VIP. They do not happen on the sync/crossover cable.

              Remember: Upvote with the šŸ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • M
                mayk
                last edited by

                thank you for clearing that up.. That is verry verry usefull information.Ā  All theĀ  connections are on seperate vlans.Ā  i think it is wise to check them too.

                1 Reply Last reply Reply Quote 0
                • M
                  mayk
                  last edited by

                  A quick question..Ā  When the HeartbeatĀ  happens on all interfaces with a vip, is there a way to monitor this ? logs ? For example, am i correct to assume that if a Heartbeat does not arrive on time on the slaveĀ  he wil then assume that there is a failure and become master for that vip ?Ā  Is there an option, and is it wise , to adjust timeout settings for the hb ?

                  1 Reply Last reply Reply Quote 0
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    The way to monitor it: If the heartbeats stop being seen by the slave, it takes over as master. It's logged in the system log.

                    If you want to decrease the sensitivity, increase the advbase on the VIPs. A higher base means that it will be less sensitive to a problem but it also takes longer to detect an outage.

                    Remember: Upvote with the šŸ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.