Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    HA failover Issue with CARP switching all interfaces to backup when just one connection fails.

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    6 Posts 3 Posters 719 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jronald
      last edited by

      HA fail-over Issue. When any one of our internet connections drop all interfaces fail-over to the backup and then switches back. Why even switch when the internet connection has also failed on the backup as well. Any idea why is this happening? Was this a known issue with 22.01? All advanced gateway settings are still set to the defaults for all gateways. Any thoughts would be appreciated.

      Thanks!
      Jim

      Running 22.01-RELEASE (amd64) on XG 1538

      Aug 22 11:19:00 sshguard 17300 Now monitoring attacks.
      Aug 22 11:19:00 sshguard 52663 Exiting on signal.
      Aug 22 11:14:22 kernel carp: demoted by -240 to 0 (pfsync bulk
      Aug 22 11:08:00 sshguard 52663 Now monitoring attacks.
      Aug 22 11:08:00 sshguard 14862 Exiting on signal.
      Aug 22 11:07:03 check_reload_status 20671 Starting packages
      Aug 22 11:07:03 check_reload_status 20671 Reloading filter
      Aug 22 11:07:02 check_reload_status 20671 rc.newwanip starting ovpns8
      Aug 22 11:07:02 kernel ovpns8: link state changed to UP
      Aug 22 11:07:02 check_reload_status 20671 Reloading filter
      Aug 22 11:07:01 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
      Aug 22 11:07:00 check_reload_status 20671 Starting packages
      Aug 22 11:07:01 kernel ovpns8: link state changed to DOWN
      Aug 22 11:06:59 check_reload_status 20671 Starting packages
      Aug 22 11:06:59 check_reload_status 20671 Starting packages
      Aug 22 11:06:59 check_reload_status 20671 Starting packages
      Aug 22 11:06:59 check_reload_status 20671 Starting packages
      Aug 22 11:06:59 check_reload_status 20671 rc.newwanip starting ovpns8
      Aug 22 11:06:59 kernel ovpns8: link state changed to UP
      Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpns7
      Aug 22 11:06:58 kernel ovpns7: link state changed to UP
      Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpnc5
      Aug 22 11:06:58 kernel ovpnc5: link state changed to UP
      Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpnc4
      Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpns6
      Aug 22 11:06:57 check_reload_status 20671 Reloading filter
      Aug 22 11:06:58 kernel ovpns8: link state changed to DOWN
      Aug 22 11:06:58 kernel ovpns6: link state changed to UP
      Aug 22 11:06:58 kernel ovpnc4: link state changed to UP
      Aug 22 11:06:55 check_reload_status 20671 Carp master event
      Aug 22 11:06:55 kernel carp: 196@ix0: BACKUP -> MASTER (master timed out)
      Aug 22 11:06:55 kernel carp: 116@ixl3: BACKUP -> MASTER (master timed out)
      Aug 22 11:06:55 check_reload_status 20671 Carp master event
      Aug 22 11:06:54 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
      Aug 22 11:06:54 check_reload_status 20671 Carp master event
      Aug 22 11:06:54 kernel ovpns7: link state changed to DOWN
      Aug 22 11:06:54 kernel ovpnc5: link state changed to DOWN
      Aug 22 11:06:54 check_reload_status 20671 Reloading filter
      Aug 22 11:06:54 kernel ovpns6: link state changed to DOWN
      Aug 22 11:06:54 check_reload_status 20671 Reloading filter
      Aug 22 11:06:54 kernel ovpnc4: link state changed to DOWN
      Aug 22 11:06:52 check_reload_status 20671 Carp backup event
      Aug 22 11:06:52 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:06:52 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:06:52 check_reload_status 20671 Carp backup event
      Aug 22 11:06:51 check_reload_status 20671 Carp backup event
      Aug 22 11:06:51 kernel carp: 247@ixl0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:06:51 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
      Aug 22 11:06:51 check_reload_status 20671 Carp master event
      Aug 22 11:06:47 check_reload_status 20671 Reloading filter
      Aug 22 11:06:47 check_reload_status 20671 Restarting OpenVPN tunnels/interfaces
      Aug 22 11:06:47 check_reload_status 20671 Restarting IPsec tunnels
      Aug 22 11:06:47 check_reload_status 20671 updating dyndns WAN_ATT2_GW
      Aug 22 11:06:47 rc.gateway_alarm 90277 >>> Gateway alarm: WAN_ATT2_GW (Addr:1.1.1.1 Alarm:0 RTT:13.720ms RTTsd:.347ms Loss:5%)
      Aug 22 11:05:51 php 43661 notify_monitor.php: Message sent to xxxx.com OK
      Aug 22 11:05:48 kernel carp: 247@ixl0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:48 check_reload_status 20671 Carp backup event
      Aug 22 11:05:31 php 43661 notify_monitor.php: Message sent to xxx.com OK
      Aug 22 11:05:28 check_reload_status 20671 Carp master event
      Aug 22 11:05:28 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
      Aug 22 11:05:24 check_reload_status 20671 Carp backup event
      Aug 22 11:05:24 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:23 check_reload_status 20671 Carp master event
      Aug 22 11:05:24 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
      Aug 22 11:05:22 check_reload_status 20671 Starting packages
      Aug 22 11:05:22 check_reload_status 20671 Reloading filter
      Aug 22 11:05:21 check_reload_status 20671 rc.newwanip starting ovpns8
      Aug 22 11:05:21 check_reload_status 20671 Reloading filter
      Aug 22 11:05:21 kernel ovpns8: link state changed to UP
      Aug 22 11:05:21 kernel arp: x.x.x.x moved from 00:00:5e:00:01:f7 to 00:e0:ed:e3:5f:ac on ixl0
      Aug 22 11:05:21 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
      Aug 22 11:05:21 check_reload_status 20671 Carp master event
      Aug 22 11:05:21 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
      Aug 22 11:05:20 check_reload_status 20671 Reloading filter
      Aug 22 11:05:20 check_reload_status 20671 Carp backup event
      Aug 22 11:05:20 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:19 check_reload_status 20671 Starting packages
      Aug 22 11:05:20 kernel ovpns8: link state changed to DOWN
      Aug 22 11:05:19 check_reload_status 20671 Starting packages
      Aug 22 11:05:18 check_reload_status 20671 rc.newwanip starting ovpns8
      Aug 22 11:05:18 kernel ovpns8: link state changed to UP
      Aug 22 11:05:18 kernel ovpns8: link state changed to DOWN
      Aug 22 11:05:18 check_reload_status 20671 Starting packages
      Aug 22 11:05:18 check_reload_status 20671 rc.newwanip starting ovpns7
      Aug 22 11:05:18 check_reload_status 20671 Carp backup event
      Aug 22 11:05:18 check_reload_status 20671 Carp master event
      Aug 22 11:05:16 check_reload_status 20671 Starting packages
      Aug 22 11:05:16 check_reload_status 20671 Starting packages
      Aug 22 11:05:18 kernel carp: 247@ixl0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:18 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
      Aug 22 11:05:18 kernel ovpns7: link state changed to UP
      Aug 22 11:05:16 check_reload_status 20671 rc.newwanip starting ovpnc5
      Aug 22 11:05:16 kernel ovpnc5: link state changed to UP
      Aug 22 11:05:15 kernel ovpns7: link state changed to DOWN
      Aug 22 11:05:15 check_reload_status 20671 Starting packages
      Aug 22 11:05:15 kernel ovpnc5: link state changed to DOWN
      Aug 22 11:05:15 check_reload_status 20671 rc.newwanip starting ovpns6
      Aug 22 11:05:15 kernel ovpns6: link state changed to UP
      Aug 22 11:05:15 check_reload_status 20671 Starting packages
      Aug 22 11:05:15 check_reload_status 20671 rc.newwanip starting ovpnc4
      Aug 22 11:05:15 kernel ovpnc4: link state changed to UP
      Aug 22 11:05:15 kernel ovpns6: link state changed to DOWN
      Aug 22 11:05:15 check_reload_status 20671 Starting packages
      Aug 22 11:05:15 kernel ovpnc4: link state changed to DOWN
      Aug 22 11:05:15 check_reload_status 20671 Starting packages
      Aug 22 11:05:15 check_reload_status 20671 Reloading filter
      Aug 22 11:05:15 kernel carp: demoted by 240 to 240 (pfsync bulk start)
      Aug 22 11:05:15 check_reload_status 20671 Carp backup event
      Aug 22 11:05:15 kernel carp: 247@ixl0: INIT -> BACKUP (initialization complete)
      Aug 22 11:05:15 kernel carp: 247@ixl0: BACKUP -> INIT (hardware interface up)
      Aug 22 11:05:15 check_reload_status 20671 Carp backup event
      Aug 22 11:05:15 php-fpm 70897 /rc.carpmaster: HA cluster member "(x.x.x.x@ix1): (LAN)" has resumed CARP state "MASTER" for vhid 1
      Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpns7
      Aug 22 11:05:14 kernel ovpns7: link state changed to UP
      Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpnc5
      Aug 22 11:05:14 kernel ovpnc5: link state changed to UP
      Aug 22 11:05:14 kernel ovpns7: link state changed to DOWN
      Aug 22 11:05:14 kernel ovpnc5: link state changed to DOWN
      Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpns6
      Aug 22 11:05:14 kernel ovpns6: link state changed to UP
      Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpnc4
      Aug 22 11:05:14 kernel ovpnc4: link state changed to UP
      Aug 22 11:05:14 check_reload_status 20671 Carp master event
      Aug 22 11:05:14 check_reload_status 20671 Carp master event
      Aug 22 11:05:14 kernel carp: 116@ixl3: BACKUP -> MASTER (preempting a slower master)
      Aug 22 11:05:14 kernel carp: 196@ix0: BACKUP -> MASTER (preempting a slower master)
      Aug 22 11:05:14 kernel carp: 1@ix1: BACKUP -> MASTER (preempting a slower master)
      Aug 22 11:05:14 check_reload_status 20671 Carp master event
      Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ixl0
      Aug 22 11:05:13 check_reload_status 20671 Reloading filter
      Aug 22 11:05:13 kernel ovpns6: link state changed to DOWN
      Aug 22 11:05:13 check_reload_status 20671 Reloading filter
      Aug 22 11:05:13 kernel ovpnc4: link state changed to DOWN
      Aug 22 11:05:13 php-fpm 70897 /rc.carpbackup: HA cluster member "(x.x.x.x@ix1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
      Aug 22 11:05:13 check_reload_status 20671 Linkup starting ixl0
      Aug 22 11:05:13 kernel ixl0: link state changed to UP
      Aug 22 11:05:13 kernel carp: demoted by -240 to 0 (interface up)
      Aug 22 11:05:13 kernel carp: 247@ixl0: INIT -> BACKUP (initialization complete)
      Aug 22 11:05:13 kernel ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
      Aug 22 11:05:13 check_reload_status 20671 Carp backup event
      Aug 22 11:05:12 check_reload_status 20671 Carp backup event
      Aug 22 11:05:12 check_reload_status 20671 Carp backup event
      Aug 22 11:05:12 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:12 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:12 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:05:12 check_reload_status 20671 Carp backup event
      Aug 22 11:04:29 check_reload_status 20671 Reloading filter
      Aug 22 11:04:29 check_reload_status 20671 Restarting OpenVPN tunnels/interfaces
      Aug 22 11:04:29 check_reload_status 20671 Restarting IPsec tunnels
      Aug 22 11:04:29 check_reload_status 20671 updating dyndns WAN_ATT2_GW
      Aug 22 11:04:29 rc.gateway_alarm 99002 >>> Gateway alarm: WAN_ATT2_GW (Addr:1.1.1.1 Alarm:1 RTT:13.698ms RTTsd:.808ms Loss:22%)
      Aug 22 11:04:19 php-fpm 70897 /rc.start_packages: Skipping STARTing packages process because previous/another instance is already running
      Aug 22 11:04:19 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
      Aug 22 11:04:19 check_reload_status 20671 Starting packages
      Aug 22 11:04:19 check_reload_status 20671 Starting packages
      Aug 22 11:04:18 check_reload_status 20671 Starting packages
      Aug 22 11:04:18 check_reload_status 20671 Starting packages
      Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - -> x.x.x.x - Restarting packages.
      Aug 22 11:04:18 check_reload_status 20671 Reloading filter
      Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: rc.newwanip called with empty interface.
      Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: rc.newwanip: on (IP address: x.x.x.x) (interface: []) (real interface: ovpns6).
      Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: rc.newwanip: Info: starting on ovpns6.
      Aug 22 11:04:18 check_reload_status 20671 Starting packages
      Aug 22 11:04:18 check_reload_status 20671 rc.newwanip starting ovpns8
      Aug 22 11:04:18 kernel ovpns8: link state changed to UP
      Aug 22 11:04:18 check_reload_status 20671 rc.newwanip starting ovpns7
      Aug 22 11:04:18 kernel ovpns7: link state changed to UP
      Aug 22 11:04:17 check_reload_status 20671 rc.newwanip starting ovpnc5
      Aug 22 11:04:17 kernel ovpnc5: link state changed to UP
      Aug 22 11:04:17 check_reload_status 20671 rc.newwanip starting ovpns6
      Aug 22 11:04:17 check_reload_status 20671 Reloading filter
      Aug 22 11:04:17 kernel ovpns6: link state changed to UP
      Aug 22 11:04:17 check_reload_status 20671 rc.newwanip starting ovpnc4
      Aug 22 11:04:17 check_reload_status 20671 Reloading filter
      Aug 22 11:04:17 kernel ovpnc4: link state changed to UP
      Aug 22 11:04:16 check_reload_status 20671 Carp master event
      Aug 22 11:04:16 check_reload_status 20671 Carp master event
      Aug 22 11:04:16 check_reload_status 20671 Carp master event
      Aug 22 11:04:16 kernel carp: 116@ixl3: BACKUP -> MASTER (master timed out)
      Aug 22 11:04:16 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
      Aug 22 11:04:16 kernel carp: 196@ix0: BACKUP -> MASTER (master timed out)
      Aug 22 11:04:13 check_reload_status 20671 Carp backup event
      Aug 22 11:04:13 check_reload_status 20671 Carp backup event
      Aug 22 11:04:13 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:04:13 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:04:13 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:04:13 check_reload_status 20671 Carp backup event
      Aug 22 11:04:13 check_reload_status 20671 Carp master event
      Aug 22 11:04:13 check_reload_status 20671 Carp master event
      Aug 22 11:04:13 kernel carp: 116@ixl3: BACKUP -> MASTER (master timed out)
      Aug 22 11:04:13 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
      Aug 22 11:04:13 kernel carp: 196@ix0: BACKUP -> MASTER (master timed out)
      Aug 22 11:04:13 check_reload_status 20671 Carp master event
      Aug 22 11:04:13 kernel ovpns8: link state changed to DOWN
      Aug 22 11:04:09 kernel ovpns7: link state changed to DOWN
      Aug 22 11:04:09 kernel ovpnc5: link state changed to DOWN
      Aug 22 11:04:09 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
      Aug 22 11:04:09 kernel ovpns6: link state changed to DOWN
      Aug 22 11:04:09 check_reload_status 20671 Reloading filter
      Aug 22 11:04:08 kernel ovpnc4: link state changed to DOWN
      Aug 22 11:04:08 check_reload_status 20671 Reloading filter
      Aug 22 11:04:07 check_reload_status 20671 Carp backup event
      Aug 22 11:04:07 check_reload_status 20671 Carp backup event
      Aug 22 11:04:07 check_reload_status 20671 Carp backup event
      Aug 22 11:04:07 check_reload_status 20671 Linkup starting ixl0
      Aug 22 11:04:07 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:04:07 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:04:07 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
      Aug 22 11:04:07 kernel ixl0: link state changed to DOWN
      Aug 22 11:04:07 kernel carp: demoted by 240 to 240 (interface down)
      Aug 22 11:04:07 kernel carp: 247@ixl0: MASTER -> INIT (hardware interface down)
      Aug 22 11:04:07 check_reload_status 20671 Carp backup event

      1 Reply Last reply Reply Quote 0
      • planedropP
        planedrop
        last edited by

        This sounds like a configuration issue, CARP should not be activating just because the internet gateway itself went down.

        Can you give some more info about your setup?

        CARP has to communicate between the firewalls on both the SYNC interface AND the interfaces that have CARP VIPs on them, this means with a WAN side you need to have each firewall with it's own actual IP on the WAN and then the CARP VIP for the WAN side and these need to be on a switch so that each firewall knows that their connection is still up.

        What does your CARP and HA configuration look like?

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          By default CARP will do preemption on link loss meaning it cuts over all VIPs at once.

          For most people this is what they want, because a link loss usually means an interface failed, not just a loss of WAN connectivity.

          It's not typical to lose link for the majority of cases when a WAN fails. If you're testing that manually, a better test is to unplug one level up higher (e.g. upstream fiber link, coax cable, uplink port, etc) on the CPE, not the NIC itself.

          You could set net.inet.carp.preempt=0 as a tunable to not have it behave that way, but then if something like the primary LAN NIC failed you'd have to manually cut it over, it wouldn't be seamless.

          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          J 1 Reply Last reply Reply Quote 1
          • J
            jronald @jimp
            last edited by jronald

            @jimp Hey Jim, Thank you for your response. Is there a tunable that would prevent preemptive fail-over if the interface on the secondary is also down?

            I watched the High Availability on pfSense 2.4 Hangout video and it looks like things are working as described. The preemptive fail-over makes sense but not if the interface on the secondary had also failed. In our case switching and then switching back causes a 2 or 3 minute outage for our users. It would be better to just switch gateways. FYI, we have about 150 active users using Zoom phone, MS Teams, Teamviewer, VPN and general internet access.

            Thanks again.
            James Ronald

            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              Not explicitly, no, but if it's just one interface it may still be OK.

              Preemption and so on based on interface failure is primarily controlled by things like net.inet.carp.ifdown_demotion_factor but the value is the same on both so in theory both should end up demoted by the same amount for an interface failure, so the primary node may still win an election for master status naturally.

              I haven't tried that, though. If that doesn't work as it is, then it likely isn't feasible.

              Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              J 1 Reply Last reply Reply Quote 0
              • J
                jronald @jimp
                last edited by

                @jimp Thank you!

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.