• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

HA failover Issue with CARP switching all interfaces to backup when just one connection fails.

Scheduled Pinned Locked Moved HA/CARP/VIPs
6 Posts 3 Posters 719 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    jronald
    last edited by Aug 23, 2023, 1:32 PM

    HA fail-over Issue. When any one of our internet connections drop all interfaces fail-over to the backup and then switches back. Why even switch when the internet connection has also failed on the backup as well. Any idea why is this happening? Was this a known issue with 22.01? All advanced gateway settings are still set to the defaults for all gateways. Any thoughts would be appreciated.

    Thanks!
    Jim

    Running 22.01-RELEASE (amd64) on XG 1538

    Aug 22 11:19:00 sshguard 17300 Now monitoring attacks.
    Aug 22 11:19:00 sshguard 52663 Exiting on signal.
    Aug 22 11:14:22 kernel carp: demoted by -240 to 0 (pfsync bulk
    Aug 22 11:08:00 sshguard 52663 Now monitoring attacks.
    Aug 22 11:08:00 sshguard 14862 Exiting on signal.
    Aug 22 11:07:03 check_reload_status 20671 Starting packages
    Aug 22 11:07:03 check_reload_status 20671 Reloading filter
    Aug 22 11:07:02 check_reload_status 20671 rc.newwanip starting ovpns8
    Aug 22 11:07:02 kernel ovpns8: link state changed to UP
    Aug 22 11:07:02 check_reload_status 20671 Reloading filter
    Aug 22 11:07:01 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
    Aug 22 11:07:00 check_reload_status 20671 Starting packages
    Aug 22 11:07:01 kernel ovpns8: link state changed to DOWN
    Aug 22 11:06:59 check_reload_status 20671 Starting packages
    Aug 22 11:06:59 check_reload_status 20671 Starting packages
    Aug 22 11:06:59 check_reload_status 20671 Starting packages
    Aug 22 11:06:59 check_reload_status 20671 Starting packages
    Aug 22 11:06:59 check_reload_status 20671 rc.newwanip starting ovpns8
    Aug 22 11:06:59 kernel ovpns8: link state changed to UP
    Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpns7
    Aug 22 11:06:58 kernel ovpns7: link state changed to UP
    Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpnc5
    Aug 22 11:06:58 kernel ovpnc5: link state changed to UP
    Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpnc4
    Aug 22 11:06:58 check_reload_status 20671 rc.newwanip starting ovpns6
    Aug 22 11:06:57 check_reload_status 20671 Reloading filter
    Aug 22 11:06:58 kernel ovpns8: link state changed to DOWN
    Aug 22 11:06:58 kernel ovpns6: link state changed to UP
    Aug 22 11:06:58 kernel ovpnc4: link state changed to UP
    Aug 22 11:06:55 check_reload_status 20671 Carp master event
    Aug 22 11:06:55 kernel carp: 196@ix0: BACKUP -> MASTER (master timed out)
    Aug 22 11:06:55 kernel carp: 116@ixl3: BACKUP -> MASTER (master timed out)
    Aug 22 11:06:55 check_reload_status 20671 Carp master event
    Aug 22 11:06:54 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
    Aug 22 11:06:54 check_reload_status 20671 Carp master event
    Aug 22 11:06:54 kernel ovpns7: link state changed to DOWN
    Aug 22 11:06:54 kernel ovpnc5: link state changed to DOWN
    Aug 22 11:06:54 check_reload_status 20671 Reloading filter
    Aug 22 11:06:54 kernel ovpns6: link state changed to DOWN
    Aug 22 11:06:54 check_reload_status 20671 Reloading filter
    Aug 22 11:06:54 kernel ovpnc4: link state changed to DOWN
    Aug 22 11:06:52 check_reload_status 20671 Carp backup event
    Aug 22 11:06:52 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:06:52 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:06:52 check_reload_status 20671 Carp backup event
    Aug 22 11:06:51 check_reload_status 20671 Carp backup event
    Aug 22 11:06:51 kernel carp: 247@ixl0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:06:51 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
    Aug 22 11:06:51 check_reload_status 20671 Carp master event
    Aug 22 11:06:47 check_reload_status 20671 Reloading filter
    Aug 22 11:06:47 check_reload_status 20671 Restarting OpenVPN tunnels/interfaces
    Aug 22 11:06:47 check_reload_status 20671 Restarting IPsec tunnels
    Aug 22 11:06:47 check_reload_status 20671 updating dyndns WAN_ATT2_GW
    Aug 22 11:06:47 rc.gateway_alarm 90277 >>> Gateway alarm: WAN_ATT2_GW (Addr:1.1.1.1 Alarm:0 RTT:13.720ms RTTsd:.347ms Loss:5%)
    Aug 22 11:05:51 php 43661 notify_monitor.php: Message sent to xxxx.com OK
    Aug 22 11:05:48 kernel carp: 247@ixl0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:48 check_reload_status 20671 Carp backup event
    Aug 22 11:05:31 php 43661 notify_monitor.php: Message sent to xxx.com OK
    Aug 22 11:05:28 check_reload_status 20671 Carp master event
    Aug 22 11:05:28 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
    Aug 22 11:05:24 check_reload_status 20671 Carp backup event
    Aug 22 11:05:24 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:23 check_reload_status 20671 Carp master event
    Aug 22 11:05:24 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
    Aug 22 11:05:22 check_reload_status 20671 Starting packages
    Aug 22 11:05:22 check_reload_status 20671 Reloading filter
    Aug 22 11:05:21 check_reload_status 20671 rc.newwanip starting ovpns8
    Aug 22 11:05:21 check_reload_status 20671 Reloading filter
    Aug 22 11:05:21 kernel ovpns8: link state changed to UP
    Aug 22 11:05:21 kernel arp: x.x.x.x moved from 00:00:5e:00:01:f7 to 00:e0:ed:e3:5f:ac on ixl0
    Aug 22 11:05:21 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
    Aug 22 11:05:21 check_reload_status 20671 Carp master event
    Aug 22 11:05:21 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
    Aug 22 11:05:20 check_reload_status 20671 Reloading filter
    Aug 22 11:05:20 check_reload_status 20671 Carp backup event
    Aug 22 11:05:20 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:19 check_reload_status 20671 Starting packages
    Aug 22 11:05:20 kernel ovpns8: link state changed to DOWN
    Aug 22 11:05:19 check_reload_status 20671 Starting packages
    Aug 22 11:05:18 check_reload_status 20671 rc.newwanip starting ovpns8
    Aug 22 11:05:18 kernel ovpns8: link state changed to UP
    Aug 22 11:05:18 kernel ovpns8: link state changed to DOWN
    Aug 22 11:05:18 check_reload_status 20671 Starting packages
    Aug 22 11:05:18 check_reload_status 20671 rc.newwanip starting ovpns7
    Aug 22 11:05:18 check_reload_status 20671 Carp backup event
    Aug 22 11:05:18 check_reload_status 20671 Carp master event
    Aug 22 11:05:16 check_reload_status 20671 Starting packages
    Aug 22 11:05:16 check_reload_status 20671 Starting packages
    Aug 22 11:05:18 kernel carp: 247@ixl0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:18 kernel carp: 247@ixl0: BACKUP -> MASTER (master timed out)
    Aug 22 11:05:18 kernel ovpns7: link state changed to UP
    Aug 22 11:05:16 check_reload_status 20671 rc.newwanip starting ovpnc5
    Aug 22 11:05:16 kernel ovpnc5: link state changed to UP
    Aug 22 11:05:15 kernel ovpns7: link state changed to DOWN
    Aug 22 11:05:15 check_reload_status 20671 Starting packages
    Aug 22 11:05:15 kernel ovpnc5: link state changed to DOWN
    Aug 22 11:05:15 check_reload_status 20671 rc.newwanip starting ovpns6
    Aug 22 11:05:15 kernel ovpns6: link state changed to UP
    Aug 22 11:05:15 check_reload_status 20671 Starting packages
    Aug 22 11:05:15 check_reload_status 20671 rc.newwanip starting ovpnc4
    Aug 22 11:05:15 kernel ovpnc4: link state changed to UP
    Aug 22 11:05:15 kernel ovpns6: link state changed to DOWN
    Aug 22 11:05:15 check_reload_status 20671 Starting packages
    Aug 22 11:05:15 kernel ovpnc4: link state changed to DOWN
    Aug 22 11:05:15 check_reload_status 20671 Starting packages
    Aug 22 11:05:15 check_reload_status 20671 Reloading filter
    Aug 22 11:05:15 kernel carp: demoted by 240 to 240 (pfsync bulk start)
    Aug 22 11:05:15 check_reload_status 20671 Carp backup event
    Aug 22 11:05:15 kernel carp: 247@ixl0: INIT -> BACKUP (initialization complete)
    Aug 22 11:05:15 kernel carp: 247@ixl0: BACKUP -> INIT (hardware interface up)
    Aug 22 11:05:15 check_reload_status 20671 Carp backup event
    Aug 22 11:05:15 php-fpm 70897 /rc.carpmaster: HA cluster member "(x.x.x.x@ix1): (LAN)" has resumed CARP state "MASTER" for vhid 1
    Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpns7
    Aug 22 11:05:14 kernel ovpns7: link state changed to UP
    Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpnc5
    Aug 22 11:05:14 kernel ovpnc5: link state changed to UP
    Aug 22 11:05:14 kernel ovpns7: link state changed to DOWN
    Aug 22 11:05:14 kernel ovpnc5: link state changed to DOWN
    Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpns6
    Aug 22 11:05:14 kernel ovpns6: link state changed to UP
    Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ovpnc4
    Aug 22 11:05:14 kernel ovpnc4: link state changed to UP
    Aug 22 11:05:14 check_reload_status 20671 Carp master event
    Aug 22 11:05:14 check_reload_status 20671 Carp master event
    Aug 22 11:05:14 kernel carp: 116@ixl3: BACKUP -> MASTER (preempting a slower master)
    Aug 22 11:05:14 kernel carp: 196@ix0: BACKUP -> MASTER (preempting a slower master)
    Aug 22 11:05:14 kernel carp: 1@ix1: BACKUP -> MASTER (preempting a slower master)
    Aug 22 11:05:14 check_reload_status 20671 Carp master event
    Aug 22 11:05:14 check_reload_status 20671 rc.newwanip starting ixl0
    Aug 22 11:05:13 check_reload_status 20671 Reloading filter
    Aug 22 11:05:13 kernel ovpns6: link state changed to DOWN
    Aug 22 11:05:13 check_reload_status 20671 Reloading filter
    Aug 22 11:05:13 kernel ovpnc4: link state changed to DOWN
    Aug 22 11:05:13 php-fpm 70897 /rc.carpbackup: HA cluster member "(x.x.x.x@ix1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Aug 22 11:05:13 check_reload_status 20671 Linkup starting ixl0
    Aug 22 11:05:13 kernel ixl0: link state changed to UP
    Aug 22 11:05:13 kernel carp: demoted by -240 to 0 (interface up)
    Aug 22 11:05:13 kernel carp: 247@ixl0: INIT -> BACKUP (initialization complete)
    Aug 22 11:05:13 kernel ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
    Aug 22 11:05:13 check_reload_status 20671 Carp backup event
    Aug 22 11:05:12 check_reload_status 20671 Carp backup event
    Aug 22 11:05:12 check_reload_status 20671 Carp backup event
    Aug 22 11:05:12 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:12 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:12 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:05:12 check_reload_status 20671 Carp backup event
    Aug 22 11:04:29 check_reload_status 20671 Reloading filter
    Aug 22 11:04:29 check_reload_status 20671 Restarting OpenVPN tunnels/interfaces
    Aug 22 11:04:29 check_reload_status 20671 Restarting IPsec tunnels
    Aug 22 11:04:29 check_reload_status 20671 updating dyndns WAN_ATT2_GW
    Aug 22 11:04:29 rc.gateway_alarm 99002 >>> Gateway alarm: WAN_ATT2_GW (Addr:1.1.1.1 Alarm:1 RTT:13.698ms RTTsd:.808ms Loss:22%)
    Aug 22 11:04:19 php-fpm 70897 /rc.start_packages: Skipping STARTing packages process because previous/another instance is already running
    Aug 22 11:04:19 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
    Aug 22 11:04:19 check_reload_status 20671 Starting packages
    Aug 22 11:04:19 check_reload_status 20671 Starting packages
    Aug 22 11:04:18 check_reload_status 20671 Starting packages
    Aug 22 11:04:18 check_reload_status 20671 Starting packages
    Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - -> x.x.x.x - Restarting packages.
    Aug 22 11:04:18 check_reload_status 20671 Reloading filter
    Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: rc.newwanip called with empty interface.
    Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: rc.newwanip: on (IP address: x.x.x.x) (interface: []) (real interface: ovpns6).
    Aug 22 11:04:18 php-fpm 70897 /rc.newwanip: rc.newwanip: Info: starting on ovpns6.
    Aug 22 11:04:18 check_reload_status 20671 Starting packages
    Aug 22 11:04:18 check_reload_status 20671 rc.newwanip starting ovpns8
    Aug 22 11:04:18 kernel ovpns8: link state changed to UP
    Aug 22 11:04:18 check_reload_status 20671 rc.newwanip starting ovpns7
    Aug 22 11:04:18 kernel ovpns7: link state changed to UP
    Aug 22 11:04:17 check_reload_status 20671 rc.newwanip starting ovpnc5
    Aug 22 11:04:17 kernel ovpnc5: link state changed to UP
    Aug 22 11:04:17 check_reload_status 20671 rc.newwanip starting ovpns6
    Aug 22 11:04:17 check_reload_status 20671 Reloading filter
    Aug 22 11:04:17 kernel ovpns6: link state changed to UP
    Aug 22 11:04:17 check_reload_status 20671 rc.newwanip starting ovpnc4
    Aug 22 11:04:17 check_reload_status 20671 Reloading filter
    Aug 22 11:04:17 kernel ovpnc4: link state changed to UP
    Aug 22 11:04:16 check_reload_status 20671 Carp master event
    Aug 22 11:04:16 check_reload_status 20671 Carp master event
    Aug 22 11:04:16 check_reload_status 20671 Carp master event
    Aug 22 11:04:16 kernel carp: 116@ixl3: BACKUP -> MASTER (master timed out)
    Aug 22 11:04:16 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
    Aug 22 11:04:16 kernel carp: 196@ix0: BACKUP -> MASTER (master timed out)
    Aug 22 11:04:13 check_reload_status 20671 Carp backup event
    Aug 22 11:04:13 check_reload_status 20671 Carp backup event
    Aug 22 11:04:13 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:04:13 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:04:13 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:04:13 check_reload_status 20671 Carp backup event
    Aug 22 11:04:13 check_reload_status 20671 Carp master event
    Aug 22 11:04:13 check_reload_status 20671 Carp master event
    Aug 22 11:04:13 kernel carp: 116@ixl3: BACKUP -> MASTER (master timed out)
    Aug 22 11:04:13 kernel carp: 1@ix1: BACKUP -> MASTER (master timed out)
    Aug 22 11:04:13 kernel carp: 196@ix0: BACKUP -> MASTER (master timed out)
    Aug 22 11:04:13 check_reload_status 20671 Carp master event
    Aug 22 11:04:13 kernel ovpns8: link state changed to DOWN
    Aug 22 11:04:09 kernel ovpns7: link state changed to DOWN
    Aug 22 11:04:09 kernel ovpnc5: link state changed to DOWN
    Aug 22 11:04:09 php 43661 notify_monitor.php: Message sent to xxx@xxx.com OK
    Aug 22 11:04:09 kernel ovpns6: link state changed to DOWN
    Aug 22 11:04:09 check_reload_status 20671 Reloading filter
    Aug 22 11:04:08 kernel ovpnc4: link state changed to DOWN
    Aug 22 11:04:08 check_reload_status 20671 Reloading filter
    Aug 22 11:04:07 check_reload_status 20671 Carp backup event
    Aug 22 11:04:07 check_reload_status 20671 Carp backup event
    Aug 22 11:04:07 check_reload_status 20671 Carp backup event
    Aug 22 11:04:07 check_reload_status 20671 Linkup starting ixl0
    Aug 22 11:04:07 kernel carp: 196@ix0: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:04:07 kernel carp: 1@ix1: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:04:07 kernel carp: 116@ixl3: MASTER -> BACKUP (more frequent advertisement received)
    Aug 22 11:04:07 kernel ixl0: link state changed to DOWN
    Aug 22 11:04:07 kernel carp: demoted by 240 to 240 (interface down)
    Aug 22 11:04:07 kernel carp: 247@ixl0: MASTER -> INIT (hardware interface down)
    Aug 22 11:04:07 check_reload_status 20671 Carp backup event

    1 Reply Last reply Reply Quote 0
    • P
      planedrop
      last edited by Aug 23, 2023, 10:54 PM

      This sounds like a configuration issue, CARP should not be activating just because the internet gateway itself went down.

      Can you give some more info about your setup?

      CARP has to communicate between the firewalls on both the SYNC interface AND the interfaces that have CARP VIPs on them, this means with a WAN side you need to have each firewall with it's own actual IP on the WAN and then the CARP VIP for the WAN side and these need to be on a switch so that each firewall knows that their connection is still up.

      What does your CARP and HA configuration look like?

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by Aug 25, 2023, 6:29 PM

        By default CARP will do preemption on link loss meaning it cuts over all VIPs at once.

        For most people this is what they want, because a link loss usually means an interface failed, not just a loss of WAN connectivity.

        It's not typical to lose link for the majority of cases when a WAN fails. If you're testing that manually, a better test is to unplug one level up higher (e.g. upstream fiber link, coax cable, uplink port, etc) on the CPE, not the NIC itself.

        You could set net.inet.carp.preempt=0 as a tunable to not have it behave that way, but then if something like the primary LAN NIC failed you'd have to manually cut it over, it wouldn't be seamless.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        J 1 Reply Last reply Aug 28, 2023, 3:04 PM Reply Quote 1
        • J
          jronald @jimp
          last edited by jronald Aug 28, 2023, 3:06 PM Aug 28, 2023, 3:04 PM

          @jimp Hey Jim, Thank you for your response. Is there a tunable that would prevent preemptive fail-over if the interface on the secondary is also down?

          I watched the High Availability on pfSense 2.4 Hangout video and it looks like things are working as described. The preemptive fail-over makes sense but not if the interface on the secondary had also failed. In our case switching and then switching back causes a 2 or 3 minute outage for our users. It would be better to just switch gateways. FYI, we have about 150 active users using Zoom phone, MS Teams, Teamviewer, VPN and general internet access.

          Thanks again.
          James Ronald

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by Aug 28, 2023, 3:11 PM

            Not explicitly, no, but if it's just one interface it may still be OK.

            Preemption and so on based on interface failure is primarily controlled by things like net.inet.carp.ifdown_demotion_factor but the value is the same on both so in theory both should end up demoted by the same amount for an interface failure, so the primary node may still win an election for master status naturally.

            I haven't tried that, though. If that doesn't work as it is, then it likely isn't feasible.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            J 1 Reply Last reply Aug 29, 2023, 2:58 PM Reply Quote 0
            • J
              jronald @jimp
              last edited by Aug 29, 2023, 2:58 PM

              @jimp Thank you!

              1 Reply Last reply Reply Quote 0
              6 out of 6
              • First post
                6/6
                Last post
              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                This community forum collects and processes your personal information.
                consent.not_received