Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    HA randomly BACKUP goes to MASTER state

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    21 Posts 4 Posters 4.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      m4rek11
      last edited by

      Hello, I have problem with two pfSense (UTM01 and UTM02) instance which working in HA. Both UTMs working as VM in Proxmox VE cluster . Boths UTMs have lots of VLANs and and each of them works in HA - has CARP address.

      I think configuration is fine, because synchronization works properly and UTM01 has status MASTER, and UTM02 has status BACKUP.

      My problem is that in a day completely randomly UTM02 goes to status MASTER for a very litle short of time (~1-8 sec) and back to status BACKUP again.

      In the BACKUP i got logs like:

      May 10 07:05:00 UTM02 sshguard[29971]: Exiting on signal.
      May 10 07:05:00 UTM02 sshguard[85340]: Now monitoring attacks.
      May 10 07:11:00 UTM02 sshguard[85340]: Exiting on signal.
      May 10 07:11:00 UTM02 sshguard[48221]: Now monitoring attacks.
      May 10 07:16:53 UTM02 kernel: carp: 25@vtnet3.125: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 15@vtnet3.116: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 kernel: carp: 17@vtnet3.119: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 11@vtnet3.131: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 7@vtnet3.129: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 13@vtnet3.122: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 9@vtnet3.130: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 5@vtnet3.128: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 kernel: carp: 13@vtnet3.122: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:53 UTM02 kernel: carp: 6@vtnet3.104: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 20@vtnet3.112: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 28@vtnet3.136: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 4@vtnet3.140: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 kernel: carp: 8@vtnet3.115: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 32@vtnet3.141: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 14@vtnet3.114: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 24@vtnet3.124: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 18@vtnet3.121: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 kernel: carp: 26@vtnet3.126: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 2@vtnet3.111: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 22@vtnet3.113: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 16@vtnet3.117: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 12@vtnet3.132: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 kernel: carp: 30@vtnet3.138: BACKUP -> MASTER (master timed out)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:53 UTM02 kernel: carp: 9@vtnet3.130: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:53 UTM02 kernel: carp: 5@vtnet3.128: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:53 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:54 UTM02 kernel: carp: 17@vtnet3.119: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:54 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:54 UTM02 kernel: carp: 23@vtnet3.123: BACKUP -> MASTER (master timed out)
      May 10 07:16:54 UTM02 kernel: carp: 27@vtnet3.127: BACKUP -> MASTER (master timed out)
      May 10 07:16:54 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:54 UTM02 kernel: carp: 11@vtnet3.131: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:54 UTM02 kernel: carp: 21@vtnet3.134: BACKUP -> MASTER (master timed out)
      May 10 07:16:54 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:54 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:54 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:55 UTM02 kernel: carp: 31@vtnet3.139: BACKUP -> MASTER (master timed out)
      May 10 07:16:55 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:55 UTM02 kernel: carp: 19@vtnet3.118: BACKUP -> MASTER (master timed out)
      May 10 07:16:55 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:55 UTM02 php-fpm[92097]: /rc.carpmaster: HA cluster member "(172.16.141.253@vtnet3.141): (LAN_28)" has resumed CARP state "MASTER" for vhid 32
      May 10 07:16:55 UTM02 kernel: carp: 7@vtnet3.129: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:55 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:55 UTM02 kernel: carp: 1@vtnet1.100: BACKUP -> MASTER (master timed out)
      May 10 07:16:55 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:56 UTM02 kernel: carp: 25@vtnet3.125: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:56 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:56 UTM02 kernel: carp: 29@vtnet3.137: BACKUP -> MASTER (master timed out)
      May 10 07:16:56 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:56 UTM02 kernel: carp: 3@vtnet2.102: BACKUP -> MASTER (master timed out)
      May 10 07:16:56 UTM02 kernel: carp: 15@vtnet3.116: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:56 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:56 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:56 UTM02 kernel: carp: 13@vtnet3.122: BACKUP -> MASTER (master timed out)
      May 10 07:16:56 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:56 UTM02 kernel: carp: 9@vtnet3.130: BACKUP -> MASTER (master timed out)
      May 10 07:16:56 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:57 UTM02 kernel: carp: 16@vtnet3.117: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:57 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:57 UTM02 kernel: carp: 5@vtnet3.128: BACKUP -> MASTER (master timed out)
      May 10 07:16:57 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:57 UTM02 check_reload_status[403]: Reloading filter
      May 10 07:16:57 UTM02 kernel: ovpns10: link state changed to UP
      May 10 07:16:57 UTM02 check_reload_status[403]: rc.newwanip starting ovpns10
      May 10 07:16:57 UTM02 kernel: carp: 17@vtnet3.119: BACKUP -> MASTER (master timed out)
      May 10 07:16:57 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:57 UTM02 kernel: carp: 12@vtnet3.132: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:57 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:57 UTM02 check_reload_status[403]: Reloading filter
      May 10 07:16:57 UTM02 kernel: ovpns15: link state changed to UP
      May 10 07:16:57 UTM02 check_reload_status[403]: rc.newwanip starting ovpns15
      May 10 07:16:57 UTM02 kernel: carp: 11@vtnet3.131: BACKUP -> MASTER (master timed out)
      May 10 07:16:57 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:57 UTM02 kernel: ovpns16: link state changed to UP
      May 10 07:16:57 UTM02 kernel: carp: 30@vtnet3.138: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:57 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:58 UTM02 check_reload_status[403]: rc.newwanip starting ovpns16
      May 10 07:16:58 UTM02 kernel: carp: 26@vtnet3.126: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:58 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:58 UTM02 check_reload_status[403]: Starting packages
      May 10 07:16:58 UTM02 kernel: carp: 2@vtnet3.111: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:58 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:58 UTM02 check_reload_status[403]: Reloading filter
      May 10 07:16:58 UTM02 check_reload_status[403]: Starting packages
      May 10 07:16:58 UTM02 kernel: carp: 22@vtnet3.113: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:58 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:58 UTM02 kernel: carp: 7@vtnet3.129: BACKUP -> MASTER (master timed out)
      May 10 07:16:58 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:59 UTM02 check_reload_status[403]: Starting packages
      May 10 07:16:59 UTM02 xinetd[95726]: Starting reconfiguration
      May 10 07:16:59 UTM02 xinetd[95726]: Swapping defaults
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19000-tcp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19000-udp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19001-tcp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19001-udp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19002-tcp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19002-udp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19003-tcp
      May 10 07:16:59 UTM02 xinetd[95726]: readjusting service 19003-udp
      May 10 07:16:59 UTM02 xinetd[95726]: Reconfigured: new=0 old=8 dropped=0 (services)
      May 10 07:16:59 UTM02 kernel: carp: 24@vtnet3.124: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:16:59 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:16:59 UTM02 radiusd[51246]: Signalled to terminate
      May 10 07:16:59 UTM02 radiusd[51246]: Exiting normally
      May 10 07:16:59 UTM02 kernel: carp: 25@vtnet3.125: BACKUP -> MASTER (master timed out)
      May 10 07:16:59 UTM02 check_reload_status[403]: Carp master event
      May 10 07:16:59 UTM02 nrpe[13403]: Starting up daemon
      May 10 07:16:59 UTM02 nrpe[13403]: Bind to port 5666 on :: failed: Address already in use.
      May 10 07:16:59 UTM02 nrpe[13403]: Bind to port 5666 on 0.0.0.0 failed: Address already in use.
      May 10 07:16:59 UTM02 nrpe[13403]: Cannot bind to any address.
      May 10 07:16:59 UTM02 nrpe[15577]: Starting up daemon
      May 10 07:16:59 UTM02 nrpe[15577]: Bind to port 5666 on :: failed: Address already in use.
      May 10 07:16:59 UTM02 nrpe[15577]: Bind to port 5666 on 0.0.0.0 failed: Address already in use.
      May 10 07:16:59 UTM02 nrpe[15577]: Cannot bind to any address.
      May 10 07:16:59 UTM02 radiusd[14497]: Debugger not attached
      May 10 07:16:59 UTM02 radiusd[22076]: tls: In order to use TLS 1.0 and/or TLS 1.1, you likely need to set: cipher_list = "DEFAULT@SECLEVEL=1"
      May 10 07:16:59 UTM02 radiusd[22076]: Loaded virtual server <default>
      May 10 07:16:59 UTM02 radiusd[22076]: Loaded virtual server default
      May 10 07:16:59 UTM02 radiusd[22076]: Ignoring "sql" (see raddb/mods-available/README.rst)
      May 10 07:16:59 UTM02 radiusd[22076]: Ignoring "ldap" (see raddb/mods-available/README.rst)
      May 10 07:16:59 UTM02 radiusd[22076]: Loaded virtual server inner-tunnel-ttls
      May 10 07:16:59 UTM02 radiusd[22076]: Loaded virtual server inner-tunnel-peap
      May 10 07:16:59 UTM02 radiusd[22076]: Ready to process requests
      May 10 07:16:59 UTM02 kernel: carp: 15@vtnet3.116: BACKUP -> MASTER (master timed out)
      May 10 07:16:59 UTM02 check_reload_status[403]: Carp master event
      May 10 07:17:00 UTM02 sshguard[48221]: Exiting on signal.
      May 10 07:17:00 UTM02 sshguard[42335]: Now monitoring attacks.
      May 10 07:17:00 UTM02 kernel: carp: 16@vtnet3.117: BACKUP -> MASTER (master timed out)
      May 10 07:17:00 UTM02 check_reload_status[403]: Carp master event
      May 10 07:17:00 UTM02 check_reload_status[403]: Syncing firewall
      May 10 07:17:00 UTM02 kernel: carp: 12@vtnet3.132: BACKUP -> MASTER (master timed out)
      May 10 07:17:00 UTM02 check_reload_status[403]: Carp master event
      May 10 07:17:01 UTM02 kernel: carp: 30@vtnet3.138: BACKUP -> MASTER (master timed out)
      May 10 07:17:01 UTM02 check_reload_status[403]: Carp master event
      May 10 07:17:01 UTM02 kernel: carp: 18@vtnet3.121: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:01 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:01 UTM02 kernel: carp: 26@vtnet3.126: BACKUP -> MASTER (master timed out)
      May 10 07:17:01 UTM02 check_reload_status[403]: Carp master event
      May 10 07:17:01 UTM02 kernel: carp: 8@vtnet3.115: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:01 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:02 UTM02 kernel: carp: 2@vtnet3.111: BACKUP -> MASTER (master timed out)
      May 10 07:17:02 UTM02 check_reload_status[403]: Carp master event
      May 10 07:17:02 UTM02 Squid_Alarm[20014]: Squid is disabled, exiting.
      May 10 07:17:02 UTM02 tail_pfb[21555]: [pfBlockerNG] Firewall Filter Service stopped
      May 10 07:17:02 UTM02 php_pfb[21866]: [pfBlockerNG] filterlog daemon stopped
      May 10 07:17:04 UTM02 kernel: carp: 32@vtnet3.141: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 14@vtnet3.114: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 1@vtnet1.100: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 20@vtnet3.112: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 28@vtnet3.136: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 4@vtnet3.140: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 6@vtnet3.104: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 15@vtnet3.116: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 25@vtnet3.125: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 7@vtnet3.129: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 11@vtnet3.131: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 17@vtnet3.119: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 5@vtnet3.128: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 9@vtnet3.130: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 13@vtnet3.122: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 29@vtnet3.137: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 19@vtnet3.118: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 31@vtnet3.139: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 21@vtnet3.134: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 27@vtnet3.127: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 23@vtnet3.123: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 22@vtnet3.113: BACKUP -> MASTER (master timed out)
      May 10 07:17:04 UTM02 kernel: carp: 3@vtnet2.102: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 22@vtnet3.113: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 2@vtnet3.111: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 26@vtnet3.126: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 30@vtnet3.138: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 12@vtnet3.132: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 kernel: carp: 16@vtnet3.117: MASTER -> BACKUP (more frequent advertisement received)
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      May 10 07:17:04 UTM02 check_reload_status[403]: Carp backup event
      
      

      Correct me if I'm wrong, but I understand that for a some reason UTM02 not revice advertisements from UTM01 and goes to MASTER state for a while:
      kernel: carp: 23@vtnet3.123: BACKUP -> MASTER (master timed out)
      but after few seconds starts reciving advertisement, so:
      kernel: carp: 23@vtnet3.123: MASTER -> BACKUP (more frequent advertisement received)

      I will be greatfull for any tips what can cause that problem. If you need any informaction about config, pleaselet me know.

      Your faithfully,
      Marek.

      S 1 Reply Last reply Reply Quote 0
      • S
        SteveITS Galactic Empire @m4rek11
        last edited by

        @m4rek11 Have you already found:

        https://docs.netgate.com/pfsense/en/latest/troubleshooting/high-availability.html
        https://docs.netgate.com/pfsense/en/latest/troubleshooting/high-availability-virtual.html

        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
        Upvote 👍 helpful posts!

        M 1 Reply Last reply Reply Quote 0
        • M
          m4rek11 @SteveITS
          last edited by

          @steveits thank you for your reply. Of course I saw link from you, but nothing form it match to my problem, so I decided to ask on forum. Maybe someone had the same problem and can share some tips for it.

          P 1 Reply Last reply Reply Quote 0
          • P
            Przemyslaw85 @m4rek11
            last edited by Przemyslaw85

            @m4rek11 I have the same problem. And it wasn't on version 2.5.x
            In my opinion, the new 2.6 release broke something.
            Router 1: No problems
            Router 2:

            May 12 09:33:16	kernel		carp: 130@em1.1030: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 160@em1.1060: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 11@em1.10: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 150@em1.1050: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 140@em1.1040: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 130@em1.1030: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 120@em1.1020: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 160@em1.1060: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 110@em1.1010: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 11@em1.10: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 14@em1.100: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 150@em1.1050: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 22@em1.500: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 140@em1.1040: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 10@em0: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 196@bce0.881: MASTER -> BACKUP (more frequent advertisement received)
            May 12 09:33:16	kernel		carp: 120@em1.1020: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 110@em1.1010: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 14@em1.100: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 196@bce0.881: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 10@em0: BACKUP -> MASTER (master timed out)
            May 12 09:33:16	kernel		carp: 22@em1.500: BACKUP -> MASTER (master timed out)
            

            I have been looking for the cause for some time, unfortunately still the same.
            It is enough that on the main one I change something irrelevant for the second router, e.g. sending logs, Router 2 changes into a master for 1s.

            My pfSense box w HA:
            Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
            Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

            M 1 Reply Last reply Reply Quote 0
            • M
              m4rek11 @Przemyslaw85
              last edited by

              @przemyslaw85,

              My UTM01 not showing any log about CARP like doing it UTM02, so we have the same behavior, I think. Another example about that strange situation is if I change anything what require "reload" firewall on UTM01 (MASTER) and of course after reload it, randomly UTM02 (BACKUP) change to MASTER for little short of time. I think it is exacly what you described.

              One thing what I plan to test is to change network interface for UTMs in hipervior (Proxmox) from Virtio to e1000 and separate that VLAN that this interface is only for communication between HA.

              If anyone has any other ideas, please let me known.

              P.S. @przemyslaw85 Po Twoich postach widziałem, że też z Polski jesteś, ale że nie opisałem problemu w kategorii "Polska", więc po angielsku kontynuuje wątek. W każdym razie dziękuję za post z Twojej strony. Może uda się problem rozwiązać w końcu.

              P 1 Reply Last reply Reply Quote 0
              • P
                Przemyslaw85 @m4rek11
                last edited by Przemyslaw85

                @m4rek11 I have a separate LAN port for Router1-Router2 sync.
                It won't do much, but it is recommended that it be.
                You also have the gateway monitor reset also on Router2?

                PS. W naszym dziale wieje nudą. Mało jest nas co korzysta z tego oprogramowania. Wiec tutaj się podłączyłem.

                My pfSense box w HA:
                Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                P 1 Reply Last reply Reply Quote 0
                • P
                  Przemyslaw85 @Przemyslaw85
                  last edited by

                  In the statistics for interfaces I see Error IN errors for Lan Sync.
                  Interestingly, the amount increases with these hops to Router 2.
                  I've already changed the server, network cards and ports. The problem persists.

                  My pfSense box w HA:
                  Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                  Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                  M 1 Reply Last reply Reply Quote 0
                  • M
                    m4rek11 @Przemyslaw85
                    last edited by

                    @przemyslaw85

                    What do you mean about gateway monitor reset?

                    In my switch I don't see any packet errors about that vlan. Of course I made NIC chagnes in Proxmox, like I diescribed earlier, but problem still persist.

                    I'm very confused, why any changes that requires firewall reload causing (not always, but ofen) change MASTER to BACKUP and BACKUP to MASTER for a few seconds.

                    Maybe it is a pfSense bug?

                    B P 2 Replies Last reply Reply Quote 0
                    • B
                      B_IT @m4rek11
                      last edited by B_IT

                      @m4rek11 In earlier version 2.4.5 I saw sometimes (not always) that saving the rules from the firewall caused the change of one of several CARP interfaces from MASTER to BACKUP but it lasted a fraction of a second and did not cause any additional problems. This was only one, small issue with CARP I got with this older version.
                      This Saturday I installed version 2.6.0 CE and I noticed that just restarting one of the firewalls causes several minutes of "agreeing" who is MASTER and who is BACKUP - got similar logs as you and @Przemyslaw85
                      At first, I thought it is happening only when restarting one of 2 firewalls, because firewalls were working fine together for more than a day, but during the half of production day it happened again and do know why.

                      Our firewall is two physical machines connected via two switches

                      1 Reply Last reply Reply Quote 0
                      • P
                        Przemyslaw85 @m4rek11
                        last edited by Przemyslaw85

                        @m4rek11 I have internet connection monitoring configured. As soon as the roles of Router Backup are changed, the link status is lost.

                        @b_it Restarting the backup router also causes such problems for me.

                        My pfSense box w HA:
                        Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                        Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                        B 1 Reply Last reply Reply Quote 0
                        • B
                          B_IT @Przemyslaw85
                          last edited by

                          @przemyslaw85 I also have Internet connection monitoring enabled. I found this discussion on https://redmine.pfsense.org/issues/12961 regarding CARP storm. I will try to play with this solution this saturday and see what will happen.

                          M 1 Reply Last reply Reply Quote 0
                          • M
                            m4rek11 @B_IT
                            last edited by

                            @Przemyslaw85, Im using monitoring system too and when the problem occuring lots of hosts (vlans) are not visible for a while within it.

                            @b_it, thank you for link, please, let us to known about your test.

                            B 1 Reply Last reply Reply Quote 0
                            • B
                              B_IT @m4rek11
                              last edited by

                              @m4rek11 @Przemyslaw85 Sure I will. I hope that I will back with good news. Stay tuned

                              B 1 Reply Last reply Reply Quote 0
                              • B
                                B_IT @B_IT
                                last edited by

                                On last Saturday I added the two patches I mentioned in the previous comment and so far it looks much better. I don't see too many unnecessary messages, both firewalls are stable after these few days. Here are all the patches I added directly to mitigate CARP issue:

                                Fix CARP event storm when leaving persistent CARP maintenance mode 1/2
                                https://github.com/pfsense/pfsense/commit/8a906fba5e42d391227dfc39311d02b570576d50.patch

                                Fix CARP event storm when leaving persistent CARP maintenance mode 2/2
                                https://github.com/pfsense/pfsense/commit/3c15b353c6968801cfffb7d3b30a7069d2330a3e.patch

                                during patching Saturday I also manually added this one:
                                Fix Clicking Save & Force Update on a Dynamic DNS entry results in a GUI timeout
                                https://github.com/pfsense/pfsense/commit/bdffb77d1aa21770b23ef408ad9fba79d0825ec5.patch

                                and I applied this three patches from recommended section:
                                Disable pf counter data preservation to temporarily work around latency when reloading large rulesets (Redmine #12827)

                                Fix Captive Portal handling of non-TCP traffic after login (Redmine #12834)

                                Fix OpenVPN dashboard widget client termination (Redmine #12817)

                                to sum up: for now I will stay with 2.6.0 version with patches

                                P 1 Reply Last reply Reply Quote 0
                                • P
                                  Przemyslaw85 @B_IT
                                  last edited by

                                  @b_it I understand I have made changes for mode 1/2 and mode 2/2.
                                  For mode 1/2 I have to do steps for server 1 or both.

                                  My pfSense box w HA:
                                  Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                                  Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                                  B 1 Reply Last reply Reply Quote 0
                                  • B
                                    B_IT @Przemyslaw85
                                    last edited by

                                    @przemyslaw85 I think that every node should have the same set of patches. So I patched first node, and than the second node.

                                    this name is just my own convention name:

                                    Fix CARP event storm when leaving persistent CARP maintenance mode 1/2
                                    Fix CARP event storm when leaving persistent CARP maintenance mode 2/2

                                    For CARP issue the second patch is not going to apply without the first one. This the view from one node (the second has the same set o patches)
                                    4bc2fdee-0f66-45d5-a658-dfb4ca325c88-obraz.png

                                    P 1 Reply Last reply Reply Quote 0
                                    • P
                                      Przemyslaw85 @B_IT
                                      last edited by

                                      @b_it I confirm the operation of the patches.
                                      Yesterday I made a few changes to the original files using the file editor. I didn't know there was such a module as patches. I had to revert to the original changes from a copy made before editing.
                                      As I added 1/2 2/2 patches and Dynamin DNS I did not notice any improvement. Only after I added patches # 12827, # 12834, # 12816 and # 12817 I can say that now the system works as it should.

                                      My pfSense box w HA:
                                      Master: HP DL360G8 1x E5-2670, 64GB ECC RAM, 8x NIC (17x VLan)
                                      Slave: HP DL360G5, 2x E5410, 64GB ECC RAM, 6x NIC (17x VLan)

                                      B 1 Reply Last reply Reply Quote 0
                                      • B
                                        B_IT @Przemyslaw85
                                        last edited by

                                        @przemyslaw85 Seems to me that when I started to patch (CARP) I saw that firewall is more responsive making later changes (patching) but I didn't wait too long - just rebooted both nodes to be sure that all selected patches are fully applied.
                                        I have to admit that I started to make more thorough tests after I rebooting FWs (with mentioned patch set), so I can't be sure what really helped and how much.
                                        BTW; The patching mechanism was introduced around version 2.5, and I've already learned from his beginning that I have to be careful selecting patches.

                                        M 1 Reply Last reply Reply Quote 0
                                        • M
                                          m4rek11 @B_IT
                                          last edited by

                                          @Przemyslaw85, @B_IT, after that changes did you have carp storm in logs and that MASTER -> BACKUP, BACKUP ->MASTER change for little time?

                                          B P 2 Replies Last reply Reply Quote 0
                                          • B
                                            B_IT @m4rek11
                                            last edited by

                                            @m4rek11 I am looking into logs I see that during applying patch there are some entries, but after patching I see only a few, and they all looks as they should (at least for me) and they have reason (eg. rebooted node). I wouldn't call them storm and definitely I don't see flipping MASTER - BACKUP entries now.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.