Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Long Delays in CARP WAN Failover

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    17 Posts 4 Posters 3.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DerelictD
      Derelict LAYER 8 Netgate
      last edited by

      Definitely not normal.

      Are those typos on the IP addresses on em0_vlan31 and em0_vlan41?

      What happens when you just enter CARP maintenance mode on the primary in Status > CARP instead of yanking cables?

      I would packet capture on WAN on the secondary during a failover test and recovery and see what's really going on.

      This is a failover and a fail back on a bench WAN setup:

      $ ping 172.25.228.17
      PING 172.25.228.17 (172.25.228.17): 56 data bytes
      64 bytes from 172.25.228.17: icmp_seq=0 ttl=63 time=0.687 ms
      64 bytes from 172.25.228.17: icmp_seq=1 ttl=63 time=0.529 ms
      64 bytes from 172.25.228.17: icmp_seq=2 ttl=63 time=0.407 ms
      64 bytes from 172.25.228.17: icmp_seq=3 ttl=63 time=0.376 ms
      Request timeout for icmp_seq 4
      Request timeout for icmp_seq 5
      64 bytes from 172.25.228.17: icmp_seq=6 ttl=63 time=0.723 ms
      64 bytes from 172.25.228.17: icmp_seq=7 ttl=63 time=0.566 ms
      64 bytes from 172.25.228.17: icmp_seq=8 ttl=63 time=0.562 ms
      64 bytes from 172.25.228.17: icmp_seq=9 ttl=63 time=0.570 ms
      64 bytes from 172.25.228.17: icmp_seq=10 ttl=63 time=0.742 ms
      64 bytes from 172.25.228.17: icmp_seq=11 ttl=63 time=0.459 ms
      64 bytes from 172.25.228.17: icmp_seq=12 ttl=63 time=0.421 ms
      64 bytes from 172.25.228.17: icmp_seq=13 ttl=63 time=0.496 ms
      64 bytes from 172.25.228.17: icmp_seq=14 ttl=63 time=0.465 ms
      64 bytes from 172.25.228.17: icmp_seq=15 ttl=63 time=0.433 ms
      Request timeout for icmp_seq 16
      64 bytes from 172.25.228.17: icmp_seq=17 ttl=63 time=0.590 ms
      64 bytes from 172.25.228.17: icmp_seq=18 ttl=63 time=0.449 ms
      64 bytes from 172.25.228.17: icmp_seq=19 ttl=63 time=0.427 ms
      64 bytes from 172.25.228.17: icmp_seq=20 ttl=63 time=0.440 ms
      64 bytes from 172.25.228.17: icmp_seq=21 ttl=63 time=0.473 ms
      64 bytes from 172.25.228.17: icmp_seq=22 ttl=63 time=0.403 ms
      ^C
      –- 172.25.228.17 ping statistics ---
      23 packets transmitted, 20 packets received, 13.0% packet loss
      round-trip min/avg/max/stddev = 0.376/0.511/0.742/0.105 ms

      Chattanooga, Tennessee, USA
      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
      Do Not Chat For Help! NO_WAN_EGRESS(TM)

      1 Reply Last reply Reply Quote 0
      • R
        Ranch
        last edited by

        @Derelict:

        Are those typos on the IP addresses on em0_vlan31 and em0_vlan41?

        Yes, they were, I corrected them in my original post now.

        @Derelict:

        What happens when you just enter CARP maintenance mode on the primary in Status > CARP instead of yanking cables?

        It never misses a beat, same thing if I restart the Primary

        @Derelict:

        I would packet capture on WAN on the secondary during a failover test and recovery and see what's really going on.

        I will have to wait until this evening to do this on our live network.

        Any other thoughts or suggestions to look at while I testing further?

        1 Reply Last reply Reply Quote 0
        • R
          Ranch
          last edited by

          Sent PM with CAP

          1 Reply Last reply Reply Quote 0
          • DerelictD
            Derelict LAYER 8 Netgate
            last edited by

            What version are you running that it isn't decoding CARP as CARP  but VRRP in the packet capture?

            Please do this:

            On the SECONDARY node in BACKUP status, start a packet capture. Using Protocol, select CARP. Set the Count to something like 10000 so it doesn't stop.

            Note the time to the second if possible and unplug WAN on the primary.

            Let it fail over completely.

            Note the time to the second and reconnect WAN on Primary.

            You should see:

            CARP from the primary

            CARP from the primary should stop immediately after you unplug.

            CARP from the secondary should start nearly immediately after that.

            CARP from the primary should start nearly immediately after you plug it back in.

            CARP from the secondary should stop immediately after that.

            Problems like this are almost always something with the switch.

            Also include the system log on both nodes for the same time period.

            Happy to look at the details in a PM. The one you provided didn't really show anything other than CARP from the secondary for 10 seconds.

            It should look something like this:

            07:06:58.037963 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=17557958983656044908
            07:06:59.039006 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=17557958983656044908
            07:07:00.040021 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=17557958983656044908
            07:07:01.040995 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=17557958983656044908
            07:07:02.041957 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=17557958983656044908
            07:07:03.043097 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=17557958983656044908
            07:07:06.434500 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:07.826450 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:09.218469 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:10.610462 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:12.002465 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:13.394449 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:14.786470 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:16.178465 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:17.570444 IP 172.25.228.19 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=100 authlen=7 counter=15385611900255070985
            07:07:17.571449 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=7980853803627081614
            07:07:18.571896 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=7980853803627081614
            07:07:19.572983 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=7980853803627081614
            07:07:20.573897 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=7980853803627081614
            07:07:21.575048 IP 172.25.228.18 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=7980853803627081614

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • R
              Ranch
              last edited by

              @Derelict:

              What version are you running that it isn't decoding CARP as CARP  but VRRP in the packet capture?

              2.3.2-RELEASE-p1

              I will do the test steps today during lunch and let you know.  What is the best way to get you the log files?

              1 Reply Last reply Reply Quote 0
              • R
                Ranch
                last edited by

                Pulled the WAN from Node1 for 30 seconds and interestingly enough it failed over in about 20 seconds which is WAY less time than usual.

                Things I noted from the test Node 1 System log nor the CAP seems to even register the CARP event.

                Attached is the log files.

                CAPS_LOGS.txt

                1 Reply Last reply Reply Quote 0
                • DerelictD
                  Derelict LAYER 8 Netgate
                  last edited by

                  The logs from the two nodes aren't from the same time period.

                  Chattanooga, Tennessee, USA
                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                  1 Reply Last reply Reply Quote 0
                  • R
                    Ranch
                    last edited by

                    Sorry didn't catch that Node1 was wrong it is listed below now for 11:14:59-11:22:09

                    The others were right, right?
                    Node2 Log 11:20:39-11:21:28
                    Node1 CAP 11:20:28-11:21:54
                    Node2 CAP 11:20:26-11:21:52

                    Nov 10 11:22:09 check_reload_status updating dyndns WANGW
                    Nov 10 11:21:41 php-fpm 26622 /rc.carpmaster: HA cluster member "(10.41.41.1@em0_vlan41): (CU_STUDENT_VLAN41)" has resumed CARP state "MASTER" for vhid 13
                    Nov 10 11:21:41 php-fpm 26622 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:39 php-fpm 18153 /rc.carpmaster: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "MASTER" for vhid 10
                    Nov 10 11:21:39 php-fpm 18153 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:39 php-fpm 26987 /rc.carpmaster: HA cluster member "(10.41.31.1@em0_vlan31): (CU_STAFF_VLAN31)" has resumed CARP state "MASTER" for vhid 12
                    Nov 10 11:21:39 php-fpm 26987 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:37 php-fpm 24092 /rc.carpmaster: HA cluster member "(10.41.43.1@em0): (LAN)" has resumed CARP state "MASTER" for vhid 11
                    Nov 10 11:21:37 php-fpm 24092 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:35 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19010-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19010-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19009-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19009-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19008-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19008-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19007-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19007-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19006-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19006-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19005-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19005-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19004-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19004-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19003-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19003-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19002-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19002-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19001-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19001-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19000-udp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 19000-tcp
                    Nov 10 11:21:35 xinetd 17858 readjusting service 6969-udp
                    Nov 10 11:21:35 xinetd 17858 Swapping defaults
                    Nov 10 11:21:35 xinetd 17858 Starting reconfiguration
                    Nov 10 11:21:35 check_reload_status Reloading filter
                    Nov 10 11:21:35 php-fpm 18153 /rc.newwanip: Configuring CARP settings finalize…
                    Nov 10 11:21:35 php-fpm 18153 /rc.newwanip: pfsync done in 0 seconds.
                    Nov 10 11:21:35 php-fpm 18153 /rc.newwanip: waiting for pfsync...
                    Nov 10 11:21:34 php-fpm 18153 /rc.newwanip: rc.newwanip: on (IP address: 67.239.73.182) (interface: WAN[wan]) (real interface: em1).
                    Nov 10 11:21:34 php-fpm 18153 /rc.newwanip: rc.newwanip: Info: starting on em1.
                    Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
                    Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:29 kernel ifa_add_loopback_route: insertion failed: 17
                    Nov 10 11:21:29 kernel carp: VHID 10@em1: BACKUP -> MASTER (master down)
                    Nov 10 11:21:29 check_reload_status Carp master event
                    Nov 10 11:21:28 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19010-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19010-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19009-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19009-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19008-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19008-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19007-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19007-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19006-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19006-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19005-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19005-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19004-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19004-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19003-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19003-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19002-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19002-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19001-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19001-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19000-udp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 19000-tcp
                    Nov 10 11:21:27 xinetd 17858 readjusting service 6969-udp
                    Nov 10 11:21:27 xinetd 17858 Swapping defaults
                    Nov 10 11:21:27 xinetd 17858 Starting reconfiguration
                    Nov 10 11:21:27 check_reload_status Reloading filter
                    Nov 10 11:21:27 check_reload_status rc.newwanip starting em1
                    Nov 10 11:21:27 php-fpm 24092 /rc.linkup: Hotplug event detected for WAN(wan) static IP (67.239.73.182 )
                    Nov 10 11:21:27 check_reload_status Carp master event
                    Nov 10 11:21:27 check_reload_status Carp master event
                    Nov 10 11:21:27 check_reload_status Carp master event
                    Nov 10 11:21:27 kernel carp: VHID 12@em0_vlan31: BACKUP -> MASTER (preempting a slower master)
                    Nov 10 11:21:27 kernel carp: VHID 13@em0_vlan41: BACKUP -> MASTER (preempting a slower master)
                    Nov 10 11:21:27 kernel carp: VHID 11@em0: BACKUP -> MASTER (preempting a slower master)
                    Nov 10 11:21:26 check_reload_status Reloading filter
                    Nov 10 11:21:26 check_reload_status Restarting OpenVPN tunnels/interfaces
                    Nov 10 11:21:26 check_reload_status Restarting ipsec tunnels
                    Nov 10 11:21:26 check_reload_status updating dyndns WANGW
                    Nov 10 11:21:26 check_reload_status Linkup starting em1
                    Nov 10 11:21:26 kernel em1: link state changed to UP
                    Nov 10 11:21:26 kernel carp: demoted by -240 to 0 (interface up)
                    Nov 10 11:21:26 kernel carp: VHID 10@em1: INIT -> BACKUP
                    Nov 10 11:21:26 check_reload_status Carp backup event
                    Nov 10 11:21:10 php-fpm 17632 /rc.carpbackup: HA cluster member "(10.41.43.1@em0): (LAN)" has resumed CARP state "BACKUP" for vhid 11
                    Nov 10 11:21:10 php-fpm 18153 /rc.carpbackup: HA cluster member "(10.41.31.1@em0_vlan31): (CU_STAFF_VLAN31)" has resumed CARP state "BACKUP" for vhid 12
                    Nov 10 11:21:10 php-fpm 17941 /rc.carpbackup: HA cluster member "(10.41.41.1@em0_vlan41): (CU_STUDENT_VLAN41)" has resumed CARP state "BACKUP" for vhid 13
                    Nov 10 11:21:10 php-fpm 17632 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:10 php-fpm 18153 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:10 php-fpm 17941 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:07 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19010-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19010-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19009-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19009-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19008-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19008-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19007-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19007-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19006-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19006-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19005-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19005-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19004-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19004-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19003-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19003-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19002-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19002-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19001-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19001-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19000-udp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 19000-tcp
                    Nov 10 11:21:07 xinetd 17858 readjusting service 6969-udp
                    Nov 10 11:21:07 xinetd 17858 Swapping defaults
                    Nov 10 11:21:07 xinetd 17858 Starting reconfiguration
                    Nov 10 11:21:07 php-fpm 15607 /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
                    Nov 10 11:21:07 php-fpm 15607 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                    Nov 10 11:21:00 check_reload_status Reloading filter
                    Nov 10 11:21:00 php-fpm 17632 /rc.linkup: Hotplug event detected for WAN(wan) static IP (67.239.73.182 )
                    Nov 10 11:20:59 check_reload_status Carp backup event
                    Nov 10 11:20:59 check_reload_status Carp backup event
                    Nov 10 11:20:59 check_reload_status Carp backup event
                    Nov 10 11:20:59 check_reload_status Linkup starting em1
                    Nov 10 11:20:59 check_reload_status Carp backup event
                    Nov 10 11:20:59 kernel carp: VHID 12@em0_vlan31: MASTER -> BACKUP (more frequent advertisement received)
                    Nov 10 11:20:59 kernel carp: VHID 13@em0_vlan41: MASTER -> BACKUP (more frequent advertisement received)
                    Nov 10 11:20:59 kernel carp: VHID 11@em0: MASTER -> BACKUP (more frequent advertisement received)
                    Nov 10 11:20:59 kernel em1: link state changed to DOWN
                    Nov 10 11:20:59 kernel carp: demoted by 240 to 240 (interface down)
                    Nov 10 11:20:59 kernel em1: RX Next to Refresh = 348
                    Nov 10 11:20:59 kernel em1: RX Next to Check = 349
                    Nov 10 11:20:59 kernel em1: RX discarded packets = 0
                    Nov 10 11:20:59 kernel em1: hw rdh = 349, hw rdt = 348
                    Nov 10 11:20:59 kernel em1: RX Queue 0 –----
                    Nov 10 11:20:59 kernel em1: Tx Descriptors avail failure = 2320
                    Nov 10 11:20:59 kernel em1: TX descriptors avail = 40
                    Nov 10 11:20:59 kernel em1: Tx Queue Status = -2147483648
                    Nov 10 11:20:59 kernel em1: hw tdh = 30, hw tdt = 1014
                    Nov 10 11:20:59 kernel em1: TX Queue 0 ------
                    Nov 10 11:20:59 kernel Interface is RUNNING and ACTIVE
                    Nov 10 11:20:59 kernel em1: Watchdog timeout Queue[0]– resetting
                    Nov 10 11:20:52 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19010-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19010-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19009-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19009-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19008-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19008-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19007-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19007-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19006-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19006-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19005-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19005-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19004-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19004-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19003-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19003-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19002-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19002-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19001-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19001-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19000-udp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 19000-tcp
                    Nov 10 11:20:52 xinetd 17858 readjusting service 6969-udp
                    Nov 10 11:20:52 xinetd 17858 Swapping defaults
                    Nov 10 11:20:52 xinetd 17858 Starting reconfiguration
                    Nov 10 11:20:51 check_reload_status Reloading filter
                    Nov 10 11:20:51 check_reload_status Restarting OpenVPN tunnels/interfaces
                    Nov 10 11:20:51 check_reload_status Restarting ipsec tunnels
                    Nov 10 11:20:51 check_reload_status updating dyndns WANGW
                    Nov 10 11:14:59 php-fpm 63346 /widgets/widgets/ipsec.widget.php: Unable to find IPsec daemon leases file. Could not display mobile user stats!

                    1 Reply Last reply Reply Quote 0
                    • DerelictD
                      Derelict LAYER 8 Netgate
                      last edited by

                      You sure you don't need something like spanning-tree portfast on your switch ports?

                      This stuff is - seriously - almost always something with the switch.

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      1 Reply Last reply Reply Quote 0
                      • DerelictD
                        Derelict LAYER 8 Netgate
                        last edited by

                        Looks like it's taking quite some time for the system to detect the cable unplug event.

                        The secondary loses CARP from the primary an 11:20:39

                        Then 20 seconds later you get this on the primary:

                        Nov 10 11:20:59  kernel      em1: Watchdog timeout Queue[0]– resetting

                        At 11:21:26 the link comes back up. Failing back to master is immediate.

                        This is also strange:

                        Nov 10 11:21:34  php-fpm  18153  /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
                        Nov 10 11:21:34  php-fpm  18153  /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                        Nov 10 11:21:29  kernel      ifa_add_loopback_route: insertion failed: 17
                        Nov 10 11:21:29  kernel      carp: VHID 10@em1: BACKUP -> MASTER (master down)

                        Probably want to capture CARP on primary to see what you see there.

                        Chattanooga, Tennessee, USA
                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                        1 Reply Last reply Reply Quote 0
                        • dotdashD
                          dotdash
                          last edited by

                          Do you have a lot of people unplugging your WAN cables? I've never had a cable failure on a CARP node.
                          I've seen- power supply failure, hard disk failure, OS crash… I know you want to test things, but I don't think your test is very likely to mirror an actual failure. Try putting the master in maintenance mode, or powering it off. If that works it should work in a real failure. If you really want to dig into the cable thing, watch the CARP status on each node when you unplug the WAN, preemption should trigger the rest of the CARP VIPs to go into backup after the WAN goes.

                          1 Reply Last reply Reply Quote 0
                          • R
                            Ranch
                            last edited by

                            @Derelict:

                            Looks like it's taking quite some time for the system to detect the cable unplug event.

                            The secondary loses CARP from the primary an 11:20:39

                            Then 20 seconds later you get this on the primary:

                            Nov 10 11:20:59  kernel      em1: Watchdog timeout Queue[0]– resetting

                            At 11:21:26 the link comes back up. Failing back to master is immediate.

                            This is also strange:

                            Nov 10 11:21:34  php-fpm  18153  /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
                            Nov 10 11:21:34  php-fpm  18153  /rc.carpbackup: Message sent to jwilhite@raschools.com OK
                            Nov 10 11:21:29  kernel      ifa_add_loopback_route: insertion failed: 17
                            Nov 10 11:21:29  kernel      carp: VHID 10@em1: BACKUP -> MASTER (master down)

                            Probably want to capture CARP on primary to see what you see there.

                            I uploaded the CAP for CARP from the Primary and Secondary WAN, are you needing it from another PORT?

                            Thanks for noting the strange delay, is there a way to reduce the watchdog time?  I am pretty sure my config is pretty standard, I haven't done any tweaking to this install.

                            @dotdash:

                            Do you have a lot of people unplugging your WAN cables? I've never had a cable failure on a CARP node.
                            I've seen- power supply failure, hard disk failure, OS crash… I know you want to test things, but I don't think your test is very likely to mirror an actual failure. Try putting the master in maintenance mode, or powering it off. If that works it should work in a real failure. If you really want to dig into the cable thing, watch the CARP status on each node when you unplug the WAN, preemption should trigger the rest of the CARP VIPs to go into backup after the WAN goes.

                            Although I have had a unplug problem ONCE by accident that is not the norm.  My assumption was that from the perspective of the nodes I was testing a single Switch/Port Failure.

                            I will try a power failure of Node1 after hours and then the switch power failure Node1 is on.

                            1 Reply Last reply Reply Quote 0
                            • DerelictD
                              Derelict LAYER 8 Netgate
                              last edited by

                              He has already stated that everything works fine going into and out of maintenance mode.

                              I was going to also say that this is a very odd failure test. With that hardware that might be the best that it will do. Unplugging a cable should not result in a watchdog timeout. It should result in a link down. Not sure why that is happening.

                              Chattanooga, Tennessee, USA
                              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                              Do Not Chat For Help! NO_WAN_EGRESS(TM)

                              1 Reply Last reply Reply Quote 0
                              • R
                                Ranch
                                last edited by

                                Couldn't do a switch fail too many people left on the network.  However the power fail for the Primary Node1 worked flawlessly, never missed a beat on pings.  I will have to test this weekend or next week for the switch fail test.

                                Probably a dumb question, why does pulling the cord not simulate the same failure?

                                Below is the CAP and Log from Node2 during the Node1 fail test.

                                Node2
                                15:50:02.942083 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:02.942125 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:04.333891 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:04.333904 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:05.725931 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:05.725943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:07.117895 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:07.117909 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:08.509933 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:08.509945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:09.901886 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:09.901898 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:11.293933 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:11.293945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:12.696925 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:12.696937 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:14.095942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:14.095955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:15.487932 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:15.487944 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:16.894938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:16.894950 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:18.286944 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:18.286956 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:19.678938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:19.678950 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:21.070930 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:21.070941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:22.481941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:22.481953 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:23.886938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:23.886955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:25.287937 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:25.287948 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:26.690950 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:26.690969 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:28.095943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:28.095960 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:29.493929 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:29.493941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:30.898938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:30.898951 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:32.304932 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:32.304945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:33.703947 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:33.703961 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:35.101930 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:35.101942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:36.506943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:36.506955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:37.913927 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:37.913939 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:39.318943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:39.318955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:40.717936 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:40.717954 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:42.109945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:42.109957 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:43.528942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:43.528960 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:44.926941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:44.926953 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:46.325926 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:46.325938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:47.717949 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:47.717966 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:49.109942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:49.109959 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:50.507937 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:50.507949 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:51.913932 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:51.913944 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:53.317941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
                                15:50:53.317953 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:54.717942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
                                15:50:54.717959 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866

                                Nov 10 15:51:30 check_reload_status Carp backup event
                                Nov 10 15:51:30 kernel carp: VHID 9@em1: MASTER -> BACKUP (more frequent advertisement received)
                                Nov 10 15:51:30 kernel carp: VHID 10@em1: MASTER -> BACKUP (more frequent advertisement received)
                                Nov 10 15:51:30 check_reload_status Carp backup event
                                Nov 10 15:51:26 check_reload_status Carp backup event
                                Nov 10 15:51:26 check_reload_status Carp backup event
                                Nov 10 15:51:26 kernel carp: VHID 12@em0_vlan31: MASTER -> BACKUP (more frequent advertisement received)
                                Nov 10 15:51:26 kernel carp: VHID 13@em0_vlan41: MASTER -> BACKUP (more frequent advertisement received)
                                Nov 10 15:51:26 kernel carp: VHID 11@em0: MASTER -> BACKUP (more frequent advertisement received)
                                Nov 10 15:51:26 check_reload_status Carp backup event
                                Nov 10 15:50:10 php-fpm 30196 /rc.carpmaster: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "MASTER" for vhid 10
                                Nov 10 15:50:10 php-fpm 30196 /rc.carpmaster: Message sent to admin@raschools.com OK
                                Nov 10 15:50:10 php-fpm 91222 /rc.carpmaster: HA cluster member "(67.239.73.178@em1): (WAN)" has resumed CARP state "MASTER" for vhid 9
                                Nov 10 15:50:10 php-fpm 91222 /rc.carpmaster: Message sent to admin@raschools.com OK
                                Nov 10 15:50:10 php-fpm 29856 /rc.carpmaster: HA cluster member "(10.41.31.1@em0_vlan31): (VLAN31_STAFF)" has resumed CARP state "MASTER" for vhid 12
                                Nov 10 15:50:10 php-fpm 29546 /rc.carpmaster: HA cluster member "(10.41.41.1@em0_vlan41): (VLAN41_CHROMEBOOKS)" has resumed CARP state "MASTER" for vhid 13
                                Nov 10 15:50:10 php-fpm 29856 /rc.carpmaster: Message sent to admin@raschools.com OK
                                Nov 10 15:50:10 php-fpm 29546 /rc.carpmaster: Message sent to admin@raschools.com OK
                                Nov 10 15:50:07 php-fpm 91222 /rc.carpmaster: HA cluster member "(10.41.43.1@em0): (LAN)" has resumed CARP state "MASTER" for vhid 11
                                Nov 10 15:50:07 php-fpm 91222 /rc.carpmaster: Message sent to admin@raschools.com OK
                                Nov 10 15:50:02 check_reload_status Carp master event
                                Nov 10 15:50:02 check_reload_status Carp master event
                                Nov 10 15:50:02 check_reload_status Carp master event
                                Nov 10 15:50:02 check_reload_status Carp master event
                                Nov 10 15:50:02 kernel carp: VHID 9@em1: BACKUP -> MASTER (master down)
                                Nov 10 15:50:02 kernel carp: VHID 10@em1: BACKUP -> MASTER (master down)
                                Nov 10 15:50:02 kernel carp: VHID 12@em0_vlan31: BACKUP -> MASTER (master down)
                                Nov 10 15:50:02 kernel carp: VHID 13@em0_vlan41: BACKUP -> MASTER (master down)
                                Nov 10 15:50:02 kernel carp: VHID 11@em0: BACKUP -> MASTER (master down)
                                Nov 10 15:50:02 check_reload_status Carp master event

                                1 Reply Last reply Reply Quote 0
                                • L
                                  lexaz77757
                                  last edited by

                                  Hello, I have the same problem with my Pfsense CARP.
                                  When I pull out the WAN cable, it takes very much time, before PfSense detects the cable failure.
                                  Are there any ways to solve the problem?
                                  Thanks

                                  1 Reply Last reply Reply Quote 0
                                  • R
                                    Ranch
                                    last edited by

                                    If it was the exact same issue I had then the real problem ended up being my testing method.  Although I never understood why yanking a cable isn't a valid test.  When I simply unplug POWER from the primary router the failover is nearly instantaneous.  I never posted a follow up on simulating a switch failure but that also failed over as expected.

                                    On another note, I have had to enter maintenance mode 5 times recently so that I could change out some other equipment inline with these.  Three out of five the switchover was nearly 30seconds and the WebInterface on both routers locks up completely which switching over.  Looking at top via ssh during the switchover doesn't show anything locked up nor maxing out RAM/CPU so not sure what the problem was there if it is related at all.

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.