Long Delays in CARP WAN Failover
-
Pulled the WAN from Node1 for 30 seconds and interestingly enough it failed over in about 20 seconds which is WAY less time than usual.
Things I noted from the test Node 1 System log nor the CAP seems to even register the CARP event.
Attached is the log files.
-
The logs from the two nodes aren't from the same time period.
-
Sorry didn't catch that Node1 was wrong it is listed below now for 11:14:59-11:22:09
The others were right, right?
Node2 Log 11:20:39-11:21:28
Node1 CAP 11:20:28-11:21:54
Node2 CAP 11:20:26-11:21:52Nov 10 11:22:09 check_reload_status updating dyndns WANGW
Nov 10 11:21:41 php-fpm 26622 /rc.carpmaster: HA cluster member "(10.41.41.1@em0_vlan41): (CU_STUDENT_VLAN41)" has resumed CARP state "MASTER" for vhid 13
Nov 10 11:21:41 php-fpm 26622 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:39 php-fpm 18153 /rc.carpmaster: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "MASTER" for vhid 10
Nov 10 11:21:39 php-fpm 18153 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:39 php-fpm 26987 /rc.carpmaster: HA cluster member "(10.41.31.1@em0_vlan31): (CU_STAFF_VLAN31)" has resumed CARP state "MASTER" for vhid 12
Nov 10 11:21:39 php-fpm 26987 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:37 php-fpm 24092 /rc.carpmaster: HA cluster member "(10.41.43.1@em0): (LAN)" has resumed CARP state "MASTER" for vhid 11
Nov 10 11:21:37 php-fpm 24092 /rc.carpmaster: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:35 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
Nov 10 11:21:35 xinetd 17858 readjusting service 19010-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19010-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19009-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19009-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19008-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19008-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19007-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19007-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19006-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19006-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19005-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19005-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19004-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19004-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19003-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19003-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19002-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19002-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19001-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19001-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 19000-udp
Nov 10 11:21:35 xinetd 17858 readjusting service 19000-tcp
Nov 10 11:21:35 xinetd 17858 readjusting service 6969-udp
Nov 10 11:21:35 xinetd 17858 Swapping defaults
Nov 10 11:21:35 xinetd 17858 Starting reconfiguration
Nov 10 11:21:35 check_reload_status Reloading filter
Nov 10 11:21:35 php-fpm 18153 /rc.newwanip: Configuring CARP settings finalize…
Nov 10 11:21:35 php-fpm 18153 /rc.newwanip: pfsync done in 0 seconds.
Nov 10 11:21:35 php-fpm 18153 /rc.newwanip: waiting for pfsync...
Nov 10 11:21:34 php-fpm 18153 /rc.newwanip: rc.newwanip: on (IP address: 67.239.73.182) (interface: WAN[wan]) (real interface: em1).
Nov 10 11:21:34 php-fpm 18153 /rc.newwanip: rc.newwanip: Info: starting on em1.
Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:29 kernel ifa_add_loopback_route: insertion failed: 17
Nov 10 11:21:29 kernel carp: VHID 10@em1: BACKUP -> MASTER (master down)
Nov 10 11:21:29 check_reload_status Carp master event
Nov 10 11:21:28 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
Nov 10 11:21:27 xinetd 17858 readjusting service 19010-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19010-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19009-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19009-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19008-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19008-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19007-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19007-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19006-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19006-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19005-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19005-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19004-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19004-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19003-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19003-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19002-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19002-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19001-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19001-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 19000-udp
Nov 10 11:21:27 xinetd 17858 readjusting service 19000-tcp
Nov 10 11:21:27 xinetd 17858 readjusting service 6969-udp
Nov 10 11:21:27 xinetd 17858 Swapping defaults
Nov 10 11:21:27 xinetd 17858 Starting reconfiguration
Nov 10 11:21:27 check_reload_status Reloading filter
Nov 10 11:21:27 check_reload_status rc.newwanip starting em1
Nov 10 11:21:27 php-fpm 24092 /rc.linkup: Hotplug event detected for WAN(wan) static IP (67.239.73.182 )
Nov 10 11:21:27 check_reload_status Carp master event
Nov 10 11:21:27 check_reload_status Carp master event
Nov 10 11:21:27 check_reload_status Carp master event
Nov 10 11:21:27 kernel carp: VHID 12@em0_vlan31: BACKUP -> MASTER (preempting a slower master)
Nov 10 11:21:27 kernel carp: VHID 13@em0_vlan41: BACKUP -> MASTER (preempting a slower master)
Nov 10 11:21:27 kernel carp: VHID 11@em0: BACKUP -> MASTER (preempting a slower master)
Nov 10 11:21:26 check_reload_status Reloading filter
Nov 10 11:21:26 check_reload_status Restarting OpenVPN tunnels/interfaces
Nov 10 11:21:26 check_reload_status Restarting ipsec tunnels
Nov 10 11:21:26 check_reload_status updating dyndns WANGW
Nov 10 11:21:26 check_reload_status Linkup starting em1
Nov 10 11:21:26 kernel em1: link state changed to UP
Nov 10 11:21:26 kernel carp: demoted by -240 to 0 (interface up)
Nov 10 11:21:26 kernel carp: VHID 10@em1: INIT -> BACKUP
Nov 10 11:21:26 check_reload_status Carp backup event
Nov 10 11:21:10 php-fpm 17632 /rc.carpbackup: HA cluster member "(10.41.43.1@em0): (LAN)" has resumed CARP state "BACKUP" for vhid 11
Nov 10 11:21:10 php-fpm 18153 /rc.carpbackup: HA cluster member "(10.41.31.1@em0_vlan31): (CU_STAFF_VLAN31)" has resumed CARP state "BACKUP" for vhid 12
Nov 10 11:21:10 php-fpm 17941 /rc.carpbackup: HA cluster member "(10.41.41.1@em0_vlan41): (CU_STUDENT_VLAN41)" has resumed CARP state "BACKUP" for vhid 13
Nov 10 11:21:10 php-fpm 17632 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:10 php-fpm 18153 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:10 php-fpm 17941 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:07 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
Nov 10 11:21:07 xinetd 17858 readjusting service 19010-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19010-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19009-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19009-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19008-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19008-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19007-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19007-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19006-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19006-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19005-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19005-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19004-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19004-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19003-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19003-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19002-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19002-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19001-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19001-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 19000-udp
Nov 10 11:21:07 xinetd 17858 readjusting service 19000-tcp
Nov 10 11:21:07 xinetd 17858 readjusting service 6969-udp
Nov 10 11:21:07 xinetd 17858 Swapping defaults
Nov 10 11:21:07 xinetd 17858 Starting reconfiguration
Nov 10 11:21:07 php-fpm 15607 /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
Nov 10 11:21:07 php-fpm 15607 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:00 check_reload_status Reloading filter
Nov 10 11:21:00 php-fpm 17632 /rc.linkup: Hotplug event detected for WAN(wan) static IP (67.239.73.182 )
Nov 10 11:20:59 check_reload_status Carp backup event
Nov 10 11:20:59 check_reload_status Carp backup event
Nov 10 11:20:59 check_reload_status Carp backup event
Nov 10 11:20:59 check_reload_status Linkup starting em1
Nov 10 11:20:59 check_reload_status Carp backup event
Nov 10 11:20:59 kernel carp: VHID 12@em0_vlan31: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 11:20:59 kernel carp: VHID 13@em0_vlan41: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 11:20:59 kernel carp: VHID 11@em0: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 11:20:59 kernel em1: link state changed to DOWN
Nov 10 11:20:59 kernel carp: demoted by 240 to 240 (interface down)
Nov 10 11:20:59 kernel em1: RX Next to Refresh = 348
Nov 10 11:20:59 kernel em1: RX Next to Check = 349
Nov 10 11:20:59 kernel em1: RX discarded packets = 0
Nov 10 11:20:59 kernel em1: hw rdh = 349, hw rdt = 348
Nov 10 11:20:59 kernel em1: RX Queue 0 –----
Nov 10 11:20:59 kernel em1: Tx Descriptors avail failure = 2320
Nov 10 11:20:59 kernel em1: TX descriptors avail = 40
Nov 10 11:20:59 kernel em1: Tx Queue Status = -2147483648
Nov 10 11:20:59 kernel em1: hw tdh = 30, hw tdt = 1014
Nov 10 11:20:59 kernel em1: TX Queue 0 ------
Nov 10 11:20:59 kernel Interface is RUNNING and ACTIVE
Nov 10 11:20:59 kernel em1: Watchdog timeout Queue[0]– resetting
Nov 10 11:20:52 xinetd 17858 Reconfigured: new=0 old=23 dropped=0 (services)
Nov 10 11:20:52 xinetd 17858 readjusting service 19010-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19010-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19009-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19009-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19008-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19008-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19007-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19007-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19006-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19006-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19005-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19005-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19004-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19004-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19003-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19003-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19002-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19002-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19001-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19001-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 19000-udp
Nov 10 11:20:52 xinetd 17858 readjusting service 19000-tcp
Nov 10 11:20:52 xinetd 17858 readjusting service 6969-udp
Nov 10 11:20:52 xinetd 17858 Swapping defaults
Nov 10 11:20:52 xinetd 17858 Starting reconfiguration
Nov 10 11:20:51 check_reload_status Reloading filter
Nov 10 11:20:51 check_reload_status Restarting OpenVPN tunnels/interfaces
Nov 10 11:20:51 check_reload_status Restarting ipsec tunnels
Nov 10 11:20:51 check_reload_status updating dyndns WANGW
Nov 10 11:14:59 php-fpm 63346 /widgets/widgets/ipsec.widget.php: Unable to find IPsec daemon leases file. Could not display mobile user stats! -
You sure you don't need something like spanning-tree portfast on your switch ports?
This stuff is - seriously - almost always something with the switch.
-
Looks like it's taking quite some time for the system to detect the cable unplug event.
The secondary loses CARP from the primary an 11:20:39
Then 20 seconds later you get this on the primary:
Nov 10 11:20:59 kernel em1: Watchdog timeout Queue[0]– resetting
At 11:21:26 the link comes back up. Failing back to master is immediate.
This is also strange:
Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:29 kernel ifa_add_loopback_route: insertion failed: 17
Nov 10 11:21:29 kernel carp: VHID 10@em1: BACKUP -> MASTER (master down)Probably want to capture CARP on primary to see what you see there.
-
Do you have a lot of people unplugging your WAN cables? I've never had a cable failure on a CARP node.
I've seen- power supply failure, hard disk failure, OS crash… I know you want to test things, but I don't think your test is very likely to mirror an actual failure. Try putting the master in maintenance mode, or powering it off. If that works it should work in a real failure. If you really want to dig into the cable thing, watch the CARP status on each node when you unplug the WAN, preemption should trigger the rest of the CARP VIPs to go into backup after the WAN goes. -
Looks like it's taking quite some time for the system to detect the cable unplug event.
The secondary loses CARP from the primary an 11:20:39
Then 20 seconds later you get this on the primary:
Nov 10 11:20:59 kernel em1: Watchdog timeout Queue[0]– resetting
At 11:21:26 the link comes back up. Failing back to master is immediate.
This is also strange:
Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "BACKUP" for vhid 10
Nov 10 11:21:34 php-fpm 18153 /rc.carpbackup: Message sent to jwilhite@raschools.com OK
Nov 10 11:21:29 kernel ifa_add_loopback_route: insertion failed: 17
Nov 10 11:21:29 kernel carp: VHID 10@em1: BACKUP -> MASTER (master down)Probably want to capture CARP on primary to see what you see there.
I uploaded the CAP for CARP from the Primary and Secondary WAN, are you needing it from another PORT?
Thanks for noting the strange delay, is there a way to reduce the watchdog time? I am pretty sure my config is pretty standard, I haven't done any tweaking to this install.
Do you have a lot of people unplugging your WAN cables? I've never had a cable failure on a CARP node.
I've seen- power supply failure, hard disk failure, OS crash… I know you want to test things, but I don't think your test is very likely to mirror an actual failure. Try putting the master in maintenance mode, or powering it off. If that works it should work in a real failure. If you really want to dig into the cable thing, watch the CARP status on each node when you unplug the WAN, preemption should trigger the rest of the CARP VIPs to go into backup after the WAN goes.Although I have had a unplug problem ONCE by accident that is not the norm. My assumption was that from the perspective of the nodes I was testing a single Switch/Port Failure.
I will try a power failure of Node1 after hours and then the switch power failure Node1 is on.
-
He has already stated that everything works fine going into and out of maintenance mode.
I was going to also say that this is a very odd failure test. With that hardware that might be the best that it will do. Unplugging a cable should not result in a watchdog timeout. It should result in a link down. Not sure why that is happening.
-
Couldn't do a switch fail too many people left on the network. However the power fail for the Primary Node1 worked flawlessly, never missed a beat on pings. I will have to test this weekend or next week for the switch fail test.
Probably a dumb question, why does pulling the cord not simulate the same failure?
Below is the CAP and Log from Node2 during the Node1 fail test.
Node2
15:50:02.942083 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:02.942125 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:04.333891 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:04.333904 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:05.725931 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:05.725943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:07.117895 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:07.117909 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:08.509933 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:08.509945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:09.901886 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:09.901898 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:11.293933 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:11.293945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:12.696925 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:12.696937 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:14.095942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:14.095955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:15.487932 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:15.487944 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:16.894938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:16.894950 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:18.286944 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:18.286956 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:19.678938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:19.678950 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:21.070930 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:21.070941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:22.481941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:22.481953 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:23.886938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:23.886955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:25.287937 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:25.287948 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:26.690950 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:26.690969 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:28.095943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:28.095960 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:29.493929 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:29.493941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:30.898938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:30.898951 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:32.304932 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:32.304945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:33.703947 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:33.703961 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:35.101930 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:35.101942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:36.506943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:36.506955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:37.913927 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:37.913939 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:39.318943 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:39.318955 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:40.717936 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:40.717954 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:42.109945 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:42.109957 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:43.528942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:43.528960 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:44.926941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:44.926953 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:46.325926 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:46.325938 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:47.717949 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:47.717966 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:49.109942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:49.109959 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:50.507937 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:50.507949 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:51.913932 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:51.913944 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:53.317941 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866
15:50:53.317953 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:54.717942 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=9 advbase=1 advskew=100 authlen=7 counter=10048284322299381040
15:50:54.717959 IP 67.239.73.181 > 224.0.0.18: CARPv2-advertise 36: vhid=10 advbase=1 advskew=100 authlen=7 counter=9060617201846148866Nov 10 15:51:30 check_reload_status Carp backup event
Nov 10 15:51:30 kernel carp: VHID 9@em1: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 15:51:30 kernel carp: VHID 10@em1: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 15:51:30 check_reload_status Carp backup event
Nov 10 15:51:26 check_reload_status Carp backup event
Nov 10 15:51:26 check_reload_status Carp backup event
Nov 10 15:51:26 kernel carp: VHID 12@em0_vlan31: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 15:51:26 kernel carp: VHID 13@em0_vlan41: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 15:51:26 kernel carp: VHID 11@em0: MASTER -> BACKUP (more frequent advertisement received)
Nov 10 15:51:26 check_reload_status Carp backup event
Nov 10 15:50:10 php-fpm 30196 /rc.carpmaster: HA cluster member "(67.239.73.180@em1): (WAN)" has resumed CARP state "MASTER" for vhid 10
Nov 10 15:50:10 php-fpm 30196 /rc.carpmaster: Message sent to admin@raschools.com OK
Nov 10 15:50:10 php-fpm 91222 /rc.carpmaster: HA cluster member "(67.239.73.178@em1): (WAN)" has resumed CARP state "MASTER" for vhid 9
Nov 10 15:50:10 php-fpm 91222 /rc.carpmaster: Message sent to admin@raschools.com OK
Nov 10 15:50:10 php-fpm 29856 /rc.carpmaster: HA cluster member "(10.41.31.1@em0_vlan31): (VLAN31_STAFF)" has resumed CARP state "MASTER" for vhid 12
Nov 10 15:50:10 php-fpm 29546 /rc.carpmaster: HA cluster member "(10.41.41.1@em0_vlan41): (VLAN41_CHROMEBOOKS)" has resumed CARP state "MASTER" for vhid 13
Nov 10 15:50:10 php-fpm 29856 /rc.carpmaster: Message sent to admin@raschools.com OK
Nov 10 15:50:10 php-fpm 29546 /rc.carpmaster: Message sent to admin@raschools.com OK
Nov 10 15:50:07 php-fpm 91222 /rc.carpmaster: HA cluster member "(10.41.43.1@em0): (LAN)" has resumed CARP state "MASTER" for vhid 11
Nov 10 15:50:07 php-fpm 91222 /rc.carpmaster: Message sent to admin@raschools.com OK
Nov 10 15:50:02 check_reload_status Carp master event
Nov 10 15:50:02 check_reload_status Carp master event
Nov 10 15:50:02 check_reload_status Carp master event
Nov 10 15:50:02 check_reload_status Carp master event
Nov 10 15:50:02 kernel carp: VHID 9@em1: BACKUP -> MASTER (master down)
Nov 10 15:50:02 kernel carp: VHID 10@em1: BACKUP -> MASTER (master down)
Nov 10 15:50:02 kernel carp: VHID 12@em0_vlan31: BACKUP -> MASTER (master down)
Nov 10 15:50:02 kernel carp: VHID 13@em0_vlan41: BACKUP -> MASTER (master down)
Nov 10 15:50:02 kernel carp: VHID 11@em0: BACKUP -> MASTER (master down)
Nov 10 15:50:02 check_reload_status Carp master event -
Hello, I have the same problem with my Pfsense CARP.
When I pull out the WAN cable, it takes very much time, before PfSense detects the cable failure.
Are there any ways to solve the problem?
Thanks -
If it was the exact same issue I had then the real problem ended up being my testing method. Although I never understood why yanking a cable isn't a valid test. When I simply unplug POWER from the primary router the failover is nearly instantaneous. I never posted a follow up on simulating a switch failure but that also failed over as expected.
On another note, I have had to enter maintenance mode 5 times recently so that I could change out some other equipment inline with these. Three out of five the switchover was nearly 30seconds and the WebInterface on both routers locks up completely which switching over. Looking at top via ssh during the switchover doesn't show anything locked up nor maxing out RAM/CPU so not sure what the problem was there if it is related at all.