Suricata rules refresh causes HA Carp VIP failover and back again



  • I was trying to find out why every day I get a CARP failover and finally figured out that it happens right after Suricata rules are download and suricata is restarted (stopped part of it specifically I think).  The firewalls failover to the secondary and then back again quickly.  This happens on the two 2.4.2 p1 cluster systems that I have installed on 2 different hardware platforms (dell and the other on some gigabyte old hardware).  I was running 2.3.2 before this and didn't have the issue but after upgrading 2 different HA clusters to 2.4.2 p1, It is happening on both of them now.

    I have Suricata in 'Legacy' mode and not inline.  It seems strange that the rules change would cause this but it seems to happen every time.

    2.4.2-RELEASE-p1 (amd64)
    built on Tue Dec 12 13:45:26 CST 2017
    FreeBSD 11.1-RELEASE-p6

    suricata security 4.0.3_1 High Performance Network IDS, IPS and Security Monitoring engine by OISF.
      Package Dependencies:
        suricata-4.0.3   barnyard2-1.13_1

    Primary firewall:

    Jan 19 14:35:01 	php-fpm 	41426 	/index.php: Successful login for user 'admin' from: remote.x.x.146
    Jan 19 16:30:01 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] There is a new set of Emerging Threats Open rules posted. Downloading emerging.rules.tar.gz...
    Jan 19 16:30:09 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Emerging Threats Open rules file update downloaded successfully.
    Jan 19 16:30:09 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Snort VRT rules md5 download failed...
    Jan 19 16:30:09 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Server returned error code 422...
    Jan 19 16:30:10 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Updating rules configuration for: LAN ...
    Jan 19 16:30:17 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Enabling any flowbit-required rules for: LAN...
    Jan 19 16:30:17 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Building new sid-msg.map file for LAN...
    Jan 19 16:30:19 	SuricataStartup 	68553 	Suricata STOP for LAN(61011_igb1)...
    Jan 19 16:30:20 	check_reload_status 		Carp backup event
    Jan 19 16:30:20 	kernel 		carp: 1@igb1: MASTER -> INIT (hardware interface down)
    Jan 19 16:30:20 	kernel 		carp: demoted by 240 to 240 (interface down)
    Jan 19 16:30:20 	kernel 		igb1: link state changed to DOWN
    Jan 19 16:30:20 	check_reload_status 		Linkup starting igb1
    Jan 19 16:30:21 	php-fpm 	69318 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.2 )
    Jan 19 16:30:21 	check_reload_status 		Reloading filter
    Jan 19 16:30:22 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Suricata has restarted with your new set of rules...
    Jan 19 16:30:22 	SuricataStartup 	70868 	Suricata START for LAN(61011_igb1)...
    Jan 19 16:30:22 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] The Rules update has finished.
    Jan 19 16:30:22 	kernel 		carp: 2@igb0: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:30:22 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb0: 3
    Jan 19 16:30:22 	kernel 		carp: 3@igb2: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:30:22 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb2: 3
    Jan 19 16:30:22 	check_reload_status 		Carp backup event
    Jan 19 16:30:22 	check_reload_status 		Carp backup event
    Jan 19 16:30:22 	check_reload_status 		Syncing firewall
    Jan 19 16:30:22 	php-fpm 	67295 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:22 	php-fpm 	67295 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:23 	check_reload_status 		Carp backup event
    Jan 19 16:30:23 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:30:23 	kernel 		carp: demoted by -240 to 0 (interface up)
    Jan 19 16:30:23 	kernel 		igb1: link state changed to UP
    Jan 19 16:30:23 	check_reload_status 		Linkup starting igb1
    Jan 19 16:30:23 	php-fpm 	74332 	/rc.filter_synchronize: Beginning XMLRPC sync data to https://sync.x.x.2:8443/xmlrpc.php.
    Jan 19 16:30:23 	kernel 		carp: 2@igb0: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:30:23 	kernel 		carp: 3@igb2: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:30:23 	check_reload_status 		Carp master event
    Jan 19 16:30:23 	check_reload_status 		Carp master event
    Jan 19 16:30:24 	php-fpm 	69318 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:24 	php-fpm 	69318 	/rc.carpbackup: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "BACKUP" for vhid 3
    Jan 19 16:30:24 	php-fpm 	67295 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:24 	php-fpm 	67295 	/rc.carpbackup: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "BACKUP" for vhid 2
    Jan 19 16:30:24 	php-fpm 	67295 	/rc.carpbackup: Stopping OpenVPN server instance on WAN because of transition to CARP backup.
    Jan 19 16:30:24 	php-fpm 	67295 	OpenVPN terminate old pid: 58060
    Jan 19 16:30:24 	kernel 		ovpns1: link state changed to DOWN
    Jan 19 16:30:24 	check_reload_status 		Reloading filter
    Jan 19 16:30:24 	php-fpm 	67295 	OpenVPN PID written: 77118
    Jan 19 16:30:24 	check_reload_status 		Reloading filter
    Jan 19 16:30:24 	kernel 		ovpns1: link state changed to UP
    Jan 19 16:30:24 	php-fpm 	67295 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.2 )
    Jan 19 16:30:24 	check_reload_status 		rc.newwanip starting ovpns1
    Jan 19 16:30:24 	check_reload_status 		rc.newwanip starting igb1
    Jan 19 16:30:24 	php-fpm 	69318 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:24 	php-fpm 	69318 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:25 	php-fpm 	67295 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:25 	php-fpm 	67295 	/rc.carpmaster: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "MASTER" for vhid 2
    Jan 19 16:30:25 	php-fpm 	67295 	/rc.carpmaster: Starting OpenVPN server instance on WAN because of transition to CARP master.
    Jan 19 16:30:25 	php-fpm 	67295 	OpenVPN terminate old pid: 77118
    Jan 19 16:30:25 	kernel 		ovpns1: link state changed to DOWN
    Jan 19 16:30:25 	php-fpm 	74332 	/rc.filter_synchronize: XMLRPC reload data success with https://sync.x.x.2:8443/xmlrpc.php (pfsense.host_firmware_version).
    Jan 19 16:30:25 	php-fpm 	74332 	/rc.filter_synchronize: XMLRPC versioncheck: 17.3 -- 17.3
    Jan 19 16:30:25 	php-fpm 	74332 	/rc.filter_synchronize: Beginning XMLRPC sync data to https://sync.x.x.2:8443/xmlrpc.php.
    Jan 19 16:30:25 	check_reload_status 		Reloading filter
    Jan 19 16:30:25 	php-fpm 	69318 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:25 	php-fpm 	69318 	/rc.carpmaster: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "MASTER" for vhid 3
    Jan 19 16:30:25 	php-fpm 	67295 	OpenVPN PID written: 80509
    Jan 19 16:30:25 	kernel 		ovpns1: link state changed to UP
    Jan 19 16:30:25 	check_reload_status 		rc.newwanip starting ovpns1
    Jan 19 16:30:25 	php-fpm 	82441 	/rc.newwanip: rc.newwanip: Info: starting on ovpns1.
    Jan 19 16:30:25 	php-fpm 	82441 	/rc.newwanip: rc.newwanip: on (IP address: openvpn.x.x.129) (interface: []) (real interface: ovpns1).
    Jan 19 16:30:25 	php-fpm 	82441 	/rc.newwanip: rc.newwanip called with empty interface.
    Jan 19 16:30:25 	php-fpm 	82441 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> openvpn.x.x.129 - Restarting packages.
    Jan 19 16:30:25 	check_reload_status 		Starting packages
    Jan 19 16:30:25 	php-fpm 	82441 	/rc.newwanip: rc.newwanip: Info: starting on igb1.
    Jan 19 16:30:25 	php-fpm 	82441 	/rc.newwanip: rc.newwanip: on (IP address: lan.x.x.2) (interface: LAN[lan]) (real interface: igb1).
    Jan 19 16:30:25 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:30:25 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:30:25 	kernel 		carp: 1@igb1: BACKUP -> INIT (hardware interface up)
    Jan 19 16:30:25 	check_reload_status 		Carp backup event
    Jan 19 16:30:25 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:30:25 	check_reload_status 		Carp backup event
    Jan 19 16:30:26 	kernel 		carp: 1@igb1: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:30:26 	check_reload_status 		Carp master event
    Jan 19 16:30:26 	php-fpm 	69318 	/rc.newwanip: rc.newwanip: Info: starting on ovpns1.
    Jan 19 16:30:26 	php-fpm 	69318 	/rc.newwanip: rc.newwanip: on (IP address: openvpn.x.x.129) (interface: []) (real interface: ovpns1).
    Jan 19 16:30:26 	php-fpm 	69318 	/rc.newwanip: rc.newwanip called with empty interface.
    Jan 19 16:30:26 	php-fpm 	69318 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> openvpn.x.x.129 - Restarting packages.
    Jan 19 16:30:26 	check_reload_status 		Starting packages
    Jan 19 16:30:26 	php-fpm 	69318 	/rc.start_packages: Restarting/Starting all packages.
    Jan 19 16:30:26 	php-fpm 	82441 	/rc.newwanip: waiting for pfsync...
    Jan 19 16:30:27 	php-fpm 	74332 	/rc.filter_synchronize: XMLRPC reload data success with https://sync.x.x.2:8443/xmlrpc.php (pfsense.restore_config_section).
    Jan 19 16:30:27 	php-fpm 	74332 	/rc.filter_synchronize: Beginning XMLRPC sync data to https://sync.x.x.2:8443/xmlrpc.php.
    Jan 19 16:30:27 	php-fpm 	67295 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:27 	php-fpm 	69318 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:27 	php-fpm 	69318 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:27 	php-fpm 	69318 	/rc.start_packages: Restarting/Starting all packages.
    Jan 19 16:30:27 	SuricataStartup 	93639 	Ignoring additional START command since Suricata is already starting...
    Jan 19 16:30:28 	php-fpm 	67295 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:28 	php-fpm 	67295 	/rc.carpmaster: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "MASTER" for vhid 1
    Jan 19 16:30:29 	php-fpm 	74332 	/rc.filter_synchronize: XMLRPC reload data success with https://sync.x.x.2:8443/xmlrpc.php (pfsense.filter_configure).
    Jan 19 16:30:57 	php-fpm 	82441 	/rc.newwanip: pfsync done in 30 seconds.
    Jan 19 16:30:57 	php-fpm 	82441 	/rc.newwanip: Configuring CARP settings finalize...
    Jan 19 16:30:57 	check_reload_status 		Reloading filter
    Jan 19 16:31:04 	kernel 		carp: 1@igb1: MASTER -> INIT (hardware interface down)
    Jan 19 16:31:04 	kernel 		carp: demoted by 240 to 240 (interface down)
    Jan 19 16:31:04 	kernel 		igb1: link state changed to DOWN
    Jan 19 16:31:04 	kernel 		carp: 3@igb2: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:31:04 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb2: 3
    Jan 19 16:31:04 	kernel 		carp: 2@igb0: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:31:04 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb0: 3
    Jan 19 16:31:04 	check_reload_status 		Carp backup event
    Jan 19 16:31:04 	check_reload_status 		Linkup starting igb1
    Jan 19 16:31:04 	check_reload_status 		Carp backup event
    Jan 19 16:31:04 	check_reload_status 		Carp backup event
    Jan 19 16:31:06 	php-fpm 	78016 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.2 )
    Jan 19 16:31:06 	check_reload_status 		Reloading filter
    Jan 19 16:31:06 	php-fpm 	65739 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:06 	php-fpm 	65739 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:07 	php-fpm 	65739 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:07 	php-fpm 	65739 	/rc.carpbackup: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "BACKUP" for vhid 2
    Jan 19 16:31:07 	php-fpm 	65739 	/rc.carpbackup: Stopping OpenVPN server instance on WAN because of transition to CARP backup.
    Jan 19 16:31:07 	php-fpm 	78016 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:07 	php-fpm 	78016 	/rc.carpbackup: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "BACKUP" for vhid 3
    Jan 19 16:31:07 	php-fpm 	65739 	OpenVPN terminate old pid: 80509
    Jan 19 16:31:07 	kernel 		ovpns1: link state changed to DOWN
    Jan 19 16:31:07 	check_reload_status 		Carp backup event
    Jan 19 16:31:07 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:31:07 	kernel 		carp: demoted by -240 to 0 (interface up)
    Jan 19 16:31:07 	kernel 		igb1: link state changed to UP
    Jan 19 16:31:07 	check_reload_status 		Reloading filter
    Jan 19 16:31:07 	check_reload_status 		Linkup starting igb1
    Jan 19 16:31:07 	kernel 		carp: 3@igb2: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:31:07 	kernel 		carp: 2@igb0: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:31:07 	check_reload_status 		Carp master event
    Jan 19 16:31:07 	check_reload_status 		Carp master event
    Jan 19 16:31:08 	php-fpm 	82740 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.2 )
    Jan 19 16:31:08 	check_reload_status 		rc.newwanip starting igb1
    Jan 19 16:31:08 	check_reload_status 		Reloading filter
    Jan 19 16:31:08 	php-fpm 	65739 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:08 	php-fpm 	65739 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:09 	php-fpm 	82740 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:09 	php-fpm 	82740 	/rc.carpmaster: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "MASTER" for vhid 3
    Jan 19 16:31:09 	php-fpm 	82740 	/rc.newwanip: rc.newwanip: Info: starting on igb1.
    Jan 19 16:31:09 	php-fpm 	82740 	/rc.newwanip: rc.newwanip: on (IP address: lan.x.x.2) (interface: LAN[lan]) (real interface: igb1).
    Jan 19 16:31:09 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:31:09 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:31:09 	kernel 		carp: 1@igb1: BACKUP -> INIT (hardware interface up)
    Jan 19 16:31:09 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:31:09 	check_reload_status 		Carp backup event
    Jan 19 16:31:09 	check_reload_status 		Carp backup event
    Jan 19 16:31:09 	php-fpm 	65739 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:09 	php-fpm 	65739 	/rc.carpmaster: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "MASTER" for vhid 2
    Jan 19 16:31:09 	php-fpm 	65739 	/rc.carpmaster: Starting OpenVPN server instance on WAN because of transition to CARP master.
    Jan 19 16:31:09 	php-fpm 	65739 	OpenVPN PID written: 87267
    Jan 19 16:31:09 	check_reload_status 		Reloading filter
    Jan 19 16:31:09 	kernel 		ovpns1: link state changed to UP
    Jan 19 16:31:09 	check_reload_status 		rc.newwanip starting ovpns1
    Jan 19 16:31:10 	kernel 		carp: 1@igb1: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:31:10 	check_reload_status 		Carp master event
    Jan 19 16:31:10 	php-fpm 	82740 	/rc.newwanip: waiting for pfsync...
    Jan 19 16:31:10 	php-fpm 	78016 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:10 	php-fpm 	78016 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:10 	php-fpm 	78016 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:11 	php-fpm 	90714 	/rc.newwanip: rc.newwanip: Info: starting on ovpns1.
    Jan 19 16:31:11 	php-fpm 	90714 	/rc.newwanip: rc.newwanip: on (IP address: openvpn.x.x.129) (interface: []) (real interface: ovpns1).
    Jan 19 16:31:11 	php-fpm 	90714 	/rc.newwanip: rc.newwanip called with empty interface.
    Jan 19 16:31:11 	check_reload_status 		Reloading filter
    Jan 19 16:31:11 	php-fpm 	90714 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> openvpn.x.x.129 - Restarting packages.
    Jan 19 16:31:11 	check_reload_status 		Starting packages
    Jan 19 16:31:12 	php-fpm 	90714 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:12 	php-fpm 	90714 	/rc.carpmaster: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "MASTER" for vhid 1
    Jan 19 16:31:13 	php-fpm 	90714 	/rc.start_packages: Restarting/Starting all packages.
    Jan 19 16:31:40 	php-fpm 	82740 	/rc.newwanip: pfsync done in 30 seconds.
    Jan 19 16:31:40 	php-fpm 	82740 	/rc.newwanip: Configuring CARP settings finalize...
    Jan 19 16:31:40 	check_reload_status 		Reloading filter
    Jan 19 16:53:15 	check_reload_status 		Syncing firewall 
    

    Secondary firewall:

    Jan 19 04:31:37 	check_reload_status 		Reloading filter
    Jan 19 16:30:01 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] There is a new set of Emerging Threats Open rules posted. Downloading emerging.rules.tar.gz...
    Jan 19 16:30:07 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Emerging Threats Open rules file update downloaded successfully.
    Jan 19 16:30:07 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Snort VRT rules md5 download failed...
    Jan 19 16:30:07 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Server returned error code 422...
    Jan 19 16:30:08 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Updating rules configuration for: LAN ...
    Jan 19 16:30:14 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Enabling any flowbit-required rules for: LAN...
    Jan 19 16:30:15 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Building new sid-msg.map file for LAN...
    Jan 19 16:30:17 	SuricataStartup 	69689 	Suricata STOP for LAN(61011_igb1)...
    Jan 19 16:30:18 	check_reload_status 		Carp backup event
    Jan 19 16:30:18 	kernel 		carp: 1@igb1: BACKUP -> INIT (hardware interface down)
    Jan 19 16:30:18 	kernel 		carp: demoted by 240 to 240 (interface down)
    Jan 19 16:30:18 	kernel 		igb1: link state changed to DOWN
    Jan 19 16:30:18 	check_reload_status 		Linkup starting igb1
    Jan 19 16:30:19 	php-fpm 	87914 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:19 	php-fpm 	87914 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.3 )
    Jan 19 16:30:19 	check_reload_status 		Reloading filter
    Jan 19 16:30:19 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] Suricata has restarted with your new set of rules...
    Jan 19 16:30:19 	SuricataStartup 	72396 	Suricata START for LAN(61011_igb1)...
    Jan 19 16:30:19 	php-cgi 		suricata_check_for_rule_updates.php: [Suricata] The Rules update has finished.
    Jan 19 16:30:19 	check_reload_status 		Syncing firewall
    Jan 19 16:30:21 	check_reload_status 		Carp backup event
    Jan 19 16:30:21 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:30:21 	kernel 		carp: demoted by -240 to 0 (interface up)
    Jan 19 16:30:21 	kernel 		igb1: link state changed to UP
    Jan 19 16:30:21 	check_reload_status 		Linkup starting igb1
    Jan 19 16:30:22 	php-fpm 	74885 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:22 	php-fpm 	74885 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.3 )
    Jan 19 16:30:22 	check_reload_status 		rc.newwanip starting igb1
    Jan 19 16:30:22 	check_reload_status 		Reloading filter
    Jan 19 16:30:22 	kernel 		carp: 2@igb0: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:30:22 	kernel 		carp: 3@igb2: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:30:22 	check_reload_status 		Carp master event
    Jan 19 16:30:22 	check_reload_status 		Carp master event
    Jan 19 16:30:23 	php-fpm 	74885 	/rc.newwanip: rc.newwanip: Info: starting on igb1.
    Jan 19 16:30:23 	php-fpm 	74885 	/rc.newwanip: rc.newwanip: on (IP address: lan.x.x.3) (interface: LAN[lan]) (real interface: igb1).
    Jan 19 16:30:23 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:30:23 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:30:23 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:30:23 	kernel 		carp: 1@igb1: BACKUP -> INIT (hardware interface up)
    Jan 19 16:30:23 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:30:23 	check_reload_status 		Carp backup event
    Jan 19 16:30:23 	check_reload_status 		Carp backup event
    Jan 19 16:30:23 	kernel 		carp: 2@igb0: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:30:23 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb0: 3
    Jan 19 16:30:23 	kernel 		carp: 3@igb2: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:30:23 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb2: 3
    Jan 19 16:30:23 	check_reload_status 		Carp backup event
    Jan 19 16:30:23 	check_reload_status 		Carp backup event
    Jan 19 16:30:24 	php-fpm 	74885 	/rc.newwanip: waiting for pfsync...
    Jan 19 16:30:25 	php-fpm 	78224 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:25 	php-fpm 	78224 	/rc.carpmaster: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "MASTER" for vhid 2
    Jan 19 16:30:25 	php-fpm 	78224 	/rc.carpmaster: Starting OpenVPN server instance on WAN because of transition to CARP master.
    Jan 19 16:30:25 	php-fpm 	81151 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:25 	php-fpm 	81151 	/rc.carpmaster: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "MASTER" for vhid 3
    Jan 19 16:30:25 	php-fpm 	81151 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:26 	php-fpm 	78224 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:26 	php-fpm 	78224 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:26 	php-fpm 	81151 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:26 	php-fpm 	81151 	/rc.carpbackup: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "BACKUP" for vhid 2
    Jan 19 16:30:26 	php-fpm 	81151 	/rc.carpbackup: Stopping OpenVPN server instance on WAN because of transition to CARP backup.
    Jan 19 16:30:26 	kernel 		carp: 1@igb1: BACKUP -> MASTER (master timed out)
    Jan 19 16:30:26 	kernel 		carp: 1@igb1: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:30:26 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:30:26 	php-fpm 	82923 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:26 	php-fpm 	82923 	/rc.carpbackup: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "BACKUP" for vhid 3
    Jan 19 16:30:26 	check_reload_status 		Carp master event
    Jan 19 16:30:26 	check_reload_status 		Carp backup event
    Jan 19 16:30:27 	check_reload_status 		Syncing firewall
    Jan 19 16:30:27 	check_reload_status 		Reloading filter
    Jan 19 16:30:27 	php-fpm 	82923 	/xmlrpc.php: ROUTING: setting default route to external.x.x.241
    Jan 19 16:30:27 	php-fpm 	82923 	/xmlrpc.php: Resyncing OpenVPN instances.
    Jan 19 16:30:28 	php-fpm 	78224 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:28 	php-fpm 	78224 	/rc.carpmaster: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "MASTER" for vhid 1
    Jan 19 16:30:34 	php-fpm 	87990 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:30:34 	php-fpm 	87990 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:30:55 	php-fpm 	74885 	/rc.newwanip: pfsync done in 30 seconds.
    Jan 19 16:30:55 	php-fpm 	74885 	/rc.newwanip: Configuring CARP settings finalize...
    Jan 19 16:30:55 	check_reload_status 		Reloading filter
    Jan 19 16:31:01 	check_reload_status 		Carp backup event
    Jan 19 16:31:01 	kernel 		carp: 1@igb1: BACKUP -> INIT (hardware interface down)
    Jan 19 16:31:01 	kernel 		carp: demoted by 240 to 240 (interface down)
    Jan 19 16:31:01 	kernel 		igb1: link state changed to DOWN
    Jan 19 16:31:01 	check_reload_status 		Linkup starting igb1
    Jan 19 16:31:02 	php-fpm 	46547 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:02 	php-fpm 	46547 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.3 )
    Jan 19 16:31:02 	check_reload_status 		Reloading filter
    Jan 19 16:31:04 	check_reload_status 		Carp backup event
    Jan 19 16:31:04 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:31:04 	kernel 		carp: demoted by -240 to 0 (interface up)
    Jan 19 16:31:04 	kernel 		igb1: link state changed to UP
    Jan 19 16:31:04 	check_reload_status 		Linkup starting igb1
    Jan 19 16:31:04 	kernel 		carp: 3@igb2: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:31:04 	kernel 		carp: 2@igb0: BACKUP -> MASTER (preempting a slower master)
    Jan 19 16:31:04 	check_reload_status 		Carp master event
    Jan 19 16:31:04 	check_reload_status 		Carp master event
    Jan 19 16:31:05 	php-fpm 	48527 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:05 	php-fpm 	48527 	/rc.linkup: Hotplug event detected for LAN(lan) static IP (lan.x.x.3 )
    Jan 19 16:31:05 	check_reload_status 		rc.newwanip starting igb1
    Jan 19 16:31:05 	check_reload_status 		Reloading filter
    Jan 19 16:31:06 	php-fpm 	51030 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:06 	php-fpm 	51030 	/rc.carpmaster: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "MASTER" for vhid 2
    Jan 19 16:31:06 	php-fpm 	51030 	/rc.carpmaster: Starting OpenVPN server instance on WAN because of transition to CARP master.
    Jan 19 16:31:06 	php-fpm 	51030 	OpenVPN PID written: 52204
    Jan 19 16:31:06 	check_reload_status 		Reloading filter
    Jan 19 16:31:06 	kernel 		ovpns1: link state changed to UP
    Jan 19 16:31:06 	php-fpm 	48527 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:06 	php-fpm 	48527 	/rc.carpmaster: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "MASTER" for vhid 3
    Jan 19 16:31:06 	php-fpm 	51030 	/rc.newwanip: rc.newwanip: Info: starting on igb1.
    Jan 19 16:31:06 	php-fpm 	51030 	/rc.newwanip: rc.newwanip: on (IP address: lan.x.x.3) (interface: LAN[lan]) (real interface: igb1).
    Jan 19 16:31:06 	check_reload_status 		rc.newwanip starting ovpns1
    Jan 19 16:31:06 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:31:06 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:31:06 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:31:06 	kernel 		carp: 1@igb1: BACKUP -> INIT (hardware interface up)
    Jan 19 16:31:06 	kernel 		carp: 1@igb1: INIT -> BACKUP (initialization complete)
    Jan 19 16:31:06 	check_reload_status 		Carp backup event
    Jan 19 16:31:06 	check_reload_status 		Carp backup event
    Jan 19 16:31:07 	kernel 		carp: 3@igb2: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:31:07 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb2: 3
    Jan 19 16:31:07 	kernel 		carp: 2@igb0: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:31:07 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb0: 3
    Jan 19 16:31:07 	check_reload_status 		Carp backup event
    Jan 19 16:31:07 	check_reload_status 		Carp backup event
    Jan 19 16:31:07 	php-fpm 	48527 	/rc.newwanip: rc.newwanip: Info: starting on ovpns1.
    Jan 19 16:31:07 	php-fpm 	48527 	/rc.newwanip: rc.newwanip: on (IP address: 10.20.254.129) (interface: []) (real interface: ovpns1).
    Jan 19 16:31:07 	php-fpm 	48527 	/rc.newwanip: rc.newwanip called with empty interface.
    Jan 19 16:31:07 	check_reload_status 		Reloading filter
    Jan 19 16:31:07 	php-fpm 	48527 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> 10.20.254.129 - Restarting packages.
    Jan 19 16:31:07 	check_reload_status 		Starting packages
    Jan 19 16:31:07 	php-fpm 	51030 	/rc.newwanip: waiting for pfsync...
    Jan 19 16:31:07 	php-fpm 	58261 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:08 	php-fpm 	48527 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:08 	php-fpm 	48527 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:08 	php-fpm 	55155 	/rc.start_packages: Restarting/Starting all packages.
    Jan 19 16:31:08 	php-fpm 	58261 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:08 	php-fpm 	58261 	/rc.carpbackup: HA cluster member "(dmz.x.x.1@igb2): (SERVICE)" has resumed CARP state "BACKUP" for vhid 3
    Jan 19 16:31:08 	php-fpm 	48527 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:08 	php-fpm 	48527 	/rc.carpbackup: HA cluster member "(external.x.x.244@igb0): (WAN)" has resumed CARP state "BACKUP" for vhid 2
    Jan 19 16:31:08 	php-fpm 	48527 	/rc.carpbackup: Stopping OpenVPN server instance on WAN because of transition to CARP backup.
    Jan 19 16:31:08 	php-fpm 	48527 	OpenVPN terminate old pid: 52204
    Jan 19 16:31:08 	kernel 		ovpns1: link state changed to DOWN
    Jan 19 16:31:09 	check_reload_status 		Reloading filter
    Jan 19 16:31:10 	kernel 		carp: 1@igb1: BACKUP -> MASTER (master timed out)
    Jan 19 16:31:10 	kernel 		carp: 1@igb1: MASTER -> BACKUP (more frequent advertisement received)
    Jan 19 16:31:10 	kernel 		ifa_maintain_loopback_route: deletion failed for interface igb1: 3
    Jan 19 16:31:10 	check_reload_status 		Carp master event
    Jan 19 16:31:10 	check_reload_status 		Carp backup event
    Jan 19 16:31:11 	php-fpm 	55155 	/rc.carpbackup: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:11 	php-fpm 	55155 	/rc.carpbackup: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "BACKUP" for vhid 1
    Jan 19 16:31:11 	php-fpm 	58261 	/rc.carpmaster: Message sent to pfsensealerts@mysecretdomain.com OK
    Jan 19 16:31:11 	php-fpm 	58261 	/rc.carpmaster: HA cluster member "(lan.x.x.1@igb1): (LAN)" has resumed CARP state "MASTER" for vhid 1
    Jan 19 16:31:15 	php-fpm 	58261 	/status_ipsec.php: Successful login for user 'admin' from: remote.x.x.146
    Jan 19 16:31:38 	php-fpm 	51030 	/rc.newwanip: pfsync done in 30 seconds.
    Jan 19 16:31:38 	php-fpm 	51030 	/rc.newwanip: Configuring CARP settings finalize...
    Jan 19 16:31:38 	check_reload_status 		Reloading filter
    Jan 19 16:39:22 	check_reload_status 		Syncing firewall 
    


  • That's a little weird and unexpected!  Suricata using Legacy Mode blocking utilizes PCAP for sampling traffic on the interface.  Suricata does not directly bring the interface up or down, but it will toggle promiscuous mode on and off by default when Suricata starts and stops.  However, I would not expect that to cause the link to stop passing traffic and/or issue a "hotplug event" like I see in your logs.

    Suricata restarts itself at the end of the rules update process.  If you want, you can instead tell Suricata to simply "Live Reload" the new rules without restarting itself.  There is a checkbox either on the GLOBAL SETTINGS tab or else the INTERFACE SETTINGS tab for the interface.  I can't remember off the top of my head where that checkbox is located right now.  If you can find that checkbox, give that setting a try and see if it helps with the CARP failovers.

    You say this did not happen with 2.3.x pfSense?

    Bill



  • Yea.  I am pretty sure it was a 2.3 version before this that I was running that did not exhibit the behavior.  It was on older 2.3.2 or something like that though.  I did a bunch of upgrades around the same time and don't have the email alerts anymore to show how far back it was happening.

    I found the setting you are referring to.  We will see if that fixes the issue.  Thanks.

    Services\Suricata\Global Settings
    \Rules Update Settings
      \Live Rule Swap on Update



  • Changing Suricata config to live reload the rules stopped carp from failing over.  It does seem like Suricata was causing the issue.  I thought I didn't enable live reloading because of issues a few years ago but that was quite a few versions ago so maybe that isn't an issue anymore.  There is a note that if live reloading causes problems that you should disable live reloading.  Hopefully things keep going smoothly.

    Thanks for the help.



  • @adam65535:

    Changing Suricata config to live reload the rules stopped carp from failing over.  It does seem like Suricata was causing the issue.  I thought I didn't enable live reloading because of issues a few years ago but that was quite a few versions ago so maybe that isn't an issue anymore.  There is a note that if live reloading causes problems that you should disable live reloading.  Hopefully things keep going smoothly.

    Thanks for the help.

    Thanks for the follow-up.  Using Live Reload should be OK.  It is relatively mature now in Suricata.

    I still have no good explanation for why Suricata restarting woud cycle the network connection.  As I said earlier, the only thing it is doing with Legacy Mode blocking is starting up libpcap to get packet copies of traffic traversing the interface.  Maybe that causes something to hiccup in FreeBSD someplace and CARP sees the hiccup because maybe it disrupts traffic very briefly.  Strange issue.

    Bill



  • Just wanted to mention I am experiencing the exact same behavior on 4 of my firewalls I just upgraded to 2.4.2 all running legacy mode, and had no problems with this on 2.3.x

    Interestingly I have 2 more firewalls(with different hardware) running 2.4.0 with the exact same suricata settings and they are not causing carp failover during refresh.

    I've now enabled live swap as mentioned here, seems like that will fix it.  I'll get those other two upgraded to 2.4.2 as well and see what happens, I wonder if it's specific to the hardware or if a change in 2.4.2 is causing it.



  • @bmeeks:

    @adam65535:

    Changing Suricata config to live reload the rules stopped carp from failing over.  It does seem like Suricata was causing the issue.  I thought I didn't enable live reloading because of issues a few years ago but that was quite a few versions ago so maybe that isn't an issue anymore.  There is a note that if live reloading causes problems that you should disable live reloading.  Hopefully things keep going smoothly.

    Thanks for the help.

    Thanks for the follow-up.  Using Live Reload should be OK.  It is relatively mature now in Suricata.

    I still have no good explanation for why Suricata restarting woud cycle the network connection.  As I said earlier, the only thing it is doing with Legacy Mode blocking is starting up libpcap to get packet copies of traffic traversing the interface.  Maybe that causes something to hiccup in FreeBSD someplace and CARP sees the hiccup because maybe it disrupts traffic very briefly.  Strange issue.

    Bill

    Success! Looks like enabling "Live Swap" fixed the issue for me too. Only got past 1 "expected CARP failover event" thus far, but appears to be good.

    Thanks for the suggestion. All I did to fix it on my side is filled in the checkbox "Enable "Live Swap" reload of rules after downloading an update" on my pfsense routers and so far so good. Typically the routers appeared to fail back and forth a lot as the general system logs showed >5000 logs of CARP failover. Gladly CARP works very well, so actual impact was approx 2-5 lost pings, slight freeze on RDP sessions, but SSH sessions would continue to work as expected.

    Because I only just set this rule I have only gotten past one potential failure (update every 12 hours starting at 00:30. 00:30 did have failure, but at 11:40 I enabled the "Live Swap reload" in Suricata, and 12:30 typical CARP failover did NOT happen). In other words, I typically have two failures, one at 00:30 midnight, and a second at 12:30 noon. After changing this setting I have not had any failures.

    Crossing my fingers this was the solution :)

    PS. if helpful, my versions are:
    pfSense: 2.4.2-RELEASE
    Suricata: 4.0.3_1