Failback state killing with "Automatic" failover?
-
Running 24.03 ...
With "Automatic" failover, is it possible to kill states on a secondary gateway when failing back to the primary (i.e., primary comes back up after an outage)?
First pic below is the gateway config. For v4, WAN_DHCP is the primary and E1V95_LTEGW is the secondary. "Automatic" is selected for the default gw. Failover/failback works, with the exception that I would like states on E1V95_LTEGW to be killed on failback (to push all connections back to the primary).
Second pic is from System>Advanced>Miscellaneous, showing state-killing for lower priority gateways on recovery is selected. After recovery, however, established states on E1V95_LTEGW are not killed (not expecting gateway monitoring states to be killed, just everything else).
Do gateway groups implementing simple failover/failback -- functionally the same as Automatic -- have to be defined for state-killing to occur on recovery?
Thx.
-
After creating a gateway group for the two IPv4 gateways, and setting that to the default gateway for IPv4, states on the secondary gateway are still not being killed on failback. "Kill all states for lower priority gateways" remains set for State Killing on Gateway Recovery.
The global system Firewall State Policy is Interface Bound States. All firewall rules have State Policy of "Use global default" and Gateway set to Default.
What am I missing?
EDIT: Added list of established states from the /24 LAN-side subnets still present on the failover interface >12 hours after last failback. External addresses shown as "xxx". 192.168.95.2 is the pfsense's addr on the subnet connecting it to the upstream failover router.
[24.03-RELEASE][admin@pfSense.home.arpa]/root: pfctl -i igc1.95 -s state | sed -n '/192.168.[^9].*ESTABLISHED:E/s/\(> [0-9.]*\)/> xxx/p' igc1.95 tcp 192.168.95.2:6180 (192.168.15.6:52450) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:6941 (192.168.15.157:4602) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:28372 (192.168.15.157:4605) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:16139 (192.168.15.157:4611) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:23384 (192.168.15.6:57660) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:27280 (192.168.15.174:59649) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:47178 (192.168.15.174:51165) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:21549 (192.168.15.6:57676) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:43734 (192.168.15.157:4631) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:32178 (192.168.15.157:4647) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:52273 (192.168.15.6:60880) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:24318 (192.168.15.157:4671) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:46721 (192.168.15.65:45178) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:56547 (192.168.15.65:56328) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:41714 (192.168.15.65:56332) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:42779 (192.168.15.65:51332) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:37831 (192.168.15.65:51346) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:15242 (192.168.15.65:37692) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:55777 (192.168.15.65:37706) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:10516 (192.168.15.65:51276) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:52926 (192.168.15.65:51282) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:49583 (192.168.15.65:34458) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:26432 (192.168.15.65:34462) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:11211 (192.168.15.157:4725) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:29036 (192.168.15.6:60804) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:3413 (192.168.15.65:56162) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:29568 (192.168.15.65:56170) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:60554 (192.168.15.6:57450) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:17424 (192.168.15.6:50012) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:58224 (192.168.15.6:60396) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:15079 (192.168.15.6:53064) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:11498 (192.168.15.65:49240) -> xxx:5228 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:64186 (192.168.15.174:18112) -> xxx:5228 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:8209 (192.168.15.174:65238) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:58529 (192.168.15.6:54690) -> xxx:5228 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:48435 (192.168.40.254:46213) -> xxx:2350 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:38699 (192.168.20.155:52936) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:56630 (192.168.20.155:53746) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:28937 (192.168.20.155:53748) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:8718 (192.168.20.155:53750) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:12523 (192.168.20.155:53752) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:38445 (192.168.20.155:54950) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:38223 (192.168.20.155:53622) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:42014 (192.168.15.157:4864) -> xxx:993 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:55926 (192.168.15.157:4929) -> xxx:5228 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:7759 (192.168.15.174:47377) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:57014 (192.168.15.174:47971) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:60090 (192.168.15.65:39242) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:8218 (192.168.15.174:54320) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:45767 (192.168.15.174:65171) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:34306 (192.168.15.157:4960) -> xxx:443 ESTABLISHED:ESTABLISHED igc1.95 tcp 192.168.95.2:30221 (192.168.15.174:28172) -> xxx:5228 ESTABLISHED:ESTABLISHED
-
EDIT: Filed as #15694 on redmine.
@stephenw10 , apologies for the tag but could you do a quick review of this for sanity and pilot error?
TL;DR is that the setting to kill states on lower priority gateways after failback is not respected, both for "automatic" and gateway group failover configurations. I have a simple setup: one primary that I want to use whenever available and one secondary that I want to use only when the primary is not available.
If there's nothing obviously wrong with the setup/tests of the first two posts, I'll file a bug report.
Thanks.
-
Thanks for the feedback. Please test again with this patch applied using the System Patches package:
<removed> -
@marcosm thanks for the speedy patch! Seems to have fixed the issue overall, but I'm still seeing an odd DNS and ICMP issue. Will get to that in a moment.
I did the following with the primary and secondary configured as a gateway group.
- applied the patch
- killed all states with the primary active, waited several minutes, listed the states on the secondary via via
pfctl -i igc1.95 -s state
(first snippet below) - forced a failover by unplugging the primary for 5 minutes
- re-plugged the primary, waited 5 minutes, listed the states on the secondary via
pfctl -i igc1.95 -s state
(second snippet below)
Good news is that the list of states on the secondary is essentially the same before and after. Prior to the patch, there were hundreds of states remaining on the secondary interface after failback (smallish network, usually running at ~800 states total). The ICMP from 192.168.95.2 to 1.1.1.1 is gateway monitoring for the secondary interface.
The unexpected things are ping and DNS/DoT related: (a) ping from devices behind the pfSense being routed via the secondary when the primary is up, (b) DNS states on the secondary before and after failover/failback for 192.168.99.65, and (c) DoT states on the secondary after failback.
192.168.99.65 is a device on a /24+VLAN that has only that device at 192.168.99.65 and the pfSense at 192.168.99.1. Rules for that subnet+VLAN all use the Default gateway. 192.168.99.65 uses Cloudflare for ping and DNS. All other subnets/VLANs are configured for DoT via unbound (pfSense DNS Resolver) against Cloudflare and Google.
When the primary is up, I'd expect all pings, DNS and DoT to go through the primary interface (excluding gateway monitoring). However, it seems like (at least some) of this traffic goes through the secondary even when the primary is active.
The DoT states show as FIN_WAIT because the lookups happen so quickly. I can trigger them at will, just can't get them to show as ESTABLISHED in the pfctl output because the connections are so short-lived and I have to switch windows after triggering the lookup.
igc1.95 icmp 192.168.95.2:33486 -> 1.1.1.1:33486 0:0 igc1.95 icmp 192.168.95.2:2490 (192.168.99.65:15220) -> 1.1.1.1:2490 0:0 igc1.95 udp 192.168.95.2:32580 (192.168.99.65:42896) -> 1.1.1.1:53 MULTIPLE:SINGLE igc1.95 udp 192.168.95.2:17440 (192.168.99.65:60904) -> 1.1.1.1:53 MULTIPLE:SINGLE igc1.95 udp 192.168.95.2:47480 (192.168.99.65:59636) -> 1.1.1.1:53 MULTIPLE:SINGLE
igc1.95 icmp 192.168.95.2:42051 -> 1.1.1.1:42051 0:0 igc1.95 icmp 192.168.95.2:43412 (192.168.99.65:15220) -> 1.1.1.1:43412 0:0 igc1.95 tcp 192.168.95.2:17522 -> 1.1.1.1:853 FIN_WAIT_2:FIN_WAIT_2 igc1.95 udp 192.168.95.2:46714 (192.168.99.65:55869) -> 1.1.1.1:53 MULTIPLE:SINGLE igc1.95 udp 192.168.95.2:61749 (192.168.99.65:40048) -> 1.1.1.1:53 MULTIPLE:SINGLE igc1.95 udp 192.168.95.2:61350 (192.168.99.65:57325) -> 1.1.1.1:53 MULTIPLE:SINGLE igc1.95 udp 192.168.95.2:40557 (192.168.99.65:45178) -> 1.1.1.1:53 MULTIPLE:SINGLE
-
When you use an IP for gateway monitoring, a route is created for it via the gateway.
-
@marcosm said in Failback state killing with "Automatic" failover?:
When you use an IP for gateway monitoring, a route is created for it via the gateway.
Got it. I switched the secondary monitoring address to 9.9.9.9 since I don't use Quad9 for DNS resolution. The extra states on the secondary, while the primary is up, disappeared. Thanks!
[24.03-RELEASE][admin@pfSense.home.arpa]/root: pfctl -i igc1.95 -s state igc1.95 icmp 192.168.95.2:24256 -> 9.9.9.9:24256 0:0
-