How to configure failback for WAN1 up
- 
 I think the best way to do this would be to ensure that the traffic that needs to fail over properly is on it's on VLAN so that in the event that we need to kill states this can be done to the entire interface rather than Trying to pick and choose IP address is to do this too. This could possibly include shutting down the interface and starting the interface back up again unless that does not kill the states. I like the idea of an entire subnet or VLAN because I am pretty sure that the command would be much easier and simpler to run when affecting an entire subnet 
- 
 I posted a bounty for this. If no one steps up, I am looking at engaging someone to create a script fo some kind..CRON etc to get this going. If anyone is interested please contact me. I am also looking at upwork if no one bites here on the pfsense forum. 
- 
 I am able to create the same issue, running a clean install of v2.3.1 2 WANs setup in gateway groups called "failover" 
 WAN1 - tier 1
 WAN2 - tier 2LAN Firewall rule specifying the Gateway as the failover gateway group. If WAN1 goes down, all traffic fails over to WAN2 as expected - you can see this in Diag > States and can confirm by doing a trace route from any LAN device. When WAN1 comes back up (status > Gateway confirms "online") - some state's remain over WAN2. Standard HTTP traffic will revert to WAN1 within a few minutes. However traffic such as VoIP/SIP remains over WAN2 and the diag>states table confirms this. Can be left for 8hrs+ and remains the same. It's a pain having our VoIP traffic sent over the wrong WAN for any length of time. Hello everyone, first post here, glad to be now an active member of the community :) 
 I've had a similar problem for three days now, and after searching about it, I found this (see post #21 from Chris). Maybe that's what you are running into.But the similarity ends here, my problem is that none of the traffic is sent back to WAN1 after recovering :-\ A few explanations: 
 My hardware setup: Alix board with 3Gb ethernet interfaces: re0 as WAN1, re1 as LAN, re2 as WAN2. I have a 300Mb fiber connection (from the french ISP SFR) on WAN1 connected to the fiber > ethernet adapter provided by the ISP, with DHCP enabled so that I got my IP from them (so no other router between my Pf appliance and their network), and borrowed a 10Mb/s 4G connection (provided by another french ISP, Bouygues) at the office on a D-Link router, connected to my WAN2 interface, also as DHCP (so it gets its IP from de D-Link router).
 My software setup: pfSense 2.3.1. WAN1 and WAN2 are bonded in a Gateway Group, WAN1 as Tier1, WAN2 as Tier2, trigger: member down. WAN1 is the default gateway. Both monitoring IP are external (Google DNS). DNS servers under "System > General Setup" are a mix between Google DNS and OpenDNS, and I checked "Do not use DNS forwarder or resolver as DNS server for the firewall". I added a rule for the LAN interface to use the Gateway Group as default gateway.The problem: under normal operations, all the traffic is routed through WAN1, no problem. If I unplug WAN1, the traffic is routed through WAN2, again, no problem. But if I re-plug WAN1, the traffic never goes back to the Tier1 gateway, even after hours, even after reseting the states under "Diagnostic > States". The only way to get it back to WAN1 is to unplug WAN2. It concern every traffic (HTTPS, HTTPS, VoIP, …) on every device (my computer, the smartphones, my media box, my home server). 
 For information, my WAN1 interface is up, the gateway external monitoring IP is reachable, the DNS are responding.Did I miss something ? I think I made a mistake somewhere, but after hours of research, I cannot point it out... Or does anyone runs into the same problem ? 
 Thanks everyone, and sorry about my english, not my native language :-[
- 
 Also I have been doing some research. I am no linux or freebsd admin but I did find this and there must be some way to script this so that you can check which interface the states are connected to and if both interfaces are up/up and the states are still connected via the failover interface that the pfctl -k xxx.xxx.xxx.xxx/xxx can be executed to force the states to return to their primary or tier 1 interface. https://www.freebsd.org/cgi/man.cgi?query=pfctl&sektion=8 from shell I performed the following commands and it looks to have killed all that states related to either the IP or subnet. pfctl -k 10.20.30.115 pfctl -k 10.20.30.0/24 NAME 
 pfctl – control the packet filter (PF) deviceSYNOPSIS 
 pfctl [-AdeghmNnOPqRrvz] [-a anchor] [-D macro= value] [-F modifier]
 [-f file] [-i interface] [-K host | network] [-k host | network |
 label | id] [-o level] [-p device] [-s modifier] [-t table -T
 command [address …]] [-x level]-k host | network | label | id 
 Kill all of the state entries matching the specified host,
 network, label, or id.For example, to kill all of the state entries originating from 
 ``host'':# pfctl -k host A second -k host or -k network option may be specified, which 
 will kill all the state entries from the first host/network to
 the second. To kill all of the state entries fromhost1'' tohost2'':# pfctl -k host1 -k host2 To kill all states originating from 192.168.1.0/24 to 
 172.16.0.0/16:# pfctl -k 192.168.1.0/24 -k 172.16.0.0/16 A network prefix length of 0 can be used as a wildcard. To kill 
 all states with the target ``host2'':# pfctl -k 0.0.0.0/0 -k host2 It is also possible to kill states by rule label or state ID. In 
 this mode the first -k argument is used to specify the type of
 the second argument. The following command would kill all states
 that have been created from rules carrying the label ``foobar'':# pfctl -k label -k foobar To kill one specific state by its unique state ID (as shown by 
 pfctl -s state -vv), use the id modifier and as a second argument
 the state ID and optional creator ID. To kill a state with ID
 4823e84500000003 use:# pfctl -k id -k 4823e84500000003 To kill a state with ID 4823e84500000018 created from a backup 
 firewall with hostid 00000002 use:# pfctl -k id -k 4823e84500000018/2 
- 
 I posted a job on upwork since I am not getting any takers on the pfsense Bounty page… 
- 
 Had script created… Have not tested yet though.... https://forum.pfsense.org/index.php?topic=113643.0 
- 
 Just to show you have I have posted onto the Redmine Bug#5090 In simple terms, take a VoIP/SIP phone service, if a connection failovers over from the primary WAN1 connection to a secondary WAN2 connection, at what point should that VoIP/SIP connection be expected to fall back onto the WAN1 connection when it becomes available again. Are you saying that with state killing on failback it would move these sessions immediately? 
 Or how long would/should the state remain open on the WAN2 connection?We are currently having real problems with this on 2 client sites setup as follows: WAN1 - ADSL connection just used for VoIP traffic 
 WAN2 - EFM higher bandwidth connection used for all internet access, VPN etc.Gateway group named "EFMFirst" 
 WAN2 EFM - Tier 1
 WAN1 ADSL - Tier 2Gateway group named "DSLFirst" 
 WAN1 ADSL - Tier 1
 WAN2 EFM - Tier 2Firewall Rules for Voice network: 
 Traffic set to Gateway: DSLFirstFirewall Rules for LAN network: 
 Traffic set to Gateway: EFMFirstThe problem is that if the ADSL line drops, the VoIP traffic goes onto the EFM connection. This is fine for a short period of time, but due to the other traffic on this line the bandwidth is not enough so we can get call quality issues. This is not a problem for a short period of time (better to have some phone service than none at all). When the ADSL line comes back online (Status>Gateways confirms this), the VoIP traffic stays over the EFM connection. Looking at the State table you can see the TCP & UDP traffic stuck to WAN2. It can be left for 24hrs and still the VoIP traffic will be on the wrong WAN. It will never move the traffic back onto the ADSL connection where it should be. Therefore the call quality issues remain due to the lack of bandwidth. What would you suggest, is this truly not a bug? 
 Is there not something that can force the states to re-associate with the firewall rule and therefor the correct WAN gateway after a specified period of time perhaps?Also if you Kill the 2 States for each VoIP phone in the Diagnostics > States section, they re-appear straight away on the same ports and interfaces as they were previously. 
 This is done by filtering the state's list by the IP address of the device. You can then see both UDP states (one on the internal network & one on the wan). Then press the "Kill States" button. This removes the 2 states very briefly, but then they reappear, still on the wrong WAN interface.
 They have definitely cleared since the Byte count returns down to 0KB and starts counting again.
 Surely clearing the state should have forced it to reconnect and follow the current rule and gateway group to the correct gateway??
- 
 Not sure why you are having an issue. Possibly due to your 2 rules for the same traffic. I have all my phones on a vlan and there is only one rule which is the default which is using the gatewaygroup. If I go an kill all the states for the phones on the backup interface they reconnect via the correct gateway. Again maybe something amiss on your setup. 
- 
 Not sure why you are having an issue. Possibly due to your 2 rules for the same traffic. I have all my phones on a vlan and there is only one rule which is the default which is using the gatewaygroup. If I go an kill all the states for the phones on the backup interface they reconnect via the correct gateway. Again maybe something amiss on your setup. Yeah I too have all the phones in a VLAN. 
 The Voice Network (VLAN30) has a rule which sends traffic over the Gateway group "DSLFirst"
 The LAN Network (VLAN1) has a rule which sends traffic over the Gateway group "EFMFirst"That is why I have 2 rules, because I have 2 networks (each using a different WAN as their primary/default). I have found that killing states will work very occasionally, but most of the time the states stay open on the wrong WAN. 
- 
 you have one rule for each interface or 2 rules for each interface? 
- 
 you have one rule for each interface or 2 rules for each interface? 1 Rule for each interface 
- 
 from SSH or from gui try to run the following command: pfctl -i igb0 -k 192.168.65.0/24 where igbX is your backup interface and the subnet is what is used by your phones 
- 
 from SSH or from gui try to run the following command: pfctl -i igb0 -k 192.168.65.0/24 where igbX is your backup interface and the subnet is what is used by your phones My backup WAN interface is called WAN_EFM. 
 My Voice network is on 10.10.30.0/24I ran pfctl -i WAN_EFM -k 10.10.30.0/24 and I got the result: 
 killed 0 states from 1 sources and 0 destinations.Yet if I look at the state table, select the Interface as WAN_EFM, and Filter expression as 10.10.30 I can see a whole list of UDP states, one for each phone. 
 If I look at the WAN_DSL interface there are no states open for the phones.I'll print an output of the states below with the WAN & PBX IPs masked. WAN_EFM udp 135.196.xxx.xxx:42190 (10.10.30.39:14079) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.346 K / 7.021 K 4.39 MiB / 2.71 MiB 
 WAN_EFM udp 135.196.xxx.xxx:9175 (10.10.30.49:58472) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.379 K / 7.045 K 4.42 MiB / 2.71 MiB
 WAN_EFM udp 135.196.xxx.xxx:47285 (10.10.30.42:25810) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.453 K / 7.131 K 4.48 MiB / 2.76 MiB
 WAN_EFM udp 135.196.xxx.xxx:20572 (10.10.30.53:59061) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.453 K / 7.125 K 4.48 MiB / 2.76 MiB
 WAN_EFM udp 135.196.xxx.xxx:4430 (10.10.30.40:12615) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.428 K / 7.106 K 4.46 MiB / 2.74 MiB
 WAN_EFM udp 135.196.xxx.xxx:25173 (10.10.30.38:50089) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.433 K / 7.111 K 4.46 MiB / 2.75 MiB
 WAN_EFM udp 135.196.xxx.xxx:36676 (10.10.30.5:57001) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.438 K / 7.093 K 4.24 MiB / 2.74 MiB
 WAN_EFM udp 135.196.xxx.xxx:20383 (10.10.30.26:12710) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 8.817 K / 8.472 K 5.27 MiB / 3.68 MiB
- 
 What if you use the actual interface instead of the label? 
- 
 What if you use the actual interface instead of the label? pfctl -i igb2 -k 10.10.30.0/24 gives me: 
 killed 0 states from 1 sources and 0 destinationspfctl -i opt1 -k 10.10.30.0/24 gives me: 
 killed 0 states from 1 sources and 0 destinationsI don't get it because the EFM connection is on the physical interface igb2. Status > Interfaces gives me this: WAN_EFM Interface (opt1, igb2) Status: up MAC Address: 00:0d:b9:xx:xx:xx IPv4 Address: 135.196.xxx.xxx Subnet mask IPv4: 255.255.255.252 Gateway IPv4: 135.196.xxx.xxx IPv6 Link Local: fe80::xxx:xxx:fe41:73f6%igb2 MTU: 1500 Media: 100baseTX <full-duplex> In/out packets: 70894297/45691236 (43.12 GiB/17.40 GiB) In/out packets (pass): 70894297/45691236 (43.12 GiB/17.40 GiB)</full-duplex>Yet the state table is still full of states on the WAN_EFM connection and there's none on the WAN_DSL where it should be going because WAN_DSL is Tier 1 in the Gateway group. 
- 
 I have just Reset the whole firewall state table from Diagnostics > States > Reset States This has made no difference, connections are still on WAN_EFM even though WAN_ADSL is showing up and online. 
- 
 Try removing -i and the interface. Be aware this may kill all connections for the subnet to both interfaces 
- 
 In your gateway group the 2 interfaces are on different tiers? Or same tier? 
- 
 Maybe specify the up of the end point IP. You might have to specify 2 commands. Both to and from the IP's Based on the statement below it makes since that no states were killed: WAN_EFM udp 135.196.xxx.xxx:42190 (10.10.30.39:14079) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.346 K / 7.021 K 4.39 MiB / 2.71 MiB pfctl -i igb0 -k 192.168.65.0/24 -k 135.196.xxx.xxx pfctl -i igb0 -k IP of Voip System -k 192.168.65.0/24 or pfctl -i igb0 -k 135.196.xxx.xxx pfctl -i igb0 -k 192.168.65.0/24 -k host 
 Kill all of the state entries originating from the specified
 host.-h Help. -i interface 
 Restrict the operation to the given interface.-k host 
 Kill all of the state entries originating from the specified
 host. A second -k host option may be specified, which will kill
 all the state entries from the first host to the second host.
 For example, to kill all of the state entries originating from
 host:# pfctl -k host To kill all of the state entries from host1 to host2: # pfctl -k host1 -k host2 
- 
 Yes my WAN Gateways are on different tiers 
 To confirm a couple of things: My WAN_EFM connection is on opt1, igb2 
 My WAN_EFM connection IP is the one starting 135.196.xxx.xxxMy WAN_ADSL connection is on wan, pppoe0 
 My WAN_ADSL connection IP is the one starting 82.152.xxx.xxxMy LAN Interface is on lan, igb1 
 This has a network of 10.10.1.0/24
 It is used for general PC & ServersMy 30VOICELAN is on opt3, igb1_vlan30 
 It has a network of 10.10.30.0/24
 It is used for all VoIP phone devicesMy External VoIP Server is hosted in a datacenter and is the IP beginning 185.83.xxx.xxx Gateway group named "EFMFirst" 
 Tier 1 - WAN_EFM
 Tier 2 - WAN_ADSLGateway group named "DSLFirst" 
 Tier 1 - WAN_ADSL
 Tier 2 - WAN_EFMFirewall Rules for LAN network: 
 Traffic set to Gateway: EFMFirstFirewall Rules for 30VOICELAN network: 
 Traffic set to Gateway: DSLFirst
 If the WAN_ADSL connection goes down, the state table confirms that the states for the voice traffic are now going over the WAN_EFM connection (135.196.xxx.xxx). When the WAN_ADSL connection comes back UP, none of the states ever return to the WAN_ADSL connection. If you Reset the firewall state table all the states go back to the correct paths (LAN devices over the EFM connection and VOICELAN devices over the DSL connection)!! Resetting the firewall state table is a bit overkill since it kills all the states on every device/connection. Shouldn't I be able to kill just the states of the 30VOICENET devices which are going over the wrong connection (WAN_EFM)? Interestingly yesterday I connected a brand new VoIP phone to the network (after having the WAN_ADSL connection down earlier that day), it connected to my Hosted VoIP server through the WAN_EFM connection, even though the WAN_ADSL connection was UP and this device had no previous states ever on the router. ….. Does this mean that when that WAN_DSL had come back up earlier that day (before I connected this new device), something in PFSENSE hasn't triggered the Firewall rules/Gateways to follow the correct path? The Gateway status always reports correct, when a connection comes back UP, the status reports Online and vise versa. What command should I be running in pfctl to Kill all of the states for devices on the 30VOICELAN network to trigger the devices to register on the correct connection? If I run``` 
 pfctl -k 10.10.30.0/24If I run``` pfctl -i igb2 -k 10.10.30.0/24 ```this tells me _0 states from 1 sources and 0 destinations_ have been killed Yet if I look at the state table I can still see: WAN_EFM udp 135.196.xxx.xxx:29023 (10.10.30.11:38251) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 4.932 K / 4.71 K 2.94 MiB / 1.82 MiB WAN_EFM udp 135.196.xxx.xxx:2239 (10.10.30.54:37815) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 5.155 K / 4.679 K 3.07 MiB / 1.80 MiB WAN_EFM udp 135.196.xxx.xxx:44077 (10.10.30.46:26578) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 5.151 K / 4.675 K 3.07 MiB / 1.80 MiB WAN_EFM udp 135.196.xxx.xxx:10148 (10.10.30.22:22774) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 5.472 K / 4.954 K 3.26 MiB / 2.18 MiB 30VOICELAN udp 185.83.xxx.xxx:5060 -> 10.10.30.25:41959 MULTIPLE:MULTIPLE 309 / 321 138 KiB / 196 KiB 30VOICELAN udp 185.83.xxx.xxx:5060 -> 10.10.30.11:38251 MULTIPLE:MULTIPLE 252 / 263 99 KiB / 161 KiB 30VOICELAN udp 185.83.xxx.xxx:5060 <- 10.10.30.52:52783 MULTIPLE:MULTIPLE 266 / 254 163 KiB / 101 KiB 30VOICELAN udp 185.83.xxx.xxx:5060 <- 10.10.30.38:39870 MULTIPLE:MULTIPLE 264 / 252 161 KiB / 99 KiB 30VOICELAN udp 185.83.xxx.xxx:5060 <- 10.10.30.49:20139 MULTIPLE:MULTIPLE 264 / 252 161 KiB / 99 KiB Note the above is just a sample of the states table, there are essentially 2 states for every VoIP device (1 showing on the WAN_EFM side and one showing on the 30WOICELAN side). What pfctl command should I be using to force all of these states to go back to the correct connections? The **Reset the firewall state table** command does the job but is not targeted enough. Why does a new device attached go over the wrong WAN (following a earlier disconnection/reconnection) until such time as the Firewall state table is reset? Is this a clue as to whats going on? I hope that gives enough information…. :) Thanks James