Gateway group tier priority not being followed
I have two pfsense routers linked by two wireless links and a Pritunl VPN link. I am using gateway groups for failover. I have set WIFILink2 as the tier 1 member, WIFILink1 as the tier 2 member and Pritunl VPN as tier 3. No matter how I set the tier 1 and tier 2 members, WIFILink1 is always the preferred link. I have deleted the groups on both routers and rebooted both routers. In Firewall Rules the gateway group is set as the gateway for the appropriate subnets on both routers. None of these interfaces are the default gateway.
This used to work and the problem seemed to occur after upgrading pfsense to 2.4.0-RELEASE on both routers. I cannot be sure it is related to the upgrade since I am not checking it regularly. I just noticed the issue the other day after one set of radios that had been dead for months was replaced. There are a couple of static routes on the routers but they point to other subnets. I have also looked via CLI for odd routes and did not see anything significant.
I have looked on the forum and the google machine but could not find any similar problems or solutions. Any thoughts or assistance to resolve this issue would be greatly appreciated.
Are WIFILink1 and WIFILink2 using different subnets and gateways?
The only way I could see it breaking as you describe is if both gateways were the same, which isn't supported because the behavior is unpredictable, as you see here.
They are using different subnets and gateways. On one router WIFILink1 is 192.168.9.2/29 with a gateway of 192.168.9.1 and WIFILink2 is 192.168.9.10/29 gw 192.168.9.9.
That should be OK then, assuming there isn't a VIP or something else in the routing table declaring that as a /24 or some other larger subnet that contains both.
Next step would be to post screenshots of your gateways, gateway groups, and LAN rules that show the gateway groups being used.
See down below for screenshots of router1.
So I have done some more testing and have narrowed it down. When using a PC on this router1's LAN, downloading is using WIFILink1 and uploading is using WIFILink2. So I changed the Firewall rules not to use the gateway group but to use only the WIFI2_GW on both routers.
Some traffic is still using WIFILink1. I am not sure how. See traffic graphs on router1 after I disabled the WIFILink1 interface and then enabled in on router2 with the above rules to use WIFI2_GW and WIFI_GW_2:
Maybe I am missing something in my settings or my understanding.