Multi-WAN, Multi-VLAN - gateway policy isn't working
So I'm a pfSense n00b, trying to set up a decently complex system. I'm having trouble getting certain VLANs to utilize different gateways. I'm hoping that some of you wizards will see the boneheaded mistake I can't see and have the kindness to show it to me! I'll describe the setup:
I have 2 WAN connections:
- primary (WAN), Cable internet (called AIA)
- secondary (OPTWAN), dual bonded T1s.
I have three LAN interfaces (on a trunk port from my L3 switch):
- one basic LAN,
- and two VLANs (20 and 30):
each is attached to a 192.168.x0.0/24 subnet running dhcp.
i.e., 192.168.20.xxx on VLAN20 and 192.168.30.xxx on VLAN30 .
I have set up three gateway groups: one preferring each gateway, and one doing LB:
I have set up a simple rule for the two VLANs; VLAN20 should use the AIA WAN, and VLAN30 should use the T1 WAN. Each should fail over to the other WAN.
With this setup, VLAN20 traffic is passed properly, but VLAN30 traffic times out. HOWEVER, I can ping the internet just fine from the VLAN30 interface thru pfsense:
And, the system log shows traffic being passed by the rules.
Apparently, there is some setting allowing traffic through one LAN but not the other. However, the (default) firewall rules for the two WANs are identical. I checked the "Allow default gateway switching" in advanced to no avail. I'm stumped.
What's the missing link? I'm guessing there's a firewall rule or route I don't have set up - but I can't grok it.
Anybody see it?
The VLAN20 traffic is directed to what happens to be your default gateway anyhow, so it will not be obvious if the policy-routing rule is being applied. You could pull the cable on WAN and see if VLAN20 traffic fails over.
Firewall->NAT Outbound - maybe you enabled Manual Outbound NAT some time ago, and since then have added the 2nd WAN. If so, then you will be missing Outbound NAT rules on the 2nd WAN.
Maybe you have something in floating rules that is matching the traffic?
hmm… I checked the Outbound NAT - it shows Automatic mode, with no static mappings. The only floating rule is "allow icmp pings".
I pulled the WAN cable, and all traffic stopped. But I have verified the secondary WAN connection is up, and the gateway information is correct. The only difference I can see between the two WAN interfaces is one is set as the system Default. So I changed the default to the second wan gateway - with no effect.
Thanks, Phil, that ruled a couple things out! Notice anything else?
Where can I look?
If you are just monitoring the local gateway of each WAN, then the system will think that they are up. But actually maybe T1s is not really working all the way to the internet.
Maybe System->Routing, edit each gateway and put in a monitor IP out on the real internet (different for each gateway - e.g. 220.127.116.11 and 18.104.22.168). And/or on your Gateway Group settings, Trigger Level might be set to Member Down. In that case I think it will only consider a link down if the hardware signals on the cable actually disappear. Set it to "Packet Loss or High Latency"
Then see if T1s is reported as being up.
Maybe you will find that T1s is not really working, and now pfSense will detect that and fail it over to WAN.
Also we know that WAN (AIA) is really working. So pull the T1s cable to force T1s to be down. That really should make the VLAN30 traffic failover to WAN (AIA).
Yer a genius. I checked, and sure enough, the gateway showed as down. so I went back through, and found that the interface and the gateway were set to the same IP. :-[ off by one digit, but enough to bork it.
fixed, and everything started working. awesome!!!
Moral of the story: double check your settings… and if the gateway is .42 and the IP is .41, make sure they are in the right order.
Thanks a million!
Just because 42 is the meaning of life, the universe and everything, you still can't use it as the IP address of every device ;)
Glad you found the issue.
I couldn't resist looking up 22.214.171.124 - it is allocated to somewhere in South Korea. Shame they don't have a Hitchhikers Guide site running there.