[SOLVED] Policy-Based Routing Not Consistently Going Out the Specified Gateway

Finger79

You people who want to take a complicated setup like policy routing multiple openvpn connections then blame the software when it doesn't work yet obviously have no real grasp on what really needs to happen to make it work simply floor me.

In order for tagging and matching along the NO_WAN_EGRESS vein to work, EVERY packet that should go over the VPN must be tagged.

That is going to be a crap shoot without enabling "don't pull routes."

I think you're misunderstanding. The policy-based filtering floating rule works perfectly (matches the previously tagged packets). That's not what this thread is about.

@Derelict:

In order for tagging and matching along the NO_WAN_EGRESS vein to work, EVERY packet that should go over the VPN must be tagged.

Yep, no issues here. Tagging is working as expected.

@Derelict:

You people who want to take a complicated setup like policy routing multiple openvpn connections

The VPN connections are in one gateway group. The policy-route is set to send all traffic out the VPN gateway. That's working perfectly.

The only thing that's not consistently working is the policy-route for one device on the LAN to go out the normal WAN gateway.

Derelict

OK where is that rule in relation to all your other rules? You have yet to show that.

I was more commenting on the nonsense like this:

No, not DNS leaking, all traffic leaking such as through 80/443. Even though I have a "NO_WAN_EGRESS" policy-based filtering setup, it doesn't seem to work when I don't pull routes from the VPN provider. (Very not cool!)

It works perfectly when configured correctly.

Finger79

@Derelict:

OK where is that rule in relation to all your other rules? You have yet to show that.

I was more commenting on the nonsense like this:

No, not DNS leaking, all traffic leaking such as through 80/443. Even though I have a "NO_WAN_EGRESS" policy-based filtering setup, it doesn't seem to work when I don't pull routes from the VPN provider. (Very not cool!)

It works perfectly when configured correctly.

Screenshots of that posted earlier. The "Allow Web Traffic" rule sets the policy-based filtering tag "NO_WAN_EGRESS" and also sets the policy-based routing gateway to the VPN_Gateway.

The only time that traffic leaked out the naked WAN was when I told both client VPN connections to not pull routes. Then I got inconsistent results: some traffic went out the VPN, and other traffic went out the WAN. It was random.

Finger79

@luckman212:

So in your case you should probably set the default route of your pfSense is set to the VPN gateway, otherwise your DNS traffic will leak – or, you can set DHCP to hand out an upstream DNS server to LAN clients e.g. 8.8.8.8, etc.

DNS seems to work perfectly. Unbound sends all traffic through the VPN tunnels and never out the naked WAN. (That interface is unchecked.)

Derelict

You are all over the place. that rule routes traffic to PIA for, presumably, destinations 80 and 443.

All other traffic will go out the default gateway (or the OpenVPN connection that happens to have been able to set the 0.0.0.0/1 and 128.0.0.0/1 rules, which as you found in the other thread, will be the first OpenVPN connection without "don't pull routes" set that connects. The other one will receive errors when trying to set those routes.)

You are indicating there is a problem with some other host that is unable to go out WAN. Where is that rule in relation to all the other rules?

Not blocking out things that really don't matter might help people help you.

Finger79

@Derelict:

You are indicating there is a problem with some other host that is unable to go out WAN.

Negative. It's not unable to go out WAN. It just doesn't do it consistently.

@Derelict:

Where is that rule in relation to all the other rules?

Answered earlier:
@Finger79:

Here's some of the LAN rules (farther down the rule list) for most of the traffic. For example, the "Allow Web Traffic" rule sends all 80/443 traffic out the VPN gateway.

The .103 tablet exception rule matches first. It just doesn't consistently send traffic out the WAN. Sometimes it does, other times it doesn't. But the rule always fires.

Derelict

Is WANGW flapping?

System > Logs, Gateways

Finger79

@Derelict:

Is WANGW flapping?

System > Logs, Gateways

I don't see it mentioned anywhere in the Gateway logs.

I had to turn off WANGW gateway monitoring (meaning it's always considered "Up"). The dpinger pings may have pissed it off. I'll turn it back on and see if that helps.

Derelict

Right. If that was happening when you were seeing "random" routing then the same principles that make "NO_WAN_EGRESS" necessary would apply equally to WANGW if it was flagged as down. In that case you would need "NO_VPN_EGRESS."

Finger79

I get that. Fortunately, gateway flapping appears to not be the reason behind this.

When I do a fresh reboot of pfSense, the tablet traffic consistently goes out the WAN, as expected. Some time later (or some event later), it decides to go out the VPN only, even though the rule still fires that specifies WANGW. It's just ignoring the gateway in the rule but still logging the rule as having fired. Weird.

Derelict

Doubtful. There is some other reason the traffic is not matching that policy routing rule - else it would be policy routed accordingly.

Finger79

@Derelict:

Doubtful. There is some other reason the traffic is not matching that policy routing rule - else it would be policy routed accordingly.

Then why does the rule match in the Firewall Logs?

As you can see, the rule is very simple. If the source is .103 IPv4, then policy route it through WANGW.

This works perfectly after a fresh reboot of pfSense. Then like I said, after some time or some event, it no longer goes out the WANGW and goes out the VPN. Something is overriding the routing portion of the firewall rule.

Finger79

Just did some more, all from Firefox on the tablet:

Shows real WAN IP
Google "What is my IP"
iplocation.com
whatismyip.net
privateinternetaccess.com
whatismyip.org
ExpressVPN.com
MXtoolbox.com
ip-address.org
iplocation.net
findipinfo.com
myipaddress.com

Shows VPN IP
TorGuard.net
DuckDuckGo "What is my IP"
whatismyipaddress.com
BearsMyIP.com
ipchicken.com
ipaddress.pro

luckman212

@Finger79:

Then why does the rule match in the Firewall Logs?

As you can see, the rule is very simple. If the source is .103 IPv4, then policy route it through WANGW.

Do you have a kill switch rule below the policy route for .103 to block all traffic? You need this in case WANGW is down, because rules will be skipped if the GW is flapping, could explain your inconsistent results…

Derelict

When it stops working, run this:

pfctl -vvsr | grep -A3 XX.XX.XX.103

Here: I'll show you one of mine. I'm not afraid of leaking inside addresses:

$ pfctl -vvsr | grep -A3 192.168.223.6

@307(1493852191) pass in quick on igb1.223 route-to (ovpnc3 172.29.114.130) inet from 192.168.223.6 to <openvpn_lan:2>flags S/SA keep state label "USER_RULE: Route OpenVPN Addresses Through OpenVPN"
[ Evaluations: 2386 Packets: 0 Bytes: 0 States: 0 ]
[ Inserted: pid 21796 State Creations: 0 ]

Anyway, that will show the EXACT rules in the active rule set that have anything to do with that address at that specific point in time.</openvpn_lan:2>

Finger79

@124(10000001) pass in log quick on igb1 inet from 192.168.100.103 to <negate_net ="" works:0="">flags S/SA keep state label "NEGATE_ROUTE: Negate policy routing for de stination"
[ Evaluations: 2572 Packets: 0 Bytes: 0 States: 0 ]
[ Inserted: pid 54887 State Creations: 0 ]
@125(1505701172) pass in log quick on igb1 route-to (igb0 xxx.xxx.xxx.xxx public IP) inet from 192.168.100.103 to any flags S/SA keep state label "USER_RULE: Tablet Out Naked WAN"
[ Evaluations: 865 Packets: 55591 Bytes: 19139183 States: 5 ]
[ Inserted: pid 54887 State Creations: 807 ]
@126(1458032398) block return in log quick on igb1 inet from any to <pfb_africa_ ="" v4:6176="">label "USER_RULE: pfb_Africa"</pfb_africa_ ></negate_net >

Derelict

Looks good to me unless negate_networks includes the wrong destinations, which is pretty unlikely.

Or if there is a rule that matches that source address that won't be shown there.

I'd be happy to look at /tmp/rules.debug if you want to PM it.

Finger79

Somewhat redacted and edited:

< /tmp/rules.debug removed >

Finger79

Wonder if this table should be empty:

Diagnostics -> Tables
negate_networks

No entries exist in this table.

Derelict

pass in quick on $LAN $GWPIA_TX_CHI inet proto tcp from any to $Facebook port 443 tag "NO_WAN_EGRESS" tracker 1422073736 flags S/SA keep state label "USER_RULE: Allow Facebook"
pass in quick on $LAN inet proto tcp from any to $CloudFlare port $HTTP_HTTPS tracker 1422073738 flags S/SA keep state label "USER_RULE: CloudFlare"
pass in log quick on $LAN inet from 192.168.100.103 to <negate_networks>tracker 10000001 keep state label "NEGATE_ROUTE: Negate policy routing for destination"
pass in log quick on $LAN $GWWANGW inet from 192.168.100.103 to any tracker 1505701172 keep state label "USER_RULE: Asus Tablet Out Naked WAN"

Anything at $Facebook/443 or $CloudFlare/$HTTP_HTTPS will not match your source 192.168.100.103 rules. That is probably your problem.

Put the most specific rules at the top.</negate_networks>