Routing magic; would someone explain this please

wtw

pfSense applications in use:

pfBlockerNG – for ad/malware site DNS filtering
DNS Resolver (Unbound) – to convert DNS requests to DNS over TLS with DNSSEC
STunnel – to route specific LAN address ports to a VPS for a stable IP address due to service’s intolerance of IP address changes
OpenVPN client to a VPS for a more general access through the VPS and a stable exit IP address
OpenVPN client to a VPN service provider, using multiple site access load balancing, round-robin, for a changing IP address. Stickey used for single site connection stability.
All traffic must go through the VPN service provider tunnels.

Configuration:
In order to get these to work together (at least mostly):
All VPNs have Interface and Gateway configurations and NAT Outbound entries.

The main VPN client must accept the “push redirect-gateway def1” from the server to route all traffic through that specific VPN. This creates 2 additional routing entries, 0.0.0.0/1 & 128.0.0.0/1 routing to the first launched VPN service site. The other sites try to create this also, but fail (error), since it is already set. This conflicts with load balancing as the “default” traffic always goes to that one VPN site. To try to compensate for this, the VPN gateway group places the first one on Tier 2, the other 2 on Tier 1. But even this does not appear to be balancing the two tier 1’s traffic.
The DNS Resolver automatically routes to the first VPN service tunnel, presumably via the default route entries in (1) above.
A LAN rule is used to route LAN traffic to the VPS VPN gateway to test this connection. Currently, all LANnet traffic uses this rule for testing. The intent is the source filter will limit this after it is working properly. External IP test confirms this.
A LAN rule (before the VPS rule) is used for SSH redirect to the VPN group to avoid the VPS VPN tunnel in order to access the VPS through SSH.
A LAN rule (before the VPS rule) cannot be used successfully to redirect the STunnel traffic to the VPN group to avoid the VPS VPN tunnel. It appears this bypasses the STunnel mapping. The VPS VPN must be disabled for STunnel to work.
A LAN rule (after the VPS rule) is used to route traffic to the VPN group so that not all the traffic goes through the default VPN tunnel. This is currently only effective if the VPS is disabled. Based on packet capture, it appears that the load balanced VPN traffic is exiting directly and not through the default VPN (nested tunnel). However, it looks like the load balance VPN tunnels only carry ICMP traffic (monitor). Only about 3% of the traffic is not going out the default VPN.

The failing issues are proper round-robin load balancing among the VPN service sites and not being able to route the STunnel properly. Everything else works as desired.

Rejecting the “push redirect-gateway def1” followed by “redirect-gateway” and trying to use rules to route traffic (policy routing) is not working since as soon as a rule is met, no other rules will apply even if on another interface. I could find nothing on how to route through multiple interfaces within the router. It appears that only one rule can apply to a routing state. Also, Unbound cannot be successfully policy routed from the WAN (outbound) to the VPN gateway group. It gets to the VPN, but no replies occur, just requests. So, this method is allowing only one level of redirection. Unbound appears to already be performing an interface routing, so the rule cannot successfully route again. I am probably not configuring this correctly.

Would some be willing to explain what is going on and if I can use more explicit routing and how that would be accomplished?

Thanks.

wtw

@wtw
Maybe I just figured some of this magic out.

The default route (0.0.0.0/1 & 128.0.0.0/1 in the route table) set by the VPN client option "redirect-gateway def1" occurs after all rules and is independent of all rules (cannot be overridden by a rule).
One and only one rule can influence a connection. So, if a connection satisfies a rule requirement, however that is established (due to Quick in floating rules), no other rule can impact that connection regardless of the interface or group.
Applications like DNS Resolver and STunnel do their processing between LAN and WAN. No rule can intercept their traffic on LAN interface-tab or it will bypass the app's processing, it must make it to the WAN, where a Floating (outbound) rule can now intercept it.
Floating rules are applied first, for whatever interface or group is specified, in list order. So, a Floating outbound Quick WAN rule will be applied before any LAN interface-tab rule (conditions met).

I was thinking of interfaces as sequential, not concurrent. While the processing of rules defined within interface-tabs and group-tabs must be processed in some specific order (priority), this is not the case for Floating; order is determined by list order, not interface order.
This seems to fit the evidence.

Is this correct?

wtw

@wtw
I figured this out. I will create a new post with a resolution, since it encompases more than this specific post.
This issue is closed.