Policy-based Routing (outbound) and port forwarding (inbound) through WG tunnel

kevindd992002

@dma_pf said in Policy-based Routing through WG tunnel:

@kevindd992002 I haven't set up a site-to-site Wiregurad tunnel yet so I'm not going to be much help. But I thought I'd pass this along in case you haven't seen it. Hope it helps!
https://docs.netgate.com/pfsense/en/latest/recipes/wireguard-s2s.html

Thanks but I already read all the official documentation while I was setting this up. Coming from OpenVPN, then IPsec, and now WireGuard, I can say that WireGuard is very straightforward to setup so the chance of messing something up is lesser.

@xxgbhxx said in Policy-based Routing through WG tunnel:

@kevindd992002 I have another thread on here about my problems with WG on PF. I have a PBR rule but I can't even get the tunnel stable enough to test the rule. If I do I'll come report how it goes.

G

Thanks. I'll check your thread and maybe I can help. I have everything stable with WG except for PBR's.

@hypnosis4u2nv said in Policy-based Routing through WG tunnel:

@kevindd992002 Tom over at Lawrence Systems just put out a YouTube video for Wireguard site to site.

https://youtu.be/ZY49EAMnniY

I just watched the video. I have the exact same setup as his because one side is behind a CGNAT and the other side has a static public IP. I have the same settings except for a few things:

In the Peer WG Address field of each side's peer settings, I make sure to specify a single IP address without the CIDR (/32) notation. It works both ways because /32 is basically a single IP address also but it just makes it more simple if you use a single IP and this is what's documented in the official pfsense WG S2S article anyway.
I do not have keep alive on both sides enabled and it works just fine because of two things:
a. If the IP you set in the Peer WG Address field is included in the IP subnet settings of the actual WG interface, a gateway is automatically created and gateway monitoring is enabled by default. The monitoring itself is already the keep alive. This is documented here.
b. In the side with a static public IP, if you keep the Endpoint field blank this pfsense box WILL NOT initiate traffic to the remote peer until the remote peer sends traffic. This is documented here.
For gateway monitoring to work, the peer settings Allowed IPs field should contain the IP of the remote peer's WG interface on top of the subnets in the remote peer side that you want to route.

kevindd992002

I edited my OP and included a diagram of my setup. It's fairly a very basic setup and I have some additional weird observations with PBR and now port forwarding not working properly.

So again, for the topic related to PBR, here's what I have:

Site B LAN tab rules

Site A Outbound NAT rules

* Outbound NAT rule on WAN interface to translate packets with source IP = 192.168.20.0/24 to have a source IP = WAN address interface (static public IP)

Observations:

192.168.20.10 gets routed out Site A's WAN interface properly without any issues
192.168.20.11 does not get routed properly. I see packets reaching the Site B WG0 interface but I don't see anything in Site A's WG interface

Those PBR's are all for outbound. Now let's do inbound (port forwarding). Here's what I have on the topic related with port forwarding:

Site A Port Forward rules

Observations:

I tested first using my usual external open port test site (tool1). Everything works as expected! I can reach both 192.168.20.10 and 192.168.20.11 (Site B clients) through Site A's WAN interface. So I thought everything was working properly.
I then tested with another external open port test site (tool2) and now it's not working. The same exact behavior happens like the PBR issue above and that is the inbound packets reaches Site A's WG0 interface but stops there. Site B's WG0 interface never sees these packets.

So from my observations above, I can conclude that, for some odd reason, both PBR and port forwarding work with the "first" source IP and they don't with succeeding source IP's, if that even makes sense. The routing for both PBR and port forwarding is basically the same but only reverse of each other. This also proves that both pfsense boxes exhibit the problem.

@jimp @stephenw10 do you have any ideas?

kevindd992002

Workaround for the PBR issue

Workaround for the port forwarding issue

I created WG0 interface outbound NAT rules to source translate inbound packets destined to 192.168.20.10:8989/192.168.20.11:8990 (from any) and it worked.

These should totally be unnecessary with WireGuard since it supports reply-to. It worked for the first local source IP (PBR issue) and external source IP test (port forwarding issue) which proves that reply-to does work. So there's got to be a bug somewhere here.

AB5G

@kevindd992002 I have exactly the same issue here - https://forum.netgate.com/topic/160335/unsolved-possible-bug-wireguard-routing-weirdly/29
I'll try your fix and see if it works.

kevindd992002

@ab5g said in Policy-based Routing (outbound) and port forwarding (inbound) through WG tunnel:

@kevindd992002 I have exactly the same issue here - https://forum.netgate.com/topic/160335/unsolved-possible-bug-wireguard-routing-weirdly/29
I'll try your fix and see if it works.

Yeah, I did see your thread at one point when I was trying to reaearch on this issue. Let me know if my workaround solves your issue too. If it does, then I'd be inclined to say that this is a bug but I'll try capturing packets on the outer tunnel (WAN) now and see if I notice something there. @jimp one thing to point out here is that I can easily reproduce this behavior from both sides.

hypnosis4u2nv

@kevindd992002 https://youtu.be/mXG0RShQbaw

Not sure if this helps.

kevindd992002

@hypnosis4u2nv said in Policy-based Routing (outbound) and port forwarding (inbound) through WG tunnel:

@kevindd992002 https://youtu.be/mXG0RShQbaw

Not sure if this helps.

I just finished watching video but it's not applicable to the issue in this thread. The video explains routing through the tunnel from devices that have IP's in the transit network. This is the same as when I apply the outbound NAT workaround.

Without the workaround, the source IP's do not get translated (both PBR andport forward issues) and that's where problem comes in. The routing works for only one source IP (the first ones I ever tested with).

I hope that makes sense.

AB5G

@kevindd992002 That doesn't work for me. I'll put some more effort on it tomorrow.
If you remove the /32 NAT rule and capture packets on the WG interface - do you see what I captured on my tcpdump (the source getting translated to the tunnel IP) or you don't even see that ?
For me the NAT rule seemed to work, just that the packets don't leave the WAN for some reason.

kevindd992002

@ab5g said in Policy-based Routing (outbound) and port forwarding (inbound) through WG tunnel:

@kevindd992002 That doesn't work for me. I'll put some more effort on it tomorrow.
If you remove the /32 NAT rule and capture packets on the WG interface - do you see what I captured on my tcpdump (the source getting translated to the tunnel IP) or you don't even see that ?
For me the NAT rule seemed to work, just that the packets don't leave the WAN for some reason.

Did you mean when keeping the /32 outbound NAT rule?

With the NAT rule in place, yes I do see the source getting translated to the local WG interface IP and leaves the WAN and shows in the remote WG interface. Here's an example:

Site B open state:

Site A open state:

That public IP is an external IP from a torrent peer.

Without it, no SNAT happens and the packets do not leave the WAN interface.

Site B open state:

Site A NO open state:

However, I noticed that from the same source and to different destination IP's through the tunnel, some are working and some are not! Here's proof of that:

Site B open state:

Site A open state:

So the problem seems to be "selective" or intermittent. It is not only isolated to one source IP working but other source IP's also work for different destination IP's, if that makes sense. And again, this weirdness is all solved with the outbound NAT rule workaround.

hypnosis4u2nv

@kevindd992002 no problem, and yes if makes sense. Hoping that watching tutorials brings some idea of a what may be causing the issue that your experiencing. I was able track down issues to what I was experiencing due to watching an somewhat unrelated topic turned out to be an option that I didn't enable.

kevindd992002

@hypnosis4u2nv right, thanks. Which option fixed your issue and how did it fix it? I'm curious.

AB5G

@kevindd992002 said in Policy-based Routing (outbound) and port forwarding (inbound) through WG tunnel:

So the problem seems to be "selective" or intermittent. It is not only isolated to one source IP working but other source IP's also work for different destination IP's, if that makes sense. And again, this weirdness is all solved with the outbound NAT rule workaround.

That seems to be exactly my issue too - I don't know how to replicate it though. If you know how, can you raise a bug report please.

hypnosis4u2nv

@kevindd992002 They were unrelated to Wireguard. I had connectivity and routing issues with OpenVPN after updating to 2.5.0. Certain settings that got carried over needed additional tweaks and changes to connect. Then I ran into a routing issue where everything was routing through the client even though I had policy based rules set. After watching some videos and reading on some older posts. I found the culprit and fixed it. It may help to read/watch an unrelated topic to point out something you overlooked.

kevindd992002

Ok, I did more testing today and it looks the workaround I did was also hit and miss! It solved the problem with some of my clients but when I add new PBR's and port forwarding rules, they don't work again! And like I said, the problem is not isolated to source IP's. It's also affecting the same client but with different destination IP's. For example, I have this PBR:

When I was troubleshooting a few days ago, this was not working until I implemented that outbound NAT workaround, so I thought all is good. When it worked, the destination Alias had these host entries:

plex.tv
www.addic7ed.com

Today, I added a third host: news.newshosting.com and it never worked. So it's working for the first two but not for the new host. So go figure.

@AB5G I'll raise a bug report today.

AB5G

@kevindd992002 try clamping the MSS under the WG interface to 1420 (if you have an Ethernet uplink and see if that improves things). I saw on a unrelated thread that the MSS was causing some sites not to load (It still does not explain why the NAT wouldn't happen) - worth a try. Leave the MTU to default.

kevindd992002

@ab5g I also read about that workaround somewhere when I was researching on this but I thought it was unrelated to my issue. I'll give it a try.

kevindd992002

@AB5G setting the MSS field to 1420 (max mss 1380) in the WG interface on both sides didn't really help. Did it solve anything for you?

AB5G

@kevindd992002 No it didn't - its a hit and miss (just like your's)

xparanoik

This is somewhat related, but I changed my OpenVPN client for a wireguard tunnel, and in my PBR policy to route certain LAN clients through VPN I just switched the gateway from the old OpenVPN to the new WG one. Also updated my hybrid outbound NAT rules. Everything works just like it did before with OpenVPN.

kevindd992002

Filed a bug here.