Problems with routed IPsec VTI

Ziomalski

I made an alias and was able to capture everything in 25 entries. I won't be adding more that 1 line every few months.

Would dynamic routes solve the NAT issue?

kevindd992002

@ziomalski said in Problems with routed IPsec VTI:

I made an alias and was able to capture everything in 25 entries. I won't be adding more that 1 line every few months.

Would dynamic routes solve the NAT issue?

Have you solved this problem yet? I have the same exact problem here.

Ziomalski

@kevindd992002 So NAT and VTI just do not work on pfS. I ended up using a 2nd VM behind pfS dedicated to VPN. I used VyOS and since I'm moving from Edgerouter, the config is similar.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248474

Have a look here. Those are the details of the issue. TLDR, you can have NAT+VTI working with the console commands. However, this breaks policy-based VPN. If you can deal with that, then just add the changes in console.

kevindd992002

@ziomalski said in Problems with routed IPsec VTI:

@kevindd992002 So NAT and VTI just do not work on pfS. I ended up using a 2nd VM behind pfS dedicated to VPN. I used VyOS and since I'm moving from Edgerouter, the config is similar.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248474

Have a look here. Those are the details of the issue. TLDR, you can have NAT+VTI working with the console commands. However, this breaks policy-based VPN. If you can deal with that, then just add the changes in console.

Thanks for the reply. So I guess their documentation about Routed VTI is misleading. It says:

"There are also known issues with NAT, notably that NAT to the interface address works but 1:1 NAT or NAT to an alternate address does not work."

The description in that link you mentioned exactly describes what I'm experiencing. I do have a few policy-based routes in my LAN that uses the IPsec gateway so I guess those console workaround will break these.

If I go with a dedicated VPN router like you did, how do I go about it? Is there a docker container that accomplishes this? And how would the traffic flow be? I'm not well-versed with a dedicated VPN router but I suppose it shouldn't be that hard. Would a dedicated VPN router that's also a pfsense box work too?

kevindd992002

I consulted @jimp and confirmed that the workaround for this issues is this. The only caveat for setting that is it breaks policy-based IPsec tunnels, so if you use both route-based and policy-based then that would be a problem. As I'm using only route-based IPsec then I'm good. I'm testing now.

EDIT: So here's what I did:

Modified these:

Added these:

Restart both boxes

I did that for both ends of the tunnel and as soon as I do that, the IPsec gateway on both ends go down. Any devices from the local end can't reach/ping devices on the remote end and vice versa. The static routes are intact.

As soon as I revert my changes and restart again, routing works beautifully again but the outbound NAT issue is still there of course. Am I doing this wrong?

Ziomalski

@kevindd992002 So i did it in console/shell. If you look at my other post that you replied to, i have them listed there to copy exactly (sysctl ...). In my testing, right after applying the commands things started working. However, I don't believe they were persistent across reboots. Do you have your NAT setup correctly for that interface with correct IP? Check packet capture and see what IP is being used and if packets are coming back.

As for the dedicated router workaround, I set this up virtually in Proxmox. I have my pfS VM with real interfaces along with 2 virtual interfaces for the VPN VM. So i have a VPN-LAN and VPN-WAN interface for the nested router. In pfS i push static/gateway routes to the gateway/interface VPN-LAN and the nested VM gets internet connection on the VPN-WAN interface. It is pretty straight forward but you have to get the concept. You have to make up some IP subnets for those interfaces for everything to be separated correctly.

If you have issues, let me know a little bit more detail on your setup and I'll try to get you some screenshots and specifics.

kevindd992002

@ziomalski said in Problems with routed IPsec VTI:

@kevindd992002 So i did it in console/shell. If you look at my other post that you replied to, i have them listed there to copy exactly (sysctl ...). In my testing, right after applying the commands things started working. However, I don't believe they were persistent across reboots. Do you have your NAT setup correctly for that interface with correct IP? Check packet capture and see what IP is being used and if packets are coming back.

As for the dedicated router workaround, I set this up virtually in Proxmox. I have my pfS VM with real interfaces along with 2 virtual interfaces for the VPN VM. So i have a VPN-LAN and VPN-WAN interface for the nested router. In pfS i push static/gateway routes to the gateway/interface VPN-LAN and the nested VM gets internet connection on the VPN-WAN interface. It is pretty straight forward but you have to get the concept. You have to make up some IP subnets for those interfaces for everything to be separated correctly.

If you have issues, let me know a little bit more detail on your setup and I'll try to get you some screenshots and specifics.

Which post is that? Sorry, I can't find it. But are you talking about these commands?

sysctl net.enc.out.ipsec_filter_mask=0
sysctl net.enc.in.ipsec_filter_mask=0
sysctl net.inet.ipsec.filtertunnel=1
sysctl net.inet6.ipsec6.filtertunnel=1

As soon as I do those commands on one end of the tunnel, I lose connectivity to the other end of the tunnel and pinging any device on the other end does not work at all.

Let me explain my simple setup:

Site A (main site)

WAN interface has a public static IP
LAN interface ip is 192.168.10.1
contains these subnets:
- 192.168.10.0/24
- 192.168.30.0/24
- 192.168.40.0/24
- 192.168.55.0/24
static routes:
outbound NAT rules for servers in site B:

Site B (remote site)

WAN interface has a private dynamic IP as it is behind a CGNAT
LAN interface IP is 192.168.20.1
contains this subnet: 192.168.20.0/24
static routes:
no IPsec interface outbound NAT rules. All outbound NAT rules are for the WAN interface.

The sites are connected by an IPsec routed VTI tunnel using the 10.0.2.0/30 transit network.

Site A IPsec interface is 10.0.2.1
Site B IPsec interface is 10.0.2.2

Here's a packet capture on Site A's IPsec interface:

That's a ping attempt from a device in Site B to Site A's LAN interface IP. You can see that it reaches the far end (Site A) up to the IPsec interface. But if I do a packet capture on Site A's LAN interface, I get nothing! So the packet gets dropped in the IPsec interface somehow. This is not even a NAT issue because there is no NAT that's happening here. This is simple routing between two subnets through the IPsec tunnel.

As soon as I bring back the values of those system tunables to their default values, everything is working again except for the outbound NAT rules that I have in Site A, of course.

I have to make this workaround work first before going with with dedicated router concept as it seems to be perfect for my use case. I don't use policy-based IPsec.

Ziomalski

@kevindd992002 It looks like you shouldn't need NAT at all. I think you just need to follow the standard guide ( https://docs.netgate.com/pfsense/en/latest/vpn/ipsec/routed-vti.html) and add firewall rules on each end to permit the correct traffic. If you want all subnets to talk to each other, site B needs to allow 10/30/40/55 subnets as the source under Firewall>rules>ipsec. Site A needs to allow the .20 subnet. (note the bolded part in the guide under firewall) You can always add allow any rules for both ipsec/vti tabs for testing if you have issues.

The reason I was having so much issues, is that I don't control the other end and they required all traffic going over the tunnel to have 1 specific IP address as the source. In your case, you have no conflicts and can have any IPs going over the tunnel. You have no need to NAT over to the 10.0.2 subnet.

kevindd992002

@ziomalski

I have allow all in the ipsec tabs of both pfsense boxes. That is not the issue here :) Like I said, I don't have issues with routing or any firewall rules blocking anything WITHOUT that workaround. All subnets talk to each other with ZERO issues.

That is not what I was trying to solve in the first place. I forgot to mention but I access (SiteAPublicIP):62958 and 32401 from an external source, so I have port forward rules for those that point to the servers in Site B. This is where I need the outbound NAT (which are the ones I showed in my previous post) because IPsec in pfsense/freebsd does not support reply-to's. The outbound NAT rules are needed so that the return traffic from those site B servers will route through the same IPsec tunnel. This is the exact same thing you were trying to do from the very beginning.

So now I was trying to apply the workaround to solve the outbound NAT issue. But when I did that, the local subnets cannot reach each other anymore even though the static routes and firewall rules are still in place.

To sum it all up:

Without workaround

local subnets can reach each other without any issues. Ipsec fw ruled are set to allow all and static routes are setup correctly.
issue with outbound NAT on site A. Return traffic from site B servers reach the site A IPsec interface but gets dropped.

With workaround

local subnets CANNOT reach each other
outbound NAT issue still present

I hope that explains everything. Let me know if you still need any clarification about the issue.

Ziomalski

@kevindd992002 I guess I'm not sure which IP you are trying to NAT. The workaround was required to me because the far end of the tunnel would not accept my 192.168 subnet. For me, all traffic over the tunnel needs to be NATed from 192.168../24 to a single IP 10.x.y.z

In your case, your tunnel will accept any IP to traverse and so you don't need the workaround. Anything else should be handled by firewall/routes. Outbound NAT should still work on traffic that is not going over the tunnel.

My problem was that when I used NAT over the tunnel, the traffic would be dropped by pfS bug on the return without the workaround. You have a working tunnel with traffic flowing successfully so I don't see what needs NAT.

kevindd992002

@ziomalski said in Problems with routed IPsec VTI:

@kevindd992002 I guess I'm not sure which IP you are trying to NAT. The workaround was required to me because the far end of the tunnel would not accept my 192.168 subnet. For me, all traffic over the tunnel needs to be NATed from 192.168../24 to a single IP 10.x.y.z

In your case, your tunnel will accept any IP to traverse and so you don't need the workaround. Anything else should be handled by firewall/routes. Outbound NAT should still work on traffic that is not going over the tunnel.

My problem was that when I used NAT over the tunnel, the traffic would be dropped by pfS bug on the return without the workaround. You have a working tunnel with traffic flowing successfully so I don't see what needs NAT.

Ok, so let me explain deeper. The goal is this (let's say for 32400):

Incoming traffic:

External source IP (example, 1.1.1.1):random port -> {Site A public IP}:32400 -> port forwarded (DNAT) to site B server @ port 32400 (example 192.168.20.22:32400)

Return traffic (without site A outbound NAT):

192.168.20.22:32400 -> 1.1.1.1:random port

the destination IP is 1.1.1.1 because no outbound NAT (SNAT) happened in site A's IPsec interface
This breaks the traffic flow because return traffic gets routed out the WAN interface in site B which is the default route (asymmetric flow). This happens because IPsec in pfsense/freebsd does not support reply-to. This issue is non-existent in OpenVPN because it supports reply-to and the return traffic automatically gets routed to the same interface (OpenVPN tunnel interface)

Enter the outbound NAT in site A, specifically this one:

with that, the incoming traffic will be same as above but the source IP is now translated to the IPsec interface IP of site A which is 10.0.2.1.
when the site B server replies back, it replies to 10.0.2.1 which is the expected flow (symmetric flow)
the return traffic gets back to site A's IPsec interface but gets dropped because of this known IPsec NAT issue

Here's actually my post about it where I show some packet captures:

https://forum.netgate.com/topic/159252/ipsec-outbound-nat-to-interface-address-reply-traffic-destination-ip-not-being-translated-back-to-original-source-ip/4

Here's another post from another user that's very similar to what I explained. His post has a diagram so you might understand it better:

https://forum.netgate.com/topic/139593/traffic-from-internet-through-ipsec-vti-not-returning-the-same-way/8

But basically, the workaround to the reply-to deficiency "should've" been the outbound NAT thing but it isn't also working. Then we have another workaround with those sysctl commands which @jimp said is working but it is not for me.

kevindd992002

And just to be clear, you are correct in that I have a working tunnel if the traffic flow is between two devices in any of the local subnets (whether in site A or in site B). If the source is external, then that's a different story as I explained above. That's why I need outbound NAT.

So both our use cases our similar. The only difference is that you needed to SNAT one whole subnet to a single IP because that was your requirement. And like you said, you also observed that the return traffic does not "return" properly to the origin because of this bug.

In my case, I am SNATting "any source to site B server" to the tunnel IP address so that the return traffic gets back to the origin (external) properly.

kevindd992002

@Ziomalski do you have any further ideas here?