Routed IPsec using if_ipsec VTI interfaces

obrienmd

Sure, I'm pretty comfortable with routing, just not the ipsec side. I'll try a shared /30.

Right now, status is very strange. Gateways are showing up (i.e. they can ping each other, and I see that in tcpdump), but when I try to ping from CLI, even using -S (to set source in the same subnet, just in case), I get nothing in reply. Very odd.

States are weird, some are going through enc0, some on the actual int.

Washington:
enc0 icmp 10.91.110.2:8558 <- 10.91.110.1:8558       0:0
ipsec1 icmp 10.91.110.2:10026 -> 10.91.110.1:10026       0:0
ipsec1 icmp 10.91.110.2:33717 -> 10.91.110.1:33717       0:0

Texas:
ipsec1 icmp 10.91.110.1:8558 -> 10.91.110.2:8558       0:0
enc0 icmp 10.91.110.1:33717 <- 10.91.110.2:33717       0:0

obrienmd

Eureka - putting them in a shared /30 seems to work. Now to see if all the GRE weirdness in previous uses get me with VTI :)

State matching: https://redmine.pfsense.org/issues/4479
Another: GREs sometimes open states before ipsec, then can't "get going" until states are cleared

jimp

@obrienmd said in Routed IPsec using if_ipsec VTI interfaces:

Eureka - putting them in a shared /30 seems to work. Now to see if all the GRE weirdness in previous uses get me with VTI :)

State matching: https://redmine.pfsense.org/issues/4479

I haven't tested TCP much but it shouldn't have the same issues.

Another: GREs sometimes open states before ipsec, then can't "get going" until states are cleared

That you can solve by putting floating rules outbound on WAN to stop your traffic from leaking and making incorrect states.

obrienmd

Thanks for all the help thus far Jim!

This is super exciting - it seems to work great thus far. The floating rules thing was always a little iffy for us (one in 5-6 reboots would get bad states even though non-IKE/ESP traffic was forbidden), though I'm with you in principle.

One last (I hope) weird issue: Firewall-originated traffic targeting anything outside the ipsec tunnel ip of the far firewall goes out the ipsec interface (i.e. route works as expected), but a dump of the far side interface doesn't show the traffic incoming. So:

From Texas LAN host to Washington firewall - pings, services work
From Texas LAN host to Washington LAN host - pings, services work
From Texas firewall to Washington firewall ipsec tunnel IP - pings, services work
From Texas firewall to Washington firewall LAN IP - pings, services fail (see outbound in tcpdump, not inbound on far side)
From Texas firewall to Washington LAN host - pings, services fail (see outbound in tcpdump, not inbound on far side)

jimp

I'll have to setup a better test to try that out, but I found an issue with the interface numbering/reqids that I need to fix before getting back to that.

obrienmd

@jimp yep, I think I'm seeing the reqid issue myself. Every few reboots, I get this complaint and no traffic flows:

Jun 5 08:08:59	charon		12[KNL] received an SADB_ACQUIRE with policy id 2 but no matching policy found
Jun 5 08:08:59	charon		12[KNL] creating acquire job for policy {near_wan_ip}/32|/0 === {far_wan_ip}/32|/0 with reqid {0}
Jun 5 08:08:59	charon		14[CFG] trap not found, unable to acquire reqid 0

jimp

If it works at all, it probably isn't the same issue. In my case I'm seeing the interface end up with one number but the reqid in the ipsec config has a different number, so no traffic ever reaches the interface due to the mismatch. That should be an all-or-nothing situation.

If you only have one P1/P2 or even only one P2 per P1 then it should be OK as-is, just by coincidence.

obrienmd

When I see those errors, it really doesn't work at all. I have connected P1s and P2s, but traffic isn't flowing at all (not the previous situation two posts up).

jimp

I just pushed some changes to how the IPsec interfaces are formed. The numbering of the interfaces has changed, so to be safe you should unassign the interface before upgrading. I'll work on some upgrade code in the morning, but it should hopefully now align better in terms of how strongswan forms the reqid vs how the if_ipsec interfaces want it so everything will line up better. I need to do some more testing, but it should be better.

obrienmd

Sweet - I'll test as soon as the snapshots show up and let you know how it goes!

I did notice that the official Netgate boxes I'm working with (SG-2440s mostly) seem to "see" snapshots for 2.4.4 later than the community edition installs I'm also testing with.

jimp

The factory snapshots happen on a different schedule than the CE snapshots so they won't ever be exactly the same. Close, but not the same. Also depends on how things get merged back into factory and if there are any conflicts.

jimp

Looks like it's better and worse. I can pass traffic between the hosts that failed before, but the gateways are not being generated properly so I need to fix that up, so static routes won't work and so on.

I see the code blocks I didn't update to the new style so I'll fix those up in the morning.

obrienmd

Thanks Jim!

jimp

OK, I just pushed the updated gateway code and it's working well for me now.

I do, however, see the same behavior you did where the firewall can't reach a routed network on the far side using the ipsec interface address as the source. It does work if I set the source to be the LAN, however.

Using the ipsecX interface address as the source:

: ping -S 10.6.106.1 10.7.0.1
PING 10.7.0.1 (10.7.0.1) from 10.6.106.1: 56 data bytes
^C
--- 10.7.0.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss

Going LAN to LAN from the firewall:

: ping -S 10.6.0.1 10.7.0.1
PING 10.7.0.1 (10.7.0.1) from 10.6.0.1: 56 data bytes
64 bytes from 10.7.0.1: icmp_seq=0 ttl=64 time=0.802 ms
64 bytes from 10.7.0.1: icmp_seq=1 ttl=64 time=0.883 ms
64 bytes from 10.7.0.1: icmp_seq=2 ttl=64 time=0.716 ms
^C
--- 10.7.0.1 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss

Routes all look correct, on the source node traffic appears on the ipsecX and enc0 interface but the counters on the child SA do not increase and no ESP leaves, so somehow it isn't making its way to that connection. I'll keep poking at it, but it's not the end of the world since the same situation also didn't work on plain IPsec, though we hoped routed IPsec would be a cure for that.

jimp

Since that firewall-to-LAN routing issue is not a flaw in the VTI code that I can see, I've split that off into https://redmine.pfsense.org/issues/8551

obrienmd

Makes perfect sense to me - as soon as the daily build hits pfSense factory -devel, I'll start testing again!

jimp

OK, I think I have that nailed down. Apparently it does not get along with pf route-to directly on the interface. It works fine for LAN traffic but not traffic exiting from the firewall itself. I pushed a fix, should be in snaps soonish.

obrienmd

Cool, thanks!

Question - and this might be sacrilege - can I set my Factory boxes to download CE snapshots? I poked around repos but when it looks like just swapping the pfSense.conf one didn't really work out on a test box :)

jimp

@obrienmd said in Routed IPsec using if_ipsec VTI interfaces:

Cool, thanks!

Question - and this might be sacrilege - can I set my Factory boxes to download CE snapshots? I poked around repos but when it looks like just swapping the pfSense.conf one didn't really work out on a test box :)

Not easily, several things need adjusted and it's just not worth the hassle to downgrade like that in-place. All the changes I made today, including the fix for that route-to issue, have been synchronized to Factory so it should show up in snapshots for both CE and Factory by the morning.

obrienmd

Much appreciated.

This is going to help in a lot of places... Now I just have to get Verizon to terminate mobile private network tunnels as VTI :) Wish me luck...