IPsec VTI does not pass traffic on 2.6.0

thatsysadmin

Should've waited for it to brew a little more. Ugh. Luckily I had a OpenVPN server at the remote box to try and troubleshoot.

Anyway, I upgraded to 2.6.0 just now on 2 boxes that are connected via StS VTI. The upgrade sorta went off without a hitch, except the boxes aren't able to ping each other's IPsec VTI gateway IP.

I tried resaving/rebuilding my VPN config, resaving interfaces to no avail. Anyone having issues with this?

A Former User

@thatsysadmin

thatsysadmin

@silence
I saw that. Even went to rebuild the tunnel. Still no dice.

I only have the old labeling. ipsec1
The tunnel I just built had ipsec3 on both sides.
There weren't any new interfaces when I did the upgrade.

jimp

The new IPsec code in 2.6.0 has been in for quite some time and has been very thoroughly tested both internally and by the community.

The interface ipsec1 is the new format. In the past that would have been ipsec1000 or similar.

Does the IPsec status show the tunnel P1 and P2s as connected?

Do your IPsec P1 entries on both sides use the remote peer IP address or an FQDN?

What does iconfig ipsec1 look like on both sides? (or whatever the interface name is currently)

thatsysadmin

@jimp
Was working on 2.5.2, so I'm not sure what could have happened. It connects fine, but the boxes can't ping each other's gateway IP (the 10.0.100.x address).

Local Box:

ipsec1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1400
tunnel inet (Local IP) --> (Remote IP)
inet6 fe80::20c:29ff:fe5e:83be%ipsec1 prefixlen 64 scopeid 0x9
inet 10.0.100.1 --> 10.0.100.2 netmask 0xfffffffc
groups: ipsec
reqid: 5002
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

Remote box:

ipsec1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1400
tunnel inet (Local IP) --> (Remote IP)
inet6 fe80::20c:29ff:fe5e:83be%ipsec1 prefixlen 64 scopeid 0x9
inet 10.0.100.2 --> 10.0.100.1 netmask 0xfffffffc
groups: ipsec
reqid: 5002
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

jimp

What are your IPsec firewall rules like?

Do you see anything get dropped in the firewall log?

Any errors in the IPsec log?

Kev.i.n

I'm seeing the exact same behaviour! thought it was just me at first.

I have pfSense connected to a UniFi security gateway using a S2S VTI. Running 2.5.2 and previous versions worked fine without issues. Following the upgrade to 2.6.0, both Phase 1 and 2 come up yet I'm not able to send anything across.

I did see the warning in the docs about the name change and confirmed this changed from ipsec1000 to ipsec1 under Interfaces > Assignments.

The firewall rules over the VPN are pretty relaxed and nothing has changed in there so don't believe this to be the cause.

I've since tried:

Filter Reload
Changing network port of the VTI interface to something else then back to ipsec1
Tearing down P1 of the VPN and letting it reconnect
Restarted the IPSec service (under Status > IPSec)

After re-installing the pfSense box to 2.5.2 and restoring the config, the VPN tunnel comes back up and passes traffic successfully.

When I was running 2.6.0, I did have a look at the system logs and only thing that really popped out was the below but other than that, everything else looked pretty normal. (I'm no longer on 2.6.0 to verify other log entries)

trap not found, unable to acquire reqid 5009

Hope this helps.

Kev

jimp

@kev-i-n said in IPsec VTI broken when upgrading to 2.6.0?:

trap not found, unable to acquire reqid 5009

That's normal for VTI, VTI can't use trap policies.

If you try moving to 2.6.x again, first keep a copy of your /var/etc/ipsec/swanctl.conf and compare a copy from 2.5.2 and 2.6.0 with your configuration. Also keep a copy of the output of ifconfig -a

For anyone else experiencing problems, it would help to see that file as well. You can redact the private info but keep as much unique as possible.

I've got about a dozen or so VTI tunnels going here in my lab with all sorts of different configurations and they all work for me.

jimp

@thatsysadmin said in IPsec VTI broken when upgrading to 2.6.0?:

ipsec1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1400
[...]
reqid: 5002

One thing that stands out to me here is that it's ipsec1 but reqid: 5002, it should be 5001 (or ipsec2). since the interface name and the reqid should both be based on the same value. It makes me wonder what the IPsec configuration looks like both in your config.xml and in /var/etc/ipsec/swanctl.conf.

Kev.i.n

@jimp OK, did another upgrade and taken a copy of /var/etc/ipsec/swanctl.conf.

The only difference I can see, other than some additional comments is the reqid value and IPv6 added to remote_ts and local_ts.

Redacted a few bits but configs shown below:

2.5.2

# This file is automatically generated. Do not edit
connections {
	con1000 {
		fragmentation = yes
		unique = replace
		version = 2
		proposals = aes256-sha1-modp2048
		dpd_delay = 10s
		dpd_timeout = 60s
		rekey_time = 0s
		reauth_time = 0s
		over_time = 8640s
		rand_time = 8640s
		encap = no
		mobike = no
		local_addrs = <local_endpoint>
		remote_addrs = <remote_endpoint_fqdn>
		local {
			id = <local_endpoint>
			auth = psk
		}
		remote {
			id = <remote_endpoint>
			auth = psk
		}
		children {
			con1000 {
				dpd_action = restart
				policies = no
				life_time = 28800s
				rekey_time = 25920s
				rand_time = 2880s
				start_action = start
				remote_ts = 10.6.106.2,0.0.0.0/0
				local_ts = 10.6.106.1/30,0.0.0.0/0
				reqid = 1000
				esp_proposals = aes256-sha1-modp2048
			}
		}
	}
}
secrets {
	ike-0 {
		secret = <redacted>
		id-0 = %any
		id-1 = <remote_endpoint>
	}
}

2.6.0

# This file is automatically generated. Do not edit
connections {
	con1 {
		# P1 (ikeid 1)
		fragmentation = yes
		unique = replace
		version = 2
		proposals = aes256-sha1-modp2048
		dpd_delay = 10s
		rekey_time = 0s
		reauth_time = 0s
		over_time = 8640s
		rand_time = 8640s
		encap = no
		mobike = no
		local_addrs = <local_endpoint>
		remote_addrs = <remote_endpoint_fqdn>
		local {
			id = <local_endpoint>
			auth = psk
		}
		remote {
			id = <remote_endpoint>
			auth = psk
		}
		children {
			con1 {
				# P2 (reqid 9): Route-Based
				policies = no
				life_time = 28800s
				rekey_time = 25920s
				rand_time = 2880s
				start_action = start
				remote_ts = 10.6.106.2,0.0.0.0/0,::/0
				local_ts = 10.6.106.1/30,0.0.0.0/0,::/0
				reqid = 5001
				esp_proposals = aes256-sha1-modp2048
				dpd_action = restart
			}
		}
	}
}
secrets {
	ike-0 {
		secret = <redacted>
		id-0 = %any
		id-1 = <remote_endpoint>
	}
}

timboau 0

Yes same problem here - was running 2.5.2 (which broke the IPSEC connection viewing status)
Upgraded to 2.6.0 39 of 40 IPSEC connections connected first time (I hoped it was just a glitch)

Woke up this morning to a nightmare of tunnels just not connecting about 8 were just a mess and not functional they were connecting back to versions 2.5.1 mostly)

Luckily it was a VM so rolled back to 2.5.2 and all tunnels came up immediately.

timboau 0

@jimp If you're interested I'm able to fire up the VM with 2.6 on it and you can see what its doing? It will need to be outside of business hours Australia AEST which might suit you? This was a clean upgrade from 2.5.2 to 2.6

thatsysadmin

@jimp

What are your IPsec firewall rules like?

Generic open rule / Allow all traffic

Do you see anything get dropped in the firewall log?

Nothing for IPsec.

Any errors in the IPsec log?

2022-02-06 00:29:54.678808+00:00 charon 29338 09[KNL] <con1|2> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 in failed, not found

/var/etc/ipsec/swanctl.conf for one of the routers.

# This file is automatically generated. Do not edit
connections {
	bypass {
		remote_addrs = 127.0.0.1
		children {
			bypasslan {
				local_ts = 10.0.0.0/22,fc00::/7
				remote_ts = 10.0.0.0/22,fc00::/7
				mode = pass
				start_action = trap
			}
		}
	}
	con1 {
		# P1 (ikeid 1): <redacted>
		fragmentation = yes
		unique = replace
		version = 2
		proposals = aes256gcm128-sha512-modp8192
		dpd_delay = 10s
		rekey_time = 25920s
		reauth_time = 0s
		over_time = 2880s
		rand_time = 2880s
		encap = no
		mobike = no
		local_addrs = <redacted>
		remote_addrs = <redacted>
		local {
			id = fqdn:<redacted>
			auth = psk
		}
		remote {
			id = fqdn:<redacted>
			auth = psk
		}
		children {
			con1 {
				# P2 (reqid 2)
				policies = no
				life_time = 3600s
				rekey_time = 3240s
				rand_time = 360s
				start_action = start
				remote_ts = 10.0.100.2,0.0.0.0/0,::/0
				local_ts = 10.0.100.1,0.0.0.0/0,::/0
				reqid = 5001
				esp_proposals = aes256gcm128-modp8192
				close_action = start
				dpd_action = restart
			}
		}
	}
}
secrets {
	ike-0 {
		secret = <redacted>=
		id-0 = %any
		id-1 = fqdn:<redacted>
	}
}

timboau 0

@thatsysadmin The IPSEC firewalls do have quite a bit going on however nothing specific to the sites that didnt come back online.

The sites that didnt reconnect had errors about already being connected but were trying to reconnect. (sorry thats hopeless but I didnt record it - gave up and reverted to 2.5)

Can do some testing in 6 hours. 8pm AEST if it will help

Kev.i.n

Just to add...my config uses an FQDN as the remote gateway and I noticed @thatsysadmin also mentions fqdn in his config.

Just throwing it out there whether this has anything to do with it in comparison to @jimp's VTI tunnels in the lab.

timboau 0

@kev-i-n no FQDN in my configs (my IP/Peer address)

(they all initially connected except one on boot - they just didnt reconnect) - Stopped IPSEC service on both and started again.
Same sites remained down

Generally
IKEv2
P1 AES256-CGM 128 / AES-XCBC - HD 14
P2 ESP AES128/AES256-CGM
Key groups off

DPD enabled

Unique ID (replace)
Filter IPSEC Tunnel and VTI on IPSEC tab (ENC0) anything there?

thatsysadmin

@kev-i-n
I still had the same issue with the "IP option" I changed it as a troubleshooting measure.

jimp

There appear to be a couple different problems here getting lumped together.

If your VTI interfaces connect but do not pass traffic, this is the correct thread.

If you have any other issue on 22.01/2.6.0 such as tunnels failing to connect or reconnect, please start a new thread and post your information there. Be sure to include log entries from both sides as well as the information requested previously in this thread (IPsec config for the tunnel from config.xml and swanctl.conf, ifconfig output, etc).

Now, back to connected VTIs not passing traffic:

I'm still not getting the whole picture from these and the thread has gotten a bit mixed up with the other unrelated info. With one of these VTI tunnels in a state that fails to pass traffic I need the following information:

The output of ifconfig -a for the ipsecX interface(s)
The output of setkey -D and setkey -DP
The contents of /var/etc/ipsec/swanctl.conf for the tunnel(s)
The contents of config.xml for the tunnel(s) -- both P1 and all P2s

As with before you can mask private info but try to keep it consistent. For example, replace unique IP addresses consistently, e.g. 1.2.3.4 with A.A.A.A and 9.8.7.6 with B.B.B.B so I can tell they are different IP addresses in the output. Mask or remove any private things like PSKs or identifiers if they are sensitive.

If both sides are running pfSense software then having the info from both sides would help.

timboau 0

@jimp Hi very happy to start a new thread - lets call it 2.6.0 ipsec is broken - dont upgrade (yet).

Seriously - in the few days there appears to a lot of activity around ipsec having issues on 2.6 and people here trying to assist and fix their broken networks (remember this is a release version)

As a netgate developer how about you start a new thread for Systems that have had ipsec fail after upgrading from 2.5 to 2.6 and from there you can be particular about what information you want to put where.

Whether it be the VTI is broken, the tunnel is broken, the routing is broken - hey it just stopped working after upgrading. Most customers are coming here looking or answers on why it stopped after upgrading.

My offer is still open for you to look at a working 2.5 - we can revert to 2.6 and you can diagnose it anyway you want. You can have remote access; Im keen for this to be resolved rather than sitting back concerned im posting into the wrong thread about a general ipsec failure.

jimp

It is working for the vast majority of people, such a dire warning is unwarranted. It has been thoroughly tested internally and over the last six months or more in snapshots including heavy use on Netgate infrastructure used by all of our employees.

Keep your thread subjects relevant, for example "IPsec VTI tunnel will not reconnect on 2.6.0" or similar. Do not assume it's happening to anyone but you, and do not assume your problem is identical to others. Only after diagnosing the problem can such a determination be made.

It's important to keep each report separate so that the details do not get confused.