No Site-to-Site VPN after upgrading CE from 2.6.0 to 2.7.0
-
Thank you very much @rcoleman-netgate
From my point of view, the OVPN logs are not very telling on the server side. Probably, this is "only" a routing problem on both sides and nothing associated with OpenVPN itself, this seems to be indicated on the client side (only):
Server Log
...
Jul 3 07:31:17 openvpn 72067 UDPv4 link remote: [AF_UNSPEC]
Jul 3 07:31:17 openvpn 72067 UDPv4 link local (bound): [AF_INET]127.0.0.1:1195
Jul 3 07:31:17 openvpn 72067 Preserving previous TUN/TAP instance: ovpns2
Jul 3 07:31:17 openvpn 72067 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jul 3 07:31:17 openvpn 72067 NOTE: your local LAN uses the extremely common subnet address 192.168.0.x or 192.168.1.x. Be aware that this might create routing conflicts if you connect to the VPN server from public locations such as internet cafes that use the same subnet.
Jul 3 07:31:17 openvpn 72067 SIGUSR1[soft,server_poll] received, process restarting
Jul 3 07:31:17 openvpn 72067 Server poll timeout, restarting
Jul 3 07:31:16 openvpn 94298 srv1.xxx.com/xx.xx.xx.xx:1197 MULTI_sva: pool returned IPv4=192.168.19.2, IPv6=(Not enabled)
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 [srv1.xxx.xxx] Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:1197
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_COMP_STUBv2=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_COMP_STUB=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_LZO_STUB=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_PROTO=990
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_CIPHERS=AES-256-GCM
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_MTU=1600
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_TCPNL=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_PLAT=freebsd
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_VER=2.6.4
...Client Log
...
Jul 3 07:31:16 openvpn 76618 [srv1.xx.net] Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:1197
Jul 3 07:31:16 openvpn 76618 TUN/TAP device ovpnc3 exists previously, keep at program end
Jul 3 07:31:16 openvpn 76618 TUN/TAP device /dev/tun3 opened
Jul 3 07:31:16 openvpn 76618 /sbin/ifconfig ovpnc3 192.168.19.2/24 mtu 1500 up
Jul 3 07:31:16 openvpn 76618 /usr/local/sbin/ovpn-linkup ovpnc3 1500 0 192.168.19.2 255.255.255.0 init
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
Jul 3 07:31:16 openvpn 76618 Initialization Sequence Completed
Jul 3 07:32:12 openvpn 22998 Server poll timeout, restarting
Jul 3 07:32:12 openvpn 22998 SIGUSR1[soft,server_poll] received, process restarting
Jul 3 07:32:12 openvpn 22998 NOTE: your local LAN uses the extremely common subnet address 192.168.0.x or 192.168.1.x. Be aware that this might create routing conflicts if you connect to the VPN server from public locations such as internet cafes that use the same subnet.
Jul 3 07:32:12 openvpn 22998 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jul 3 07:32:12 openvpn 22998 Preserving previous TUN/TAP instance: ovpns4
Jul 3 07:32:12 openvpn 22998 UDPv4 link local (bound): [AF_INET]127.0.0.1:1195
Jul 3 07:32:12 openvpn 22998 UDPv4 link remote: [AF_UNSPEC]
...Is there a way to prevent the route add command from failing?
Regards,
Michael
-
Dear All,
Thank you for sharing experiences! Now, I know that I am not the only dummy who is unable to handle this.
Open issues:
-
Do we have any knowledge about the reasons? Might it be possible that LAGGs as LAN are the root cause?
-
How do we make Netgate aware of this? I can file a bug report, but I am unsure if anyone will take notice in due time?
-
Reverting to 2.6.0 is also what I had in mind. I also use freeradius3. Is there a way to reinstall or do I need to set up a freeradius server elsewhere first?
-
Is it possible to solve the problems associated with Ipsec site to site? In my case the issues are that HAProxy does not see the other side (known issue) and that rsync does work across the tunnel (do not understand this, but this is a major bottleneck).
-
Can anyone instruct me on using Wireguard for site to site VPN with HA multi-WAN behind NAT?
-
Does anyone know, if the non-CE variant does come without this problem?
Regards,
Michael
-
-
Dear All,
Trying to post a bug report bears a high risk of just being turned down:
https://redmine.pfsense.org/issues/14541
Would anyone in the forum please be so kind to:
(a) help solve understanding the issue and changing the configuration adequately and/or
(b) file a bug report in a fashion that does not get turned down?Not having site-to-site VPN due to an upgrade as not something which is bearable for a longer period.
Regards,
Michael
-
On one side, I have 192.168.1.0/24 and 192.168.4.0/24. On the other side, I have 192.168.12.0/24. This is as it is for > 10 years without any issues. I will try the 192.168.4.0/24 network on the one side and report back if that does make a difference.
-
Dear All,
Replacing the network 192.168.1.0/24 on one side by 192.168.4.0/24 did make the "extremely common subnet address" note go away. However, routing would still not work.
Changing a connection from SSL/TLS to shared key also made the "FreeBSD route add command failed: external program exited with error status: 1" error go away. Nevertheless, routing would still not work.
@mslauria Did you start from scratch with the entire configuration (all interfaces, packages, rules ...)?? Did you just delete the OpenVPN configurations and leave other things in place? What may be mixed? Will the system not work upon mixing for example an SSL/TLS OpenVPN server for remote access with a Shared Key client for the site-to-site OpenVPN? In the past, none of this was any problem, as far as I know.
Regards,
Michael
-
@michaelschefczyk
i have same problem after upgrade to 2.7.0 -
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
What added commands do you have on either side?
Can you show a screen shot of your config pages from the "tunnel settings" down?
-
Thank you @chpalmer !
My "Custom options" boxes are empty on both sides.
Screenshot Server
Screenshot Client
As indicated in the initial post, I did try around with having the IPv4 remote networks empty vs. populated and with including the remote network under local network or not. This did not change the situation. I will be glad to try again, if you instruct me to.
-
@nazelus That would be great! Please state which field in the configuration GUI you are looking at. I find it difficult to understand what "addition Lan" means. I do have two LAN networks in the server side indeed (192.168.1.0/24 and 192.168.4.0/24). I will need to connect OpenVPN to connect to at least (a) the default LAN 192.168.1.0/24 and (b) the router itself (probably 127.0.0.1) so that package HAProxy can see through the tunnel.
-
So far i have upgraded two (remote) boxes from 2.6.0 to 2.7.0
Both having OpenVPN (TLS + Passwd) connections to the central 2.6.0 server.Both upgrades went well , no VPN issues.
/Bingo
-
Dear All,
If we will remain unable to clarify this quickly, my aim is to roll back to 2.6.0. I did that on the secondary CARP-member in my hoe office: Install, recover config, unplug WAN on the fist boot. change update version back to 2.6.0, reboot again, plug in WAN and install packages manually. This does seem to work. I will try the primary unit this evening. Saturday, I will travel to the other end of the VPN (600 km) and try it there.
I did try to file a second bug report. Jim Pingle replied:
"Please do not open duplicate issues. Keep the discussion on the forum and if there is a proven bug and not a configuration issue, then the original can be reopened.
We cannot be responsible for making sure every possible variation of OpenVPN works across every version/upgrade, especially when OpenVPN itself changes and deprecates functions/features or changes how things work. Many users have working OpenVPN tunnels on 2.7.0 and current Plus versions that have been upgraded and working for years, it's highly unlikely to be a bug, but something in your setup that isn't correct or needs adjusted to compensate for OpenVPN changes. This is not the place to track that down, that is what the forum is for.
Be sure to post complete settings for all nodes involved, not just general description of the setup."
If anyone has better chances to involve the developers than me, help would be most welcome.
I will certainly not be glad to rebuild my > 40 MB configuration from scratch. I am also unable to post the configuration publicly, for obvious reasons.
Regards,
Michael
-
@michaelschefczyk did you try cold start of 2.7.0 box? My problem gone somehow after i poweroff/poweron.
-
-
As I mentioned on Redmine you most likely have a configuration problem that has always been wrong but some change on the backend changed and now your previously "working" settings which happened to be incorrect in some way stopped working.
A few common things we have seen are:
- SSL/TLS setups where people had filled in a tunnel network on the client when they should not
- SSL/TLS setups with a /24 tunnel network where the Client-Specific Overrides were not setup correctly breaking LAN-to-LAN routing
- Static Key configurations using the wrong subnet size for the tunnel network (e.g. /24 when it should have been /30)
- Not explicitly setting the same topology on both sides
- Some other routing conflict preventing the correct entries from being in the tables
- A configuration that worked by chance before that was never correct (e.g. routes in System > Routing instead of in OpenVPN natively)
- Policy routing rules overriding the VPN and sending the client traffic in some unexpected path
Since you won't (or can't) post your settings, there isn't any way for us to really help you diagnose things, but it sounds like it's your routing/route table entries that are broken or missing. Either you are not in the correct mode (e.g. SSL/TLS with /24 tunnel network requires setting up override entries for client network routing, not static routes), or some other similar problem.
Compare your setup against the reference here: https://docs.netgate.com/pfsense/en/latest/recipes/openvpn-s2s-tls.html
If you were using a routing protocol like OSPF before, then you either had to have been using shared key, a /30 tunnel network, or maybe tap mode in certain cases.
-
Dear All,
Please find the most relevant pages of my configuration below for comments. If other views are required, please let me know. Firewall rules as before are in place.
I think that this is in line with the tutorial. The tunnel does get established, but nothing else does work.
My assumption is, that the issues are either due to certificate stuff or to routing issues outside OpenVPN including LAGG.
Regards,
Michael
Server config
Server override
Client config
-
The tutorial was checked against 2.7.0 and 23.05, but there may be slight wording differences.
The "Automatically generate" box only showed up when you first create a tunnel, it won't show when editing.
I just re-followed the recipe a week or two ago and confirmed it all worked, so if it doesn't work for you, something isn't matched up or wasn't followed as shown.
-
At a glance what stands out is that the server is bound to localhost so maybe your port forward for that server isn't correct so the client can't reach it. Otherwise there isn't enough info to say why it might be failing (could be certs, for example)
Also with just the one client you probably don't want to list that client's own network as "local" to the server since that will make the client try to pull (and probably fail) to pull a route for its own network from the server.
Also you might try changing the TLS config so it's auth only and not auth+encryption.
If it still fails after all that, check the logs and see what it says on both sides.
-
@jimp Thank you very much!
Binding to localhost is due to Mulit-WAN following this tutorial:
https://docs.netgate.com/pfsense/en/latest/multiwan/openvpn.html#bind-to-localhost-and-setup-port-forwards
The NAT port forwards and rules were there before the upgrade. If they would not work, I guess the tunnel would not come up - which it does reliably. I would very much like to keep that for extra resilience.
I never liked adding the remote network into the local network field on the server side. I never had that in in the past. This was due to this tutorial: https://docs.netgate.com/pfsense/en/latest/recipes/openvpn-s2s-tls.html
There it says "Enter the LAN subnets for all sites including the server: " under "IPv4 Local Network(s)". I did remove that.
I also changed from TLS Authentication and Encryption to just Authentication.
Unfortunately, that does not change the outcome.
Regards,
Michael
-
Having the remote networks in the "local" field lets the others know they can also be reached through the server, which is nice for >1 client but not needed for just one.
If it still won't form a link now you'll need to start looking at logs to see what is going on.
The server log should show a connection coming in from the client. If it doesn't, and the client process is running, then the client isn't reaching the server which could be DNS, your NAT/firewall rules, etc. The client logs should show what it's doing there.
Most other problems would show in the logs, too, like a key or cert mismatch and so on.
-
@jimp The tunnel does connect without issues and it does stay up. The logs are similar to those further up in the thread.
By my understanding, this will likely be a routing issue.
Server log:
Jul 6 22:11:43 openvpn 45966 library versions: OpenSSL 1.1.1t-freebsd 7 Feb 2023, LZO 2.10
Jul 6 22:11:43 openvpn 45966 OpenVPN 2.6.4 amd64-portbld-freebsd14.0 [SSL (OpenSSL)] [LZO] [LZ4] [PKCS11] [MH/RECVDA] [AEAD] [DCO]
Jul 6 22:11:42 openvpn 98100 Initialization Sequence Completed
Jul 6 22:11:42 openvpn 98100 UDPv4 link remote: [AF_UNSPEC]
Jul 6 22:11:42 openvpn 98100 UDPv4 link local (bound): [AF_INET]127.0.0.1:1196
Jul 6 22:11:42 openvpn 98100 /usr/local/sbin/ovpn-linkup ovpns3 1500 0 192.168.18.1 255.255.255.0 init
Jul 6 22:11:42 openvpn 98100 /sbin/ifconfig ovpns3 192.168.18.1/24 mtu 1500 up
Jul 6 22:11:42 openvpn 98100 TUN/TAP device /dev/tun3 opened
Jul 6 22:11:42 openvpn 98100 TUN/TAP device ovpns3 exists previously, keep at program end
Jul 6 22:11:42 openvpn 98100 WARNING: experimental option --capath /var/etc/openvpn/server3/ca
Jul 6 22:11:42 openvpn 98100 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jul 6 22:11:42 openvpn 98100 NOTE: your local LAN uses the extremely common subnet address 192.168.0.x or 192.168.1.x. Be aware that this might create routing conflicts if you connect to the VPN server from public locations such as internet cafes that use the same subnet.
Jul 6 22:11:42 openvpn 97789 DCO version: FreeBSD 14.0-CURRENT #1 RELENG_2_7_0-n255866-686c8d3c1f0: Wed Jun 28 04:21:19 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/obj/amd64/LwYAddCr/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/sources/FreeBSD-src-REL
Jul 6 22:11:42 openvpn 97789 library versions: OpenSSL 1.1.1t-freebsd 7 Feb 2023, LZO 2.10
Jul 6 22:11:42 openvpn 97789 OpenVPN 2.6.4 amd64-portbld-freebsd14.0 [SSL (OpenSSL)] [LZO] [LZ4] [PKCS11] [MH/RECVDA] [AEAD] [DCO]Client Log:
Jul 6 22:12:59 openvpn 5948 [srv.xxx.xxx Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:1196
Jul 6 22:12:59 openvpn 5948 Preserving previous TUN/TAP instance: ovpnc2
Jul 6 22:12:59 openvpn 5948 Initialization Sequence Completed
Jul 6 22:13:49 openvpn 22998 Server poll timeout, restarting
Jul 6 22:13:49 openvpn 22998 SIGUSR1[soft,server_poll] received, process restarting
Jul 6 22:13:49 openvpn 22998 NOTE: your local LAN uses the extremely common subnet address 192.168.0.x or 192.168.1.x. Be aware that this might create routing conflicts if you connect to the VPN server from public locations such as internet cafes that use the same subnet.
Jul 6 22:13:49 openvpn 22998 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts -
OK so if the VPN is connected now that narrows things down a bit.
The route errors are probably the client failing to add unnecessary/duplicate routes but whether or not that's a problem depends on what the route table looks like in the end.
If the firewalls themselves can ping the other LANs then the OS routing is probably OK and there is more likely a problem in the local firewall rules/NAT.
There are a lot of troubleshooting suggestions for that sort of stuff at https://docs.netgate.com/pfsense/en/latest/troubleshooting/connectivity.html
But to boil that down a bit, you should check:
- Look at the OS routing table on both sides, make sure there are entries for the opposite side LAN(s) and that those routes are pointing to the correct OpenVPN interface(s).
- When you ping from the firewall make sure to ping from both the OpenVPN interface itself (default source) and again using the LAN interface as a source. That tests routing between the LANs in both directions, not just to/from the OpenVPN interface directly, which is a much different test.
- When pinging from a client on the LAN, look at its states under Diagnostics > States on both firewalls, there should be two entries on each, one as it enters the firewall and one as it exits the firewall. If something like outbound NAT is catching it, the NAT would show in these states. If the traffic is taking the wrong path, that would also show (e.g. it should go in LAN, out VPN, in VPN, out LAN).
That should give you a better idea of what's going on and what needs fixed.