-
Dear All,
My situation is a two-location SOHO with pfSense on Supermicro hardware, with 2 WAN connections per location, with fixed IPs and IPv4 with NAT. There are two routers per location set up as a high-availability router based on CARP.
For 10 years, this setup did serve me well for a site to site VPN:
https://docs.netgate.com/pfsense/en/latest/recipes/openvpn-ospf.html
A variant with no OSPF and remote networks provided did also work. Also a single WAN site-to-site with the server running on localhost and NAT port forwarding to localhost did work well. I am using manual outbound NAT, switching to hybrid does not change any of the issues below.After upgrading from CD 2.6.0 to 2.7.0 I did not regain full performance of the site-to-site VPN:
- OpenVPN
The best result I can get is that Diagnostics -> Ping on each firewall can ping all devices in the respective other LAN. Telephones using udp SIP can also log in through the tunnel. ICPM and TCP traffic will not flow.
The following measures do not make a difference:
- IPv4 Remote network(s) empty vs. populated
- remote network included in IPv4 Local network(s) or not
- Client specific override with IPv4 Remote Network/s depeding on the certificate CN or not
- Adding an OpenVPN interface and setting a static route or not.
- IPSec
Almost everything will work based on this https://docs.netgate.com/pfsense/en/latest/recipes/ipsec-s2s-psk.html
There are two substantial limtations:
- While most traffic does work in between most clients in each LAN, the router itself cannot see the other side. Unlike with OpenVPN Diagnostics -> Ping does not ping to the remote LAN. The real problem is that package HAProxy cannot reach to the remote LAN, which would be important as a fallback, despite the fact that my backends always use IPs, not hostnames. This seems to be a long standing problem (https://forum.netgate.com/topic/155514/routing-incoming-traffic-from-haproxy-to-endpoint-over-ipsec-vpn?=1688328043030 and https://forum.netgate.com/topic/153961/haproxy-lastchk-fails-with-l4out-level4-timeout-to-all-backends-located-on-the-far-side-of-ipsec-tunnel?=1688328043025). In my setup, the external address is the LAN CARP VIP in relevant scenarios.
- Some traffic types fail, most notably large rsync activities between TrueNAS servers through the tunnel:
“Connection closed by [remote IP] port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(231) [sender=3.2.7]”
- Wireguard
I am unable to get a handshake. This is possibly due to the combination of multi-WAN behind NAT and CARP. I did experiment with multiple output NAT variants and port forwarding UDP 51820 to localhost, the WAN CARP VIPs and even the Wireguard net, but with no success.
If nothing helps, I will install 2.6.0 again, I think. Both rsync and HAProxy to the remote LAN are features I do not like to miss.
Any suggestions by anyone, please?
Regards,
Michael Schefczyk
- OpenVPN
-
@michaelschefczyk said in No Site-to-Site VPN after upgrading CE from 2.6.0 to 2.7.0:
OpenVPN
The best result I can get is that Diagnostics -> Ping on each firewall can ping all devices in the respective other LAN. Telephones using udp SIP can also log in through the tunnel. ICPM and TCP traffic will not flow.
The following measures do not make a difference:What's in the OVPN Logs?
-
Thank you very much @rcoleman-netgate
From my point of view, the OVPN logs are not very telling on the server side. Probably, this is "only" a routing problem on both sides and nothing associated with OpenVPN itself, this seems to be indicated on the client side (only):
Server Log
...
Jul 3 07:31:17 openvpn 72067 UDPv4 link remote: [AF_UNSPEC]
Jul 3 07:31:17 openvpn 72067 UDPv4 link local (bound): [AF_INET]127.0.0.1:1195
Jul 3 07:31:17 openvpn 72067 Preserving previous TUN/TAP instance: ovpns2
Jul 3 07:31:17 openvpn 72067 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jul 3 07:31:17 openvpn 72067 NOTE: your local LAN uses the extremely common subnet address 192.168.0.x or 192.168.1.x. Be aware that this might create routing conflicts if you connect to the VPN server from public locations such as internet cafes that use the same subnet.
Jul 3 07:31:17 openvpn 72067 SIGUSR1[soft,server_poll] received, process restarting
Jul 3 07:31:17 openvpn 72067 Server poll timeout, restarting
Jul 3 07:31:16 openvpn 94298 srv1.xxx.com/xx.xx.xx.xx:1197 MULTI_sva: pool returned IPv4=192.168.19.2, IPv6=(Not enabled)
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 [srv1.xxx.xxx] Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:1197
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_COMP_STUBv2=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_COMP_STUB=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_LZO_STUB=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_PROTO=990
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_CIPHERS=AES-256-GCM
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_MTU=1600
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_TCPNL=1
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_PLAT=freebsd
Jul 3 07:31:16 openvpn 94298 87.128.10.46:1197 peer info: IV_VER=2.6.4
...Client Log
...
Jul 3 07:31:16 openvpn 76618 [srv1.xx.net] Peer Connection Initiated with [AF_INET]xx.xx.xx.xx:1197
Jul 3 07:31:16 openvpn 76618 TUN/TAP device ovpnc3 exists previously, keep at program end
Jul 3 07:31:16 openvpn 76618 TUN/TAP device /dev/tun3 opened
Jul 3 07:31:16 openvpn 76618 /sbin/ifconfig ovpnc3 192.168.19.2/24 mtu 1500 up
Jul 3 07:31:16 openvpn 76618 /usr/local/sbin/ovpn-linkup ovpnc3 1500 0 192.168.19.2 255.255.255.0 init
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
Jul 3 07:31:16 openvpn 76618 Initialization Sequence Completed
Jul 3 07:32:12 openvpn 22998 Server poll timeout, restarting
Jul 3 07:32:12 openvpn 22998 SIGUSR1[soft,server_poll] received, process restarting
Jul 3 07:32:12 openvpn 22998 NOTE: your local LAN uses the extremely common subnet address 192.168.0.x or 192.168.1.x. Be aware that this might create routing conflicts if you connect to the VPN server from public locations such as internet cafes that use the same subnet.
Jul 3 07:32:12 openvpn 22998 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jul 3 07:32:12 openvpn 22998 Preserving previous TUN/TAP instance: ovpns4
Jul 3 07:32:12 openvpn 22998 UDPv4 link local (bound): [AF_INET]127.0.0.1:1195
Jul 3 07:32:12 openvpn 22998 UDPv4 link remote: [AF_UNSPEC]
...Is there a way to prevent the route add command from failing?
Regards,
Michael
-
Dear All,
Thank you for sharing experiences! Now, I know that I am not the only dummy who is unable to handle this.
Open issues:
-
Do we have any knowledge about the reasons? Might it be possible that LAGGs as LAN are the root cause?
-
How do we make Netgate aware of this? I can file a bug report, but I am unsure if anyone will take notice in due time?
-
Reverting to 2.6.0 is also what I had in mind. I also use freeradius3. Is there a way to reinstall or do I need to set up a freeradius server elsewhere first?
-
Is it possible to solve the problems associated with Ipsec site to site? In my case the issues are that HAProxy does not see the other side (known issue) and that rsync does work across the tunnel (do not understand this, but this is a major bottleneck).
-
Can anyone instruct me on using Wireguard for site to site VPN with HA multi-WAN behind NAT?
-
Does anyone know, if the non-CE variant does come without this problem?
Regards,
Michael
-
-
Dear All,
Trying to post a bug report bears a high risk of just being turned down:
https://redmine.pfsense.org/issues/14541
Would anyone in the forum please be so kind to:
(a) help solve understanding the issue and changing the configuration adequately and/or
(b) file a bug report in a fashion that does not get turned down?Not having site-to-site VPN due to an upgrade as not something which is bearable for a longer period.
Regards,
Michael
-
On one side, I have 192.168.1.0/24 and 192.168.4.0/24. On the other side, I have 192.168.12.0/24. This is as it is for > 10 years without any issues. I will try the 192.168.4.0/24 network on the one side and report back if that does make a difference.
-
Dear All,
Replacing the network 192.168.1.0/24 on one side by 192.168.4.0/24 did make the "extremely common subnet address" note go away. However, routing would still not work.
Changing a connection from SSL/TLS to shared key also made the "FreeBSD route add command failed: external program exited with error status: 1" error go away. Nevertheless, routing would still not work.
@mslauria Did you start from scratch with the entire configuration (all interfaces, packages, rules ...)?? Did you just delete the OpenVPN configurations and leave other things in place? What may be mixed? Will the system not work upon mixing for example an SSL/TLS OpenVPN server for remote access with a Shared Key client for the site-to-site OpenVPN? In the past, none of this was any problem, as far as I know.
Regards,
Michael
-
@michaelschefczyk
i have same problem after upgrade to 2.7.0 -
Jul 3 07:31:16 openvpn 76618 ERROR: FreeBSD route add command failed: external program exited with error status: 1
What added commands do you have on either side?
Can you show a screen shot of your config pages from the "tunnel settings" down?
-
Thank you @chpalmer !
My "Custom options" boxes are empty on both sides.
Screenshot Server
Screenshot Client
As indicated in the initial post, I did try around with having the IPv4 remote networks empty vs. populated and with including the remote network under local network or not. This did not change the situation. I will be glad to try again, if you instruct me to.
-
@nazelus That would be great! Please state which field in the configuration GUI you are looking at. I find it difficult to understand what "addition Lan" means. I do have two LAN networks in the server side indeed (192.168.1.0/24 and 192.168.4.0/24). I will need to connect OpenVPN to connect to at least (a) the default LAN 192.168.1.0/24 and (b) the router itself (probably 127.0.0.1) so that package HAProxy can see through the tunnel.
-
So far i have upgraded two (remote) boxes from 2.6.0 to 2.7.0
Both having OpenVPN (TLS + Passwd) connections to the central 2.6.0 server.Both upgrades went well , no VPN issues.
/Bingo
-
Dear All,
If we will remain unable to clarify this quickly, my aim is to roll back to 2.6.0. I did that on the secondary CARP-member in my hoe office: Install, recover config, unplug WAN on the fist boot. change update version back to 2.6.0, reboot again, plug in WAN and install packages manually. This does seem to work. I will try the primary unit this evening. Saturday, I will travel to the other end of the VPN (600 km) and try it there.
I did try to file a second bug report. Jim Pingle replied:
"Please do not open duplicate issues. Keep the discussion on the forum and if there is a proven bug and not a configuration issue, then the original can be reopened.
We cannot be responsible for making sure every possible variation of OpenVPN works across every version/upgrade, especially when OpenVPN itself changes and deprecates functions/features or changes how things work. Many users have working OpenVPN tunnels on 2.7.0 and current Plus versions that have been upgraded and working for years, it's highly unlikely to be a bug, but something in your setup that isn't correct or needs adjusted to compensate for OpenVPN changes. This is not the place to track that down, that is what the forum is for.
Be sure to post complete settings for all nodes involved, not just general description of the setup."
If anyone has better chances to involve the developers than me, help would be most welcome.
I will certainly not be glad to rebuild my > 40 MB configuration from scratch. I am also unable to post the configuration publicly, for obvious reasons.
Regards,
Michael
-
@michaelschefczyk did you try cold start of 2.7.0 box? My problem gone somehow after i poweroff/poweron.
-
-
As I mentioned on Redmine you most likely have a configuration problem that has always been wrong but some change on the backend changed and now your previously "working" settings which happened to be incorrect in some way stopped working.
A few common things we have seen are:
- SSL/TLS setups where people had filled in a tunnel network on the client when they should not
- SSL/TLS setups with a /24 tunnel network where the Client-Specific Overrides were not setup correctly breaking LAN-to-LAN routing
- Static Key configurations using the wrong subnet size for the tunnel network (e.g. /24 when it should have been /30)
- Not explicitly setting the same topology on both sides
- Some other routing conflict preventing the correct entries from being in the tables
- A configuration that worked by chance before that was never correct (e.g. routes in System > Routing instead of in OpenVPN natively)
- Policy routing rules overriding the VPN and sending the client traffic in some unexpected path
Since you won't (or can't) post your settings, there isn't any way for us to really help you diagnose things, but it sounds like it's your routing/route table entries that are broken or missing. Either you are not in the correct mode (e.g. SSL/TLS with /24 tunnel network requires setting up override entries for client network routing, not static routes), or some other similar problem.
Compare your setup against the reference here: https://docs.netgate.com/pfsense/en/latest/recipes/openvpn-s2s-tls.html
If you were using a routing protocol like OSPF before, then you either had to have been using shared key, a /30 tunnel network, or maybe tap mode in certain cases.
-
Dear All,
Please find the most relevant pages of my configuration below for comments. If other views are required, please let me know. Firewall rules as before are in place.
I think that this is in line with the tutorial. The tunnel does get established, but nothing else does work.
My assumption is, that the issues are either due to certificate stuff or to routing issues outside OpenVPN including LAGG.
Regards,
Michael
Server config
Server override
Client config
-
The tutorial was checked against 2.7.0 and 23.05, but there may be slight wording differences.
The "Automatically generate" box only showed up when you first create a tunnel, it won't show when editing.
I just re-followed the recipe a week or two ago and confirmed it all worked, so if it doesn't work for you, something isn't matched up or wasn't followed as shown.
-
At a glance what stands out is that the server is bound to localhost so maybe your port forward for that server isn't correct so the client can't reach it. Otherwise there isn't enough info to say why it might be failing (could be certs, for example)
Also with just the one client you probably don't want to list that client's own network as "local" to the server since that will make the client try to pull (and probably fail) to pull a route for its own network from the server.
Also you might try changing the TLS config so it's auth only and not auth+encryption.
If it still fails after all that, check the logs and see what it says on both sides.
-
@jimp Thank you very much!
Binding to localhost is due to Mulit-WAN following this tutorial:
https://docs.netgate.com/pfsense/en/latest/multiwan/openvpn.html#bind-to-localhost-and-setup-port-forwards
The NAT port forwards and rules were there before the upgrade. If they would not work, I guess the tunnel would not come up - which it does reliably. I would very much like to keep that for extra resilience.
I never liked adding the remote network into the local network field on the server side. I never had that in in the past. This was due to this tutorial: https://docs.netgate.com/pfsense/en/latest/recipes/openvpn-s2s-tls.html
There it says "Enter the LAN subnets for all sites including the server: " under "IPv4 Local Network(s)". I did remove that.
I also changed from TLS Authentication and Encryption to just Authentication.
Unfortunately, that does not change the outcome.
Regards,
Michael
-
Having the remote networks in the "local" field lets the others know they can also be reached through the server, which is nice for >1 client but not needed for just one.
If it still won't form a link now you'll need to start looking at logs to see what is going on.
The server log should show a connection coming in from the client. If it doesn't, and the client process is running, then the client isn't reaching the server which could be DNS, your NAT/firewall rules, etc. The client logs should show what it's doing there.
Most other problems would show in the logs, too, like a key or cert mismatch and so on.