Site-to-Site OpenVPN problem on 2.7.0, possibly affected by Outbound NAT
-
This week I remotely upgraded 28 Pfsense boxes to version 2.7
27 succeeded, but one crashed and I needed to replace the router with a spare one (for minimum downtime).Only 1 of these boxes was running a site-to-site VPN, but this stopped after upgrading to 2.7
The VPN is up and the boxes themselves can ping each other.
The only effect this has that it misled the monitoring of the bridged VPNMy config is this:
Server: 192.168.16.1/23
oVPN server 1 10.0.0.0/23 <=> 192.168.0.0/23
oVPN server 2 10.0.18.0/23 <=> 192.168.18.0/23Client1: 192.168.0.1/23
Client2: 192.168.18.1/23
-
My VPN's are up as well and the boxes can reach the other boxes on their LAN-address effectively misleading the monitoring I had setup.
Did you confirm the clients can reach other? -
I don't think you can setup a Pfsense 2.6 with packages...
Pre-upgrade none of the packages could be upgraded.But it certainly is an upgrade issue...
I think something was done before in the firewall which it now doesn't.... -
Mmmm.... maybe you should add that the tunnel itself works and that the boxes themselves can ping each other.
It still could be an issue with OpenVPN, but my gut feeling says it's Pfsense.... or FreeBSDI have a setup with 1 Pfsense as server and 2 others as clients.
They are implying it works with their recommended setup.
Did they actually test this or do they just think it works?Like I said... pinging from the boxes works....
-
none of us are just filling in all kind of values and are able to get it to work "by chance" on 2.6 seeing it stop working the moment we upgrade to 2.7
I've followed a how-to when I set it up.
Doesn't the able-to-ping-to-boxes-themselves rule out some of the scenario's you just gave if not all?
I have little problem giving my config.xml for the relevant sections.
which sections do you want beside the openvpn?
-
@frater said in No Site-to-Site VPN after upgrading CE from 2.6.0 to 2.7.0:
none of us are just filling in all kind of values and are able to get it to work "by chance" on 2.6 seeing it stop working the moment we upgrade to 2.7
I've followed a how-to when I set it up.
In the past it has been quite easy to misconfigure things and still have them work by chance. I have yet to find a case like this that wasn't a misconfiguration or something that needs changed because OpenVPN itself changed.
Please review what I've written and compare against the recipe in the docs I linked.
We've also seen people run into trouble because they followed tutorials from other places (not the pfSense docs) and those tutorials had bad recommendations or other problems.
-
I have followed the tutorial you referenced.
I threw away the previous OpenVPN server & client settings and then setup the server and then the client.In the tutorial of the client I didn't have the options which were referenced in the tutorial.
I higlighted them with yellowIs it because it's a 2.6.0 tutorial?
I also didn't fill in the local/remote networks in the client setup, because it isn't mentioned in the tutorial.
I am now able to ping from the pfsense box itself all the IP's on the server's LAN and the server itself.
The network clients however still can't ping anything on the remote network.The server can't ping anything on the OpenVPN's client network
-
Like I wrote earlier....
The VPN is up and the router (client) can ping to the LAN-address of the other router (server) and the clients of the server.
The LAN-devices on the client router can't ping any LAN-address of the server router.The server-router can ping the client router
The server-router can ping the LAN-devices of the client router.
The LAN-devices on the server router can't ping any LAN-address of the client router.Having the tunnel up and being able to at least have the possibility to do 1 LAN-to-LAN ping, I think there's little reason to look at the certificates.
Server LAN IP: 192.168.17.1/23
Client LAN IP: 192.168.1.1/23The monitoring of the server causes a ping to a device on the client's LAN (192.168.1.4) which is indeed pingable from the server router.
But still the same behaviour.... none of the LAN-devices can reach the other network
-
@nazelus said in No Site-to-Site VPN after upgrading CE from 2.6.0 to 2.7.0:
ERROR: FreeBSD route add command failed: external program exited with error status: 1
Might be this
It wouldn't be the first time that some feature isn't working while the expert/developer insists that it should work without testing it with the current version.
@jimp Can you setup a VPN between 2 pfsense 2.7 boxes and test the LAN-device <=> LAN-device connectivity???
It's not that I know these people who are also having this problem.
It's too easy to say it's user error if you can't show it works NOW. -
I see you did not change your VPN following the tutorial that was posted to use certificates.
Like you, I also used the "Peer to Peer Shared key" before, but if something is not working I prefer to follow the advice of following a tutorial to the letter. This is why I completely setup the server and client using "Peer to Peer SSL/TLS"
I admit it doesn't change one bit in the end where it counts, but for debugging purposes I think it's best to stay with the script.
Are you willing to set everything up from scratch?
I think I will now try to setup the 2nd client. Maybe that one does work. -
Well, well.....
I have a little progress....
I just configured a 2nd OVPN-client and that one was able to ping to all LAN-devices from a LAN-device.
LAN-devices on the other client and on the server are still unable to reach other devices.For completeness I would like to write that I'm using /23 networks instead of /24 networks.
For 2 locations I changed the /24 to /23 because other IT companies were putting devices in the LAN with a static address.
By switching to /23 I could move a lot of the DHCP-clients out of the way of these static devices and have less chance that they put new ones there.192.168.1.1/24 became 192.168.1.1/23 turning the network to 192.168.0.0/23
192.168.17.1/24 became 192.168.17.1/23 turning the network to 192.168.16.0/23192.168.18.1/23 was a /23 from the start...
Of course I'm using /23 networks in my oVPN setup
I'm writing this because the 192.168.18.1/23 is working and it's the only router that's also in the 192.168.x.0/24 network
Maybe there's some awkward bug where a /24 is somewhere hardcoded.Anyhow....
I will now be focusing on the differences between the 2 clients.
Maybe I will remove the first client and set it up again.. -
SUCCESS
I solved it on my box and it was indeed something of a misconfiguration...
I still need to test some more, but I already have several LAN-devices that can ping other LAN-devices on remote networks.
Because the newly configured oVPN-client was (partially) working and the other oVPNclient not, I started to focus on the network that didn't have a connection.
That's the 192.168.0.0/23 network
I started with a grep -C5 '192.168.1.' /cf/conf/config.xml and noticed some outbound NAT rules to 192.168.1.0/24
There are no networks like that on that server.
That router was configured with hybrid outbound NAT and when I set it up I used an existing configuration of another router, deleting everything I didn't need.
I think this outbound rule was created for a Vigor bridged modem, which this location didn't have.
On setting it up 2 years ago I deleted that interface.
It seems now that this rule in outbound NAT settings should have been deleted manually as wellAfter I deleted the 2 outbound rules and restarted all the oVPN instances it still didn't work,
so I rebooted the whole router and.... tadaaa..... it worked.I haven't checked everything, but it feels good
-
I didn't create a screenshot of the outbound rules for 192.168.1.0/24 and because I removed those entries I can't make one now, but my config still has this orphaned outbound ruleset
Another network which doesn't exist anymore on this box.
I have no reason to return to the old config as the "shared key" seems to be deprecated, so I will leave it like this.
I wonder what the culprit is on your boxes.
Do take a peak at the outbound NAT rules and see if there are any orphans.. -
You are now using the certificates instead of shared key?
Not that it should matter, but best is to start out with a recommended configuration.server 192.168.17.1/23
client 192.168.18.1/23
client 192.168.1.1/23here I'm pinging 192.168.1.4 from an access point on 192.168.19.20
# ifconfig | grep 192 inet addr:192.168.19.20 Bcast:192.168.19.255 Mask:255.255.254.0 # ping -c2 192.168.1.4 PING 192.168.1.4 (192.168.1.4): 56 data bytes 64 bytes from 192.168.1.4: seq=0 ttl=61 time=18.857 ms 64 bytes from 192.168.1.4: seq=1 ttl=61 time=21.602 ms --- 192.168.1.4 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 18.857/20.229/21.602 ms
-
I previously reported that site-to-site was working after I removed an outbound NAT-rule.
This turned out to be not entirely true.To test this I logged into a device on the site "clientC" and pinged a device on "clientB"
This worked...Device on clientC:
# ifconfig | grep 192 inet addr:192.168.19.20 Bcast:192.168.19.255 Mask:255.255.254.0 # ping -c2 192.168.1.4 PING 192.168.1.4 (192.168.1.4): 56 data bytes 64 bytes from 192.168.1.4: seq=0 ttl=61 time=18.857 ms 64 bytes from 192.168.1.4: seq=1 ttl=61 time=21.602 ms --- 192.168.1.4 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 18.857/20.229/21.602 ms
I now did the same on that same pingable device on "clientB" and tried to ping the device on "clientC" this did NOT work.
Device on ClientB
[~] # ifconfig eth3 | grep 192.168. inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.254.0 [~] # ping -c2 192.168.19.20 PING 192.168.19.20 (192.168.19.20): 56 data bytes --- 192.168.19.20 ping statistics --- 2 packets transmitted, 0 packets received, 100% packet loss
ClientB itself:
ifconfig igb0 | grep 192 inet 192.168.1.1 netmask 0xfffffe00 broadcast 192.168.1.255 [2.7.0-RELEASE][root@pfSense.filmhallen.lan]/root: ping -c2 192.168.19.20 PING 192.168.19.20 (192.168.19.20): 56 data bytes 92 bytes from 10.0.16.1: Redirect Host(New addr: 10.0.16.2) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 65ca 0 0000 3f 01 2820 10.0.16.3 192.168.19.20 64 bytes from 192.168.19.20: icmp_seq=0 ttl=62 time=20.493 ms 92 bytes from 10.0.16.1: Redirect Host(New addr: 10.0.16.2) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 7c93 0 0000 3f 01 1157 10.0.16.3 192.168.19.20 64 bytes from 192.168.19.20: icmp_seq=1 ttl=62 time=19.468 ms --- 192.168.19.20 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 19.468/19.980/20.493/0.512 ms
I'm getting a reply from 10.0.16.3
This is not the IP from the device on clientC (that's 192.168.19.20), but the oVPN-address of router clientB.That source address should be translated be the router, I would think.
-
I forked this off into a new thread so it would all be together since it's likely a different issue than the post it was on before.
What do the state table entries on each firewall look like when you try those ping tests?
-
serverA = 192.168.17.1/23
clientB = 192.168.1.1/23
clientC = 192.168.18.1/23From device 192.168.1.4/23 I'm unsuccesfully pinging to 192.168.19.209/23
My guess is that the WAN-interface shouldn't be there.
From device 192.168.19.209/23 I'm succesfully pinging to 192.168.1.4/23
-
That WAN interface state definitely shouldn't be there, which means the two most likely causes are:
- There is no route in the table on that firewall for
198.168.19.0/23
so it falls through to the default route and out WAN - The LAN firewall rules on there have a gateway set and are forcing the traffic out WAN
As an extra protection against 1, consider adding reject rules on the Floating tab, quick, outbound, on your WAN(s), matching a destination of private networks (either an alias or a large enough mask to catch them all, such as
192.168.0.0/16
). That will stop potentially private traffic from attempting to exit the WAN. Having that set to log is probably also a good idea. - There is no route in the table on that firewall for
-
There was/is indeed a gateway rule on the LAN, but I disabled it just now....
-
removing the gateway rule on the LAN tab was sufficient to get that WAN state gone
I still can't ping to 192.168.19.209 from 192.168.1.4