Site-to-Site OpenVPN problem on 2.7.0, possibly affected by Outbound NAT
-
@nazelus said in No Site-to-Site VPN after upgrading CE from 2.6.0 to 2.7.0:
ERROR: FreeBSD route add command failed: external program exited with error status: 1
Might be this
It wouldn't be the first time that some feature isn't working while the expert/developer insists that it should work without testing it with the current version.
@jimp Can you setup a VPN between 2 pfsense 2.7 boxes and test the LAN-device <=> LAN-device connectivity???
It's not that I know these people who are also having this problem.
It's too easy to say it's user error if you can't show it works NOW. -
I see you did not change your VPN following the tutorial that was posted to use certificates.
Like you, I also used the "Peer to Peer Shared key" before, but if something is not working I prefer to follow the advice of following a tutorial to the letter. This is why I completely setup the server and client using "Peer to Peer SSL/TLS"
I admit it doesn't change one bit in the end where it counts, but for debugging purposes I think it's best to stay with the script.
Are you willing to set everything up from scratch?
I think I will now try to setup the 2nd client. Maybe that one does work. -
Well, well.....
I have a little progress....
I just configured a 2nd OVPN-client and that one was able to ping to all LAN-devices from a LAN-device.
LAN-devices on the other client and on the server are still unable to reach other devices.For completeness I would like to write that I'm using /23 networks instead of /24 networks.
For 2 locations I changed the /24 to /23 because other IT companies were putting devices in the LAN with a static address.
By switching to /23 I could move a lot of the DHCP-clients out of the way of these static devices and have less chance that they put new ones there.192.168.1.1/24 became 192.168.1.1/23 turning the network to 192.168.0.0/23
192.168.17.1/24 became 192.168.17.1/23 turning the network to 192.168.16.0/23192.168.18.1/23 was a /23 from the start...
Of course I'm using /23 networks in my oVPN setup
I'm writing this because the 192.168.18.1/23 is working and it's the only router that's also in the 192.168.x.0/24 network
Maybe there's some awkward bug where a /24 is somewhere hardcoded.Anyhow....
I will now be focusing on the differences between the 2 clients.
Maybe I will remove the first client and set it up again.. -
SUCCESS
I solved it on my box and it was indeed something of a misconfiguration...
I still need to test some more, but I already have several LAN-devices that can ping other LAN-devices on remote networks.
Because the newly configured oVPN-client was (partially) working and the other oVPNclient not, I started to focus on the network that didn't have a connection.
That's the 192.168.0.0/23 network
I started with a grep -C5 '192.168.1.' /cf/conf/config.xml and noticed some outbound NAT rules to 192.168.1.0/24
There are no networks like that on that server.
That router was configured with hybrid outbound NAT and when I set it up I used an existing configuration of another router, deleting everything I didn't need.
I think this outbound rule was created for a Vigor bridged modem, which this location didn't have.
On setting it up 2 years ago I deleted that interface.
It seems now that this rule in outbound NAT settings should have been deleted manually as wellAfter I deleted the 2 outbound rules and restarted all the oVPN instances it still didn't work,
so I rebooted the whole router and.... tadaaa..... it worked.I haven't checked everything, but it feels good
-
I didn't create a screenshot of the outbound rules for 192.168.1.0/24 and because I removed those entries I can't make one now, but my config still has this orphaned outbound ruleset
Another network which doesn't exist anymore on this box.
I have no reason to return to the old config as the "shared key" seems to be deprecated, so I will leave it like this.
I wonder what the culprit is on your boxes.
Do take a peak at the outbound NAT rules and see if there are any orphans.. -
You are now using the certificates instead of shared key?
Not that it should matter, but best is to start out with a recommended configuration.server 192.168.17.1/23
client 192.168.18.1/23
client 192.168.1.1/23here I'm pinging 192.168.1.4 from an access point on 192.168.19.20
# ifconfig | grep 192 inet addr:192.168.19.20 Bcast:192.168.19.255 Mask:255.255.254.0 # ping -c2 192.168.1.4 PING 192.168.1.4 (192.168.1.4): 56 data bytes 64 bytes from 192.168.1.4: seq=0 ttl=61 time=18.857 ms 64 bytes from 192.168.1.4: seq=1 ttl=61 time=21.602 ms --- 192.168.1.4 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 18.857/20.229/21.602 ms
-
I previously reported that site-to-site was working after I removed an outbound NAT-rule.
This turned out to be not entirely true.To test this I logged into a device on the site "clientC" and pinged a device on "clientB"
This worked...Device on clientC:
# ifconfig | grep 192 inet addr:192.168.19.20 Bcast:192.168.19.255 Mask:255.255.254.0 # ping -c2 192.168.1.4 PING 192.168.1.4 (192.168.1.4): 56 data bytes 64 bytes from 192.168.1.4: seq=0 ttl=61 time=18.857 ms 64 bytes from 192.168.1.4: seq=1 ttl=61 time=21.602 ms --- 192.168.1.4 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 18.857/20.229/21.602 ms
I now did the same on that same pingable device on "clientB" and tried to ping the device on "clientC" this did NOT work.
Device on ClientB
[~] # ifconfig eth3 | grep 192.168. inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.254.0 [~] # ping -c2 192.168.19.20 PING 192.168.19.20 (192.168.19.20): 56 data bytes --- 192.168.19.20 ping statistics --- 2 packets transmitted, 0 packets received, 100% packet loss
ClientB itself:
ifconfig igb0 | grep 192 inet 192.168.1.1 netmask 0xfffffe00 broadcast 192.168.1.255 [2.7.0-RELEASE][root@pfSense.filmhallen.lan]/root: ping -c2 192.168.19.20 PING 192.168.19.20 (192.168.19.20): 56 data bytes 92 bytes from 10.0.16.1: Redirect Host(New addr: 10.0.16.2) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 65ca 0 0000 3f 01 2820 10.0.16.3 192.168.19.20 64 bytes from 192.168.19.20: icmp_seq=0 ttl=62 time=20.493 ms 92 bytes from 10.0.16.1: Redirect Host(New addr: 10.0.16.2) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 7c93 0 0000 3f 01 1157 10.0.16.3 192.168.19.20 64 bytes from 192.168.19.20: icmp_seq=1 ttl=62 time=19.468 ms --- 192.168.19.20 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 19.468/19.980/20.493/0.512 ms
I'm getting a reply from 10.0.16.3
This is not the IP from the device on clientC (that's 192.168.19.20), but the oVPN-address of router clientB.That source address should be translated be the router, I would think.
-
I forked this off into a new thread so it would all be together since it's likely a different issue than the post it was on before.
What do the state table entries on each firewall look like when you try those ping tests?
-
serverA = 192.168.17.1/23
clientB = 192.168.1.1/23
clientC = 192.168.18.1/23From device 192.168.1.4/23 I'm unsuccesfully pinging to 192.168.19.209/23
My guess is that the WAN-interface shouldn't be there.
From device 192.168.19.209/23 I'm succesfully pinging to 192.168.1.4/23
-
That WAN interface state definitely shouldn't be there, which means the two most likely causes are:
- There is no route in the table on that firewall for
198.168.19.0/23
so it falls through to the default route and out WAN - The LAN firewall rules on there have a gateway set and are forcing the traffic out WAN
As an extra protection against 1, consider adding reject rules on the Floating tab, quick, outbound, on your WAN(s), matching a destination of private networks (either an alias or a large enough mask to catch them all, such as
192.168.0.0/16
). That will stop potentially private traffic from attempting to exit the WAN. Having that set to log is probably also a good idea. - There is no route in the table on that firewall for
-
There was/is indeed a gateway rule on the LAN, but I disabled it just now....
-
removing the gateway rule on the LAN tab was sufficient to get that WAN state gone
I still can't ping to 192.168.19.209 from 192.168.1.4 -
Since it appears to be making it to the VPN there, now you'd check the states, rules, and routing on the other nodes. Make sure the OpenVPN rules allow it on both the serverA and clientC firewalls, and check the states along each leg.
-
I checked the LAN firewall again and noticed an autocreated rule pfB_PRI1_v4 of pfBlockerNG.
I removed it from pfBlockerNG and it started to work.It was the ClientC network that was able to ping itself, but couldn't be pinged....
I'm still having this when I ping from pfsense clientB to pfsense clientC
I'm getting the answer from the oVPN ip if I don't give a source address:/root: ping -c2 -S 192.168.1.1 192.168.18.1 PING 192.168.18.1 (192.168.18.1) from 192.168.1.1: 56 data bytes 64 bytes from 192.168.18.1: icmp_seq=0 ttl=63 time=11.856 ms 64 bytes from 192.168.18.1: icmp_seq=1 ttl=63 time=13.479 ms --- 192.168.18.1 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 11.856/12.668/13.479/0.812 ms [2.7.0-RELEASE][root@pfSense.filmhallen.lan]/root: ping -c2 192.168.18.1 PING 192.168.18.1 (192.168.18.1): 56 data bytes 92 bytes from 10.0.16.1: Redirect Host(New addr: 10.0.16.2) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 f430 0 0000 3f 01 9acd 10.0.16.2 192.168.18.1 64 bytes from 192.168.18.1: icmp_seq=0 ttl=63 time=14.299 ms 92 bytes from 10.0.16.1: Redirect Host(New addr: 10.0.16.2) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 83fc 0 0000 3f 01 0b02 10.0.16.2 192.168.18.1 64 bytes from 192.168.18.1: icmp_seq=1 ttl=63 time=11.461 ms --- 192.168.18.1 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 11.461/12.880/14.299/1.419 ms
-
The redirect is sort of expected due to how OpenVPN handles its routing. It usually stuffs a dummy address in the table as a destination just to make sure the traffic gets handed off to OpenVPN and then OpenVPN deals with it from there, but depending on what is hitting what it may end up getting that kind of response.
As long as the traffic goes through it's not a huge concern.
-
I still have a problem pinging 192.168.19.209 even though I can ping it from the network itself.
It's a Windows machine, so I think that's a problem with that firewall not accepting a connection from other LANsI moved an AP from 192.168.19.20 to 192.168.19.210 and I was able to ping it....
I will revisit this thread if I find out it has to be something else... -
That sounds like a local network config issue on the target system. There are some cases where Windows will only accept inbound traffic from its own subnet unless it thinks it's on a certain type of network. Like if it's set to public vs private but maybe not exactly that.
If you need to fudge that you could setup a hybrid outbound NAT rule on the LAN to make the source of traffic appear to be the local network, but that can break or complicate certain protocols. It's best to fix the local network config on the client system.