2.3.1 / site-to-site: routing/pf issue after upgrade from 2.2.6

viragomann

So please check the routes on the server node (Diagnostic > Routes).
If OpenVPN is configured correctly there has to be a route to clients LAN using clients OpenVPN IP as gateway.

brevilo

I know, and I did all that (see above, just confirmed it again). I'm currently experimenting with a third pair of nodes, not using CARP. I also think that I'm affected by https://redmine.pfsense.org/issues/6499…

brevilo

Ok, here's concrete example on the test pair (no CARP, currently with subnet topology) I just mentioned.

Routing table on the server (relevant excerpt):


192.168.10.0/24		192.168.100.2	UGS	1687	1500		ovpns1	
192.168.100.0/24	192.168.100.1	UGS	0	1500		ovpns1	
192.168.100.1		link#9		UHS	0	16384		lo0	
192.168.100.2		link#9		UH	24	1500		ovpns1

When I capture ICMP on the server's and the client's OpenVPN interfaces respectively, and ping the client from the server, I get the following:

server-src: LAN / client-dst: OpenVPN: ok (request/reply seen on server and client)
server-src: LAN / client-dst: LAN: fails (request seen on server, no request seen on client)
server-src: OpenVPN / client-dst: LAN: fails (request seen on server, no request seen on client)

That means for some reason all packets (requests) targeting the client LAN are sent via the server's OpenVPN interface but they never appear at the client's OpenVPN interface. This is not a routing issue on the server itself, as the packets seem to leave the server as expected. That's why I described this as an "OpenVPN-internal issue" above…

viragomann

@brevilo:

server-src: LAN / client-dst: LAN: fails (request on server, no request on client)

Have you taken this capture at the client node or at server?

brevilo

Both, as described above. The results are in parentheses.

viragomann

And you're sure that the client node is the default gateway at the hosts behind?

brevilo

That's not the issue here. When I said "client-dst: LAN" in the tests above I meant the VPN client's LAN address/interface, so the client's LAN nodes behind that NIC aren't of interest here (but yes, they do have proper routes). Also, how should client LAN routes affect echo requests "getting lost" between VPN server and client, in the tunnel itself…?

Again, the whole setup worked just fine until I upgraded to 2.3.1.

Soyokaze

To which interface OpenVPN servers are binded?
Which NAT setting do you use (default, hybrid etc)?
Do you use dynamic routing (OSPF for ex) or all links are static binded?

brevilo

@pan_2:

To which interface OpenVPN servers are binded?
Which NAT setting do you use (default, hybrid etc)?
Do you use dynamic routing (OSPF for ex) or all links are static binded?

WAN (of course?)
Hybrid, with source = remote LAN and NAT = local LAN/CARP interface (and client and server respectively)
Not sure what you mean. There's no special routing apart from the OpenVPN site-to-site settings.

brevilo

Update: ok, my third test pair is working again. That particular setup got screwed up when I disabled its iroute for testing purposes :-[ So I'm now back at [url=https://forum.pfsense.org/index.php?topic=113151.msg636321#msg636321]reply #9 where my test rig works with 2.3.1_5 but my production rig using CARP needs further debugging.

Stay tuned…

Soyokaze

@brevilo:

WAN (of course?)

Hybrid, with source = remote LAN and NAT = local LAN/CARP interface (and client and server respectively)

Not sure what you mean. There's no special routing apart from the OpenVPN site-to-site settings.

1. Not always. Sometimes I bind it localhost, sometimes to VIP.
3. Then it is ''static''.
2. >>with source = remote LAN and NAT = local LAN/CARP interface
Err? You NATing your openvpn network to local LAN? Or I did not understood something?

Maybe you draw network diagram? Will help a little to understand your topology.

NB: I have plenty S2S links, with outbound NAT (traffic redirection) and inbound (D/PNAT), everything works OK on 2.3+

brevilo

Err? You NATing your openvpn network to local LAN?

No, why would I? I source NAT the respective remote LAN on each side, that is the client LAN on the server side and vice versa.

The topology is a straight-forward site-to-site with HA/failover: ClientLAN–-Client===Server---ServerLAN (with client and server using CARP, so two nodes each with virtual IPs on LAN and WAN).

Cheers

brevilo

Darn, I think I finally found the culprit: I had an old IPSec config on those boxes (preferred VPN solution but never got it work reliably) which was inactive but not disabled. It seems as if this messes up some internal routing (not reflected by the routing table). It also seems that this is a regression in 2.3.? since 2.2.6 still doesn't have this problem! IIRC, the enable/disable implementation of IPSec changed in 2.3 so that would explain it.

That was a mean one…

Cheers