IPSEC + DNS Resolver/Domain Override + Static Route [ Solved ]

moterpent

I've noticed since the last version of 2.2.x, and with all 2.3.x releases, there seems to be a change in regards to DNS resolution, when using domain overrides, where the DNS server for the overridden domain is on the opposite side of an IPSEC tunnel.

On earlier versions this would work fine, but only if one added a static route for the nets where the remote DNS server was located, using the LAN, or other appropriate interface's IP, as the gateway.

With the last few versions, I've noticed that the static route causes problems with general network connectivity across the tunnel, for certain clients. Windows clients seem to handle it OK, but newer Linux distributions seem to struggle. Older Linux distributions seem to be OK. For instance Windows 7/10, CentOS 6 work fine. Cent7, recent Ubuntu (15.x, 16.x) and Mint (17.x) struggle.

Upon removing the the static route, the troublesome clients start working as long as they talk via IP only. It seems pfSense still needs the static in order for DNS resolver domain overrides to work.

So I'm faced with either:

a) Disable static route: All clients work, but none can resolve anything via DNS.

…or...

b) Enable static route: Some clients can resolve DNS and communicate with remote, while others can't communicate at all / have overwhelming packet loss.

Any thoughts and/or recommendations?

moterpent

In an attempt to try and revive this thread, here's some additional information. When the aforementioned static route as enabled, which seems to still be required for DNS resolution/resolver to work across the IPSEC tunnel, I get the following when trying to ping hosts on the remote side of the tunnel. For instance from a Ubuntu 16.04 box:

ping 172.16.59.15
PING 172.16.59.15 (172.16.59.15) 56(84) bytes of data.
From 192.168.60.2: icmp_seq=1 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=1 ttl=254 time=33.4 ms
From 192.168.60.108 icmp_seq=2 Destination Host Unreachable
From 192.168.60.108 icmp_seq=3 Destination Host Unreachable
From 192.168.60.108 icmp_seq=4 Destination Host Unreachable
From 192.168.60.2: icmp_seq=5 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=5 ttl=254 time=16.6 ms
...

Notice that 3 of every 4 ICMP/ping packets never make it through. If I try this from a Win10 or Cent6 host, on the same source subnet, it works fine. All hosts use the same gateway (192.168.60.2 pfsense), broadcast domain (192.168.60.0/24).

Sample ping output for the Cent6 box to the same host:

ping 172.16.59.15
PING 172.16.59.15 (172.16.59.15) 56(84) bytes of data.
From 192.168.60.2: icmp_seq=1 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=1 ttl=254 time=11.4 ms
From 192.168.60.2: icmp_seq=2 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=2 ttl=254 time=10.6 ms
From 192.168.60.2: icmp_seq=3 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=3 ttl=254 time=10.3 ms
From 192.168.60.2: icmp_seq=4 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=4 ttl=254 time=16.1 ms
From 192.168.60.2: icmp_seq=5 Redirect Host(New nexthop: 172.16.59.15)
64 bytes from 172.16.59.15: icmp_seq=5 ttl=254 time=16.8 ms

The redirects are still there, but the packets go through.

Now remove the static and try again from Ubuntu 16.08:

ping 172.16.59.15
PING 172.16.59.15 (172.16.59.15) 56(84) bytes of data.
64 bytes from 172.16.59.15: icmp_seq=1 ttl=254 time=36.6 ms
64 bytes from 172.16.59.15: icmp_seq=2 ttl=254 time=35.0 ms
64 bytes from 172.16.59.15: icmp_seq=3 ttl=254 time=34.8 ms
64 bytes from 172.16.59.15: icmp_seq=4 ttl=254 time=33.1 ms
64 bytes from 172.16.59.15: icmp_seq=5 ttl=254 time=32.2 ms

Packet loss is gone and all hosts work great regardless – by IP address. Of course DNS no longer resolves for domains that live on the other side of the tunnel.

Any thoughts, suggestions or questions are very welcome. Thanks.

moterpent

After some additional study and testing I came across the following forum post:

https://forum.pfsense.org/index.php?topic=95163.0

This appears to resolve the problem in my case.

To summarize, and in the event someone else runs into this, the fix appears to be…

Go to Services -> DNS Resolver (unbound)
Scroll down to the "Outgoing Network Interfaces" section
Make sure that only the LAN interface is selected. ("All" is selected by default.)
Save and apply

I would think that there could still be issues if a person happens to have more than one LAN type interface (OPT1 etc). Fortunately, in my case, I only have a single LAN so this will do for the time being. It would be nice if there was some way to specify an interface per each domain and host override instead of assuming the same one for everything. Or at least have a way at hinting to unbound which interface to use for certain requests.

I'll continue to test this and if it holds up will mark this post as solved.

(Note: Using this method also makes the need for a static route / redirect unnecessary.)

flowjo-mike

Hi, I am having the same issue except changing the DNS resolver doesn't help at all. I am running 2.3.2 and in order for our VPN clients to resolve LAN DNS is by manually adding DNS to their network interface (wifi or eth)… Adding DNS to the VPN connection didn't help.

I have tried all suggestions I found in the forums, but no setting on the pfSense would work.

Is yours still working?