Domain Controller resolution over IPSec
-
I don't think this is an IPSec problem, because it's pfSense site to site, default settings, ipsec firewall interface set to allow everything, full communication via IP. I can resolve computer names and everything.
I usually set up Windows domain networks with a domain name override on pfSense and let pfSense handle everything, including forwarding via custom options to NextDNS.
All this works beautifully over the local network. But with a domained join computer on the other side of that VPN, it claims it can't contact a logon server to apply group policy. Even though WINS is set through DHCP to that domain controller. Even though if I manually set DNS to use that DC on the client, it works perfectly. Even though on that side of the VPN I still have a domain name override pointing back to the fully reachable DC on the other side.
So bottom line, Windows DC DNS resolution just doesn't seem to work over the VPN connection, even though the traffic itself does. I don't understand why. Hopefully someone does.
-
@CarAnalogy I should also note that the computer's firewall profile setting is set to private, the server's software firewall completely disabled for testing.
Maybe this is a Windows problem, I just don't understand how I can make it work so easily over the local network but somehow can't get it to go over the VPN.
-
I didn't expect to get much response to this one, since it's not really directly pfSense related. I wouldn't blame anyone reading it for just saying "pfft, Windows" and continuing to scroll.
However in the interest of helping anyone else that may have had the same question, I found a workaround.
There has always been some question (in my mind and experience, at least) whether Windows even notices when a second DNS server has been configured. Apparently now it does. Using DHCP to set pfSense as the primary DNS server, and the DC as the secondary DNS server seems to solve the issue.
I like to keep all DNS traffic that's not specifically Active Directory related away from the Domain Controller's DNS, but this does work to mitigate the problem I described.
-
@CarAnalogy One more quick update, checking on the domain controller, it does still appear to be only handing requests for the domain. Maybe this is intended behavior, maybe no one else will ever have this problem, but hoped this helped someone.
-
@CarAnalogy said in Domain Controller resolution over IPSec:
There has always been some question (in my mind and experience, at least) whether Windows even notices when a second DNS server has been configured. Apparently now it does. Using DHCP to set pfSense as the primary DNS server, and the DC as the secondary DNS server seems to solve the issue.
My understanding here is that whenever more than a single DNS server is configured on a Windows client (or server), it will randomly choose one of the configured servers and then try to communicate with it. If the chosen server answers, Windows will continue to use only that server for all future DNS requests until either the client is rebooted (in which case the random selection happens again) or until the chosen DNS server subsequently fails to respond. And in this case "fails to respond" means a loss of connectivity as in pings timeout and or the DNS request itself times out. "Fails to respond" does NOT mean the DNS server answered with NXDOMAN (non-existent domain). Any answer is considered a "response" and thus the Windows client will continue to use that server.
So, if my understanding is correct, it would seem Windows may be unable to contact one of the servers (the DC perhaps) and defaults to the second configured one. Why it fails to contact the DC first I don't know, but generally such issues are caused by either routing problems or a local firewall on the remote host.
-
@bmeeks That’s the funny thing, it can definitely contact everything on the other side of the firewall. It seems like it’s making a broadcast request for the domain controller and then just failing if that doesn’t work.
Windows DNS server order is a mystery to many but it does seem in this case it’s trying the first listed server for everything, and then either purposely or very quickly failing over to the domain controller for DNS lookups of the Active Directory domain. DNS on the server doesn’t show any cached queries for any outside domains, and it does show the DNS registration of the domain joined computers on the far side of the VPN.
It’s actually working perfectly in this configuration, it’s still strange to me though that somehow that DNS name override doesn’t seem to have any effect over the VPN.
-
@CarAnalogy said in Domain Controller resolution over IPSec:
@bmeeks That’s the funny thing, it can definitely contact everything on the other side of the firewall. It seems like it’s making a broadcast request for the domain controller and then just failing if that doesn’t work.
Windows DNS server order is a mystery to many but it does seem in this case it’s trying the first listed server for everything, and then either purposely or very quickly failing over to the domain controller for DNS lookups of the Active Directory domain. DNS on the server doesn’t show any cached queries for any outside domains, and it does show the DNS registration of the domain joined computers on the far side of the VPN.
It’s actually working perfectly in this configuration, it’s still strange to me though that somehow that DNS name override doesn’t seem to have any effect over the VPN.
You may have already come across this information, but there may be something helpful at this Microsoft link: https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/manage/dc-locator?tabs=dns-based-discovery. I went back and read your original post again and noticed you mentioned the error message about failing to locate a DC to retrieve and apply group policy. Perhaps there is a setting that needs tweaking in your AD site configuration ???
Have you performed packet captures at each end of the VPN link? Are the DNS queries from your client via the VPN never making it to the DC DNS server, or are the queries perhaps arriving and the ensuing replies from the DC DNS are getting misrouted. Might be time to perform packet captures at both ends of the VPN tunnel to see what's actually hitting the wire. I'm not there and am not familiar with your network, but the symptoms you describe seem to be related to a routing issue with the VPN traffic and the DC. Also, are there any firewall rules limiting traffic through the VPN? Make sure those are letting all the Active Directory stuff go through unmolested.
Can you ping the IP address of the DC from the client on the VPN that says it cannot locate a DC to fetch group policy from? That would indicate whether or not there is a funadmental communications issue. If you can't ping the IP address of the DC, then other things will likely fail as well.
-
@bmeeks that’s a good link, thank you.
Yes, I can ping it, pull up network shares, everything. Totally not a connectivity issue. Benefit of it being pfSense both sides, and following the pfSense manual recommendations, just default settings both sides, local interface selected for P2, allow everything on all protocols from everywhere on IPsec interface, all that good stuff. The VPN works perfectly.
But I think your link is heading the right direction. This feels to me like something making a broadcast request, which of course can’t go over the VPN because it just doesn’t work that way. Even though according to that article if I’m reading correctly, broadcast should be the fallback when DNS fails.
I’m a big fan of going with the default configuration as much as possible, as low touch on clients as possible, everything from the network config. Somehow it seems like Windows, in its default configuration, is taking that broadcast more seriously than the DNS resolution.
Cause remember when the windows clients are set to directly use the DC as the DNS server, it works fine. It can reach it, but for some reason doesn’t try to do that by DNS unless it’s explicitly told to.
-
@CarAnalogy said in Domain Controller resolution over IPSec:
Cause remember when the windows clients are set to directly use the DC as the DNS server, it works fine. It can reach it, but for some reason doesn’t try to do that by DNS unless it’s explicitly told to.
I did not realize you were NOT setting the DC as the DNS server configured on the Windows clients. I thought you were handing out the AD DNS server to the Windows clients and they still could not resolve using it.
In my view (and many others agree with this) you should ALWAYS point Windows Active Directory clients directly to the DC (or an Active Directory DNS) for DNS resolution. The AD DNS server can then be configured to resolve or forward for domains which it is not authoritative for. But you generally do not want to send Windows AD clients to a non-AD DNS server first even if you have a domain override in place.
The preferred way to do this is point the Windows AD clients directly to the AD DNS server as part of their DHCP configuration. I also highly recommend using the Windows DHCP server as well for AD setups. It just works much more seamlessly due to its ability to dynamically update the AD DNS with host IP addresses and names. Use the domain override within pfSense only to send lookups from other non-AD clients to the AD DNS should those clients need to resolve an AD host for some reason.
-
Windows DNS isn’t random, it’s closer to “last known good” but it resets after a time. Somewhen, I ran across a MS KB about it I can try to dig up if someone reminds me.
We just did this with IPSec and have with OpenVPN in the past. Don’t recall doing anything special tbh, probably gave the DNS in the IPSec config? We do normally set a domain override in DNS for the AD domain.
For remote, there is a chicken/egg thing where you have to connect the VPN to join the domain, reboot, connect again, and then switch user to log in as the domain user.
-
I remembered to look for it.
“...
o The DNS client does not utilize each of the DNS servers listed in TCP/IP configuration for each query. By default, on startup the DNS client will attempt to use the server in the Preferred DNS server entry. If this server fails to respond for any reason, the DNS client will switch to the server listed in the alternate DNS server entry. The DNS client will continue to use this alternate DNS server until:- It fails to respond to a DNS query, or:
- The ServerPriorityTimeLimit value is reached (15 minutes by default).
Note
Only a failure to respond will cause the DNS client to switch Preferred DNS servers; receiving an authoritative but incorrect response does not cause the DNS client to try another server. As a result, configuring a Domain Controller with itself and another DNS server as Preferred and Alternate servers helps to ensure that a response is received, but it does not guarantee accuracy of that response. DNS record update failures on either of the servers may result in an inconsistent name resolution experience.” -
Thanks to @bmeeks and @SteveITS for the information and links.
Ordinarily on the local network, I let the clients use pfSense for DNS and DHCP, and use the domain override for the AD domain name, and it works fine.
I do agree with the advice about using Windows DHCP to update DNS and have in the past, but then I've had issues using the DC as the primary DNS server, and if it goes down it also takes the internet connection down which makes things trickier for me remotely.
But after this experience, I agree with your recommendations. Setting the DC as the secondary DNS directly via DHCP, while leaving pfSense as the primary DNS, seems to completely solve this issue. The clients instantly resolve the domain for logon and network policies and drives, and the domain controller doesn't show any cached DNS queries for any outside domains. And on the server I do see the computers on the far side of the VPN registered in DNS dynamically.
With regard to the link Steve posted, I'm not sure why this seems to work so well. But that's still the strange part. It's not that the domain override wasn't working, or that the DC was unreachable, it's that Windows still seemed to be trying to reach the DC via broadcast first, or some other way that somehow was not making it the full circuit across the VPN.
I do set the DC to use itself on localhost as its only DNS.
-
@CarAnalogy I looked at a client with IPSec. It has:
DNS Servers [ x ] Provide a DNS server list to clients
...checked in the Mobile Clients section. Then firewall rules need to allow DNS of course.
There are many ways to get DNS to work, and as long as the PC can see the domain they are valid. :)
-
@SteveITS these aren't actually mobile clients, this is a site to site IPSec.
But yeah I think we agree the way to go here is to specifically assign the DC as DNS one way or another. Since I control DHCP on both sides, that seems to be the way to go in this case.