clients cannot resolve any google sites (plus on other) but pfsense can
-
So, not sure if this is even a pfsense issue, but that is where I was able to figure out a workaround, and I figure there are some bright individuals here who can help point me in the right direction.
TL;DR Starting this afternoon, all the client machines at my work started to be unable to access google.com and were seeing a dns_probe_started/dns_probe_finished_bad_config error in chrome. Similar results in other browsers. Could not ping the sites from client machines but was able to ping and resolve hostnames in pfsense. After trying various troubleshooting steps I finally got things working by enabling DNS forwarding in the DNS resolver. Still not sure what is causing the sudden issue.
Read on for more details. I am at a loss as to troubleshooting steps at this point. Sounds like possible DNS redirection/hijacking? Security software (bitdefender gravityzone) has not detected anything and full scans still in progress on a few machines showing nothing.
DNS Settings at the time were as below. Anything cut off in the resolver option screen is disabled. Of note, I thought I had had the python module enabled, and when I restored the previous saved backup (after I resolved the main issue by enabling DNS forwarding in the resolver) the python module was enabled. I'm not sure how or when that became disabled.
As I said, this started this afternoon when folks noted they could not access google.com and were seeing the DNS error noted above. I also discovered remotedesktop.google.com would not resolve, as well as our company website. The company website is hosted by hostgator. Didn't check gmail, but assume it was the same. After confirming the issue on multiple clients I tried the following things which did not help:
-
flushed the dns cache and did a release/renew on my pc - didn't expect that to help but figured good place to start
-
turned off pfblockerng/dnsbl
-
disabled DNS server override in General Setup (didn't remember enabling that that to begin with - not sure it needs to be enabled or not)
-
forgetting how the DNS resolver works i.e. does not use forwarding servers, I changed my DNS servers from ISP provided to google DNS
-
changed DNS Resolution behavior from default to use remote DNS servers, ignore local
Finally I had the idea of manually setting the DNS server on the client (used Google DNS servers) and that worked (but found I could not access my pfsense on the client). Seeing that, and being reminded now of how the resolver works, I enabled DNS forwarding in the resolver and reset the client to automatic config and that has worked. As noted above, I did then restore a previous backup (mainly to prove to myself that the python module had been enabled) and made the necessary changes to keep things working.
So, while things are working normally now, I am at a loss as to why this happened and not quite sure how to move forward to troubleshoot/diagnose the underlying issue. I'm pretty much self taught/learn as you go when it comes to all this, and could really use some advice/guidance here.A few more details that are probably important. This happened at our main office and our satellite office, which have a site-to-site IPsec connection. The main office is using a netgate sg-2100 and the satellite an SG-1100. Both are on 23.01. Settings are essentially the same on both devices, except I did note the sg-1100 did have the python module enabled. On the sg-1100 I only enabled DNS forwarding in the resolver and did not restore any previous config.
Any help is greatly appreciated. Please let me know if there are any other details that would help.
-
-
@pzanga Thus doesn’t really explain your issue but since you said you enabled forwarding, you should disable DNSSEC. It can cause false errors when forwarding.
Also note https://redmine.pfsense.org/issues/14056
-
@SteveITS said in clients cannot resolve any google sites (plus on other) but pfsense can:
Also note https://redmine.pfsense.org/issues/14056
If forwarding is used and forwarding is using "over TLS" then that issue "14056" (set the NOALSR) might be a solution.
pfSense+ 23.05 solved this - so, you've probably have here another good reason to hit the upgrade button.Btw : consider using pfSense as it was meant to be used :
So : no DNS servers - no DNS "ISP" override - and Use local DNS (127.0.0.1) .....
The image of the Resolver settings you've showed above is ok - for myself, I also activated Python mode.
This is the other, bottom part :With these settings, pfSense handles DNS in resolver mode, the "how Internet was meant to be used" mode.
clients cannot resolve any google sites (plus on other) but pfsense can
This implies that LAN client can't use the network router as their DNS 'source'.
On the client, (windows), execute aipconfig /all
This spit out all the important network details.
It should mention :which informs the LAN client device (a Windows PC) that it should use 192.168.1.1 (or the IPv6 equivalent) as the DNS source.
Next check, on the same device :
C:\Users\Gauche>nslookup google.com Serveur : pfSense.no.way Address: 2a01:cb19:907:a6dc::1 Réponse ne faisant pas autorité : Nom : google.com Addresses: 2a00:1450:4007:81a::200e 142.250.201.174
If this command shows an error, you'll know that unbound, the resolver wasn't listing on the pfSense LAN interface.
In that case : restart it (on pfSense).
This situation shouldn't (normally) happen - I've never seen it, but others (see here on forum) have.Lats but not least : even if your PC is using 192.168.1.1 as it DNS "source" this doesn't mean that applications running on that PC are using the system's (Windows) DNS source. They can very well chose to do things differently.
Chrome for example (I've never used it) will probably it's own known DNS servers for DNS needs, as this browser's main goal is not doing browsing for you, but "collecting everything what you do [with this application], and send that info to home". With Chrome settings you can 'force' it to use local (pfSense) DNS.
On the other hand : if the local pfSense DNS (unbound) is malfunctioning on the LAN side, Chrome's choice isn't actually that bad : DNS works : Internet pages show up. -
@Gertjan Ah, I see the Redmine shows a target of 23.09 but the last post says it’s off in 23.05.
-
@SteveITS Thanks for that. I had overlooked DNSSEC but have disabled it now. I don't think that redmine issue is at play here, since I had not been forwarding when my issue started and not using SSL/TLS for outgoing DNS at this point.
@Gertjan said
Btw : consider using pfSense as it was meant to be used :
So : no DNS servers - no DNS "ISP" override - and Use local DNS (127.0.0.1) .....Thank you. Since things are working currently, I plan to wait until after business hours and try to revert back to the default resolver mode as you noted, including python mode. I definitely want to use pfsense/dns as it was meant to be used. I didn't do the original configuration on our pfsense box, so not sure why the DNS servers were configured when forwarding was not enabled. I have made plenty of changes since taking over, but never thought to delete those. Without forwarding enabled (as it was configured) would the system ever use those DNS servers? Similarly, does the DNS override option being enabled have any effect if forwarding is disabled? Also, anyone know if the override feature is enabled/disabled by default? I don't recall ever touching that setting either way.
As far as the client machine, I ran ipconfig /all and it shows the pfsense box IP as the DNS source. And the nslookup showed no error. Of course, this is now with things working. When I can retest later today I will test with the previous problematic settings, if the issue recurs. I did try to ping google.com yesterday from the client when I was having issues and that showed 100% packet loss. Didn't think to try nslookup though. :(
Lats but not least : even if your PC is using 192.168.1.1 as it DNS "source" this doesn't mean that applications running on that PC are using the system's (Windows) DNS source. They can very well chose to do things differently.
I do see the Chrome (and Edge) settings/flags that can be disabled to prevent them from using DoH. What is considered best practice when it comes to that? Should I disable those settings? Should I use pfsense rules to force all DNS requests to use the local DNS? Or just leave it as is (assuming that it isn't the cause of my problem).
Thanks again for the help. I will post an update when I get a chance to test the "proper" DNS config.
-
Another test : run this on the command line :
grep 'start' /var/log/resolver.log
The idea is to keep the number of unbound restarts as low as possible.
A couple of times per week : ok, but many times per hours (example) isn't bad, but during restart, which can take several seconds, your network has no DNS.@pzanga said in clients cannot resolve any google sites (plus on other) but pfsense can:
so not sure why the DNS servers were configured when forwarding was not enabled
If you didn't enter these :
then they are put there because this has been checked :
If your WAN (ISP) uses DHCP, pfSense, upon connection, uses DHCP, and this will deliver an IP, a network, a gateway and ..... one or more ISP DNS. Exactly as what happens when you connect a device (PC, whatever) to your pfSense LAN.
These ISP DNS are not used.
But pfSense itself can use it if needed, for example, if the top first IP (normally 127.0.0.1 = unbiound doesn't reply)
Seecat /etc/resolv.conf
@pzanga said in clients cannot resolve any google sites (plus on other) but pfsense can:
google.com yesterday from the client when I was having issues and that showed 100% packet loss
"packet loss" means : google.com was resolvbed, so 'ping' had an IP to work with. Thus DNS is ok.
But then there was no path to this IP : this means a bad connection.@pzanga said in clients cannot resolve any google sites (plus on other) but pfsense can:
I do see the Chrome (and Edge) settings/flags that can be disabled to prevent them from using DoH. What is considered best practice when it comes to that? Should I disable those settings? Should I use pfsense rules to force all DNS requests to use the local DNS? Or just leave it as is (assuming that it isn't the cause of my problem).
Ok, you are aware that 'programs' like web browser can do tings their own way.
Its up to you to chose what you prefer to use, and what happens when and how. This info is important when you want to debug things.