DNS Resolver Sometimes Not Resolving Hosts
-
I've noticed something that I suspect may be similar so I figured I would add it to this thread (apologies if I'm misunderstanding something instead!). As some background:
I have Unbound enabled as the DNS resolver:
Network Interfaces: All
Outgoing Network Interfaces: LAN
DNS Query Forwarding: Enabled
(That's about it for the configuration, the rest is default).System > General Setup -
DNS Servers: [Internal DNS Server IP, using LAN gateway]Clients on the LAN network query the Internal DNS directly. I push the IP of the internal DNS for clients that connect using OpenVPN, so they should be querying the internal DNS directly as well.
As such, it might well be that I don't need to have either the forwarder or the resolver enabled on the pfSense (since nothing is really asking the pfSense directly). However, it is handy to be able to resolve internal and external FQDN's from the firewall for ping, traceroute and via Drill and for this reason i assume i need to have either the forwarder or resolver configured.
So with the background out of the way, my story:
When I use drill to query an internal FQDN, presumably Unbound forwards the request to the internal DNS as configured. However, what I'm seeing is that approximately a tenth of the time, rather than seeing a correct resolution as reported by the internal DNS, i see a blank resolution from 127.0.0.1.
The command I'm using is:
drill fully.qualified.internal.domain
But occasionally I get:
$ drill fully.qualified.internal.domain ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 52900 ;; flags: qr rd ra ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;; fully.qualified.internal.domain. IN A ;; ANSWER SECTION: ;; AUTHORITY SECTION: ;; ADDITIONAL SECTION: ;; Query time: 12 msec ;; SERVER: 127.0.0.1 ;; WHEN: Tue Dec 1 17:06:44 2015 ;; MSG SIZE rcvd: 42
Yet 9 times out of ten i see the correct resolution from the internal DNS (this is when running exactly the same drill command and just hitting 'execute' again).
Am I missing a setting somewhere?
-
You'll need to provide more details about what you've done. Are you using DNS Resolver host overrides? Domain overrides? No overrides?
-
Hi,
Sorry, here are some further details:
There are no overrides in place for either domains or hosts. Nothing additional is specified in the Advanced box of the General Settings.
The Advanced settings tab are all defaults.
Under the Access Lists I have created a single record that contains all the subnets that will be communicating with the firewall from the LAN side (including the OpenVPN tunnel subnet) and set it with an Allow action.
-
How are you doing internal domains with no overrides? Where is the DNS zone authority?
-
The answer probably comes back either to the setup outlined above, or
I'm doing something wrong, or
my terminology might be misleading (I have the potentially confusing habit of using 'Internal DNS' to mean 'the DNS server on the internal network' {i.e. the LAN} but it just struck me that 'Internal DNS' might be read by others as meaning 'hosting DNS records on the pfSense itself' so I'll make an effort here to be a little more explicit in the answer below).Basically no client is configured to direct DNS queries at the pfSense (2.2.5 by the way) itself. Hosts on the LAN direct DNS queries to the DNS server on the LAN. OpenVPN clients have a tunnel to the LAN and direct queries to the LAN DNS (as configured in their OpenVPN config). As such, for day to day purposes the firewall doesn't have much to do with DNS (my issue comes about when using some of the diagnostic tools built into the pfSense to reach devices via their FQDN).
For my setup, if the firewall were to have a DNS query directed at it then all it needs to know how to do is forward the request to the DNS server on the LAN which will then either have the record, or act as a recursive resolver for the query (hence why I have forwarding enabled in Unbound). The LAN DNS manages the resolution of all the private DNS hostnames (i.e. it holds the A, AAAA records etc, for hosts on the LAN domain) and if it can't resolve the FQDN directly (i.e. the query is for something in the domains that the LAN DNS is not authoritative for) then the LAN DNS will recursively resolve the query via the public DNS infrastructure.
Looking at the definitions of the two functions you mentioned:
Host Overrides allows creation of custom DNS responses/records to create new entries that do not exist in DNS outside the firewall, or to override DNS responses for other hosts.
Host overrides don't sound applicable to my situation as the records exist on the LAN DNS server (and they are valid, they do not need to be overridden).
Domain Overrides are for domains that should be queried by a specific remote server. For example, if all records for mysite.example.com exist on a private DNS server at 192.0.2.5, then a domain override can be set to forward all queries for that domain to that server.
Again, this is not required day to day since DNS queries are directed at the LAN DNS which knows what domain(s) it is authoritative for, and knows how to resolve queries for domains that it is not authoritative for.
Based on the (hopefully clearer) description of my environment, does it sound to you like I do need entries in either of these areas given that Unbound is being asked to operate as a forwarder?
EDIT:
Anyway, all of this a bit of an aside and is hijacking the OP's thread so I will butt-out again. The intent was to mention that I too am seeing potential DNS resolution issues with Unbound, in my case when using drill on the pfSense command prompt (all the rest of my waffle was to give some context so that smarter might be able to work out if its a bug, or a config mistake I have made). Thanks all.
-
One thing you need to understand about a resolver… It has to walk the tree to resolve someting, roots, ns for tld, name server for domain in question, etc.. if your doing a cold resolve and especially if say the ns for a specific domain are on the other side of the planet from you are just plain suck in response, etc. then its quite possible a query might time out the first time if the client wants it FAST and doesn't wait long enough.
You have added just a little bit more time to that since your clients are asking your internal, who then asks unbound on pfsense.
Now to the OP issue... Here is a PROBLEM!!
10.0.0.254 (PfSense)
8.8.8.8
8.8.4.4if you want your client to resolve your internal hosts -- then the ONLY dns they should point to is dns that knows about your local hosts... if you ask 8.8.8.8 he is not going to know shit about your local hosts local IPs..
Point your clients to ONLY your internal, let your internal look up the stuff it doesn't know about from external. Pointing clients to name servers that don't contain the same info is just asking for issues since you can never be sure which dns the client will ask..
-
Hey Guys,
I was actually waiting for an email with a reply… turns out this forum doesn't send emails haha.
Thank you all very much for your responses. I still got to read through them more carefully.
I'm away from home currently, but I'll be sure to remove Google's DNS servers from the list when a client requests an IP and such... when I return.
Does the resolver forward external queries (say twitter.com) to the DNS servers configured in system>advanced? Or is that a forwarder thing?
-
Forwarder. The resolver resolves from the root down. Described as walking the tree earlier.
-
"turns out this forum doesn't send emails haha."
What?? Yeah it does.. Did you setup notifications?
"Does the resolver forward external queries"
<rolleyes>JFC… maybe there should just be a simple test before you allow the resolver to be turned on in pfsense.. Answer question of the difference between a resolver and forwarder and if they don't get it right they can not enable it.. Maybe should just go back to the forwarder as default because use of an actual resolver just seems way to complicated for what is sad to say a large portion of the user group... Its like the basic concept has to be explained every freaking day...</rolleyes>
-
I have a similar issue, not sure… but I do point hosts to only pfsense DNS...
When I try to ask the DNS about a local domain by nslookup i got:
nslookup gmail.com 192.168.0.1 ;; connection timed out; no servers could be reached
however the service is up! after reboot TWICE… magically it resolves everything as should,
resolver log says:Dec 6 11:02:27 unbound: [17320:0] info: start of service (unbound 1.5.4). Dec 6 11:02:27 unbound: [17320:0] info: service stopped (unbound 1.5.4). Dec 6 11:02:27 unbound: [17320:0] info: start of service (unbound 1.5.4). Dec 6 11:02:26 unbound: [17320:0] info: service stopped (unbound 1.5.4). Dec 6 11:02:26 unbound: [17320:0] info: start of service (unbound 1.5.4). Dec 6 10:50:05 unbound: [58211:0] info: start of service (unbound 1.5.4). Dec 6 10:50:05 unbound: [58211:0] info: service stopped (unbound 1.5.4). Dec 6 10:50:04 unbound: [58211:0] info: start of service (unbound 1.5.4). Dec 6 10:50:02 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:49:40 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:49:40 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:49:09 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:49:09 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:48:46 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:48:46 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:48:43 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:48:43 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:48:11 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:48:11 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:47:56 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:47:56 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:47:37 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:47:37 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:47:10 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:47:10 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:47:04 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:47:04 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:46:50 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:46:50 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:46:42 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:46:41 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:46:39 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:46:39 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:46:31 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:46:31 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:46:11 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:46:11 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:59 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:45:59 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:54 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:45:54 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:45 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:45:44 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:40 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:45:40 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:35 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:45:35 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:31 unbound: [77165:0] info: start of service (unbound 1.5.4). Dec 6 10:45:31 unbound: [77165:0] info: service stopped (unbound 1.5.4). Dec 6 10:45:25 unbound: [77165:0] info: start of service (unbound 1.5.4).
-
well this is not that its not returning an answer because it didn't know or couldn't find – that looks like you just got a timeout.. Because it sure looks like your unbound is starting and stopping all the time.. So either it was off you asked.
Uncheck "Register DHCP leases in the DNS Resolver" in the resolver settings and see if that helps it from starting and stopping every few minutes.