DNS resolution for some hosts fails, but nslookup works
-
Hello,
recently I encountered a problem that I don't know how to research. That is why I hope that someone here might have an idea.
As mentioned in the title I have a pfSense firewall with dns resolver service enabled. It queries another firewall in the network, which is configured under System -> General Setup. This works fine, except for some hosts.
So under Diagnostic -> DNS Lookup the fqdn of these hosts cannot be resolved. When I use command prompt nslookup with the fqdn and address of the firewall as nameserver, the fqdn gets resolved just fine.I don't know what that might cause since there is no domain override for these hosts. The other firewall that is queried can't be the problem, because querying it directly from the command prompt of the pfSense works just fine with these hosts.
The one thing these hosts have in common is that their fqdn can't be resolved by public nameservers. The other firewall is the only nameserver configured on the pfSense firewall and is able to resolve these fqdns. So I don't think this could cause the problem either.
So far I tried enabling/disabling Forwarding Mode in the Resolver config and restarted the resolver service. I also did a packet capture and could verify that in both cases (Diagnosis -> DNS Lookup and command prompt) the other firewall is queried and answers the query with the correct ip address for these hosts. So in both cases the pfSense gets a reply with the correct ip address.
The pfSense firewall could resolve the hosts at one point in time, when I configured the firewall rules that use these fqdns. I can't think of any changes (updates or configuration) that might have broken this. So any ideas are welcome. Unfortunately I can't give you screenshots or anything, because these are all internal addresses and I don't think that I'm allowed to post them.
Kind regards,
Klinger -
@klinger
You aren't using .local ad domain by any chance are you ?I "was ordered to use .local" as my lab domain , and have regretted it ever since ....
But changing isn't really an option now ... Too many things are "infected" with that name.Well ....
When i upgraded a few linux (mint) machines i was hit by excactly that issue , if i was looking up something ending in .local.
Since the linux box basically didn't forward the query to the DNS servers since it was a .local name.I tried many tricks, but never succeded in resolving it via "resolver trickery".
My linux had lab.local as domain name, and i strangely enough could DNS resolve other lab.local hosts. But i couldn't resolve citrix.local hosts , unless i used dig/host/nslookup in a shell , and directed the query to a specific DNS server (the same as was used by the resolver).
Well i found out that they had been "tuning the systemctl resolver daemon" , and now it didn't even query DNS servers for .local ... UNLESS the .local was handed down via DHCP as either domain name , or DNS search list domains.
So after like two days of trickery, i caved in. And added both lab.local and citrix.local , in the DHCP domain searchlist variable.
@johnpoz IS RIGHT , when saying DON'T USE .local it will come back to bite you.
I found out btw: that home.arpa was the recommended "private domain to use"
/Bingo
-
@bingo600 said in DNS resolution for some hosts fails, but nslookup works:
that .arpa was the recommended "private domain to use"
home.arpa ;) not just .arpa
-
@johnpoz said in DNS resolution for some hosts fails, but nslookup works:
@bingo600 said in DNS resolution for some hosts fails, but nslookup works:
that .arpa was the recommended "private domain to use"
home.arpa ;) not just .arpa
Dooh ....
Else all reverse domains would be FSCK'edCorrected in my post
-
@klinger I'm a bit confused by your description so apologies if I misunderstand. In the Resolver settings, a Domain Override will force it to forward queries to the other DNS server, for instance Windows AD DNS. That's the "correct" way to get it to resolve internal domains using other servers.
If that's not in use then Resolver resolves names itself. It only forwards if forwarding is enabled in which case it forwards everything.
If you have forwarding enabled (in Resolver) uncheck the option to use DNSSEC. That seems to be problematic for some people on 23.01.
System/General Setup/DNS Resolution Behavior sets whether pfSense uses itself for DNS.
-
@bingo600 said in DNS resolution for some hosts fails, but nslookup works:
You aren't using .local ad domain by any chance are you ?
The problematic hosts don't use .local top level domains. They use public top level domains but are not publicly resolvable. The domains of these hosts are not controlled by my organization.
For our internal domain however, we use .local unfortunately. I've already set up domain forwarding for that, which still works.@steveits said in DNS resolution for some hosts fails, but nslookup works:
I'm a bit confused by your description so apologies if I misunderstand. In the Resolver settings, a Domain Override will force it to forward queries to the other DNS server, for instance Windows AD DNS. That's the "correct" way to get it to resolve internal domains using other servers.
I use domain override for our internal domain. In all cases (System and Override), the other firewall is used as a name server. Our firewalls are connected via an mpls network and the central firewall is connected to another mpls network, in which the problematic hosts reside. The hosts use different domains. Configuring a bunch of Domain overrides would technicly be possible but I would have to do a "dirty" NAT masquerading rule on the central firewall.
@steveits said in DNS resolution for some hosts fails, but nslookup works:
If you have forwarding enabled (in Resolver) uncheck the option to use DNSSEC. That seems to be problematic for some people on 23.01.
Thank you for the info. Unfortunately disabling DNSSEC didn't help. I also disabled forwarding.
@steveits said in DNS resolution for some hosts fails, but nslookup works:
System/General Setup/DNS Resolution Behavior sets whether pfSense uses itself for DNS.
I use the default setting. So query 127.0.0.1 first and then fall back.
The weird thin is that in both cases the pfSense firewall gets an answer from the central firewall for the DNS lookup. In both cases with the correct A record. But only when I use nslookup fqdn with the central firewall specified, the pfSense reports that it could resolve the fqdn.
I have to look deeper in the packet capture, maybe I find something regarding DNSSEC. On saturday I'm planning to restart the firewall. I'll report back on monday if the reboot helped.
-
Sorry for the late response, the last two days were really busy.
The reboot unfortunately didn't help.
I haven't had time to do a deep analysis of the packet capture but at a glance they seem identical. -
DNS resolution for some hosts fails, but nslookup works
Start "nslookup" on the command like, no parameters.
This is what I see :
Microsoft Windows [version 10.0.22621.1265] (c) Microsoft Corporation. Tous droits réservés. C:\Users\gwkro>nslookup Serveur par defaut : pfsense.my-local-network.net Address: 2a01:cb19:beef:a6dc::1 >
(sorry, French, but you get the picture).
Every letter shown there is important !!
For example :2a01:cb19:beef:a6dc::1
That's the DNS server my PC uses when it has a DNS question.
For some reason mine is using IPv6 - but that device has another well known IP : 192.168.1.1
So, pfSense is the one to be questioned when there is a DNS question.Take note of the "my-local-network.net" part - this is very important !
Host do not have names like 'server'.
On networks, hosts have names like this
"server.my-local-network.net"nslookup make live easy for you, it will accept "server" as a request. What happen is that it will fire up a DNS request like "what is the IP of 'server' ?". This request will get a failed answer from pfsense/unbound as the question is plain wrong.
But hey, maybe other DNS servers would have answered that DNS question with only 'server' as the question (I guess Microsoft DNS servers would have answered ;)).
Behind the screen, nslookup won't take 'No' for an answer, and will retry :
This time, with the correct question "what is the IP of 'server.my-local-network.net.' ?
And now unbound can do someting with it.
It knows that "my-local-network.net." is local, so it will not forward the question upstream to 8.8.8.8 or who ever, as these will no nothing about your local network (devices whatever). Only unbound could know that.So, here we go :
> set debug > server Serveur : pfSense.my-local-network.net Address: 2a01:cb19:beef:a6dc::1 ------------ Got answer: HEADER: opcode = QUERY, id = 2, rcode = NOERROR header flags: response, auth. answer, want recursion, recursion avail. questions = 1, answers = 1, authority records = 0, additional = 0 QUESTIONS: server.my-local-network.net, type = A, class = IN ANSWERS: -> server.my-local-network.net internet address = 192.168.1.33 ttl = 3600 (1 hour) ------------ ------------ Got answer: HEADER: opcode = QUERY, id = 3, rcode = NOERROR header flags: response, auth. answer, want recursion, recursion avail. questions = 1, answers = 1, authority records = 0, additional = 0 QUESTIONS: server.my-local-network.net, type = AAAA, class = IN ANSWERS: -> server.my-local-network.net AAAA IPv6 address = 2a01:cb19:beef:a6dc::c2 ttl = 3600 (1 hour) ------------ Nom : server.my-local-network.net Addresses: 2a01:cb19:beef:a6dc::c2 192.168.1.33 >
You saw I did this :
set debug
?
That's because I had questions, and I want answers.I asked about 'server' and nslookup adds right away ".my-local-network.net" (it's lying : it added .my-local-network.net. - with a final - important ! - period at the end)
Put unbound in query debug mode, and you'll see what nslookup really asked to unbound ;)
I'm not going to tell you that ".local" as a local domain name is bad.
Let's say it like this : that won't help you.From this point on :
Make host overrides for every device you can find, and that isn't local - that is, devices that do not live under the domain of your pfSense/unbound, but other 'internal' domains elsewhere.Example :
Now, when I
nslookup aaaa
I will receive a fail. Secretly, nslookup will add what it knows : the domain "my-local-network.net" but that host, aaa0my-local-network.net, doesn't exist.
I have to ask the right question to get the correct answer :C:\Users\gwkro>nslookup aaa.my-local-network.net.net Serveur : pfSense.my-local-network.net.net Address: 2a01:cb19:beef:a6dc::1 Nom : aaa.remote-shop-behind-vpn.tld Address: 192.168.100.10
and this the correct answer.
So :
Take care of your host overrides and you'll be good.
Ditch .localBtw : maybe there is a way so your local unbound knows about other host names servers and their domains, so you don't need to host override for every device, but only 'remote' domains, and the IP of the remote network where to drop the DNS question.
-
@gertjan Thank you for your elaborate reply. But there might be some misunderstanding. First of all yes .local is bad. But that's our domain for which domain override is setup and working. The problematic host have known top level domains and their full fqdn is used for all queries.
You mentioned log level debug and I totally forgot that I enabled that
And there I found something. I could see that for some reason the pfSense tries to contact the DNS Server for that domain, even when it's not configured. For some reason this is a public ip address although the authoritative have non public ip addresses. I could also see that it tries to resolve the fqdn and after that it tries the fqdn with the .local domain attached.Unfortunately I could not replicate this log. But I think I'm going to increase the log level further (currently 3) to get more info. I verified that DNS Override in System -> General Setup is not set. Which makes me wonder why it tries to reach that server.
I checked the packet captures again (nslookup, one with specified name server, the other one without) and could not find any differences. In both cases the A record is resolved. The only difference is that in one case also the AAAA record is queried.
-
@klinger said in DNS resolution for some hosts fails, but nslookup works:
pfSense tries to contact the DNS Server for that domain, even when it's not configured. For some reason this is a public ip address although the authoritative have non public ip addresses
This is your domain, or the public domain - never in a million years going to work if some domains NS are listed as rfc1918 space.. Nobody on the public internet would ever be able to resolve anything from such a domain.
What is the actual fqdn your trying to resolve - if you don't want to make it known public - please send it too me in a private message.
-
@klinger said in DNS resolution for some hosts fails, but nslookup works:
or which domain override is setup and working.
You sure - or are they just resolving it via mdns? .local is not really a valid tld, .local is reserved for mdns. I would recommend you change your local domain. The new standard is home.arpa
-
@johnpoz Thank you for your reply. However as mentioned above the problematic hosts are not ours and I can't share their fqdn. It's a network operated by the government where municipalities can exchange data and services. The hosts are in that network with public top level domains. The fqdns are only resolvable in this network. Our central firewall is connected to that network, which is why the pfSense is configured to query our central firewall. Our central firewall is the only name server configured on the pfSense.
@johnpoz said in DNS resolution for some hosts fails, but nslookup works:
I would recommend you change your local domain.
Easier said than done. I know we have to change that but it's harder than you might think, when you have 1000+ hosts and other municipalities also using services that we provide under the .local domain. However the .local domain as mentioned works.
@johnpoz said in DNS resolution for some hosts fails, but nslookup works:
obody on the public internet would ever be able to resolve anything from such a domain.
As mentioned it is not resolvable from the internet. Only within this government mpls network. Most of the problematic hosts use .net as the top level domain but not all. However public fqdns that use .net are resolvable.
-
@gertjan said in DNS resolution for some hosts fails, but nslookup works:
You saw I did this :
set debugI just tried it. Didn't know about that but it was very helpful. When I use nslookup as is with the internal server 127.0.0.1, I get the following answer (shortened).
AUTHORITY RECORDS: -> testa-de.net origin = ns1-eu.123ns.eu mail addr = hostmaster.123ns.eu serial = 2022062251 refresh = 86400 retry = 7200 expire = 3600000 minimum = 86400 ttl = 3420 ADDITIONAL RECORDS: ------------ ** server can't find "the problematic host"
When I specify to use the central firewall, it works as expected.
Obviously the host can't be resolved by this public DNS server but why does the pfSense query it? I can verify via packet capture that it indeed queries our central firewall and gets a response. But why does it try to get a authoritative answer?
I can't post the other response, because it only has non public information. However all sections that you might expect are there (Question, Answers, Authority Records and Additional Records for the Authority Records).
-
@klinger I didn't reread this thread, but if you are trying to do this on pfSense itself, there is an option under Settings/General Setup:
-
@klinger said in DNS resolution for some hosts fails, but nslookup works:
Only within this government mpls network.
Ok - if that is the case unbound would not resolve something that returns a rfc1918 address because of rebind protection. You would either need to turn that off completely, or set this domain as private, so that some A record that returns rfc1918 would be allowed
https://docs.netgate.com/pfsense/en/latest/services/dns/rebinding.html
-
@steveits Thanks for the reply. Didn't solve the problem unfortunately.
-
@klinger if some NS no matter where it is local, public internet or on some private mpls network tries to return a rfc1918 address for some record that would be a rebind.
And you need to tell unbound this domain that record is in is private and rfc1918 is ok, or you have to turn off rebind protection completely.
-
@johnpoz That sounds quite logical. I've tried disabling Rebind Check under System -> Advanced and restarted the resolver service. Didn't help. I also tried adding the domain as private domain in the custom options. But that didn't help either. For good measure I also disabled DNSSEC.
I think this might be the cause but I guess it must be triggered by something else.
-
@klinger what exactly are you trying to resolve? This record? testa-de.net
What NS are you pointing to for this domain, how does this NS know your coming from the mpls network.. Do you have a domain overide setup to resolve this domain?
This is the name servers I show for that domain.
;; QUESTION SECTION: ;testa-de.net. IN NS ;; ANSWER SECTION: testa-de.net. 86400 IN NS ns2-eu.123ns.de. testa-de.net. 86400 IN NS ns1-eu.123ns.eu. testa-de.net. 86400 IN NS ns4-eu.123ns.de. testa-de.net. 86400 IN NS ns3-eu.123ns.eu.
But those are just public internet NS..
-
@johnpoz I'm trying to resolve several host names in a subdomain of testa-de.net. The only NS that is configured is our central firewall. There is no domain override configured for that domain. As mentioned there are quite a bunch of domains in that mpls network.
Today I rebooted the firewall to test if disabling dns rebind check is affected by that. It didn't. I also tested this on a different pfSense firewall in "our" mpls network. Disabling dns rebind check worked on that one and I could resolve the host (it also uses the central firewall as NS).
I conclude therefore that on the firewall where this problem originated, that something is causing the dns rebind check option to not work as expected. -
@klinger said in DNS resolution for some hosts fails, but nslookup works:
The only NS that is configured is our central firewall.
This is another pfsense box or something else?
-
@johnpoz Our central firewall is a Sophos SG.
-
Today I tested several settings and found the solution. As mentioned in the previous post, the solution:
@johnpoz said in DNS resolution for some hosts fails, but nslookup works:
Ok - if that is the case unbound would not resolve something that returns a rfc1918 address because of rebind protection. You would either need to turn that off completely, or set this domain as private, so that some A record that returns rfc1918 would be allowed
https://docs.netgate.com/pfsense/en/latest/services/dns/rebinding.htmlis correct. It didn't work for me however because in my DNS Resolver config (Services -> DNS Resolver -> General Settings) the "DNS Query Forwarding" option was unchecked. After enabling that the solution worked. I can't tell if thats intended or not because it's not mentioned in the Docs. But at least it works for now.
I tested with disabling rebind check completely and also tested if only setting a specified domain as private. Both options worked when query forwarding was enabled.
-
@klinger said in DNS resolution for some hosts fails, but nslookup works:
the "DNS Query Forwarding" option was unchecked
I really don't know how you expected to resolve some secret rfc1918 address that only resolves via mpls network if you were not forwarding to this specific NS that had those records.
I mean it would be possible to setup public dns and have some view, that only if you were hitting it from a specific IP(s) would you get the view that had the rfc1918 address in them.
Glad you got it sorted.
-
@johnpoz said in DNS resolution for some hosts fails, but nslookup works:
possible to setup public dns and have some view, that only if you were hitting it from a specific IP(s) would you get the view that had the rfc1918 address
For sake of discussion we have private IPs set up under a subdomain in use on our data center. Virtuozzo needed the hostnames but we wanted to use private IPs for that layer. Maybe we could have set it all up on pfSense, but in normal DNS we can resolve them from our office too. Meh, multiple solutions. :)
-
@johnpoz Well it worked before... But I kinda misunderstood the option. I could also confirm that without forwarding mode, the pfSense firewall queried our central firewall for name resolution (via packet capture). So either way the "correct" name server was queried.
-
@klinger said in DNS resolution for some hosts fails, but nslookup works:
the pfSense firewall queried our central firewall for name resolution (via packet capture).
Well then you were forwarding - or that NS is authoritative for the domain you were looking up. Out of the box pfsense is a resolver, it talks to the roots down to talk to the actual NS for whatever domain your looking for.
So no out of the box pfsense would not query some NS on your network.. Unless the roots and the gltd servers pointed you to them..
Out of the box pfsense is a resolver..
Hey roots, what is the NS for .com
Hey gltd servers what is the NS for domain.com
Hey NS for domain.com what is the IP address of www.domain.com
you can see this with a simple dig and a +trace, this is how a resolver works.. So why would it ask some upstream NS in your network?
[23.01-RELEASE][admin@sg4860.local.lan]/root: dig www.netgate.com +trace ; <<>> DiG 9.18.8 <<>> www.netgate.com +trace ;; global options: +cmd . 21987 IN NS d.root-servers.net. . 21987 IN NS e.root-servers.net. . 21987 IN NS f.root-servers.net. . 21987 IN NS g.root-servers.net. . 21987 IN NS h.root-servers.net. . 21987 IN NS i.root-servers.net. . 21987 IN NS j.root-servers.net. . 21987 IN NS k.root-servers.net. . 21987 IN NS l.root-servers.net. . 21987 IN NS m.root-servers.net. . 21987 IN NS a.root-servers.net. . 21987 IN NS b.root-servers.net. . 21987 IN NS c.root-servers.net. . 21987 IN RRSIG NS 8 0 518400 20230322050000 20230309040000 951 . TN/9VuM2Q8uI9vNqRDfX/si09GNyq8dHFQdBJPG7CE935u/HbanonU99 Z/mZRM2xIt9zJd8kuvWDi9t0TTLYdFaoJ4XMcQyOQZeeZM/XfLUNBkX0 YdJqjDZD3joFSHNUpKHRF/aIZhoKwRxuAqQsiK04HXrKt3SyaGnVsUy5 kXQU05Z5HEgP8ZK3ziqLD+0bRX9uYAegL+JgEEDx421apR1xN4FY6ngF VONOKheKbl6LhSp91jfkR5LhiEyAT3PMXwfQEntHAmCyBgfw05rbSZB6 vALXQDZBcWCs/pW9VEpPx4J1DpGYhgKAa7ojk8ZDgnY3kfl/H6LplGFi qme5AQ== ;; Received 525 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. com. 172800 IN NS f.gtld-servers.net. com. 172800 IN NS g.gtld-servers.net. com. 172800 IN NS h.gtld-servers.net. com. 172800 IN NS i.gtld-servers.net. com. 172800 IN NS j.gtld-servers.net. com. 172800 IN NS k.gtld-servers.net. com. 172800 IN NS l.gtld-servers.net. com. 172800 IN NS m.gtld-servers.net. com. 86400 IN DS 30909 8 2 E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766 com. 86400 IN RRSIG DS 8 1 86400 20230323050000 20230310040000 951 . ENL8WPFbxOqXipIZr0gi73LXISv1Oc5VREINA+nwZ4SdXg72++HZvKPt q7Rlv/Zy/z8U0xsV8drSfktoc3L/vOT97I/xvBiqGBKfmcI9fZ+OI+rp aql8fl7ep0KxSsCW2snOapFvf3LeDcPop5OJtCOv0h0g6CYnLugbWdRR qiF6FDg38bx/QwpQZL0BKxD3E6/qjFOrBPuTbHkWk5P+B5SdEF9cWcsK pVy+N3wxKCvKALzxQzQ/zPja/P+8plxGzOYeiaZCFDN4wxa6433zkluG lLSTABiU6mnsmOl+0mVQWzsF0s6QgLemrtyQlGT9HJw/kZhsX8N7WnlJ sPBQ6w== ;; Received 1175 bytes from 198.97.190.53#53(h.root-servers.net) in 33 ms netgate.com. 172800 IN NS ns1.netgate.com. netgate.com. 172800 IN NS ns2.netgate.com. CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0Q2D6NI4I7EQH8NA30NS61O48UL8G5 NS SOA RRSIG DNSKEY NSEC3PARAM CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN RRSIG NSEC3 8 2 86400 20230315042256 20230308041256 36739 com. VHsDr23lP03/xPxRbNUFC+UkSrUZ/Qr3JYHjhz7DYNOLPnzixRL+Hjv/ +kjbNiKVHYy2iGqU38XGJ4sPbvyRx8qygeTX3E7NnS4SdjnN2PKkTMAQ 42Vjxkq928qpoKPOwyn4zgcGSCZffTlNbY5IKVZacivEishoJ1j3BnVJ 1p2/N0gsLcS2GjIob2YGe7j4Lz8Aa5Rrj0s+DwlyP+BlCQ== 2U53SUOKS8OJJV178M90A8BMNI9USDVJ.com. 86400 IN NSEC3 1 1 0 - 2U54DC5VA9HQSV9DBV1IK3JD7KR4L61T NS DS RRSIG 2U53SUOKS8OJJV178M90A8BMNI9USDVJ.com. 86400 IN RRSIG NSEC3 8 2 86400 20230316045937 20230309044937 36739 com. JLAYnUTWdSkzhgKse8Qoyz0cdweJTibB9d0fQmTG1iDubISe0e/HhhBK SAdDEjqsOyV6x6bwtCVi+7HfoawJpsUgDNfYXEcgQfaXRk6TEOofhKnO mK+fVRHYsGbrBkyAfogu6KbQUAgleU65xfCmjKNaeYCDLe1Tq4FBcBLQ GstvOhDuAH5by0b1UBv+5k40jxuut/dlWfd+fxiwaxEO9A== ;; Received 717 bytes from 2001:502:1ca1::30#53(e.gtld-servers.net) in 36 ms www.netgate.com. 60 IN CNAME 1826203.group3.sites.hubspot.net. ;; Received 124 bytes from 208.123.73.90#53(ns2.netgate.com) in 37 ms [23.01-RELEASE][admin@sg4860.local.lan]/root: