Split DNS inconsistencies
-
I'm building a pfSense configuration for use on an embedded system production line. The embedded devices have very limited buffers, so they have issues with the somewhat spotty factory floor net connections. I'm using rinetd and the vhosts package to support the devices, one function requires local tcp termination, the other just needs a file hosted locally.
I've configured a virtual IP and host overrides for the two functions, so the devices on the factory floor see my pfSense box instead of the origin servers, and they seem to connect fine.
The problem I have is that some programs use this split DNS, and some don't. I've checked "Do not use the DNS Forwarder as a DNS server for the firewall", and indeed 127.0.0.1 does not appear in the dashboard list of resolvers nor in /etc/resolv.conf. host and nslookup correctly query upstream DNS, but ping, fetch and (apparently) rinetd all get the virtual IP instead of the origin server.
I've just tried enabling Strict Interface Binding in the DNS Forwarder config, binding only to LAN and my virtual IP, with no change.
I'm missing something here… any suggestions are welcome.
(I'm pretty sure I had rinetd working correctly before I installed vhosts... my next guess is that the vhosts packages has somehow messed up the resolver config?)
[EDIT: 2.1-RELEASE (i386) running on an Atom D2500 board with Intel NICs]
-
It's not vhosts, I just rebuilt from scratch without either rinetd or vhosts and I still have the same issue.
dig, nslookup, and host all query upstream.
ping, fetch, etc. all hit the dns forwarder.
It's incredibly frustrating when your diagnostic tools lie to you like this.What am I missing?
-
I've been able to install unbound and it works as expected. I'm still puzzled why dnsmasq is broken, but I'm at least out of the woods on this project.
-
add log-queries to your dns forwarded advanced options to see all the queries and replies in the resolver log. This should help with figuring out what the heck is going on. Also some sort of nslookup or example would definitely help.
Though you have it working with unbound.
-
dig, nslookup, and host all query upstream.
ping, fetch, etc. all hit the dns forwarder.You mean on pfsense itself? Or some other box?
Could you give and example of the problem your seeing? with ouput from say nslookup or dig and ping and the returned IPs
-
Yes, on the pfsense box itself.
I've fixed the customer box by switching to unbound, let me see if I can reproduce this on my own machine.
Here we go:
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(2): host uk.milewski.org
uk.milewski.org has address 5.2.16.118
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(3): ping uk.milewski.org
PING uk.milewski.org (172.23.16.11): 56 data bytes
64 bytes from 172.23.16.11: icmp_seq=0 ttl=64 time=1.130 ms[EDIT: I do not see queries for this host at all in /var/log/resolver.log. Should I be looking somewhere else?]
-
is this uk.milewski.org host being register in pfsense dhcp? via dhcp or has a static setting?
What I would guess is host is using whatever dns pfsense has set to use, but ping might be looking to /etc/hosts file first - it is common for applications to use local cache, host file before doing a query to dns - while tools like dig or host are designed to just query dns that system is set to use.
the dns forwarder on pfsense can be set to register dhcp leases and or static in the forwarder - these are stored in the host file actually..
So for example in my pfsense host file you see
[2.1.1-PRERELEASE][root@pfsense.local.lan]/root(2): cat /etc/hosts
127.0.0.1 localhost localhost.local.lan
192.168.1.253 pfsense.local.lan pfsense
192.168.1.31 raspberrypi.local.lan raspberrypi
192.168.1.99 popcorn.local.lan popcorn
192.168.1.100 i5-w7.local.lan i5-w7
<snipped>192.168.1.7 ubuntu.local.lan ubuntu
192.168.1.10 vcenter.local.lan vcenter
192.168.2.252 wrt54g.local.lan wrt54g
192.168.2.252 tomato.local.lan tomato# dhcpleases automatically entered
192.168.1.215 w81.local.lan w81 # dynamic entry from dhcpd.leases
192.168.2.213 android-49snipped.local.lan android-49snipped # dynamic entry from dhcpd.leases
192.168.2.211 Chromecast.local.lan Chromecast # dynamic entry from dhcpd.leases
[2.1.1-PRERELEASE][root@pfsense.local.lan]/root(3):So if you have entries from dhcp in your hosts file - then that would explain why ping resolve that entry vs what the public internet resolves which is your 5.address.</snipped>
-
Yes, on the pfsense box itself.
I've fixed the customer box by switching to unbound, let me see if I can reproduce this on my own machine.
Here we go:
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(2): host uk.milewski.org
uk.milewski.org has address 5.2.16.118
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(3): ping uk.milewski.org
PING uk.milewski.org (172.23.16.11): 56 data bytes
64 bytes from 172.23.16.11: icmp_seq=0 ttl=64 time=1.130 ms[EDIT: I do not see queries for this host at all in /var/log/resolver.log. Should I be looking somewhere else?]
Did you add the log-queries in DNS Forwarder advanced options? You are seeing queries just not the one for that host?
-
Did you add the log-queries in DNS Forwarder advanced options? You are seeing queries just not the one for that host?
Correct, I see queries from the clients on my LAN, but not for queries made from a shell on the pfsense box.
-
Did you add the log-queries in DNS Forwarder advanced options? You are seeing queries just not the one for that host?
Correct, I see queries from the clients on my LAN, but not for queries made from a shell on the pfsense box.
Feb 26 15:59:19 dnsmasq[80195]: cached test.com is 208.64.121.161 Feb 26 15:59:19 dnsmasq[80195]: query[A] test.com from 127.0.0.1 Feb 26 15:59:19 dnsmasq[80195]: cached test.com is 208.64.121.161 Feb 26 15:59:19 dnsmasq[80195]: query[A] test.com from 127.0.0.1 Feb 26 15:59:19 dnsmasq[80195]: reply test.com is 208.64.121.161 Feb 26 15:59:19 dnsmasq[80195]: forwarded test.com to 8.8.4.4 Feb 26 15:59:19 dnsmasq[80195]: forwarded test.com to 8.8.8.8 Feb 26 15:59:19 dnsmasq[80195]: query[A] test.com from 127.0.0.1
Diagnostics -> DNS Lookup shows in my queries
-
Well yeah they would go outbound if pfsense does not have a record of them - but if the host file has a record, then why would say ping use dns if it found it in the local cache or host file?
What is in your /etc/hosts file?
-
Well yeah they would go outbound if pfsense does not have a record of them - but if the host file has a record, then why would say ping use dns if it found it in the local cache or host file?
What is in your /etc/hosts file?
Diagnostics -> DNS Lookup with www.test.com in hosts file pointing to 192.168.55.1
Feb 26 16:36:51 dnsmasq[51271]: /etc/hosts www.test.com is 192.168.55.1 Feb 26 16:36:51 dnsmasq[51271]: query[A] www.test.com from 127.0.0.1 Feb 26 16:36:50 dnsmasq[51271]: /etc/hosts www.test.com is 192.168.55.1 Feb 26 16:36:50 dnsmasq[51271]: query[A] www.test.com from 127.0.0.1
Console host www.test.com
Feb 26 16:39:02 dnsmasq[51271]: forwarded www.test.com to 8.8.4.4 Feb 26 16:39:02 dnsmasq[51271]: forwarded www.test.com to 8.8.8.8 Feb 26 16:39:02 dnsmasq[51271]: query[MX] www.test.com from 127.0.0.1 Feb 26 16:39:02 dnsmasq[51271]: reply test.blockdos.com is NODATA-IPv6 Feb 26 16:39:02 dnsmasq[51271]: forwarded www.test.com to 8.8.4.4 Feb 26 16:39:02 dnsmasq[51271]: forwarded www.test.com to 8.8.8.8 Feb 26 16:39:02 dnsmasq[51271]: query[AAAA] www.test.com from 127.0.0.1 Feb 26 16:39:02 dnsmasq[51271]: /etc/hosts www.test.com is 192.168.55.1 Feb 26 16:39:02 dnsmasq[51271]: query[A] www.test.com from 127.0.0.1
Shouldn't the DNS forwarder still be logging the fowards at least?
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(2): host uk.milewski.org uk.milewski.org has address 5.2.16.118 [2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(3): ping uk.milewski.org PING uk.milewski.org (172.23.16.11): 56 data bytes 64 bytes from 172.23.16.11: icmp_seq=0 ttl=64 time=1.130 ms
Neither of these are in the logs? I would expecting at least 1 of the two to be in the Forwarder log.
What is the address 5.2.16.118?
What is the address range 5.2.16.X have to do with your environnent?
I am I correct in assuming this is the correct one right?What is the address 172.23.16.11?
What is the address range 172.23.16.X have to do with your environnent?
This is the wrong reponse right?Do you get the same problem from a PC on the network? If so can you wireshark it and give us an output of the DNS queries and responses?
If you do an nslookup from the pfsense console what is the output?
[2.1-RELEASE][root@pfsense.localdomain]/usr/local/www(32): nslookup uk.milewski.org Server: 127.0.0.1 Address: 127.0.0.1#53 Non-authoritative answer: Name: uk.milewski.org Address: 5.2.16.118
You need to know where the answer is coming from to figure out why.
-
I checked his authoritative nameservers for that domain on the public net, and they only respond with 1 record for that uk host.
If nslookup or dig or host resolves the public, but ping resolves some rfc1918 address I assume he is not asking dns but finding it in a hosts file on the pfsense box. Pfsense stores the overrides in the hosts file, and it also can store dhcp leases in the hosts file.
So even if he has no host override on his forwarder to point to the local rfc1918 address, if he gave the box a dhcp IP then its quite possible its in his /etc/hosts file and there is where applications like ping could find it since they look there before asking dns.
-
I checked his authoritative nameservers for that domain on the public net, and they only respond with 1 record for that uk host.
If nslookup or dig or host resolves the public, but ping resolves some rfc1918 address I assume he is not asking dns but finding it in a hosts file on the pfsense box. Pfsense stores the overrides in the hosts file, and it also can store dhcp leases in the hosts file.
So even if he has no host override on his forwarder to point to the local rfc1918 address, if he gave the box a dhcp IP then its quite possible its in his /etc/hosts file and there is where applications like ping could find it since they look there before asking dns.
I see.
Do not use the DNS Forwarder as a DNS server for the firewall By default localhost (127.0.0.1) will be used as the first DNS server where the DNS forwarder is enabled, so system can use the DNS forwarder to perform lookups. Checking this box omits localhost from the list of DNS servers.
With this on I do not get queries from pfsense shell shown in the resolver log either. Guess that makes sense then :'(
Yes, on the pfsense box itself.
I've fixed the customer box by switching to unbound, let me see if I can reproduce this on my own machine.
Here we go:
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(2): host uk.milewski.org
uk.milewski.org has address 5.2.16.118
[2.1-RELEASE][amilewski@ashby.milewski.org]/home/amilewski(3): ping uk.milewski.org
PING uk.milewski.org (172.23.16.11): 56 data bytes
64 bytes from 172.23.16.11: icmp_seq=0 ttl=64 time=1.130 ms[EDIT: I do not see queries for this host at all in /var/log/resolver.log. Should I be looking somewhere else?]
What are all the steps you are doing the reproduce this without the rinetd and vhosts package?
-
Sorry, distracted by other things, working around the fact that I can't develop packages because the xmlrpc server repository has been taken off the net. >:(
Anyway… to reproduce this problem, add a host override for some external host (in above case, uk.milewski.org) pointing to an address on the LAN (above, 172.23.16.11) and check "Do not use the DNS Forwarder as a DNS server for the firewall".
But as you've pointed out, since overrides are stored in /etc/hosts, the firewall still sees overrides even if the firewall is not using the forwarder. This seems broken, as it means the "Do not use..." checkbox doesn't do a lot.
I'm guessing at this point, but it looks like it's dnsmasq that's copying the overrides into /etc/hosts, and that unbound reads the xml without messing with /etc/hosts.
-
Sorry, distracted by other things, working around the fact that I can't develop packages because the xmlrpc server repository has been taken off the net. >:(
Anyway… to reproduce this problem, add a host override for some external host (in above case, uk.milewski.org) pointing to an address on the LAN (above, 172.23.16.11) and check "Do not use the DNS Forwarder as a DNS server for the firewall".
But as you've pointed out, since overrides are stored in /etc/hosts, the firewall still sees overrides even if the firewall is not using the forwarder. This seems broken, as it means the "Do not use..." checkbox doesn't do a lot.
I'm guessing at this point, but it looks like it's dnsmasq that's copying the overrides into /etc/hosts, and that unbound reads the xml without messing with /etc/hosts.
I looked through all the code and after the dnsmasq manual. Below is what I have come up with to clarify:
- The Option "Do not use the DNS Forwarder as a DNS server for the firewall" removes 127.0.0.1 from firewall's resolv.conf only.
- If you want DNS forwarder to not be used for dhcp clients you need to specify DNS servers in Services -> DHCP Server or Services -> DNS Forwarder -> Uncheck Enable
- "Host Overrides" are host file based overrides and it is done through php code only. Dnsmasq will read from this file for lookups though.
- If you want a record and not a host override you need to use "host-record=www.fudzilla.com,192.168.55.1" and you will see the expected result.
- The overrides you are using in unbound are much like these ones. Where as pFsense is using host based overrides on that page that dnsmasq can read for lookups unforunately this will affect other things on the firewall side.
Hope that helps.
–host-record=<name>[,<name>….][<ipv4-address>],[<ipv6-address>]
Add A, AAAA and PTR records to the DNS. This adds one or more names to the DNS with associated IPv4 (A) and IPv6 (AAAA) records. A name may appear in more than one host-record and therefore be assigned more than one address. Only the first address creates a PTR record linking the address to the name. This is the same rule as is used reading hosts-files. host-record options are considered to be read before host-files, so a name appearing there inhibits PTR-record creation if it appears in hosts-file also. Unlike hosts-files, names are not expanded, even when expand-hosts is in effect. Short and long names may appear in the same host-record, eg. –host-record=laptop,laptop.thekelleys.org,192.168.0.1,1234::100</ipv6-address></ipv4-address></name></name> -
I am curious why this is an issue to be honest? If your creating records that point to local an address "172.23.16.11"
Why would you not want to use this? If your not using it, why create it in the first place?
-
Why would you not want to use this? If your not using it, why create it in the first place?
There are lots of reasons why you'd want to run split DNS. In my case, I need to run rinetd on the firewall in front of a real host on the internet. My LAN clients need to resolve a certain hostname to a Virtual IP on the firewall, but I need rinetd to be able to resolve the real external address of that hostname so it can connect.
-
I understand the reason for split dns, that is not what I asked.. My question is to why you want pfsense to resolve the public record vs the local private record? But now I see your running an application on pfsense that you don't want to use the host file to resolve.
But why does pfsense or apps on it have to resolve anything? Why don't you just forward to the IP your wanting rinetd to forward too vs using fqdn? Does this fqdn change its IP?
-
But why does pfsense or apps on it have to resolve anything? Why don't you just forward to the IP your wanting rinetd to forward too vs using fqdn? Does this fqdn change its IP?
Circling back to this, I certainly could use a hard-coded IP. That would mean that if the service I'm using moves, we have to remember to reconfigure the pfsense box as well. This is exactly what DNS was designed to avoid.
Unbound doesn't seem very stable, so if there's a way to get this to work, perhaps by having dnsmasq use a separate file rather than /etc/hosts, that would be a win for me.