DNSBL VIP resolution from Win7
I have been trying to diagnose this issue for a couple of days now, and I think I understand what is going on, but can't seem to solve it. And if I can't solve it then I probably don't fully understand what is going on, so I am turning for help here.
I have DNSBL setup and blocked domains are sent to the virtual IP if I query them from the pfSense box, or a Linux box on the local network. However, if I query a blocked domain from a Win7 box on the local network it does not return the virtual IP, and therefore is not blocked. An example of each nslookup query is below.
The EasyList sites are blocked when queried from the pfSense box.
If I query the same site from a test Linux box on the local network I get the same results.
[root@disect ~]# nslookup ad.doubleclick.net
If I query the same site from a Windows box on the local network I get a different result. I even made sure to flush the Windows dns cache before doing the query.
C:\Users\jeffb> nslookup ad.doubleclick.net
Network DNS is controlled by a CentOS 6 VM (taxa, 192.168.112.51) running dnsmasq. After setting up the pfSense box all DNS queries are now forwarded to the pfSense box (192.168.112.11). The pfSense box is the only IP address listed in /etc/resolve.conf of the dnsmasq box. Logs on the dnsmasq box, and sniffing traffic on the network, shows that all DNS queries on the local network are initially going to the dnsmasq box, then being forwarded to the pfSense box.
The pfSense box has Resolver enabled and no DNS servers listed in the System/General Setup page. The dashboard only shows 127.0.0.1 listed for the DNS server. I can ping the virtural IP address from a Windows box, and pointing a browser from a Windows box to the virtual IP address gives the 1x1 pix. The Resolver is not in Forwarder mode. My network is just a single subnet, with no VLANs or anything fancy.
A packet capture on the pfSense box shows the response to an nslookup request for ad.doubleclick.net from a Win7 box.
15:26:32.058779 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 64)
192.168.112.51.22671 > 192.168.112.11.53: 32274+ A? ad.doubleclick.net. (36)
15:26:32.058897 IP (tos 0x0, ttl 64, id 52473, offset 0, flags [none], proto UDP (17), length 80)
192.168.112.11.53 > 192.168.112.51.22671: 32274* 1/0/0 ad.doubleclick.net. A 10.10.10.1 (52)
15:26:32.071164 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 64)
192.168.112.51.9640 > 192.168.112.11.53: 16722+ AAAA? ad.doubleclick.net. (36)
It looks to me as though the pfSense box (192.168.112.11) sends a response back to the dnsmasq box (192.168.112.51) from the A (IPv4) query with the correct answer of the virtual IP address. But it then goes on to query the AAAA (IPv6) record and never gets a response. Doing a tracert on the Win7 box to this same domain shows that it immediately hops to the pfSense box, then stalls for roughly 5 seconds, then goes out to resolve the 18.104.22.168 address through my upstream provider.
I have tried a few things on the Win7 box relating to disabling and changing the IPv4 and IPv6 priorities, without getting a different result. I also tried checking the prefer IPv4 over IPv6 box on System/Advanced/Networking page on the pfSense box, without getting a different result.
So I may not know what is going on with the address resolution, but my goal is to be able to have the Windows boxes on my network resolve to the virtual IP address for the blocked sites. Can anyone explain to me what is going on here, or have any other diagnostic steps for me to try? Thanks.
The fact that I am not getting any replies, and some recent additional debugging indicates that this issue may be outside the scope of DNSBL, although it certainly seems to be impacting how DNSBL works (or doesn't work). Additional debugging notes below.
I noticed in the pfSense logs that there was one Win7 box on the local network that was actually using DNSBL.
From the working Win7 box an nslookup on ad.doubleclick.net returns the virtual IP of the DNSBL. Looking at all the network configurations between the working Win7 box and a non-working Win7 box indicates that they both appear to be the same. Both boxes are stock installations, however one box may have more or less updates installed than the other, but I do not know if this is the case, or which one might be more up to date. I believe that the non-working box is more up to date than the working box, but will need to confirm it.
So I ran nslookup from the Windows command line on both boxes, using the debug mode to get additional output. I made sure to flush the dns cache before each query. The results are summarized below (in debug mode I couldn't copy and paste the output, but think I am listing the pertinent details).
Working Win7 box:
ad.doubleclick.net, type=A, class=IN
internet address = 10.10.10.1
ttl = 60 (1 min)
Non-Working Win7 box
ad.doubleclick.net, type=A, class=IN
canonical name = dart.l.doubleclick.net
ttl = 75742 (21 hours 2 mins 22 secs)
internet address = 22.214.171.124
ttl = 48 (48 secs)
I don't have enough knowledge of Windows networking and DNS to understand why these differences are occurring, but the differences are clearly affecting the DNS block list process. Not sure if anyone here can help me at this point since this is a pfSense list and not Win7 technical support. But I am still open to any suggested ideas, and even any suggestions for a good Windows DNS forum where I might be able to get help. Thanks.
I don't suspect this to be an issue with the package… There seems to be some DNS settings issues with those Lan clients... The DNS settings for the LAN Clients needs to be pointed to pfSense Resolver only, in order for DNSBL to function... If you nslookup and don't get the DNSBL VIP address, then its short circuiting somehow...
I had one user who was using Avast (I believe) Antivirus client, and it had an option to monitor DNS queries and re-route incorrect DNS queries to the proper address and thus bypassing DNSBL...
Well, I put this issue aside to think about, and was out of the office for a few days. When I returned this morning I saw in the DNSBL log files that another Win7 box on the network is now listed as blocking sites. It is a box that last week was tested and was not working. So I tested my own Win7 box, and now DNSBL is now working on it. My box had not been shutdown or rebooted between the time it was tested as not working, and today when it is working.
So I don't know what to say other than there is something fishy in the Windows network stack. All of our LAN boxes get their network, dns, dhcp, and gateway information from our dnsmasq server, so they are all configured the same. We are running the Avast Antivirus clients here, but I do not see any option to re-route the incorrect dns queries to the proper address. And we are not using a proxy here.
Anyway, I will now consider this fixed/solved. Even though the true issue is still not identified. I give up and will move on to something else.
Thanks for the input.