Automatically fill Unbound DNS cache with top hits list?
-
Haha I don't know what you want to hear man. All I can tell you is that if I set your down at a computer on my network, had you browse under normal conditions, then pulled in a bunch of DNS queries at once, you would be able to tell the difference.
It isn't anything crazy, but it does make it feel noticeably faster.
I am definitely not worried about the few bucks more a year I'll have to pay for the CPU cycles or the kBs of data that it will take to do it. -
Well what you really should do is identify the issue. So it can be resolved specifically and properly.
I'd start with httpWatch. The basic edition is free.
https://www.httpwatch.com/Pretty sure Chrome has similar capabilities that can show where time is being consumed.
-
That's very cool, I'll check it out!
I don't think that there is a real problem per say, I think I just have a high latency connection and that's life. For example, when I run dig on a lot of url's, at a glance it looks like most resolve in ~300ms, but some queries are in the 2-3 second range, as John pointed out he has a normal latency connection and is getting returns in the 30ms range. Again, everything works fine and I wouldn't call it a significant problem at all (after all I am still resolving). It's just something I thought I'd see if I could tweak.
You definitely do tell a difference when you go from 2-3sec in some cases to 0ms. While 2-3 seconds is abysmal for DNS, it's still only 2-3 seconds and those times are the exception not the norm. This really is something I'm just curious about, especially since it's actually a noticeable improvement on my network.I'm not suggesting this is something other should do either. For me it's educational messing around with the commands and trying to hack something together that does what I want.
But I will check that program out, maybe there is something wrong. Thank you.
-
Hopefully the basic edition will show the DNS lookup timing. I have access pro edition.
-
Very cool, getting that now!
-
Alright, I got it working. It's pretty hacked together but it works (so far).
The first cron job retrieves the Majestic Million Top URL List every day, it then pulls out only the column listing the URLs, and cuts it down to the first 5000 lines. Then it adds dig commands limiting the timeout to 3s (I think default = 5s) and only one try (default =3?). It splits the file up into 50 files of 100 lines each and makes them executable, and deletes all of the unnecessary files.
30 04 * * * root
fetch -o /tmp/mm.csv http://downloads.majestic.com/majestic_million.csv && cut -d , -f 3 /tmp/mm.csv | cat >> /tmp/mmf && rm /tmp/mm.csv && sed -I -e '1d;5002,$d' /tmp/mmf && rm /tmp/mmf-e && sed -I -e 's/^/dig +short +time=3 +tries=1 +ttlid /' /tmp/mmf && rm /tmp/mmf-e && split -d -l 100 /tmp/mmf /tmp/mmf && rm /tmp/mmf && chmod +x /tmp/mmf**
The second cron job runs the DNS query every 5 minutes.
0,5,10,15,20,25,30,35,40,45,50,55 * * * * root
/tmp/mmf00 & /tmp/mmf01 & /tmp/mmf02 & /tmp/mmf03 & /tmp/mmf04 & /tmp/mmf05 & /tmp/mmf06 & /tmp/mmf07 & /tmp/mmf08 & /tmp/mmf09 & /tmp/mmf10 & /tmp/mmf11 & /tmp/mmf12 & /tmp/mmf13 & /tmp/mmf14 & /tmp/mmf15 & /tmp/mmf16 & /tmp/mmf17 & /tmp/mmf18 & /tmp/mmf19 & /tmp/mmf20 & /tmp/mmf21 & /tmp/mmf22 & /tmp/mmf23 & /tmp/mmf24 & /tmp/mmf25 & /tmp/mmf26 & /tmp/mmf27 & /tmp/mmf28 & /tmp/mmf29 & /tmp/mmf30 & /tmp/mmf31 & /tmp/mmf32 & /tmp/mmf33 & /tmp/mmf34 & /tmp/mmf35 & /tmp/mmf36 & /tmp/mmf37 & /tmp/mmf38 & /tmp/mmf39 & /tmp/mmf40 & /tmp/mmf41 & /tmp/mmf42 & /tmp/mmf43 & /tmp/mmf44 & /tmp/mmf45 & /tmp/mmf46 & /tmp/mmf47 & /tmp/mmf48 & /tmp/mmf49 &
The initial run takes ~34s right after a flushed cache. This pretty much lines up with my ~300ms/request average (cutting out the really high time outliers).
Subsequent runs take ~13s.
CPU usage during the run is ~40%. This is significant considering my router runs an i5-2400.
Bandwidth usage is ~40kbps during the initial run.
Cache size increases by a factor of ~165 over the size ~ a minute after a flush with normal usage.
It works for me and makes my network noticeably snappier. But it also sucks down a lot of CPU for a router. That doesn't bother me on my network but this is obviously not a useful thing to do for most.
-
So it takes you 300ms to query for forum.pfsense.org from their NS??
Where are you in the world? What is your internet connection? If your latency is that bad, that is going to effect all downloads not just dns queries.. So I again do not see how trimming .3 of second is going to make a freaking difference in your performance..
Lets see your httpwatch traces, etc.
-
What you're doing will cut down on the initial seeding time of the cache but you can't cheat on the TTLs set on the records, they have to refetched when they expire and that's going to equally costly compared to the situation if you just started with an empty cache.
-
So it takes you 300ms to query for forum.pfsense.org from their NS??
Where are you in the world? What is your internet connection? If your latency is that bad, that is going to effect all downloads not just dns queries.. So I again do not see how trimming .3 of second is going to make a freaking difference in your performance..
Lets see your httpwatch traces, etc.
Not sure who that question is directed at. Probably the OPer. But just shy of 300 ms is what it takes to get the IP address using DNS resolver. This can be seen in that httpWatch screen capture. It's not simply just a query to the authoritative NS. Has to walk the chain. So it adds up. 3 tenths of a second is humanly perceptible. Though not by much.
For me going to the pfSense home page the total DNS time was about 400 ms for about 4 lookups. Two of which were to Google services. But by then most of the page is probably already rendered.
The OPer hasn't really given much details about the situation other that it's slow. But snappier if DNS pre cached. But nothing about the service, it's latency, bandwidth etc. Some httpWatch traces could potentially reveal some relevant info.
-
"It's not simply just a query to the authoritative NS. Has to walk the chain."
It only has to walk the whole chain if none of the chain is cached.. But part of the chain should already be cached.. Many NS have much longer TTLs than just the records, etc. Once you ask the roots for the NS of the tlds.. those have a ttl of
;; QUESTION SECTION:
;org. IN NS;; ANSWER SECTION:
org. 86400 IN NS a0.org.afilias-nst.info.
org. 86400 IN NS b2.org.afilias-nst.org.
org. 86400 IN NS b0.org.afilias-nst.org.
org. 86400 IN NS c0.org.afilias-nst.info.
org. 86400 IN NS a2.org.afilias-nst.info.
org. 86400 IN NS d0.org.afilias-nst.org.So you sure do not have to walk the chain to get those unless the ttl has expired.
Now a problem that you might have with pfsense.org is they have their NS with a very low 300 second TTL which doesn't make a lot of sense unless they were about to change their NS..
Looks like someone forgot to update the ttl on those records.. since I show the actual ns1 and ns2 having ttl of 3600
;; AUTHORITY SECTION:
netgate.com. 3600 IN NS ns1.netgate.com.
netgate.com. 3600 IN NS ns2.netgate.com.So normally when there is a low ttl on a record, you would only have to query the authoritative NS directly when it expires, not walk the whole chain again.
-
Suggest you Wireshark DNS of an actually http://pfSense.org/ browsing session after the TTL has expired.
The attached Wireshark screen capture is of browsing to http://pfSense.org/ (in a new browser session with the sites cache and cookies cleared; not that that should matter) after having been there several times already within the past hour and the DNS TTL had expired.
Up the chain it goes to:
Name: a0.org.afilias-nst.info
Address: 199.19.56.1![pfSense.org DNS.jpg](/public/imported_attachments/1/pfSense.org DNS.jpg)
![pfSense.org DNS.jpg_thumb](/public/imported_attachments/1/pfSense.org DNS.jpg_thumb)