Excessive DNS lookups for _http._tcp.pkg.pfsense.org after 2.3 upgrade
-
According to my OpenDNS statistics, my network made over 12,000 DNS requests for _http._tcp.pkg.pfsense.org in the 18 hours since I upgraded from 2.2.x to 2.3. dnsmasq is running on the router and this seems pretty excessive as my next highest # of requests for the whole network was <800 for *.akamaiedge.net. I'll report back if I see this again tomorrow, but is there a way to scale that back?
-
Yeah pkg apparently really, really wants to fetch that SRV record, even if it is unnecessary. We added that to DNS earlier today, so that shouldn't be happening anymore.
-
It's used by pkg to locate package mirrors if the repository specification has a mirror type of "SRV", maybe you should change the mirror type to just "HTTP" if you're not planning on offering multiple package mirrors with geolocation support?
-
Yes, we know what it's for. We're leaving it that way as we might use the SRV in the future.
-
I just noticed another spike in DNS requests for _http._tcp.pkg.pfsense.org over the course of the last 24 hours. Only 1,654 requests this time, but this is the first "spike" since my first post on 13 April.
-
Still the SRV record it's looking up? That exists and resolves fine, so it shouldn't be doing anything repeatedly. Has a 5 minute TTL, so the most any single system should be looking it up would be 288 times in 24 hours and only that if it were trying continuously.
-
Correct. My latest count for the 24 hours preceding this post is 25,634 SRV lookups for _http._tcp.pkg.pfsense.org. I have no packages installed at this time. Prior to yesterday, the only package I had installed was mtr-nox, but since I wasn't using it, I removed it.
![Screenshot 2016-04-22 17.35.31.png](/public/imported_attachments/1/Screenshot 2016-04-22 17.35.31.png)
![Screenshot 2016-04-22 17.35.31.png_thumb](/public/imported_attachments/1/Screenshot 2016-04-22 17.35.31.png_thumb) -
Ok I just installed dnstop, and having it listen on my public facing interface, I run unbound (resolver).. And I don't see it doing any such craziness.. Been running for a while, and while I did see that query go - it has long since drop off the top listing..
simple fetch http://pkg.freebsd.org/freebsd:10:x86:64/latest/All/dnstop-20140915.txz
pkg add and have dnstop working, you sure opendns is giving you the correct info??
I show over 1100 total queries, with SRV type only being 3 out of the whole 1100 something, but now I am curious to what kind of query is #0?
-
Having worked for OpenDNS in the past maintaining their resolver infrastructure around the world, I am quite confident that the information they are giving me is as close to 100% accurate as possible.
I'm running dnstop right now. The screenshot is about 5 minutes worth of data.
![Screenshot 2016-04-23 13.16.15.png](/public/imported_attachments/1/Screenshot 2016-04-23 13.16.15.png)
![Screenshot 2016-04-23 13.16.15.png_thumb](/public/imported_attachments/1/Screenshot 2016-04-23 13.16.15.png_thumb) -
And 3 1/2 hours later we're seeing dns lookups for hostname.bind, id.server and . too.
![Screenshot 2016-04-23 16.42.01.png](/public/imported_attachments/1/Screenshot 2016-04-23 16.42.01.png)
![Screenshot 2016-04-23 16.42.01.png_thumb](/public/imported_attachments/1/Screenshot 2016-04-23 16.42.01.png_thumb) -
Well I can tell you its something odd with your config and not an actual issue.. Since been letting it run and with over 112k queries
pfsense.org for anything is no where near the top. Do you have the page open somewhere on a very fast refresh?? And why would something be looking for hostname.bind or id.server??
Your saying the http query is also for SRV, well I only show a total of 43 of those out of 112K total queries..
-
pcap those DNS requests, are you getting correct replies? A dig to OpenDNS for that SRV does get the proper reply and TTL, so it should be impossible to have that many queries piling up (assuming you're getting a correct response).
-
I don't have a pcap yet, but here is what I see from the router:
[2.3-RELEASE][xxx@xxx]/root: dig _http._tcp.pkg.pfsense.org srv @208.67.222.222 ; <<>> DiG 9.10.3-P4 <<>> _http._tcp.pkg.pfsense.org srv @208.67.222.222 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1335 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;_http._tcp.pkg.pfsense.org. IN SRV ;; ANSWER SECTION: _http._tcp.pkg.pfsense.org. 220 IN SRV 10 10 80 pkg.pfsense.org. ;; Query time: 19 msec ;; SERVER: 208.67.222.222#53(208.67.222.222) ;; WHEN: Tue May 03 17:32:14 CDT 2016 ;; MSG SIZE rcvd: 90 [2.3-RELEASE][xxx@xxx]/root: dig _http._tcp.pkg.pfsense.org srv ; <<>> DiG 9.10.3-P4 <<>> _http._tcp.pkg.pfsense.org srv ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17256 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;_http._tcp.pkg.pfsense.org. IN SRV ;; ANSWER SECTION: _http._tcp.pkg.pfsense.org. 300 IN SRV 10 10 80 pkg.pfsense.org. ;; Query time: 55 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Tue May 03 17:32:21 CDT 2016 ;; MSG SIZE rcvd: 90
My OpenDNS stats for the previous 24 hours indicate almost 10K requests for _http._tcp.pkg.pfsense.org again. I'll have a pcap ready by tomorrow.
-
You're getting the right reply, it certainly seems sane.
-
I left the packet capture routing running (via diag_packet_capture.php) overnight. Stopped it this morning and downloaded the pcap and it was only 24 bytes and contained no useful information. However, there were over 11K dns requests for that record again during the same time period I tried the packet capture.
![Screenshot 2016-05-04 14.15.47.png](/public/imported_attachments/1/Screenshot 2016-05-04 14.15.47.png)
![Screenshot 2016-05-04 14.15.47.png_thumb](/public/imported_attachments/1/Screenshot 2016-05-04 14.15.47.png_thumb) -
Well if your saying you did 11k dns queries for that, and your pcap was empty then clearly you were not capturing on the right interface or the right port or someone is clearly mistaken to the number of queries that are happening ;)
-
What filter did you have on the capture? Sounds like you ended up filtering out pretty much everything.
-
I disabled dnsmasq and setup/enabled unbound and the problem seems to have gone away. As much as I like bug hunting, I'm not going to dive into dnsmasq and figure out the why… I guess we can consider this issue closed.
-
Ah, now that makes sense.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579536dnsmasq not caching SRV records is "by design". Seems like a really poor design to me.
Guess you must keep your dashboard up all the time? Or at least a lot.
dnsmasq will query all configured DNS servers simultaneously, so in the case of OpenDNS at least assuming you have both their IPs in there, they'll show you 2 queries per 1 that's actually done, which was doubling it.
-
@cmb:
Ah, now that makes sense.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579536dnsmasq not caching SRV records is "by design". Seems like a really poor design to me.
Guess you must keep your dashboard up all the time? Or at least a lot.
dnsmasq will query all configured DNS servers simultaneously, so in the case of OpenDNS at least assuming you have both their IPs in there, they'll show you 2 queries per 1 that's actually done, which was doubling it.
I remember reading something about Linux where most distros would query all DNS servers and use the first response. Everyone talking about it were so proud about configuring 8+ dns servers and getting the fastest response. They have a funny mindset in that camp.