Excessive DNS lookups for _http._tcp.pkg.pfsense.org after 2.3 upgrade



  • According to my OpenDNS statistics, my network made over 12,000 DNS requests for _http._tcp.pkg.pfsense.org in the 18 hours since I upgraded from 2.2.x to 2.3. dnsmasq is running on the router and this seems pretty excessive as my next highest # of requests for the whole network was <800 for *.akamaiedge.net. I'll report back if I see this again tomorrow, but is there a way to scale that back?



  • Yeah pkg apparently really, really wants to fetch that SRV record, even if it is unnecessary. We added that to DNS earlier today, so that shouldn't be happening anymore.



  • It's used by pkg to locate package mirrors if the repository specification has a mirror type of "SRV", maybe you should change the mirror type to just "HTTP" if you're not planning on offering multiple package mirrors with geolocation support?



  • Yes, we know what it's for. We're leaving it that way as we might use the SRV in the future.



  • I just noticed another spike in DNS requests for _http._tcp.pkg.pfsense.org over the course of the last 24 hours. Only 1,654 requests this time, but this is the first "spike" since my first post on 13 April.



  • Still the SRV record it's looking up? That exists and resolves fine, so it shouldn't be doing anything repeatedly. Has a 5 minute TTL, so the most any single system should be looking it up would be 288 times in 24 hours and only that if it were trying continuously.



  • Correct. My latest count for the 24 hours preceding this post is 25,634 SRV lookups for _http._tcp.pkg.pfsense.org. I have no packages installed at this time. Prior to yesterday, the only package I had installed was mtr-nox, but since I wasn't using it, I removed it.

    ![Screenshot 2016-04-22 17.35.31.png](/public/imported_attachments/1/Screenshot 2016-04-22 17.35.31.png)
    ![Screenshot 2016-04-22 17.35.31.png_thumb](/public/imported_attachments/1/Screenshot 2016-04-22 17.35.31.png_thumb)


  • LAYER 8 Global Moderator

    Ok I just installed dnstop, and having it listen on my public facing interface, I run unbound (resolver)..  And I don't see it doing any such craziness.. Been running for a while, and while I did see that query go - it has long since drop off the top listing..

    simple fetch http://pkg.freebsd.org/freebsd:10:x86:64/latest/All/dnstop-20140915.txz

    pkg add and have dnstop working, you sure opendns is giving you the correct info??

    I show over 1100 total queries, with SRV type only being 3 out of the whole 1100 something, but now I am curious to what kind of query is #0?




  • Having worked for OpenDNS in the past maintaining their resolver infrastructure around the world, I am quite confident that the information they are giving me is as close to 100% accurate as possible.

    I'm running dnstop right now. The screenshot is about 5 minutes worth of data.

    ![Screenshot 2016-04-23 13.16.15.png](/public/imported_attachments/1/Screenshot 2016-04-23 13.16.15.png)
    ![Screenshot 2016-04-23 13.16.15.png_thumb](/public/imported_attachments/1/Screenshot 2016-04-23 13.16.15.png_thumb)



  • And 3 1/2 hours later we're seeing dns lookups for hostname.bind, id.server and . too.

    ![Screenshot 2016-04-23 16.42.01.png](/public/imported_attachments/1/Screenshot 2016-04-23 16.42.01.png)
    ![Screenshot 2016-04-23 16.42.01.png_thumb](/public/imported_attachments/1/Screenshot 2016-04-23 16.42.01.png_thumb)


  • LAYER 8 Global Moderator

    Well I can tell you its something odd with your config and not an actual issue..  Since been letting it run and with over 112k queries

    pfsense.org for anything is no where near the top.  Do you have the page open somewhere on a very fast refresh??  And why would something be looking for hostname.bind or id.server??

    Your saying the http query is also for SRV, well I only show a total of 43 of those out of 112K total queries..






  • pcap those DNS requests, are you getting correct replies? A dig to OpenDNS for that SRV does get the proper reply and TTL, so it should be impossible to have that many queries piling up (assuming you're getting a correct response).



  • I don't have a pcap yet, but here is what I see from the router:

    [2.3-RELEASE][xxx@xxx]/root: dig _http._tcp.pkg.pfsense.org srv @208.67.222.222
    
    ; <<>> DiG 9.10.3-P4 <<>> _http._tcp.pkg.pfsense.org srv @208.67.222.222
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1335
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 4096
    ;; QUESTION SECTION:
    ;_http._tcp.pkg.pfsense.org.    IN      SRV
    
    ;; ANSWER SECTION:
    _http._tcp.pkg.pfsense.org. 220 IN      SRV     10 10 80 pkg.pfsense.org.
    
    ;; Query time: 19 msec
    ;; SERVER: 208.67.222.222#53(208.67.222.222)
    ;; WHEN: Tue May 03 17:32:14 CDT 2016
    ;; MSG SIZE  rcvd: 90
    
    [2.3-RELEASE][xxx@xxx]/root: dig _http._tcp.pkg.pfsense.org srv
    
    ; <<>> DiG 9.10.3-P4 <<>> _http._tcp.pkg.pfsense.org srv
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17256
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 4096
    ;; QUESTION SECTION:
    ;_http._tcp.pkg.pfsense.org.    IN      SRV
    
    ;; ANSWER SECTION:
    _http._tcp.pkg.pfsense.org. 300 IN      SRV     10 10 80 pkg.pfsense.org.
    
    ;; Query time: 55 msec
    ;; SERVER: 127.0.0.1#53(127.0.0.1)
    ;; WHEN: Tue May 03 17:32:21 CDT 2016
    ;; MSG SIZE  rcvd: 90
    

    My OpenDNS stats for the previous 24 hours indicate almost 10K requests for _http._tcp.pkg.pfsense.org again. I'll have a pcap ready by tomorrow.



  • You're getting the right reply, it certainly seems sane.



  • I left the packet capture routing running (via diag_packet_capture.php) overnight. Stopped it this morning and downloaded the pcap and it was only 24 bytes and contained no useful information. However, there were over 11K dns requests for that record again during the same time period I tried the packet capture.

    ![Screenshot 2016-05-04 14.15.47.png](/public/imported_attachments/1/Screenshot 2016-05-04 14.15.47.png)
    ![Screenshot 2016-05-04 14.15.47.png_thumb](/public/imported_attachments/1/Screenshot 2016-05-04 14.15.47.png_thumb)


  • LAYER 8 Global Moderator

    Well if your saying you did 11k dns queries for that, and your pcap was empty then clearly you were not capturing on the right interface or the right port or someone is clearly mistaken to the number of queries that are happening ;)



  • What filter did you have on the capture? Sounds like you ended up filtering out pretty much everything.



  • I disabled dnsmasq and setup/enabled unbound and the problem seems to have gone away. As much as I like bug hunting, I'm not going to dive into dnsmasq and figure out the why… I guess we can consider this issue closed.



  • Ah, now that makes sense.
    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579536

    dnsmasq not caching SRV records is "by design". Seems like a really poor design to me.

    Guess you must keep your dashboard up all the time? Or at least a lot.

    dnsmasq will query all configured DNS servers simultaneously, so in the case of OpenDNS at least assuming you have both their IPs in there, they'll show you 2 queries per 1 that's actually done, which was doubling it.



  • @cmb:

    Ah, now that makes sense.
    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579536

    dnsmasq not caching SRV records is "by design". Seems like a really poor design to me.

    Guess you must keep your dashboard up all the time? Or at least a lot.

    dnsmasq will query all configured DNS servers simultaneously, so in the case of OpenDNS at least assuming you have both their IPs in there, they'll show you 2 queries per 1 that's actually done, which was doubling it.

    I remember reading something about Linux where most distros would query all DNS servers and use the first response. Everyone talking about it were so proud about configuring 8+ dns servers and getting the fastest response. They have a funny mindset in that camp.



  • Ouch, that's bad… What is the situation anyway with the DNS forwarders, isn't DNSMasq a bit redundant since it's not doing anything that Unbound can't do?



  • @kpa:

    What is the situation anyway with the DNS forwarders, isn't DNSMasq a bit redundant since it's not doing anything that Unbound can't do?

    No, that's not true. dnsmasq can do things that Unbound can't, and vice versa. There are also behavior differences between them, which is why we didn't force everyone to Unbound.


  • LAYER 8 Global Moderator

    One thing off the top that dnsmasq can do that unbound can not is do localized responses..  Not aware that unbound can do that?  Pretty sure dnsmasq will send queries to all dns servers listed and use the fasted response.  I believe the way unbound does it is sequential?

    As cmb states there are differences in for sure.. dnsmasq is by design a forwarder, while out of the box unbound is meant to be a resolver while it can be put in forwarder mode that is not where it shines so having both available for sure makes better choices for pfsense.  Now if they had an authoritative dns that would be the homerun like bind..


Log in to reply