DNS over TLS forwarding howto
-
"I've haven't really found browsing to be any slower (despite the additional lookups that may be occurring)."
Not sure where that little bit of FUD started that resolving is going to be slower.. In any sense that would matter.. Unless your on a some really high latency connection. The only time that there would be any sort of delay would be a cold lookup having to walk all the way down.
So you asking google for something.. What is that 30ms.. What does it take to fully resolve something? Lets call it 300ms worst case – lets just say.. But that is only on the COLD lookup.. Your talking less then 3 tenths of a second..
After that then your 1ms since the next lookups for www.domain.tld will come from cache.
Now keep in mind that was FULL cold lookup.. Had to talk to roots.. But you have looked up the NS for the tld in question, you no longer have to go ask roots every time. you look up newdomain.tld.. It will go straight their for the NS of newdomain.tld
Keep in mind that talking to the authoritative servers means you will always get the FULL TTL... not whatever google had left on its timer.. So Maybe you got 10 seconds left, and then have to do a query again in 10 seconds. While your resolver got the full 1 hour ttl, etc. So if you got two queries in 1 hour the forwarder would have to get asked twice while you resolver would have only had to do the 1 lookup.
Once something is cached it all becomes moot if your resolving or forwarding. Since its cached.. So why would you care about a few ms difference when looking up something cold.. Your telling me you can tell the difference in your page load be it it took 30 ms or 300ms.. Really?
Keep in mind that 300ms is most likely really high exaggeration... But you could always do some testing if you curious... Just clear your unbound cache and then do a simple dig.. How long does it take to resolve..
Unless your isp is only allowing you to talk to its ns, or your on really high latency internet - say sat or something. Resolving is going to be better all the way around, privacy, security and performance is a non issue to be honest.. Since once something has been looked up.. Your client caches it at the os, the browser caches it, your local dns caches it.. Your not cold resolving every single time all the way down from roots, etc.
Shoot resolving can even be faster.. Once you know the NS for domainX, looking up any records in domainX now just mean talking directly to that NS - which could even be closer to you in RTT or faster to respond than google.. Here just did a query direct to google for forum.pfsense.org took 64ms. While query direct to the NS for pfsense.org only took 52 ms.. And got back more data.. Why anyone would forward - especially if privacy is a concern.. And just giving someone all your info vs spreading it around to just the authoritative servers your wanting to go too and the roots.. That are global controlled setup.. You really think they are tracking billy's IP is going to sites xyz and selling that info?
edit:
Resolver also has a setting or prefetch.. Which can remove the full Colds or even the talking to the authoritative NS delay out... Since the resolver will resolve www.domain.tld if someone asks for it when there is 10% or less left on the TTL.. So lets say the TTL is 24 hours.. If someone asks for that record and there is less than 2.4 hours left on the TTL then the resolver will refresh that info by resolving it again and now the TTL is back to 24 hours. In the back ground, so next query comes by for that record. Its there waiting in the 1ms response cache. No need to even go ask google for it at 30ms, etc.
-
I have tried using forwarding(to OpenDNS) and using unbound with no forwarding…I have found negligible difference in speed(didn't actually measure the speed, I just didn't notice while surfing), however after a reboot it can be very slow using unbound(John not sure that is what you mean by "Cold" lookup)...it then speeds up.
I did play with enable "qname-minimisation-strict" to see how it worked...I didn't do extensive testing but noticed it broke Skype while using the IOS app...
-
"I've haven't really found browsing to be any slower (despite the additional lookups that may be occurring)."
Not sure where that little bit of FUD started that resolving is going to be slower.. In any sense that would matter.. Unless your on a some really high latency connection. The only time that there would be any sort of delay would be a cold lookup having to walk all the way down.
So you asking google for something.. What is that 30ms.. What does it take to fully resolve something? Lets call it 300ms worst case – lets just say.. But that is only on the COLD lookup.. Your talking less then 3 tenths of a second..
After that then your 1ms since the next lookups for www.domain.tld will come from cache.
Now keep in mind that was FULL cold lookup.. Had to talk to roots.. But you have looked up the NS for the tld in question, you no longer have to go ask roots every time. you look up newdomain.tld.. It will go straight their for the NS of newdomain.tld
Keep in mind that talking to the authoritative servers means you will always get the FULL TTL... not whatever google had left on its timer.. So Maybe you got 10 seconds left, and then have to do a query again in 10 seconds. While your resolver got the full 1 hour ttl, etc. So if you got two queries in 1 hour the forwarder would have to get asked twice while you resolver would have only had to do the 1 lookup.
Once something is cached it all becomes moot if your resolving or forwarding. Since its cached.. So why would you care about a few ms difference when looking up something cold.. Your telling me you can tell the difference in your page load be it it took 30 ms or 300ms.. Really?
Keep in mind that 300ms is most likely really high exaggeration... But you could always do some testing if you curious... Just clear your unbound cache and then do a simple dig.. How long does it take to resolve..
Unless your isp is only allowing you to talk to its ns, or your on really high latency internet - say sat or something. Resolving is going to be better all the way around, privacy, security and performance is a non issue to be honest.. Since once something has been looked up.. Your client caches it at the os, the browser caches it, your local dns caches it.. Your not cold resolving every single time all the way down from roots, etc.
Shoot resolving can even be faster.. Once you know the NS for domainX, looking up any records in domainX now just mean talking directly to that NS - which could even be closer to you in RTT or faster to respond than google.. Here just did a query direct to google for forum.pfsense.org took 64ms. While query direct to the NS for pfsense.org only took 52 ms.. And got back more data.. Why anyone would forward - especially if privacy is a concern.. And just giving someone all your info vs spreading it around to just the authoritative servers your wanting to go too and the roots.. That are global controlled setup.. You really think they are tracking billy's IP is going to sites xyz and selling that info?
edit:
Resolver also has a setting or prefetch.. Which can remove the full Colds or even the talking to the authoritative NS delay out... Since the resolver will resolve www.domain.tld if someone asks for it when there is 10% or less left on the TTL.. So lets say the TTL is 24 hours.. If someone asks for that record and there is less than 2.4 hours left on the TTL then the resolver will refresh that info by resolving it again and now the TTL is back to 24 hours. In the back ground, so next query comes by for that record. Its there waiting in the 1ms response cache. No need to even go ask google for it at 30ms, etc.Thanks John for taking the time to craft this very helpful response. After reading what you wrote and doing a bit more research on my own (looking through past threads on Unbound and performance) I understand the tradeoff's far better now. This thread was also very helpful:
https://forum.pfsense.org/index.php?topic=112160.0
I think a lot of the performance concerns can be traced back to the fact that a good number of domains today have pretty short TTL's (e.g. like www.cnn.com in the thread linked above). So on a small network (e.g. a home network), the TTL will expire after a few minutes necessitating another lookup the next time someone makes a DNS request for that same website. That lookup may/may not be faster than going against e.g. Google's DNS servers which likely already have the DNS record in question cached. I think that's probably where the main privacy/performance tradeoff comes in. In my opinion, adding up to a few tenths of a second of additional latency is worth it instead of sending all queries to one set of DNS servers on the internet. If speed is of upmost concern then using then using Unbound with forwarding enabled (to a fast DNS server with a large cache) is probably an option to consider.
Now it's possible to tweak the TTL values on Unbound to improve performance. Frankly, I wouldn't really want to touch the minimum TTL value (default is 0) since you run the risk of then having expired/bad records in the cache. However, what do you think about the max TTL value? I see that it is set to 86400 (1 day) by default. By running dig to a few sites on the net I saw that the expiry on records can be higher. Would it be bad idea to increase this value to two days (172800) instead to make sure that cache records don't expire prematurely?
–------------------
Anyway, I do not want to take this thread too far off-topic. I found some additional information regarding using Unbound over TLS:
https://calomel.org/unbound_dns.html
In general, I would agree with what kejianshi posted. Seems to me the additional overhead would add enough load on servers to slow down what was intended to be fast and lightweight protocol. That being said, hardware and connections speeds have continued to improve, so would be cool if this could be implemented at some point. For now, I think using the Unbound resolver with forwarding disabled (and the qname-minimisation option discussed in this thread enabled) is an easy to improve privacy by 1. limiting the amount of information that is shared during the recursive DNS lookup and 2) spreading the DNS request across multiple servers from multiple orgs vs. just one.
Thanks again to all for this great discussion.
-
Unbound in particular makes for a pretty poor client for dns over tls at the moment. If you look at the implementation progress, it doesn't support reusing sessions or out of order queries. As a server these are no issue but as a client it will mean creating and tearing down a connection with each query. This matters because unbound is effectively the dns client in this example.
Freebsd seems really good at keeping up with unbound releases, so hopefully in the not too distant future this won't be the case.
-
I'm curious if anyone had an opinion on whether it might make sense to increase the max TTL value in the resolver from 1 to 2 days to potentially improve cache performance? Or is best to keep it at 1 day or lower? Thanks again.
-
I'm curious if anyone had an opinion on whether it might make sense to increase the max TTL value in the resolver from 1 to 2 days to potentially improve cache performance? Or is best to keep it at 1 day or lower? Thanks again.
I expect not much to change with that. You should just enable prefetch support. When the 1 day period expires it'll check for an update to refresh the cache prior to you requesting it.
-
guys what really helps in losing google dns caching hit rates, is the new unbound feature called serve-expired. It is now in the pfsense GUI as one of the advanced options, so you need only tick the box.
What it does is when a dns record is cached, even when TTL hits 0 (expired) a new lookup from the lan will serve the cached record but at the same time the cache is updated so the next lookup after is more up to date.
So you can enable the privacy stuff, use your own forwarder or do direct lookups from pfsense unbound, and this option mitigates most cold cache issues.
-
guys what really helps in losing google dns caching hit rates, is the new unbound feature called serve-expired. It is now in the pfsense GUI as one of the advanced options, so you need only tick the box.
What it does is when a dns record is cached, even when TTL hits 0 (expired) a new lookup from the lan will serve the cached record but at the same time the cache is updated so the next lookup after is more up to date.
So you can enable the privacy stuff, use your own forwarder or do direct lookups from pfsense unbound, and this option mitigates most cold cache issues.
So I saw this option a couple weeks ago and started wondering if there would be any adverse impact to using it. I suppose there is the chance that the first lookup returns an incorrect result and an outdated page (or no page at all). A quick refresh/reload on the browser should fix that though (since the cache has refreshed in the background). Are there any security considerations one should be aware of when enabling this feature? I actually decided to try it (i.e. enable serve-expired) out and so far everything is working fine. I can see this option as potentially being beneficial on smaller networks with fewer clients.
This leads me to a more general question: Why do a lot of major sites have short TTL's these days? Is this being done for load balancing reasons?
Thanks in advance.
-
There is a chance of course you end up going to an invalid ip, but in my experience the chance of that happening is extremely tiny. The providers that set silly low TTL that last just a few seconds change so they can redirect quickly in the event of an outage and for load balancing purposes, I cannot remember this causing me a problem in the several weeks I have been using it.
-
how to know if it working?
my dig result are still using port 53
dig google.com ; <<>> DiG 9.11.2-P1 <<>> google.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53396 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 9 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 29 IN A 216.58.196.14 ;; AUTHORITY SECTION: google.com. 38383 IN NS ns2.google.com. google.com. 38383 IN NS ns3.google.com. google.com. 38383 IN NS ns1.google.com. google.com. 38383 IN NS ns4.google.com. ;; ADDITIONAL SECTION: ns2.google.com. 40481 IN A 216.239.34.10 ns2.google.com. 239457 IN AAAA 2001:4860:4802:34::a ns3.google.com. 62066 IN A 216.239.36.10 ns3.google.com. 241432 IN AAAA 2001:4860:4802:36::a ns4.google.com. 48518 IN A 216.239.38.10 ns4.google.com. 239690 IN AAAA 2001:4860:4802:38::a ns1.google.com. 62057 IN A 216.239.32.10 ns1.google.com. 240075 IN AAAA 2001:4860:4802:32::a ;; Query time: 76 msec ;; SERVER: 1.1.1.1#53(1.1.1.1) ;; WHEN: Mon Apr 02 13:36:49 +08 2018 ;; MSG SIZE rcvd: 303
-
server: ssl-upstream: yes do-tcp: yes forward-zone: name: "." forward-addr: {ipv4address}@853 forward-addr: {ipv6address}@853
This configuration causes lookup delays for me when a Domain Override is configured, perhaps because it affects how unbound tries to connect to the override server.
I don't experience the delays with this configuration:
forward-zone: name: "." forward-ssl-upstream: yes forward-addr: 9.9.9.9@853 forward-addr: 2620:fe::fe@853
-
Is it possible to use both DNS over TLS AND pfblockerng DNSBL in custom settings? currently I have this line under custom settings:```
server:include: /var/unbound/pfb_dnsbl.*confI tried adding the TLS code under that line but it didn't work. :(
-
Is it possible to use both DNS over TLS AND pfblockerng DNSBL in custom settings? currently I have this line under custom settings:```
server:include: /var/unbound/pfb_dnsbl.*confI tried adding the TLS code under that line but it didn't work. :(
Working fine here with DNSBL configured.
"it didn't work" doesn't really give us a lot to help with/from. Are you hitting an error, is there anything useful to go off of in the DNS Resolver logs?
-
I replicated your config (removed the additional "server:" line and it now works, but it took about 20 seconds until unbound started responding after applying config, thanks!
-
there is an official netgate guide on this feature now following the launch of cloudflare's service. :)
https://www.netgate.com/blog/dns-over-tls-with-pfsense.html
-
There are some improvements in the guide provided by netgate as well compared to the original post. Rather than update my post with these changes I just edited a reference to the blog post.
I've upgraded to 2.4.4 to try out the changes for both forwarding dns over tls queries and providing to internal hosts. So far these seem to work pretty well now that the cloudflaire unbound compatibility issue is resolved.
-
Hi,
For getting DNS over TLS working do you have to change the resolver listening to 853 or you would leave that alone.
Also, would you change the firewall reroute port on LAN to 853 for using pfsense as DNS server or no,
When i use dig google.com i get , 127.0.0.1 at port 53. is this what you expect,; <<>> DiG 9.11.2-P1 <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5629
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com. IN A;; ANSWER SECTION:
google.com. 211 IN A x.x.x.x
google.com. 211 IN A x.x.x.x
google.com. 211 IN A x.x.x.x
google.com. 211 IN A x.x.x.x
google.com. 211 IN A x.x.x.x
google.com. 211 IN A x.x.x.x;; Query time: 111 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Apr 11 16:39:42 CDT 2018
;; MSG SIZE rcvd: 135 -
I'm noticing significant slow-down using this (i.e. DNS over TLS using unbound in pfSense), compared to running a secondary DoH or DNSCrypt proxy on local machines. For example, dnscrypt-proxy v2.0.8 now supports DoH in addition to DNSCrypt. I ran some tests from my MacBook Pro, a Ghost BSD machine, an iPhone X and a Windows 10 machine; first using pfSense and then using a local proxy pointed at the same upstream server (1.0.0.1 or 9.9.9.9).
All tests show that resolving DNS via pfSense (box and specs in my sig) is at least 2 to 3 times slower than running a DoH or dnscrypt proxy directly on the same local machines, despite them being set to forward to the same external DNS servers as pfSense.
As you can see, using my pfSense box for DNS (192.168.1.1) is very slow. As soon as I enable Stubby on macOS (TLS), Simple DNSCrypt on Windows (dnscrypt-proxy using DoH), or AdGuard Pro on iOS (dnscrypt), the time to resolve is cut in half. It's still fairly quick either way, but there is an absolute and definite noticeable difference in real world usage. Browsing is instant with Stubby/Simple DNSCrypt/AdGuard, but takes an extra second or so after hitting enter before the site is found and loaded when running DNS via pfSense.
Initially I thought it could be a protocol difference, i.e. TLS being slower than DoH or dnscrypt. However Stubby on macOS uses TLS also, and that's still twice as fast as pfSense to the same DNS server (1.0.0.1 or 9.9.9.9) and for the same lookups. The pfSense hardware is easily beefy enough and doesn't break 3% CPU usage under load, so it can't be that…
So, any ideas?
-
Rainmaker, Unbound has one real major weakness in using DNS over TLS as a forwarder. It does not re-use tcp sessions. Each query is a TLS handshake. I'm willing to bet that this is entirely responsible for the increased query time that you are observing. Stubby supports out of order queries and tcp session reuse.
I know this is an item that unbound has patches to work on but it doesn't look like a trivial change.
-
Rainmaker, Unbound has one real major weakness in using DNS over TLS as a forwarder. It does not re-use tcp sessions. Each query is a TLS handshake. I'm willing to bet that this is entirely responsible for the increased query time that you are observing. Stubby supports out of order queries and tcp session reuse.
I know this is an item that unbound has patches to work on but it doesn't look like a trivial change.
Ah, yes. I did read that on the dnsprivacy.org website a few weeks ago, but I'd forgotten all about it. That would indeed explain it. Not a big deal for now, I'll keep the local proxies running and use pfSense as a 'backup' for roaming devices, visitors etc who may not be otherwise protected. Thanks so much for taking the time to reply.