Newbie - Unbound sometimes is really slow: up to 20sec
-
Hi,
my name is Nicolò and I'm new on the pfsense/freebsd platform;
I contact you because I have not had feedback on the IRC and BBcan177 wrote me on youtube to ask for help on the forum.
I used to be a OpenWRT/LEDE user since 2008 but now I moved to a 1Gbps FTTH connection at home
and a standard router can't handle the bandwidth. (At least with the std linux kernel without custom HW Nat or Qualcom fast path stuff)I tried pfsense on my old gear (i5 750, 4GB DDR3 and SSD, 2 Intel NICs and 2 by Realtek :-X)
All started fine, I was able to setup my LAGG, vlans and firewall roules to isolate my server subnet of my home subnet.
(I had a couple of problem due to the filtering way of how bridges work on pfsense, but that I have found how to solve it, now roules apply to the bridge and not on single interfaces ;))Anyway, for start I used the dns forwarder (Dnsmasq, which I used to know). All was working fine but than i dicovered the "pfblocker DNSBL" package and I moved to use the DNS resolver (unbound).
And problems started to coming out! Sometimes all works well other times i have to wait up to 15/20 seconds to be able to connect to a server.
I tried to do a dns benchmark as shown by the guys of "Lawrence Systems / PC Pickup" the result was that my pfsense box is 450/495% slower than my ISP DNS server.
and 180% slower than Public DNS servers.
Unbound is setup to use the forward option, and my DNS server are the top 2 from the namebench tests. Followed by the primary Google DNS.
I have to add that my ISP use a "fake" domain for inner traffic like administration panel and VOIP traffic so I have added a roule that if something match those domains are forwarded to a specific server.
I hope I was clear, in case you want more information ask.
have a good day,
hitech95. -
Hi !
You are aware of the fact that on a "i5" core, the unbound resolver will never ever need more a a fraction of a millisecond to pass along the 'simple' DNS request ?
Locally, and the delay of your "WAN" connection will add some more 'ms' to that.And then you made some decisions that might explain the delays.
You disabled the "world resolver" (takes the 12+ top level DNS resolver for planet earth" as a starting point to drill down to reply on your request. Nice benefit : the fastest reply will count !
No, you choose the Forwarder …. which means that "unbound", the local resolver, does nothing more as passing the DNS request one logical hop further along the path. This next DNS resolver will do the “top” resolving for you (and throwing all your request in stats so they know exactly who you are, what you do, what you want, what you see, etc).I propose that you use the default pfSense settings == use the Resolver mode – and disable all package (not that I doubt that about pfblocker – but questioning the entire DNSBL list (== a boat load of servers that lists all kind of suspected addresses and some of them are really slow to answer – pfblocker will cut-off after …. 20 seconds or so ;) ) and that you test again.
Btw : https://serverfault.com/questions/91063/how-do-i-benchmark-performance-of-external-dns-lookups
-
Hi !
You are aware of the fact that on a "i5" core, the unbound resolver will never ever need more a a fraction of a millisecond to pass along the 'simple' DNS request ?
Locally, and the delay of your "WAN" connection will add some more 'ms' to that.And then you made some decisions that might explain the delays.
You disabled the "world resolver" (takes the 12+ top level DNS resolver for planet earth" as a starting point to drill down to reply on your request. Nice benefit : the fastest reply will count !
No, you choose the Forwarder …. which means that "unbound", the local resolver, does nothing more as passing the DNS request one logical hop further along the path. This next DNS resolver will do the “top” resolving for you (and throwing all your request in stats so they know exactly who you are, what you do, what you want, what you see, etc).I propose that you use the default pfSense settings == use the Resolver mode – and disable all package (not that I doubt that about pfblocker – but questioning the entire DNSBL list (== a boat load of servers that lists all kind of suspected addresses and some of them are really slow to answer – pfblocker will cut-off after …. 20 seconds or so ;) ) and that you test again.
Btw : https://serverfault.com/questions/91063/how-do-i-benchmark-performance-of-external-dns-lookups
- I know that the problem is not the hardware.
- The wan cant be due the fact that using the dnsmasq instead the unbound all works fine
- If I disable the forwarder the time goes up, and the problem is more frequent than now.
- pfblocker is using the same list I was using on OpenWRT/LEDE with DNSmasq.
- I have already tested withoud the pfblocker and with the default settings. The problem is the randomness of the problem. It can works for hours than at same point all locks and than start working again…
I used to have a MIPS 500Mhz single core router with 64MB of ram and the DNS request was a lot more faster than this... (with the same servers) so it is why i'm asking for help...
Maybe there is something wrong with my setup! -
… than at same point all locks and than start working again...
All ?
"All" for all devices hooked up, including pfSense itself ? How long does this last ?Ok, so, I stop focusing DNS ;)
Check your entire WAN connection.My
connection works ok, - it happens that latency goes up - even triples - my ISP has had some outage the last 17 years.
I never had the impression that I had to invest in pfSense "because it was acting slowly". -
… than at same point all locks and than start working again...
All ?
"All" for all devices hooked up, including pfSense itself ? How long does this last ?Ok, so, I stop focusing DNS ;)
Check your entire WAN connection.My
connection works ok, - it happens that latency goes up - even triples - my ISP has had some outage the last 17 years.
I never had the impression that I had to invest in pfSense "because it was acting slowly".I don't know, connection established continue working but opening a new wesite take really long. Don't kno wif all devices at the same time locks up, this is only for that 20sec period i wrote before.
I'll do more tests if you want.
I'm not sure about the wan connection (it is a GPON 1Gbps link), I tested for a week with dns masq and this never happened… so I'm not sure if it is coincidence or if is Unbound.
For example today I don't had any problem...
-
Have a peek in your DNS resolver logs to see if the resolver is stuck in some sort of a restart loop. I've seen on my setup that dual stack IPv4 and IPv6 (I receive a /56 via DHCP from my provider) causes unbound to enter a restart loop if whatever script triggers it to restart triggers a restart. For example, if a new DHCP host comes online and is registered, the restart loop usually happens.
If your issue is in fact the same, I've seen it posted a few times on here without much recourse other than to disable IPv6.
-
This is just happened:
C:\Users\Hitech95>ping google.it Impossibile trovare l'host google.it. Verificare che il nome sia corretto e riprovare. C:\Users\Hitech95>ping 8.8.8.8 Esecuzione di Ping 8.8.8.8 con 32 byte di dati: Risposta da 8.8.8.8: byte=32 durata=15ms TTL=56 Risposta da 8.8.8.8: byte=32 durata=15ms TTL=56 Risposta da 8.8.8.8: byte=32 durata=15ms TTL=56 Statistiche Ping per 8.8.8.8: Pacchetti: Trasmessi = 3, Ricevuti = 3, Persi = 0 (0% persi), Tempo approssimativo percorsi andata/ritorno in millisecondi: Minimo = 15ms, Massimo = 15ms, Medio = 15ms Control-C ^C C:\Users\Hitech95>ping google.it Impossibile trovare l'host google.it. Verificare che il nome sia corretto e riprovare. C:\Users\Hitech95>ping google.com Esecuzione di Ping google.com [216.58.205.142] con 32 byte di dati: Risposta da 216.58.205.142: byte=32 durata=9ms TTL=54 Risposta da 216.58.205.142: byte=32 durata=10ms TTL=54 Statistiche Ping per 216.58.205.142: Pacchetti: Trasmessi = 2, Ricevuti = 2, Persi = 0 (0% persi), Tempo approssimativo percorsi andata/ritorno in millisecondi: Minimo = 9ms, Massimo = 10ms, Medio = 9ms Control-C ^C C:\Users\Hitech95>ping google.it Impossibile trovare l'host google.it. Verificare che il nome sia corretto e riprovare. C:\Users\Hitech95>ping google.it Impossibile trovare l'host google.it. Verificare che il nome sia corretto e riprovare. C:\Users\Hitech95>ping 8.8.8.8 Esecuzione di Ping 8.8.8.8 con 32 byte di dati: Risposta da 8.8.8.8: byte=32 durata=15ms TTL=56 Risposta da 8.8.8.8: byte=32 durata=15ms TTL=56 Risposta da 8.8.8.8: byte=32 durata=15ms TTL=56 Statistiche Ping per 8.8.8.8: Pacchetti: Trasmessi = 3, Ricevuti = 3, Persi = 0 (0% persi), Tempo approssimativo percorsi andata/ritorno in millisecondi: Minimo = 15ms, Massimo = 15ms, Medio = 15ms Control-C ^C C:\Users\Hitech95>ping google.it Impossibile trovare l'host google.it. Verificare che il nome sia corretto e riprovare. C:\Users\Hitech95>ping google.com Esecuzione di Ping google.com [216.58.205.142] con 32 byte di dati: Risposta da 216.58.205.142: byte=32 durata=9ms TTL=54 Risposta da 216.58.205.142: byte=32 durata=10ms TTL=54 Statistiche Ping per 216.58.205.142: Pacchetti: Trasmessi = 2, Ricevuti = 2, Persi = 0 (0% persi), Tempo approssimativo percorsi andata/ritorno in millisecondi: Minimo = 9ms, Massimo = 10ms, Medio = 9ms Control-C ^C C:\Users\Hitech95>ping google.it Impossibile trovare l'host google.it. Verificare che il nome sia corretto e riprovare. C:\Users\Hitech95>
Sorry my OS is in italian language.
As you can see sometims something get stuck resolving some DNS.
And than all start working again.Thw WAN is a PPPoE connection and according to the infos on the dashboard the wan is UP.
This is the last 50 entry in he log:
Jan 27 18:15:01 unbound 56015:0 info: 0.131072 0.262144 27 Jan 27 18:15:01 unbound 56015:0 info: 0.262144 0.524288 8 Jan 27 18:15:01 unbound 56015:0 info: 0.524288 1.000000 1 Jan 27 18:15:01 unbound 56015:0 info: server stats for thread 1: 3 queries, 0 answers from cache, 3 recursions, 0 prefetch, 0 rejected by ip ratelimiting Jan 27 18:15:01 unbound 56015:0 info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0 Jan 27 18:15:01 unbound 56015:0 info: average recursion processing time 0.074612 sec Jan 27 18:15:01 unbound 56015:0 info: histogram of recursion processing times Jan 27 18:15:01 unbound 56015:0 info: [25%]=0 median[50%]=0 [75%]=0 Jan 27 18:15:01 unbound 56015:0 info: lower(secs) upper(secs) recursions Jan 27 18:15:01 unbound 56015:0 info: 0.016384 0.032768 1 Jan 27 18:15:01 unbound 56015:0 info: 0.032768 0.065536 1 Jan 27 18:15:01 unbound 56015:0 info: 0.131072 0.262144 1 Jan 27 18:15:01 unbound 56015:0 info: server stats for thread 2: 148 queries, 24 answers from cache, 124 recursions, 0 prefetch, 0 rejected by ip ratelimiting Jan 27 18:15:01 unbound 56015:0 info: server stats for thread 2: requestlist max 3 avg 0.379032 exceeded 0 jostled 0 Jan 27 18:15:01 unbound 56015:0 info: average recursion processing time 0.081790 sec Jan 27 18:15:01 unbound 56015:0 info: histogram of recursion processing times Jan 27 18:15:01 unbound 56015:0 info: [25%]=0.0170394 median[50%]=0.0415902 [75%]=0.131072 Jan 27 18:15:01 unbound 56015:0 info: lower(secs) upper(secs) recursions Jan 27 18:15:01 unbound 56015:0 info: 0.000000 0.000001 18 Jan 27 18:15:01 unbound 56015:0 info: 0.000128 0.000256 1 Jan 27 18:15:01 unbound 56015:0 info: 0.000512 0.001024 1 Jan 27 18:15:01 unbound 56015:0 info: 0.002048 0.004096 1 Jan 27 18:15:01 unbound 56015:0 info: 0.004096 0.008192 5 Jan 27 18:15:01 unbound 56015:0 info: 0.008192 0.016384 4 Jan 27 18:15:01 unbound 56015:0 info: 0.016384 0.032768 25 Jan 27 18:15:01 unbound 56015:0 info: 0.032768 0.065536 26 Jan 27 18:15:01 unbound 56015:0 info: 0.065536 0.131072 12 Jan 27 18:15:01 unbound 56015:0 info: 0.131072 0.262144 20 Jan 27 18:15:01 unbound 56015:0 info: 0.262144 0.524288 11 Jan 27 18:15:01 unbound 56015:0 info: server stats for thread 3: 87 queries, 12 answers from cache, 75 recursions, 0 prefetch, 0 rejected by ip ratelimiting Jan 27 18:15:01 unbound 56015:0 info: server stats for thread 3: requestlist max 3 avg 0.333333 exceeded 0 jostled 0 Jan 27 18:15:01 unbound 56015:0 info: average recursion processing time 0.074287 sec Jan 27 18:15:01 unbound 56015:0 info: histogram of recursion processing times Jan 27 18:15:01 unbound 56015:0 info: [25%]=0.0197973 median[50%]=0.0401408 [75%]=0.0868352 Jan 27 18:15:01 unbound 56015:0 info: lower(secs) upper(secs) recursions Jan 27 18:15:01 unbound 56015:0 info: 0.000000 0.000001 11 Jan 27 18:15:01 unbound 56015:0 info: 0.000256 0.000512 1 Jan 27 18:15:01 unbound 56015:0 info: 0.001024 0.002048 1 Jan 27 18:15:01 unbound 56015:0 info: 0.004096 0.008192 2 Jan 27 18:15:01 unbound 56015:0 info: 0.016384 0.032768 18 Jan 27 18:15:01 unbound 56015:0 info: 0.032768 0.065536 20 Jan 27 18:15:01 unbound 56015:0 info: 0.065536 0.131072 10 Jan 27 18:15:01 unbound 56015:0 info: 0.131072 0.262144 8 Jan 27 18:15:01 unbound 56015:0 info: 0.262144 0.524288 2 Jan 27 18:15:01 unbound 56015:0 info: 0.524288 1.000000 2 Jan 27 18:15:01 unbound 56015:0 notice: Restart of unbound 1.6.6. Jan 27 18:15:01 unbound 56015:0 notice: init module 0: validator Jan 27 18:15:01 unbound 56015:0 notice: init module 1: iterator Jan 27 18:15:01 unbound 56015:0 info: start of service (unbound 1.6.6). Jan 27 18:15:10 unbound 56015:1 info: failed to prime trust anchor -- DNSKEY rrset is not secure . DNSKEY IN