Unbound cant resolve domains - which exists correctly
-
Good Catch @Grimson, I noticed the refused - but that is very valid point and explains the refused maybe.
-
Okay, now I messed something up with this reply... I try to recover it.
@johnpoz said in Unbound cant resolve domains - which exists correctly:
What does the resolver log say about it?
Nothing. At 4am I scheduled an automatic re-connect of my pppoe connection. From this timestamp are the last log entries. Nothing newer.
@johnpoz said in Unbound cant resolve domains - which exists correctly:
What does unbound say it would do to look it up?Okay, this is funny. The output:
[2.4.4-RELEASE][admin@firewall.northern-lights]/root: unbound-control -c /var/unbound/unbound.conf lookup haus-automatisierung.com The following name servers are used for lookup of haus-automatisierung.com. ;rrset 67565 2 0 2 0 haus-automatisierung.com. 153965 IN NS ns5.kasserver.com. haus-automatisierung.com. 153965 IN NS ns6.kasserver.com. ;rrset 7164 1 0 8 0 ns6.kasserver.com. 7164 IN A 85.13.159.101 ;rrset 7164 1 0 8 0 ns5.kasserver.com. 7164 IN A 85.13.128.3 Delegation with 2 names, of which 2 can be examined to query further addresses. It provides 2 IP addresses. 85.13.128.3 expired, rto 48657592 msec, tA 3 tAAAA 2 tother 2. 85.13.159.101 rto 120000 msec, ttl 864, ping 0 var 94 rtt 376, tA 3, tAAAA 3, tother 3, probedelay 85, EDNS 0 assumed.
This seems quite good, or?
@johnpoz said in Unbound cant resolve domains - which exists correctly:
Looks like when you queried unbound you got REFUSED
status: REFUSED
; SERVER: 10...#53(10...)Why are you hiding rfc1918?
The *** are for me. I feel a little bit weird to post internal ip addresses public.
-
They are RFC1918... Its like saying you live on 123 street or 456 Street..
Everyone has rfc1918 -- your not giving anything away... Be like saying you live on the earth ;)
-
Sure... I dont know whats the reason for this discomfort ;)
Yesterday I testet the dig call without the additional parameters and the result is a little bit confusing:
dig haus-automatisierung.com ; <<>> DiG 9.13.5 <<>> haus-automatisierung.com ;; global options: +cmd ;; connection timed out; no servers could be reached
If I try it with an another domain:
dig forum.netgate.com ; <<>> DiG 9.13.5 <<>> forum.netgate.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51682 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;forum.netgate.com. IN A ;; ANSWER SECTION: forum.netgate.com. 3217 IN A 208.123.73.199 ;; Query time: 1 msec ;; SERVER: 10.0.1.200#53(10.0.1.200) ;; WHEN: Mon Dec 31 12:16:07 CET 2018 ;; MSG SIZE rcvd: 62
Unfortunately my knowledge in the depth of the DNS is not so big, so I cant understand why I get a timeout at the first request. If the parameters +trace +all are the problem, then it should work without or?
In the web ui I cant find anything recording to the time where I used dig. The problem exists on windows pcs or mobile phones as well.
Is the missing edns compliance possible a reason? But if the pfsense know the resolution locally why she doesnt communicate it to the clients :(
-
@logic said in Unbound cant resolve domains - which exists correctly:
;; connection timed out; no servers could be reached
You sure unbound is just not restarting, and when you try it down, and then when you try domain 2 its up..
Look in your unbound log, do you have it registering dhcp - it could be restarting quit a bit..
Oh my gawd - the first subnet in the 10 range.. Your hacked now buddy! ;) If I only new what mask you were using then you would be in real trouble ;) ROFL...
-
@johnpoz said in Unbound cant resolve domains - which exists correctly:
@logic said in Unbound cant resolve domains - which exists correctly:
;; connection timed out; no servers could be reached
You sure unbound is just not restarting, and when you try it down, and then when you try domain 2 its up..
I cant find any indication for a restart. I can request other domains in a second shell while the first shell is waiting for the timeout. And unbound is present in top.
Now I tried dig on the pfsense directly and get this output:
[2.4.4-RELEASE][admin@firewall.northern-lights]/var/log: dig haus-automatisierung.com ; <<>> DiG 9.12.2-P1 <<>> haus-automatisierung.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8891 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;haus-automatisierung.com. IN A ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Mon Dec 31 12:48:42 CET 2018 ;; MSG SIZE rcvd: 53
Look in your unbound log, do you have it registering dhcp - it could be restarting quit a bit..
Here are the last lines from the resolver.log. Round about 2 hours old.
Dec 31 04:01:10 firewall unbound: [75625:0] info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recur sions, 0 prefetch, 0 rejected by ip ratelimiting Dec 31 04:01:10 firewall unbound: [75625:0] info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostl ed 0 Dec 31 04:01:13 firewall unbound: [38282:0] notice: init module 0: validator Dec 31 04:01:13 firewall unbound: [38282:0] notice: init module 1: iterator Dec 31 04:01:13 firewall unbound: [38282:0] info: start of service (unbound 1.8.1). Dec 31 04:01:14 firewall unbound: [38282:0] info: generate keytag query _ta-4a5c-4f66. NULL IN Dec 31 11:19:08 firewall unbound: [55523:0] notice: init module 0: validator Dec 31 11:19:08 firewall unbound: [55523:0] notice: init module 1: iterator Dec 31 11:19:08 firewall unbound: [55523:0] info: start of service (unbound 1.8.1). Dec 31 11:19:10 firewall unbound: [55523:0] info: generate keytag query _ta-4a5c-4f66. NULL IN
And currently I doesnt registering dhcp clients in unbound. I have planed it for Static DHCP but dont use it at the moment.
Oh my gawd - the first subnet in the 10 range.. Your hacked now buddy! ;) If I only new what mask you were using then you would be in real trouble ;) ROFL...
Try /1
-
I just noticed your output
85.13.128.3 expired, rto 48657592 msec, tA 3 tAAAA 2 tother 2.
85.13.159.101 rto 120000 msec, ttl 864, ping 0 var 94 rtt 376, tA 3, tAAAA 3, tother 3, probedelay 85, EDNS 0 assumed.Your havng some serious issues talking to those NS.. Those RTO values are BAD!!!
https://nlnetlabs.nl/documentation/unbound/info-timeout/You could try flushing the unbound cache for that domain and trying again... But yeah you going to have problem with those kinds of stats for your NS for that domain..
Are you having the same sort of of values for other domains you have run into that don't resolve?
What do you see for other domains.. you can check here
Status / DNS ResolverYou can sort by RTO, etc.. Lets see a snip of what sort of numbers your getting for other NS
Also you notice the timeouts, etc.. tA, tAAAA - your having huge problem talking to those NS.. So yeah going to have problems resolving from them.
-
Hi and happy new year ;)
After a long time of random keywords and clicks I found another domain: http://hamsterhilfe-nrw.de/
The screenshot from the unbound status:
The result of my local dig:
dig hamsterhilfe-nrw.de ; <<>> DiG 9.13.5 <<>> hamsterhilfe-nrw.de ;; global options: +cmd ;; connection timed out; no servers could be reached
And this is a dig from the pfsense directly:
dig hamsterhilfe-nrw.de ; <<>> DiG 9.12.2-P1 <<>> hamsterhilfe-nrw.de ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45842 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1452 ;; QUESTION SECTION: ;hamsterhilfe-nrw.de. IN A ;; ANSWER SECTION: hamsterhilfe-nrw.de. 3600 IN A 85.13.152.171 ;; Query time: 204 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Wed Jan 02 22:06:53 CET 2019 ;; MSG SIZE rcvd: 64
Interesting: The pfsense uses "8.8.8.8". One of the default dns which I have set under "General setup" but I thought this dns server will only used if I activate the forwarding mode?
The second dig will use "127.0.0.1":
dig hamsterhilfe-nrw.de ; <<>> DiG 9.12.2-P1 <<>> hamsterhilfe-nrw.de ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23436 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;hamsterhilfe-nrw.de. IN A ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Wed Jan 02 22:09:53 CET 2019 ;; MSG SIZE rcvd: 48
But I cant reproduce that. For a retry I restarted the dns resolver and now every dig on the pfsense machine for this domain goes to 8.8.8.8, no use of 127.0.0.1 anymore.
The unbound lookup after a service restart:
unbound-control -c /var/unbound/unbound.conf lookup hamsterhilfe-nrw.de The following name servers are used for lookup of hamsterhilfe-nrw.de. ;rrset 86389 2 0 2 0 hamsterhilfe-nrw.de. 86389 IN NS ns5.kasserver.com. hamsterhilfe-nrw.de. 86389 IN NS ns6.kasserver.com. ;rrset 7190 1 0 8 0 ns6.kasserver.com. 7190 IN A 85.13.159.101 ;rrset 7191 1 0 8 0 ns5.kasserver.com. 7191 IN A 85.13.128.3 Delegation with 2 names, of which 2 can be examined to query further addresses. It provides 2 IP addresses. 85.13.128.3 rto 1504 msec, ttl 891, ping 0 var 94 rtt 376, tA 2, tAAAA 0, tother 0, EDNS 0 assumed. 85.13.159.101 rto 3008 msec, ttl 894, ping 0 var 94 rtt 376, tA 3, tAAAA 0, tother 0, EDNS 0 assumed.
After msec increasing over time and are not at 120000.
And this is a screenshot after one of my service restarts and dig runs:
-
Why do you have all those timeouts.. Look like your connection must just be crap??
Look at those RTO times?? What are you on SAT or something?
As to pfsense using 8.8.8.8 - why would you put those in general?
-
Hi,
i resolved the problem. I installed a bind 9.11 in a docker container and activated only the resolver for my subnet. And everything works without any problems.
So I didn't saw the problem with my connection. But maybe with the unbound configuration.
And I found it: I have, additional to my WAN, some VPN gateways and the default option "All" for Outgoing Network Interfaces selected. Some of the VPN gateways deny DNS queries and this is the reason for the "random" timeouts.
I dont understand why the same domains every and every time hit this "bad" gateways but after an exclusion of this interfaces everything works like a charm.
Thank you very much for your time and help. I'm annoyed that I didnt find the problem earlier.
-
And ZERO mention of VPN in your first post wasn't very helpful either.
And you have more than 1 of them? On the unbound setting tell it to use what interface(s) you want it to use for outbound queries..
-
@johnpoz said in Unbound cant resolve domains - which exists correctly:
And ZERO mention of VPN in your first post wasn't very helpful either.
Sorry for that. I totally forgot that not all vpn server accept dns queries.
And you have more than 1 of them?
Ye, gateways was the wrong word. I mean interfaces. pfsense runs three vpn clients and each client is assigned to his own interface. Based on lan client ips the traffic is redirected to different interfaces via firewall rules.
On the unbound setting tell it to use what interface(s) you want it to use for outbound queries..
Ye, I've done this now. Far to late :(
-
i resolved the problem. I installed a bind 9.11 in a docker container and activated only the resolver for my subnet. And everything works without any problems.
As I have said multiple times in other threads, this is the way to solve DNS resolution issues when you are policy-routing all over the place.