DNS Resolver not resolving part 1234
-
@henkbart said in DNS Resolver not resolving part 1234:
You wrote about the setting DHCP Client Registreation.
I don't see that in my settings.You are using the new dhcp server - kea.
So no 'DHCP Client Registration' issue that can influence unbound.Check you system.log and resolver.log
You can see in the resolver.log when unbound gets restarted.
At that precise moment, check the system.log entries.If an interface goes down/up, this will restart unbound.
So, if you have a bad NIC somewhere, or a bad cable, this can impact unbound also. -
@Gertjan
Thanks for clearing up then DHCP settings.When looking at the log files, i only see that the Unboud is startet and no stops until i reboot or restart the service.
Strange thing is, that still when rebooting PFSense, the resolver does not resolve anything, until i restart the service.
But i can not see anything else in the log files, then the Restart service entry.....No entries in the Watchdog. And no other errrors i can find...
Strange thing....
-
grep 'start' /var/log/resolver.log
@henkbart said in DNS Resolver not resolving part 1234:
No entries in the Watchdog
Be careful with the watchdog package.
It's pretty brain dead, and is capable of restarting "unbound" while it is actually starting, creating two instances at the same moment.
The "watchdog package" is meant to be used when you are developing.Another example : witrh a fully loaded (to the sealing) pfBlockerng, it can take a long time for unbound to start. This can be detected by the stupid Watchdog package, and it decides to "start' (again) unbound - wihile another instance is already started.
Big-mess time again.Your case : unbound stops working, for whatever reason, but we'll fund out, and that should be repaired.
Keep in mind : you unbound, my unbound, and unbound running on several hundred thousands other pfSense installs, and a couple of other million or so devices, runs fine.
So, there is only one question to answer : "whats up with your system", and you're done. -
@henkbart said in DNS Resolver not resolving part 1234:
I checked if i had the latest version of Unbound
Depending when you updated to 23.09.1 you might not have. And you can't tell by looking at unbound -V
They added 2 packages to the release on Saturday.
seems one way you can tell is the System Information widget on the dashboard Version
if it says Dec 9 sample:
"built on Sat Dec 9 12:57:00 EST 2023" you will have them
Dec 6th (the first date of release) you won'tmore details here
https://forum.netgate.com/topic/184681/after-update-2-7-2-23-09-1/11?_=1702376507123
Edit: that said. that's not likely your issue.
-
I dont use Pfblockerng, so that could not be the case here.
When i reboot the system, and look at the log i can only see that it is started, and stopped at the same time, and then restarted.
And then no stop ever. -
@henkbart said in DNS Resolver not resolving part 1234:
And then no stop ever.
So, its running ?
Test with these 4 :ps ax | grep unbound
top
sockstat | grep 'unbound'
dig @127.0.0.1 netgate.com
-
When i check all the packages :
pkg upgrade
Updating pfSense-core repository catalogue...
Fetching meta.conf: 0%
pfSense-core repository is up to date.
Updating pfSense repository catalogue...
Fetching meta.conf: 0%
pfSense repository is up to date.
All repositories are up to date.
Checking for upgrades (0 candidates): 100% 0 B 0.0kB/s 00:01
Processing candidates (0 candidates): 100% 0 B 0.0kB/s 00:01
Checking integrity... done (0 conflicting)
Your packages are up to date.And when i check the version of UNBOUND i got :
unbound -V
Version 1.18.0Configure line: --with-libexpat=/usr/local --with-ssl=/usr --enable-dnscrypt --disable-dnstap --with-libnghttp2 --with-dynlibmodule --enable-ecdsa --disable-event-api --enable-gost --with-libevent --with-pythonmodule=yes --with-pyunbound=yes ac_cv_path_SWIG=/usr/local/bin/swig LDFLAGS=-L/usr/local/lib --disable-subnet --disable-tfo-client --disable-tfo-server --with-pthreads --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/share/info/ --build=amd64-portbld-freebsd14.0
Linked libs: libevent 2.1.12-stable (it uses kqueue), OpenSSL 3.0.12 24 Oct 2023
Linked modules: dns64 python dynlib respip validator iterator
DNSCrypt feature availableBSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues -
also
grep -E 'start | stopped' /var/log/resolver.log
they should pair up by the pid
Dec 2 00:31:05 name unbound[26560]: [26560:0] info: start of service (unbound 1.18.0). Dec 4 00:31:02 name unbound[26560]: [26560:0] info: service stopped (unbound 1.18.0).
a restart would typically look like
Nov 30 03:05:48 name unbound[48588]: [48588:0] info: start of service (unbound 1.18.0). Nov 30 03:05:48 name unbound[48588]: [48588:0] info: service stopped (unbound 1.18.0). Nov 30 03:05:48 name unbound[48588]: [48588:0] notice: Restart of unbound 1.18.0. Nov 30 03:05:51 name unbound[48588]: [48588:0] info: start of service (unbound 1.18.0).
if you have a lot of open starts without a stop, something else is happening. (but don't look at the date/time you do a system update)
-
ps ax | grep unbound
23994 - Ss 0:01.58 /usr/local/sbin/unbound -c /var/unbound/unbound.conf
45451 - Is 0:00.00 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d private.lan -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts
10308 0 S+ 0:00.00 grep unboundsockstat | grep 'unbound'
unbound unbound 23994 3 udp4 192.168.1.1:53 :
unbound unbound 23994 4 tcp4 192.168.1.1:53 :
unbound unbound 23994 5 udp4 127.0.0.1:53 :
unbound unbound 23994 6 tcp4 127.0.0.1:53 :
unbound unbound 23994 7 tcp4 127.0.0.1:953 :
unbound unbound 23994 8 stream /var/run/php-fpm.socket
unbound unbound 23994 9 dgram -> /var/run/log
unbound unbound 23994 10 stream -> [23994 11]
unbound unbound 23994 11 stream -> [23994 10]
unbound unbound 23994 12 stream /var/run/php-fpm.socket
unbound unbound 23994 13 stream -> [23994 14]
unbound unbound 23994 14 stream -> [23994 13]
unbound unbound 23994 15 stream -> [23994 16]
unbound unbound 23994 16 stream -> [23994 15]
unbound unbound 23994 17 stream -> [23994 18]
unbound unbound 23994 18 stream -> [23994 17]dig @127.0.0.1 netgate.com
; <<>> DiG 9.18.16 <<>> @127.0.0.1 netgate.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51534
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;netgate.com. IN A;; ANSWER SECTION:
netgate.com. 60 IN A 199.60.103.104
netgate.com. 60 IN A 199.60.103.4;; Query time: 149 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Dec 12 12:52:33 CET 2023
;; MSG SIZE rcvd: 72top
last pid: 67887; load averages: 0.08, 0.18, 0.17 up 0+01:52:14 12:53:09
62 processes: 1 running, 61 sleeping
CPU: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle
Mem: 105M Active, 154M Inact, 504M Wired, 15G Free
ARC: 151M Total, 30M MFU, 114M MRU, 2236K Anon, 795K Header, 4133K Other
107M Compressed, 267M Uncompressed, 2.49:1 Ratio
Swap: 1024M Total, 1024M FreePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
43007 root 1 20 0 14M 3716K CPU0 0 0:00 0.15% top
98015 root 1 20 0 22M 11M select 1 0:00 0.02% sshd
23994 unbound 4 20 0 85M 55M kqread 3 0:02 0.02% unbound
28457 root 9 20 0 53M 25M select 0 0:02 0.01% kea-dhcp4
24814 root 1 20 0 23M 8864K select 2 0:00 0.01% ntpd
81718 root 5 68 0 13M 2748K uwait 3 0:00 0.00% dpinger
410 root 1 20 0 108M 32M kqread 1 0:00 0.00% php-fpm
27047 root 1 20 0 13M 3516K bpf 3 0:00 0.00% filterlog
411 root 1 68 0 148M 52M accept 2 0:03 0.00% php-fpm
412 root 1 68 0 144M 49M accept 2 0:02 0.00% php-fpm
35166 root 1 26 0 148M 51M accept 0 0:02 0.00% php-fpm -
-
What repos are you using on your PFSense device??
-
What? - the official Netgate ones -- you should never go to any other repo for updates or packages.
Maybe I'm not understanding your question?
-
@jrey
I asked that because of the Unbound 1.8.0.1 version you wrote about in the other article.
I only got 1.18.0 version so i was wondering if that had something to do with the repos. -
@henkbart said in DNS Resolver not resolving part 1234:
I only got 1.18.0 version so i was wondering if that had something to do with the repos.
Ah
The unbound 1.18.0_1 was slipstreamed into the release package after it was first released. (so was curl) as discussed in the thread referenced.
however as noted even on the system that was updated after that original release (and therefore would have received the updated versions) unbound still reports itself as 1.18.0 in both the logs and when you do a "unbound -V" - the system's update log in that case says 1.18.0_1 was installed
on that system the file's time stamp does reflect that it was from the 9th build, suggesting it should be 1.18.0_1, it just doesn't report that.
-
@henkbart there was a new version of unbound that you could get via pkg update and then upgrade.. But that was pushed in the latest 23.09.1 but it doesn't really show that if you just do a version on unbound.
[23.09.1-RELEASE][admin@sg4860.local.lan]/root: pkg info | grep unbound unbound-1.18.0_1 Validating, recursive, and caching DNS resolver [23.09.1-RELEASE][admin@sg4860.local.lan]/root:
[23.09.1-RELEASE][admin@sg4860.local.lan]/root: unbound -V Version 1.18.0 Configure line: --with-libexpat=/usr/local --with-ssl=/usr --enable-dnscrypt --disable-dnstap --with-libnghttp2 --with-dynlibmodule --enable-ecdsa --disable-event-api --enable-gost --with-libevent --with-pythonmodule=yes --with-pyunbound=yes ac_cv_path_SWIG=/usr/local/bin/swig LDFLAGS=-L/usr/local/lib --disable-subnet --disable-tfo-client --disable-tfo-server --with-pthreads --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/share/info/ --build=amd64-portbld-freebsd14.0 Linked libs: libevent 2.1.12-stable (it uses kqueue), OpenSSL 3.0.12 24 Oct 2023 Linked modules: dns64 python dynlib respip validator iterator DNSCrypt feature available BSD licensed, see LICENSE in source package for details. Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues [23.09.1-RELEASE][admin@sg4860.local.lan]/root:
I don't recall ever having any issue where unbound just wouldn't resolve, but was still running.. And sure haven't seen as of late.. A good test might be to try and resolve something just local.. say your pfsense fqdn via your fav local tool, nslookup, dig, host, doggo, etc.. Does that work, just not external? Its best to use a cmd line tool because then you can see the actual response from unbound, be it NX or servfail, refused, etc.
-
-
-
@jrey said in DNS Resolver not resolving part 1234:
I've never had an issue with DNS resolver stopping or "Freezing"
and
@johnpoz said in DNS Resolver not resolving part 1234:
I don't recall ever having any issue where unbound just wouldn't resolve, but was still running
yup this ^
-
@henkbart said in DNS Resolver not resolving part 1234:
45451 - Is 0:00.00 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d private.lan -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /etc/hosts
This is the one I was talking about when I mentioned the Resolver "DHCP Client Registration" check box, the option you don't have (under Services > DNS Resolver > General Settings) as you are running Kea ..... as in that case the option doesn't show up.
Is this correct, are you using Kea ?
Or Dhcpd ?If you are using kea, you should see this :
If you are using kea, this process "/usr/local/sbin/dhcpleases" can not - should not exist.
As this is the one that shoots unbound in the face every time ..... see above.edit :
You have this :
== kea DHCP checked ?
-
Lucky you,
But there are a lot of people having troubles with it, including me.
That makes it difficult to pinpoint the location of the problem....