Netgate 1100 dns stops
-
@freek_box There are a few recent threads, though I have not seen this.
https://forum.netgate.com/topic/173148/slow-dns-after-22-05/
https://forum.netgate.com/topic/174698/occassionally-dns-fails-to-resolve-restarting-dns-resolver-fixes-it/ -
Do you see any errors in the system or resolver logs?
If you try to manually restart the DNS resolver do you see an error?
If you just resave the Resolver config does it show an error?
Steve
-
in the system logs dns resolver I see:
Sep 22 21:12:31 unbound 99947 [99947:0] notice: init module 0: validator
Sep 22 21:12:31 unbound 99947 [99947:0] notice: init module 1: iterator
Sep 22 21:12:31 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] info: service stopped (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0
Sep 22 21:12:32 unbound 99947 [99947:0] notice: Restart of unbound 1.12.0.
Sep 22 21:12:32 unbound 99947 [99947:0] notice: init module 0: validator
Sep 22 21:12:32 unbound 99947 [99947:0] notice: init module 1: iterator
Sep 22 21:12:32 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] info: service stopped (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
Sep 22 21:12:32 unbound 99947 [99947:0] info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0
Sep 22 21:12:32 unbound 99947 [99947:0] notice: Restart of unbound 1.12.0.
Sep 22 21:12:32 unbound 99947 [99947:0] notice: init module 0: validator
Sep 22 21:12:32 unbound 99947 [99947:0] notice: init module 1: iterator
Sep 22 21:12:32 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).
Sep 22 21:12:34 unbound 99947 [99947:0] info: generate keytag query _ta-4f66. NULL IN -
None of that looks like a problem. I assume at 21:12 you were not seeing a DNS issue?
Nothing in the system logs when it stops?
-
@stephenw10 said in Netgate 1100 dns stops:
None of that looks like a problem.
Let me rephrase these logs lines :
Sep 22 21:12:31 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).
Sep 22 21:12:31 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] notice: Restart of unbound 1.12.0.
Sep 22 21:12:32 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] info: service stopped (unbound 1.12.0).
Sep 22 21:12:32 unbound 99947 [99947:0] notice: Restart of unbound 1.12.0
Sep 22 21:12:32 unbound 99947 [99947:0] info: start of service (unbound 1.12.0).That's a restart every second !
I think I known all the reasons why unbound would get restarted **.
Still, for a 1100, this is IMHO, far in the danger zone.Restarting unbound while its already restarting, I don't know how to explain this, but something tells me that there is a big chance that @Freek_Box winds up with a dead unbound process in his pfSense.
And this gives all the effect needed to find himself with that "dns stops".** is there an interface going up and down every seconds ?
A LAN device is hail storming the SG11000 with DHCP requests ?
Etc.
These two would restart unbound. -
@gertjan said in Netgate 1100 dns stops:
would restart unbound
Also a DHCP lease renewal, if DNS registration is enabled.
v1.12 is a few pfSense versions old though...? 22.01 would have 1.13 IIRC. @Freek_Box what pfSense version are you on?
I recall there was one pfSense version that reverted unbound to an earlier version due to stability issues but I can't seem to find that in the release notes.
-
It's not actually unusual to see that logged multiple times when Unbound is restarted. And that isn't usually a sign of a problem. For example here I restarted manually it on my edge, logs filtered for restarts:
Sep 23 15:57:15 unbound 97113 [97113:0] info: start of service (unbound 1.15.0). Sep 23 15:57:17 unbound 97113 [97113:0] info: start of service (unbound 1.15.0). Sep 23 15:57:18 unbound 97113 [97113:0] info: start of service (unbound 1.15.0). Sep 23 15:57:20 unbound 97113 [97113:0] info: start of service (unbound 1.15.0). Sep 23 15:57:41 unbound 97113 [97113:0] info: start of service (unbound 1.15.0). Sep 23 15:57:42 unbound 97113 [97113:0] info: start of service (unbound 1.15.0).
The question of how Unbound is managed is... another question!
But it should not stop t responding normally after that.Steve
-
@stephenw10
Manually restarting, because, for example, your editing the config, is normal.This is a more graphical way to look at my unbound restarts.
and most of these restarts are a result of of my interaction with unbound, directly, or indirectly, like me trying out things with pfBlockerng-devel - like switching between unbound and python mode so I think I can answer something here on the forum after testing it myself
When I'm not at work for a week - then unbound won't restart in that week.
Only pfBlockerng-devel will break that cycle.
For me, unbound 1.15.0, is very stable.Using 22.05 on a 4100 - and yes, I'm using IPv4 (two third of the traffic) and IPv6 (one third of the traffic).
-
Yup, it doesn't restart often for me either but when it does it logs several restarts in a row like that.
When OP checks his resolver logs I would expect it to show that as the last thing that happened. If it was showing continuous restarts that would be a much bigger issue.
If the Unbound service just stops a cannot be restarted I expect to see some errors logged either in the Resolver or System log.
Steve
-
Running version 21.05.2-RELEASE (arm64)
-
Any reason you're not running something newer?
-
I asume I'm on the latest version?
-
@freek_box Nope, that's last year.
https://docs.netgate.com/pfsense/en/latest/releases/index.html#pfsense-plus-software
https://docs.netgate.com/pfsense/en/latest/troubleshooting/upgrades.html#upgrade-not-offered-library-errors -
@freek_box said in Netgate 1100 dns stops:
I asume I'm on the latest version?
Add the RSS widget on your dashboard :
https://www.netgate.com/blog/pfsense-plus-software-version-22.05-now-available
so you'll have a double check on what happening and available.This :
is strange.
The check for available updates succeeded, but the info coming back said "21.05.2-Release", so, you're fine.
As already said above, use the Troubleshooting Upgrades suggestions. You will find 22.05 avaible. -
I did the:
Navigate to System > Updates
Set Branch to Previous stable version
Wait a few moments for the upgrade check to completeBut now it shows:
-
At the command line run:
pkg-static -d update
What error does it return?
-
DBG(1)[40213]> pkg initialized
Updating pfSense-core repository catalogue...
DBG(1)[40213]> PkgRepo: verifying update for pfSense-core
DBG(1)[40213]> Pkgrepo, begin update of '/var/db/pkg/repo-pfSense-core.sqlite'
DBG(1)[40213]> Request to fetch pkg+https://repo.netgate.com/pkg/pfSense_plus-v21_05_2_aarch64-core/meta.conf
DBG(1)[40213]> opening libfetch fetcher
DBG(1)[40213]> Fetch > libfetch: connecting
DBG(1)[40213]> Fetch: fetching from: https://repo00.atx.netgate.com/pkg/pfSense_plus-v21_05_2_aarch64-core/meta.conf with opts "i"
1082900480:error:141F0006:SSL routines:tls_construct_cert_verify:EVP lib:/var/jenkins/workspace/pfSense-build-release-tarballs/BUILD_NODE/pkg-aarch64/OS_MAJOR_VERSION/freebsd12/PLATFORM/aws/crypto/openssl/ssl/statem/statem_lib.c:283:
DBG(1)[40213]> Fetch: fetching from: https://repo00.atx.netgate.com/pkg/pfSense_plus-v21_05_2_aarch64-core/meta.conf with opts "i"
Certificate verification failed for /C=US/ST=Texas/L=Austin/O=Rubicon Communications, LLC (Netgate)/CN=repo00.atx.netgate.com
1082900480:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:/var/jenkins/workspace/pfSense-build-release-tarballs/BUILD_NODE/pkg-aarch64/OS_MAJOR_VERSION/freebsd12/PLATFORM/aws/crypto/openssl/ssl/statem/statem_clnt.c:1915:
Segmentation fault (core dumped) -
When I do:
pkg-static clean -ay; pkg-static install -fy pkg pfSense-repo pfSense-upgradeI get:
pkg-static: Repository pfSense missing. 'pkg update' required
pkg-static: No package database installed. Nothing to do!
Updating pfSense-core repository catalogue...
1082900480:error:141F0006:SSL routines:tls_construct_cert_verify:EVP lib:/var/jenkins/workspace/pfSense-build-release-tarballs/BUILD_NODE/pkg-aarch64/OS_MAJOR_VERSION/freebsd12/PLATFORM/aws/crypto/openssl/ssl/statem/statem_lib.c:283:
Certificate verification failed for /C=US/ST=Texas/L=Austin/O=Rubicon Communications, LLC (Netgate)/CN=repo01.atx.netgate.com
1082900480:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:/var/jenkins/workspace/pfSense-build-release-tarballs/BUILD_NODE/pkg-aarch64/OS_MAJOR_VERSION/freebsd12/PLATFORM/aws/crypto/openssl/ssl/statem/statem_clnt.c:1915:
Child process pid=4480 terminated abnormally: Segmentation fault -
The segfault like that indicates the crypto chip is in an unreachable state. You need to completely power cycle the device to reset it. So halt the device then remove the power for 10s or so. It should update correctly when rebooted.
https://docs.netgate.com/pfsense/en/latest/troubleshooting/upgrades.html#segmentation-fault-in-pkgSteve
-
I have no possibility to unplug it is that a problem? The device is hours away from me.