Unbound crashing randomly after 24.03 upgrade
-
We currently have four Netgate 6100 appliances configured identically regarding services and packages installed. The only difference between them is the firewall rules defined for each office location.
One of the four seems to be having a random issue where DNS stops resolving. When I log into the firewall, it states that Unbound is not running. I start it back up, and all seems to be well again. The other three firewalls have not experienced this issue. The office this appears to be happening at is also our largest office, with the most network devices compared to the other three offices.
It appears to be happening sometime over the weekend as I have discovered each instance of this issue happening when I get to the office on Monday morning. Today has been the third time this has happened since upgrading to 24.03. Prior to 24.03, all four devices have been running rock solid.
Question: Is there a way to pull log files beyond the 500 entries in the System Logs viewer? When I check the logs in the GUI log viewer, the entries do not go back far enough to see what is potentially causing the crash.
I searched the forums regarding this, and I did not find any recent reports of this happening. It seems most of the forum posts regarding unbound crashing are from at least a few years ago or older.
-
@mwierowski said in Unbound crashing randomly after 24.03 upgrade:
Is there a way to pull log files beyond the 500 entries in the System Logs viewer?
Goto the settings of this log viewer !?
Status > System Logs > SettingsChange GUI Log Entries for something bigger.
See alos lower on teh same page : Log Rotation Options.
Make the size of each log file way bigger (do check the limits with your actual drive capacity).
There is some size info on the page :Btw : be careful with detailed unbound logging : Services > DNS Resolver > Advanced Settings : Level 3 logging, probably also Level 2 will log a lot, depending on the DNS activity of your LANs.
A command like this (SSH or console) :
grep 'start' /var/log/resolver.log
shows you how often unbound was told to (stop and then) 'start'.
-
It appears that unbound is restarting quite often:
May 20 09:31:42 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:31:42 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:32:03 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:32:03 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:33:38 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:33:38 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:34:35 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:34:35 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:39:23 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:39:23 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:39:29 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:39:29 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:40:44 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:40:44 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:40:44 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:40:44 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3. May 20 09:42:04 unbound 74285 [74285:0] info: service stopped (unbound 1.19.3). May 20 09:42:04 unbound 74285 [74285:0] notice: Restart of unbound 1.19.3.
I am not seeing anything indicating the reason for the restarts. Do I need to increase the logging level in order to determine why the service is restarting so often? Is this normal to have unbound stop and restart so often?
-
@mwierowski said in Unbound crashing randomly after 24.03 upgrade:
I am not seeing anything indicating the reason for the restarts.
Yeah, you don't want your un bound process getting restarted every 2 minutes or so.
That quiet ... not good.Most probable cause : uncheck this one:
as every DHCP lease will restart unbound when this option is checked.
Other events that can restart packages and also unbound : WAN is going down and up again.
pfBlockerng, if you've set it up in nervous mode : update IP and DNSBL feeds every hour or so. I've set mine to Weekly for all of them.Mine restarts daily or more often, but that's me 'messing' around with my pfSense.
See the memory usage graphs that clearly shows unbound restarting.
Btw : Using 24.03 on a 4200, and its rock solid. -
@Gertjan said in Unbound crashing randomly after 24.03 upgrade:
@mwierowski said in Unbound crashing randomly after 24.03 upgrade:
I am not seeing anything indicating the reason for the restarts.
Yeah, you don't want your un bound process getting restarted every 2 minutes or so.
That quiet ... not good.Most probable cause : uncheck this one:
as every DHCP lease will restart unbound when this option is checked.
Thanks, I unchecked that option and will monitor the logs to see if that slows down the restarts.
-
@Gertjan, so far, since unchecking that option, I haven't seen a single restart of unbound. Hopefully, this will resolve the issue. Thanks again for your help.
-
@mwierowski said in Unbound crashing randomly after 24.03 upgrade:
@Gertjan, so far, since unchecking that option, I haven't seen a single restart of unbound. Hopefully, this will resolve the issue. Thanks again for your help.
I'd expect so. If you do need registration, the other option is to set a longer lease time. Clients normally renew their leases at 1/2 the lease duration. So a 1 hour lease with 30 devices would be an average of once per minute.
I believe Netgate is working on improving this when they are further along in transitioning to Kea DHCP.
-