Unbound problems after update to latest beta
-
+1 on slow reboot and overall speed of configuring unbound…
-
The main difference is that it is not stomping all over itself to restart unbound now. It's possible it was fast before because at some point it was actually failing to restart and then something came along later in the boot process and kicked it again.
Does it actually improve if you back out just https://github.com/pfsense/pfsense/commit/38d110824c87ff60c6289c0432d55009586ceee4 ? Or do you also have to back out https://github.com/pfsense/pfsense/commit/8a0aa42c197361ebb82387e5bdc8378e5440837f to make it fast again?
I wouldn't be surprised to find out that https://github.com/pfsense/pfsense/commit/8a0aa42c197361ebb82387e5bdc8378e5440837f had a larger impact because before that commit, any calls to unbound-control were improperly formed and thus did nothing, now they do.
Drop a note with feedback on https://redmine.pfsense.org/issues/7326 so others following the original issue can see if they can confirm the problem and/or fix.
-
Also having unbound issues after update
-
Any way to roll back to old version for the meantime?
-
After reboot into new build can't access webgui until I restart webconfiguratpr from ssh. Then only by IP hostname won't work.
When I do get in, dhcpd, ntpd, suricata, bandwidths, dnsbl services are all down. Also no VPN client will connect.
My setup is effectively broken after this update.
-
That does not sound related. Start your own thread and post as much detail (console output, system log contents, etc) as possible.
If you suspect these changes are related, use the System Patches package to revert those commits.
-
How do I revert to an old commit with system patchesOK, I had to workaround a bit to be get the patch fetch to resolve but I reverted those two and all is well now.
https://forum.pfsense.org/index.php?topic=133164.msg731939#msg731939At least for me the latest snapshot for Unbound takes down my entire network.
Thanks for the help!
-
Does it actually improve if you back out just https://github.com/pfsense/pfsense/commit/38d110824c87ff60c6289c0432d55009586ceee4 ? Or do you also have to back out https://github.com/pfsense/pfsense/commit/8a0aa42c197361ebb82387e5bdc8378e5440837f to make it fast again?
I just gave it a quick test on my secondary HA gateway. Backing out the first patch was sufficient, unbound starts fast at boot, stops and restarts fine again.
Anything else I should test? -
Yeah same here.
First one is enough. -
OK, I backed out that commit, next new snapshots should be OK again. We'll have to keep looking for a fix for https://redmine.pfsense.org/issues/7326, that may be hitting a few users but this seems to negatively impact more.
Those of you who had problems with unbound starting/stopping properly, what services are enabled on these firewalls? And what packages are installed?
-
Those of you who had problems with unbound starting/stopping properly, what services are enabled on these firewalls? And what packages are installed?
Thanks!
Services:
dhcpd
dpinger
ftp-proxy
ntpd
openvpn
sshd
syslogd
unbound
zabbix_agentd_ltsPackages:
FTP_Client_Proxy 0.3_3
System_Patches 1.1.6_1
zabbix-agent 0.8.9_3 -
Could everyone who had problems running unbound with the previous commit please try the attached patch.
It doesn't use unbound-control to stop unbound, but just the kill as before, plus the delay loop to be certain it stopped before moving on.
It is also available at https://redmine.pfsense.org/issues/7326#note-21
Use the system patches package to apply the change.
-
Hmm, how would I use this with the System Patches Package?
/usr/bin/patch --directory=/ -t -p1 -i /var/patches/596064c796ffb.patch --check --forward --ignore-whitespace Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |diff --git a/src/etc/inc/services.inc b/src/etc/inc/services.inc |index ffc4aa8..0e2cfad 100644 |--- a/src/etc/inc/services.inc |+++ b/src/etc/inc/services.inc -------------------------- No file to patch. Skipping... Hunk #1 ignored at 2235. Hunk #2 ignored at 2261. 2 out of 2 hunks ignored while patching src/etc/inc/services.inc done
Edit: Never mind, I removed "/src" instances and it worked.
Looks better with this patch, everything I tested seems to be as fast as always.
-
You can set the Path Strip to 2 in the system patches package entry and then you don't have to edit anything.
Once a couple more people try it out I'll commit that change, probably Monday.
-
Just a newb question here, but after the change is committed an upgrade will implement it, correct? Just making sure I won't still have to implement the patch at that point if I'm not a fresh install. I'm currently not experiencing any issues but thought I'd ask. Thanks!
-
Just a newb question here, but after the change is committed an upgrade will implement it, correct? Just making sure I won't still have to implement the patch at that point if I'm not a fresh install. I'm currently not experiencing any issues but thought I'd ask. Thanks!
Correct. If the patch gets committed, it will be included in whatever the next snapshot is after it gets committed.
How fast it gets committed depends entirely on people who experienced the previous issues testing it and providing feedback, though.
-
Jim's first reply is correct, I was the one who did the original bug report on this.
Previously unbound was quick on boot because it didnt actually check if it had started ok, it simply ran a kill command immediately followed by a start command (this was actually done on the wan ip change process). The problem was if the kill process had not finished then the start command would fail but the pfsense boot scripts were unaware of this.
My original proposed fix was to issue a shutdown command to unbound to stop it, which can take time to complete especially when there is a large dnsbl list. Hence the delay on boot and restarts.
Better to have the delay than a service that hasnt started.
bbcan177 the developer of pfblockerng confirmed the bug also.
-
Better to have the delay than a service that hasnt started.
That was my thought as well, except in practice this led to people not having any running unbound instance, or one left running but in a weird/broken state (e.g. stuck at 100% CPU).
Unbound's official docs actually say to stop it with kill, which is still what my last patch does, but it also waits ~30 to let it stop nicely before attempting to start it again.
-
yep I can see from the posts here the fix hasnt worked for everyone, hopefully these guys will test the new patch to confirm its good for them.
-
I hope so, too. I have not received any feedback about the patch I posted here (or on the Redmine ticket). I could not reproduce the problems here, so I'd rather not commit it without verifying it works properly on firewalls that exhibited the previous problems.