[bind] very slow startup
-
Dear,
I'm experiencing a strange problem with bind.
I use pfsense as primary dns server to resolve my internal domain names.
I've a standard configuration with 1 local zone.
When I reboot my pfsense installation it takes a long to startup bind.I've tried also with a fresh installation of pfsense and bind.
Without any configured zone it startup immediately, but after configuring 1 zone with just mx and dns entries the time needed to startup is very long.From syslog:
Mar 28 06:36:56 pfSense php-fpm[361]: /rc.start_packages: Restarting/Starting all packages. Mar 28 06:39:26 pfSense php-fpm[361]: /rc.start_packages: Configuration Change: (system): BIND: Saved resulting config file for zone in xml
from resolv.log:
Mar 28 06:39:52 pfSense named[72089]: starting BIND 9.16.23 (Extended Support Version) <id:fde3b1f> Mar 28 06:39:52 pfSense named[72089]: running on FreeBSD amd64 12.3-STABLE FreeBSD 12.3-STABLE RELENG_2_6_0-n226742-1285d6d205f pfSense Mar 28 06:39:52 pfSense named[72089]: built with '--disable-linux-caps' '--localstatedir=/var' '--sysconfdir=/usr/local/etc/namedb' '--with-dlopen=yes' '--with-libxml2' '--with-openssl=/usr' '--with-readline=-L/usr/local/lib -ledit' '--with-dlz-filesystem=yes' '--enable-dnstap' '--disable-fixed-rrset' '--disable-geoip' '--without-maxminddb' '--without-gssapi' '--without-libidn2' '--with-json-c' '--disable-largefile' '--without-lmdb' '--disable-native-pkcs11' '--without-python' '--disable-querytrace' '--enable-tcp-fastopen' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd12.3' 'build_alias=amd64-portbld-freebsd12.3' 'CC=cc' 'CFLAGS=-O2 -pipe -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -L/usr/local/lib -ljson-c -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-isystem /usr/local/include' 'CPP=cpp' 'PKG_CONFIG=pkgconf'
It seems that the system is waiting something for a 1.5 minutes than it proceed.
Installing shellcmd and asking to execute the following command during startup:/usr/local/sbin/named -c /etc/namedb/named.conf -u bind -t /var/etc/named/
I see an error in the log
named: chroot(): No such file or directoryusing ssh and the same command:
Mar 28 07:28:35 pfSense named[41515]: loading configuration from '/etc/namedb/named.conf' Mar 28 07:28:35 pfSense named[41515]: open: /etc/namedb/named.conf: file not found Mar 28 07:28:35 pfSense named[41515]: loading configuration: file not found Mar 28 07:28:35 pfSense named[41515]: exiting (due to fatal error)
it seems that it needs time to populate /var/etc/named folder...
Do you have experienced this issue? Any suggestions?
thanks -
-
@gertjan I think is not the same.
in the issue reported about 9.16_12 (I have 9.16_13) they reported a problem reading configuration.
in any case creating a link as shown:[2.6.0-RELEASE][root@pfSense.home.arpa]/cf: ls -l total 4 drwxr-xr-x 3 root wheel 512 Mar 28 07:52 conf lrwxr-xr-x 1 root wheel 14 Mar 28 05:43 named -> /var/etc/named
the problem still present.
-
@raverag
Take a look to be sure.I saw over there : BIND 9.16_12 - and version BIND 9.16_13 - you use 9.16.23
I'm not using bind myself (on pfSense) but it looks like many versions exists.
There are some issues about where bind could/should find it's cobbfig :
Here : /etc/named/
or here /var/etc/named/
or somewhere else ? Like here :--sysconfdir=/usr/local/etc/namedb'
Your's is complaining that there is noything here :
Mar 28 07:28:35 pfSense named[41515]: loading configuration from '/etc/namedb/named.conf'
Mar 28 07:28:35 pfSense named[41515]: open: /etc/namedb/named.conf: file not foundwhich seams normal to me, most if not all pfSense settings are in /usr/local/etc or could be in /var/etc/.
Try the simlink trick.
-
@gertjan Thanks
I've looked into several path (thanks to find) but nothing found.
It seems that the location /vat/etc/named is dynamically created and populated.what I've found to improve the startup is:
- created a copy of /var/etc/named (once populated) into /cf/named
- using shellcmd start a copy of bind looking the configuration in /cf/named
- after the time needed for the initialization the system (re)starts normally bind using /var/etc/named for chroot.
it's a quick workaround.... but let me able to use dns while pfsense initialize bind...
-
I had the same problem with bind taking an unnecessarily long time to startup. I did the same as you at the start - created a temporary bind configuration, and used rc.custom.local to start it early on in boot - which seemed to work.
Investigating further, it turned out that the pfsense bind package file (bind.inc) was calling rndc even though named was not running. The timeout in rndc is now 60 seconds (reasonably recent change), so if you have > 1 zone, it will delay by minutes.
See my other post for a suggested rndc shell wrapper workaround - that checks if named is running first: bind 9.16_13 - rndc delays. If you choose this approach, make sure to move the original rndc binary out of the way first!
If you want to see more supporting evidence, look at /usr/local/pkg/bind.inc - search for the line
// Freeze dynamic zones to prevent journal corruption_text
The code calls rndc without checking that named is actually running - with a resulting 60 second timeout for each dynamic zone.
-
Dear @davetick ,
you have really found the issue!
thanks to your feedback. -
Thank you @davetick !
I've been fighting the BIND startup with no success until I found your post.
I have ~25 zones and it seemed like it would never start.
-
@gogglespisano as @davetick suggested the fix for me was rename
/usr/local/sbin/rndc in /usr/local/sbin/rndc.orig and create a new /usr/local/sbin/rndc:#!/bin/sh if [ -n "`/bin/ps auxw | /usr/bin/grep "[n]amed " | /usr/bin/awk '{print $2}'`" ]; then /usr/local/sbin/rndc.orig "$@" fi
don't forget to add permission +x to the new /usr/local/sbin/rndc and remember to do it again in case of future upgrade.
-
Based on @davetick workaround, I've summitted a pull request to fix this problem and also make the BIND widget work again. It would work on an upgrade that still had the old /cf/named/ folder, but would fail on a new install of 2.6.
The code is at https://github.com/pfsense/FreeBSD-ports/pull/1163/files/
I've also attached a patch file. Please test it and let me know if you have any problems or you can comment in the pull request or associated bugs.
-
BIND 9.16_17 has been released with the patch
-
Awesome work ! Have had a look and seems to have combined a number of really good fixes. RC start/stop, bind.inc rndc calls - all looking good.
Deployed to test instance performing really fine, will test/break/play for a few days more before deploying to prod, however a nice solid fix - thanks @gogglespisano ,
So nice to have reasonable boot time without hacky workaround
:-)
-
Thanks @davetick. It was your post that got me pointed in the right direction.
-
@gogglespisano good job! i've tested in my environment and it works properly