Bind upgrade producing errors on pfsense 2.5 upgrade
-
@gertjan Ok thanks for the info, I did not know that :-)
-
@matthijs was bind turned off when you tried to remove the journal files?
-
@de0xyrib0se
No, first thing I did was raise my SOA serial number for my (master) zones (with a number higher than in the last .jnl zone update) I use the date serial format yyyymmddnn)
after that I logged in the PFSense host with ssh, went to /cf/named/etc/namedb/master/mymastername/
rm *.jnl
and then restarted bind
I think it is not related with my issueMy problem with bind (I think) is during statup/boot and also with install/reinstall package it is trying to connect to rndc 127.0.0.1#8953 for some reason, but it is not running at that very moment, resulting in the rndc: connect failed: 127.0.0.1#8953: timed out message (and it is trying 5 times or so taking a long time)
-
This :
@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
rm *.jnl
and then restarted bindThese jnl are database-lookalike files, binary format en opened by bind permanently.
You can't 'delete' them while bind9 has them open for writing.is a major no go.
If the rm and restart had to be done (I doubt) I would do it like this :
( old fashioned debain service handling )service bind9 stop
Now I edit zone files, config files, whatever.
When done, I check my config and zone files :
named-checkconf -z
When no errors and all looke dandy :
service bind9 start
Btw : when I need to update a zone, for example : I want to change the SOA :
oot@ns311465:~# rndc freeze test-domaine.fr root@ns311465:~# nano /etc/bind/zones/db.test-domaine.fr root@ns311465:~# rndc reload test-domaine.fr zone reload queued root@ns311465:~# rndc thaw test-domaine.fr A zone reload and thaw was started. Check the logs to see the result.
No need to restart bind, no journal file issues.
Btw : journal files exists if the zone files are modified by other means as the admin.
For example : when the zone contains info that is update using RFC 2136.
Or when the zone is signed for DNSSEC.
Simple zones do have dot jnl and dot jbk files.Btw : I'm using the somewhat older
BIND 9.9.5-9+deb8u19-Debian (Extended Support Version)
-
@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
@de0xyrib0se
No, first thing I did was raise my SOA serial number for my (master) zones (with a number higher than in the last .jnl zone update) I use the date serial format yyyymmddnn)
after that I logged in the PFSense host with ssh, went to /cf/named/etc/namedb/master/mymastername/
rm *.jnl
and then restarted bind
I think it is not related with my issueMy problem with bind (I think) is during statup/boot and also with install/reinstall package it is trying to connect to rndc 127.0.0.1#8953 for some reason, but it is not running at that very moment, resulting in the rndc: connect failed: 127.0.0.1#8953: timed out message (and it is trying 5 times or so taking a long time)
Shut down bind (command is listed above) and then do the rm, you cannot remove the files when it has a read lock on them. Restart bind afterwards and it will rebuild the journal files automatically.
This is what I did and it worked like a charm.
-
Thanks for your reply and suggestion. I did exactly as you described (I probably also stopped bind before, I forgot to mention that) Your suggestion did not solve my "rndc: connect failed: 127.0.0.1#8953: timed out message" during statup/boot and package install/reinstall.
rndc tries to connect to 127.0.0.1#8953 (during startup/boot and package install/reinstall) at a moment it is not running (hence the timeout). After bind is started it is running with no problems. Also rndc runs perfectly without errors AFTER bind is started.
-
@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
rndc tries to connect to 127.0.0.1#8953 (during startup/boot and package install/reinstall) at a moment it is not running (hence the timeout). After bind is started it is running with no problems. Also rndc runs perfectly without errors AFTER bind is started.
If bind (named) is not running, rndc cannot contact it, hence the
rndc tries to connect to 127.0.0.1#8953
error.
If named is running, there error won't show (because rndc can now contact named on port 8953).
Again : running unbound and bind on the same device is something I wouldn't advice to do.
-
"If bind (named) is not running, rndc cannot contact it, hence the rndc tries to connect to 127.0.0.1#8953"
So why is rndc trying to connect to named (during reboot) when named it is not started yet ? And why is there no problem at all as soon as named is started ?
"Again : running unbound and bind on the same device is something I wouldn't advice to do."
Why not, I got unbound and named running on seperated interfaces and separated ports
-
@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
Why not, I got unbound and named running on seperated interfaces and separated ports
As long as the control port, yours are 8953 for bind, and 853 for unbound, are not conflicting - thus not the case, and you 'bind' unbound to "interface 1" and named to "interface 2" they could co-exist.
The control port are normally only bound to 127.0.0.1 or ::1.@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
So why is rndc trying to connect to named (during reboot) when named it is not started yet ?
I'm not using bind, the pfSense package, myself.
I don't know why and when rndc is used.
Check the logs to see if bind (named) is already started when this happens.@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
And why is there no problem at all as soon as named is started ?
rndc is a program that controls the behaviour of bind (named) during run time.
Like unbound-control for unbound.rndc won't produce error messages f named is not running.
And it complains if it does.The real question is (for me) why is rndc executed if named isn't running yet ?
Btw : like a web server, or mail server, a DNS resolver + domain server like bind isn't really a service that can be made accessible with a GUI. There are just to many settings, options and different cases.
You wind up using the config files.
Take note : I use bind (named) a lot, as master domain server and several slaves, for all my domain names, DNSSEC stuff. The pfSEnse acme package works perfect using RFC2136, something bind supports very well. But I use it on my web/mail/whatever servers, all dedicated servers on the Internet, not my local firewall.
Again, this my my opinion of course. -
I did a complete reinstall of PFsense 2.5.2 and restored my last configuration.
It did not reinstall all of the packages incl bind/named automatically after the first reboot
I got the following notice in the upper right cornerGeneral
Package named does not exist in current pfSense version and it has been removed. @ 2021-10-11 13:44:20
Package reinstall process finished successfully @ 2021-10-11 13:44:45It did automatically reinstall all the other packages
-
After the above notice/error I manualy reinstalled bind/named from the package manager
here's the log output
Installing pfSense-pkg-bind...
Updating pfSense-core repository catalogue...
pfSense-core repository is up to date.
Updating pfSense repository catalogue...
pfSense repository is up to date.
All repositories are up to date.
The following 5 package(s) will be affected (of 0 checked):New packages to be INSTALLED:
bind916: 9.16.16_1 [pfSense]
fstrm: 0.6.1 [pfSense]
pfSense-pkg-bind: 9.16_11 [pfSense]
protobuf: 3.14.0,1 [pfSense]
protobuf-c: 1.4.0 [pfSense]Number of packages to be installed: 5
The process will require 43 MiB more space.
6 MiB to be downloaded.
[1/5] Fetching pfSense-pkg-bind-9.16_11.txz: ... done
[2/5] Fetching bind916-9.16.16_1.txz: .......... done
[3/5] Fetching protobuf-c-1.4.0.txz: .......... done
[4/5] Fetching protobuf-3.14.0,1.txz: .......... done
[5/5] Fetching fstrm-0.6.1.txz: ......... done
Checking integrity... done (0 conflicting)
[1/5] Installing protobuf-3.14.0,1...
[1/5] Extracting protobuf-3.14.0,1: .......... done
[2/5] Installing protobuf-c-1.4.0...
[2/5] Extracting protobuf-c-1.4.0: .......... done
[3/5] Installing fstrm-0.6.1...
[3/5] Extracting fstrm-0.6.1: .......... done
[4/5] Installing bind916-9.16.16_1...
[4/5] Extracting bind916-9.16.16_1: .......... done
[5/5] Installing pfSense-pkg-bind-9.16_11...
[5/5] Extracting pfSense-pkg-bind-9.16_11: .......... done
Saving updated package information...
done.
Loading package configuration... done.
Configuring package components...
Loading package instructions...
Custom commands...
Executing custom_php_install_command()...done.
Executing custom_php_resync_config_command()...rndc: connect failed: 127.0.0.1#8953: timed out
rndc: connect failed: 127.0.0.1#8953: timed out
rndc: connect failed: 127.0.0.1#8953: timed out
rndc: connect failed: 127.0.0.1#8953: timed out
rndc: connect failed: 127.0.0.1#8953: timed out
rndc: connect failed: 127.0.0.1#8953: timed out
done.
Menu items... done.
Services... done.
Writing configuration... done.Message from bind916-9.16.16_1:
--
BIND requires configuration of rndc, including a "secret"
key. The easiest, and most secure way to configure rndc is
to run 'rndc-confgen -a' to generate the proper conf file,
with a new random key, and appropriate file permissions.The /usr/local/etc/rc.d/named script will do that for you.
If using syslog to log the BIND9 activity, and using a
chroot'ed installation, you will need to tell syslog to install
a log socket in the BIND9 chroot by running:sysrc altlog_proglist+=named
And then restarting syslogd with: service syslogd restart
Cleaning up cache... done.
Success -
@gertjan
I use ACME with DNS-NSUpdate method on the PFSense box which works great
I need the ACME certificates on de PFSense box because I also use HA-Proxy SSL ofloading for a number of websites. -
Here the complete log:
here's the log output:
>>> Installing pfSense-pkg-bind... Updating pfSense-core repository catalogue... pfSense-core repository is up to date. Updating pfSense repository catalogue... pfSense repository is up to date. All repositories are up to date. The following 5 package(s) will be affected (of 0 checked): New packages to be INSTALLED: bind916: 9.16.16_1 [pfSense] fstrm: 0.6.1 [pfSense] pfSense-pkg-bind: 9.16_11 [pfSense] protobuf: 3.14.0,1 [pfSense] protobuf-c: 1.4.0 [pfSense] Number of packages to be installed: 5 The process will require 43 MiB more space. 6 MiB to be downloaded. [1/5] Fetching pfSense-pkg-bind-9.16_11.txz: ... done [2/5] Fetching bind916-9.16.16_1.txz: .......... done [3/5] Fetching protobuf-c-1.4.0.txz: .......... done [4/5] Fetching protobuf-3.14.0,1.txz: .......... done [5/5] Fetching fstrm-0.6.1.txz: ......... done Checking integrity... done (0 conflicting) [1/5] Installing protobuf-3.14.0,1... [1/5] Extracting protobuf-3.14.0,1: .......... done [2/5] Installing protobuf-c-1.4.0... [2/5] Extracting protobuf-c-1.4.0: .......... done [3/5] Installing fstrm-0.6.1... [3/5] Extracting fstrm-0.6.1: .......... done [4/5] Installing bind916-9.16.16_1... [4/5] Extracting bind916-9.16.16_1: .......... done [5/5] Installing pfSense-pkg-bind-9.16_11... [5/5] Extracting pfSense-pkg-bind-9.16_11: .......... done Saving updated package information... done. Loading package configuration... done. Configuring package components... Loading package instructions... Custom commands... Executing custom_php_install_command()...done. Executing custom_php_resync_config_command()...rndc: connect failed: 127.0.0.1#8953: timed out rndc: connect failed: 127.0.0.1#8953: timed out rndc: connect failed: 127.0.0.1#8953: timed out rndc: connect failed: 127.0.0.1#8953: timed out rndc: connect failed: 127.0.0.1#8953: timed out rndc: connect failed: 127.0.0.1#8953: timed out done. Menu items... done. Services... done. Writing configuration... done. ===== Message from bind916-9.16.16_1: -- BIND requires configuration of rndc, including a "secret" key. The easiest, and most secure way to configure rndc is to run 'rndc-confgen -a' to generate the proper conf file, with a new random key, and appropriate file permissions. The /usr/local/etc/rc.d/named script will do that for you. If using syslog to log the BIND9 activity, and using a chroot'ed installation, you will need to tell syslog to install a log socket in the BIND9 chroot by running: # sysrc altlog_proglist+=named And then restarting syslogd with: service syslogd restart >>> Cleaning up cache... done. Success
-
Everything was working fine with PFsense 4.5.x, I first encountered the bind issues after the upgrade to 2.5.0
-
Ah, ok.
Executing custom_php_resync_config_command()...rndc: connect failed: 127.0.0.1#8953: timed out
When installed, bind (named) can't run right away. It probably needs some setup first.
And if it runs right away during the install, as the default is 'bind ports (853, 53) to all interfaces, this would clasch with unbound right from the start.Why rndc is used during install : I can't tell.
Maybe related to this :BIND requires configuration of rndc, including a "secret" key. The easiest, and most secure way to configure rndc is to run 'rndc-confgen -a' to generate the proper conf file, with a new random key, and appropriate file permissions.
But again, like nearly any other pfSense, package, the service won't run as it needs a proper set up first.
What happens when you install bind, and while it's loading, you stop the unbound service.
bind (named) will probably start as it can use all the ports/interfaces it needs.
After that, you fine tune bind, and restart it.
Then you can start unbound also. -
@gertjan
rndc-confgen -a creates default config in /usr/local/etc/namedb while de pfSense uses a chroot environment for bind/named. (in /cf/named/etc/named/)Like I said bind/named was working fine in psSense 4.5.x without issues
The message/notice after a full new psSense install and restore the configuration is also not good
"General
Package named does not exist in current pfSense version and it has been removed. @ 2021-10-11 13:44:20
Package reinstall process finished successfully @ 2021-10-11 13:44:45"I hope the issues get fixed soon
-
@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
Package named does not exist in current pfSense version and it has been removed. @ 2021-10-11 13:44:20
'named' is called 'bind' now.
Not really an issue.
It will break the auto re install of course. You had to install in manually. -
I think the issue is caused by something wrong in /usr/local/etc/rc.d/named
-
@matthijs said in Bind upgrade producing errors on pfsense 2.5 upgrade:
/usr/local/etc/rc.d/named
-
Configuration:
unbound is listening on:
192.168.10.1
192.168.20.1
localhost
The control port is set to 953bind is listening on
192.168.10.9
The control port is set to 8953
some zones are dynamic and updated by the dhcp servicepfblocker with dnsbl is enabled
The problem:
Upon rebooting the server, startup is really slow. This happened during and since the upgrade to version 21 (aka 2.5)
Investigation:
reboot the server and connect with the serial console and ssh
When server seems frozen run ps -aux | grep (various names here name/bind/rndc) on the ssh console
We notice rndc is being started multiple times (rndc freeze/thaw commands) and take a long time to complete (it seem to go over all defined bind zones) freeze/thaw operations are related to editing the zone files manually on a dynamic zone
We notice no new services are starting while bind is in this process loop
keep killing the rndc processes and the boot sequence will finish in a reasonable time
leave it running and pfsense will eventually finish starting (seems to depend on the number of zones configured in bind)Possible causes:
rndc commands are being run before bind is started (rndc cannot start bind on its own)
rndc is using the wrong port
bind is started but control channel is on the wrong port