21.02.02 on SG-5100 - Every Reboot Requires Restart of DNS Resolver
-
After every admin-initiated reboot, I must manually restart unbound (DNS Resolver) before pfSense will start servicing LAN DNS requests. Unbound is running, the resolver tables are empty, and the last log entry is:
Apr 17 11:48:09 gateway unbound[21674]: [21674:0] info: start of service (unbound 1.13.1).
I am not sure if this started with 21.02-p1, 21.02.2, or 21.02 itself.
This definitely was not happening prior to 21.02.
-
If it's enabled :
It starts when the system boots.
When it stops - as part of a restart process, it should leave about 20 log lines of statistics.What packages are you using ?
What type of interfaces ? -
As I stated, unbound is running after boot. It just isn’t doing anything and the resolver tables are empty. The only packages running are arpwatch, bandwidthd, and darkstat. They were all in place prior to the upgrade. I do have DHCP Registration (Register DHCP leases in the DNS Resolver) enabled so that the bandwidth monitor tables will resolve device hostnames. I suspect a race condition on startup, probably during the many unbound restarts as devices initially reconnect, that leaves unbound blocked or unstable. During my next maintenance outage I will reboot with DHCP registration disabled and see what effect that has.
-
I just upgraded from pfSense 2.4.5 to 2.5.1 tonight and I'm now experiencing this same behavior. After the upgrade (and after 3 subsequent reboots), the DNS resolver (unbound) won't resolve requests until I do a restart after a reboot. It is running, just not responding to anything.
-
@grumple said in 21.02.02 on SG-5100 - Every Reboot Requires Restart of DNS Resolver:
It is running, just not responding to anything.
Can you run from the command line :
dig @127.0.0.1 google.com
?
-
@plfinch said in 21.02.02 on SG-5100 - Every Reboot Requires Restart of DNS Resolver:
bandwidthd, and darkstat.
These two put a strain on the system.
I would use them only after your sure the system is fine.@plfinch said in 21.02.02 on SG-5100 - Every Reboot Requires Restart of DNS Resolver:
I will reboot with DHCP registration disabled
That's always a good idea, as as soon as the DHCP server(s) start, it could restart unbound on every,new lease. See the DHCP log how many new leases are handled at any given time.
-
@gertjan One of the things I noted in another thread last night was the following. I have my unbound set to listen only on LOCALHOST and LAN (in the config it shows the following)
# Interface IP(s) to bind to interface: 192.168.20.1 interface: 127.0.0.1 interface: ::1 # Outgoing interfaces to be used outgoing-interface: 192.168.20.1
However upon a fresh reboot, when I login and look via sockstat, unbound is only listening on LOCALHOST. I didn't send it an actual query as it clearly wasn't listening on LAN which is the issue, but I will when I get a chance to reboot again. I also have it set to send queries from the LAN address as well, which may actually cause it to fail (since it would appear that the LAN address wasn't available yet when it started unbound initially), but as I said will do further testing. This was exactly how I had it configured in 2.4.5 where it worked just fine.
-
@grumple said in 21.02.02 on SG-5100 - Every Reboot Requires Restart of DNS Resolver:
b
It's friday here, I'm preparing myself to receive some local fools, so I can be sure of anything, but :
Lately, it seems to me, that it is best to use these settings :
:
I've seen strange things when I preselect "logical" interface for inbound and outbound for unbound.
Commands like dig and nslookup start to dump error messages to the command line.
Like "127.0.0.1" took a blow against the head.
Not sure. -
@gertjan I can't do ALL on the outbound request (nor do I really want to expose my DNS resolver to the world either) as I have some "internal" domains that are resolved by resolvers across IPSec tunnels and thus the query has to come from the internal interface to work properly. I'll keep monitoring it for now, and will likely write a script this weekend to monitor it and restart it if it goes dead.
-
I do have the same problem with v.2.5.1. I have to restart my unbound after a reboot of the firewall.
@grumple
Do have any VPN clients running? Try to disable them, and reboot again. See if it helps. -
Had a maintenance window...
I rebooted 5 times with DHCP Registration (Register DHCP leases in the DNS Resolver) enabled and 5 times with it disabled. All 10 times I had to restart unbound (DNS Resolver) before LAN clients could get DNS queries resolved. So the issue is not related to DHCP lease registration, and is consistently reproducible.
I did notice that the state of unbound (from ps command) was I (Idle kernel thread) immediately after boot with the problem present and no additional cpu time was accumulating (frozen at 0.22S). After restarting unbound the state was S (interruptible sleep, pending event) as expected and cpu time did accumulate on subsequent ps queries (with unbound obviously waking and doing work).
Further, as grumpie noted, from sockstat I can see that when the issue is present unbound is listening on 127.0.0.1 but not on the firewall’s primary IP as would be necessary to satisfy LAN DNS requests. After restarting unbound it is listening and responding on the firewall’s primary IP as well.
So, clearly, following reboot, unbound has stopped processing events and is permanently out to lunch on the LAN side. This behavior is new.
-
I have exactly the same issue (custom build socket 1151 PC). I have to restart unbound after every reboot of the PC - unbound service itself appears running, but none of the lan clients can resolve names until manual unbound restart. It all started after upgrade to 2.5.0 (update to 2.5.1 changes nothing)
Looks like workaround is to select "all" in the both interfaces settings on Services-DNS Resolver-General Settings page. [need more testing, waiting for maintenance window]
I have open vpn server enabled.
I have "Register DHCP static mappings in the DNS Resolver" option enabled.
I have "Register DHCP leases in the DNS Resolver" option disabled.Issue is 100% repeatable.
-
Hello!
Pfsense generates the unbound.conf on the fly from its config under a variety of circumstances, including when you boot (rc.bootup). Assuming you specified which network interfaces to listen on (not ALL), it will not add an interface to the config that is disabled or has no carrier.
I wonder if your LAN interface is showing no carrier for some period of time at boot. Once the interface is UP and you restart unbound, pfsense sees it and adds it to the unbound.conf...?
I see this behavior if I dont have an active cable/device plugged into the LAN port at boot. For kicks, you could try putting a long sleep in front of the
services_unbound_configure() call in rc.bootup and see if it makes a difference.John
-
@densilent said in 21.02.02 on SG-5100 - Every Reboot Requires Restart of DNS Resolver:
I have "Register DHCP static mappings in the DNS Resolver" option enabled.
The default value. That should be ok.
Take my word for it : Unchecking is even better.Set these to "All" :
-
Just to make it a bit more complicated ;-) I do see exactly that behaviour all the time if my WAN connection has been broken and comes back. At the moment we do have a not so stable cable connection so I had that several time in the last 5 days. No reboot of the pfSense was box has been done, jut the WAN connection was down and came back.
also have pfSenseNG running. Wondering if this might have some influence here.
-
Sorry, I'm having a bunch of network issues and I guess I posted this on the wrong thread... I meant to post this here:
I have two SG-5100s and two SG-4860s. I did an upgrade from 2.5 to 21.02.2-RELEASE on both SG-4860s and one of the SG-5100's.
I am now seeing this same unbound DNS resolver crash issue on both SG-5100s (even the one that I did not upgrade) and one of the SG-4860s.
I am also running pfBlockerNG and Suricata.
-
I still have this very annoying issue as do many others. As surmised by posts here and in other forum topics, I am guessing that something has changed in the boot sequence with 21.x/2.5.x causing unbound to start before the LAN interfaces are fully up. Thus, the resolver starts up not listening for LAN DNS requests. This happens 100% of the time for me. Every boot/reboot means LAN clients cannot access the Internet using domain names until I restart unbound. This was never an issue with prior pfsense versions.
What is the simplest temporary fix for this? I need to either delay initial unbound start or force an unbound restart sometime after boot completes. It saddens me to have to resort to a hack for something that should just work but it is either that or roll back.
Peter
-
I believe I have the very same issue, other than ubound will crash or get hung up for a very long time, multiple times after the firewall is up and running. Each time I am failing to resolve names on the LAN and I need to restart DNS Resolver and then things are fine. I am running on 21.02.2-RELEASE on an SG-2220 that has historically been incredibly reliable until this issue.
-
I now tried the following:
System -> Routing -> Gateways:
For all gateways I checked "Disable Gateway Monitoring" and "Disable Gateway Monitoring Action"
Services -> DNS Resolver -> General Settings:
In "Network Interfaces" selected every single entry and not only "All" which I had before.
I tried:
- Reboot pfSense
- Broken WAN connection
Both times UNBOUND die work afterwards without any manual intervention.
Not sure if it is by accident or if it changed something. So keep fingers crossed ;-)
-
I've seen similar behavior with 2.5.0 and 2.5.1: after a reboot of my pfsense box, I had to manually restart the DNS Resolver.
For me, the issue appears to have been some of my IoT zoo members: I have a bunch of Sensibo Skys to control my a/c.
With their MAC address being aa:bb:cc:dd:ee:ff, they register a client hostname with the DHCP server of "Sensibo Sky ff:ee:dd:cc:bb:aa" (yes, reverse byte order; and to increase confusion: bytes<0x10 lack the leading 0 in the hostname).
As I can see, the ":" is not a valid character in a hostname. The DHCP server doesn't seem to mind that much, however I had system log entries "bad name in dhcpd.leases".excerpt from /var/dhcpd/var/db/dhcpd.leases
lease 10.94.0.102 {
starts 5 2021/06/04 09:13:02;
ends 5 2021/06/04 11:13:02;
tstp 5 2021/06/04 11:13:02;
cltt 5 2021/06/04 09:13:02;
binding state active;
next binding state free;
rewind binding state free;
hardware ethernet bc:dd:c2:11:f6:0f;
client-hostname "Sensibo Sky f:f6:11:c2:dd:bc";
}Since I've mapped these devices to fixed DHCP settings and gave them valid hostnames on the DHCP server tab, I can happily reboot my pfsense box, and it comes back up with a running and functional DNS resolver.
From that, I'd assume there's at least three culprits in my case:
- the IoT devices registering invalid names
- the DHCP server not filtering that and putting them in its lease database as is
- the DNS resolver not being able to deal with this, and apparently refusing to come up on boot.
The DNS resolver has "Register DHCP leases in DNS Resolver", and "Register DHCP static mappings in the DNS Resolver" both active.
I have no clue why the DNS resolver would work despite the issue after a manual restart.
Maybe this helps someone.