pfSense 2.5.2 CE - DNS Resolver periodically stops working
-
DNS Resolver periodically (several times a week, at different times) stops working. And it doesn't restore itself. pfSense 2.5.2 (Community Edition), unbound 1.12.0.
I cure it via rebooting or just entering the menu 'System / General Setup' and clicking the Save button down there, but it crashes again a few days later.
I have tried various lists of DNS servers, with the same result.
I have 'Use local DNS (127.0.0.1), fall back to remote DNS Servers (Default)' DNS Resolution Behavior, 'Allow DNS server list to be overridden by DHCP/PPP on WAN' option on, DNS Resolver on, DNS Forwarder off, no DHCP, IPv4 only, a non RFC 1918 LAN addresses 192.201.0.0/24, the installed bandwidthd 0.7.4_5 and ntopng 0.8.13_10 packages, a 1GB RAM and an about 4GB swap file.
There aren't relevant meaningful messages in /var/log/system.log nor /var/log/resolver.log. Neither there isn't any unusual things on the Dashboard after the crash.What should I do to localize the issue when it crashes next time?
-
@oleg_v_les Rather than rebooting can you start it in Status/Services or the DNS Resolver page?
So you're not running a pfBlocker package?
There is a Watchdog package to restart failed services, as long as you aren't running Snort or Suricata. Not a solution, but could be a faster recovery.
-
@steveits Thanks for your answer, Steve!
I just installed the Service_Watchdog and hope it will help.Not, I haven't started the unbound service in Status/Services or the DNS Resolver page (and it will be useless from now because of the Service_Watchdog); and I don't use a pfBlocker package.
Let's see how it will help...
-
@oleg_v_les Ah! It happened again now!
And Watchdog did not help, unfortunately. But when I went to Status/Services and restarted unbound then it helped.
By the way, during the trouble I went to Diagnostics/Ping and made a ping to a FQDN - and it was resolved to the IP-address successfully (though the same FQDN did not be resolved from a LAN host at the time).
There are many lines in /var/log/resolver.log like these:
Nov 19 12:14:57 pfSense4 unbound[52723]: [52723:1] notice: remote address is 192.201.0.208 port 53705 Nov 19 12:14:57 pfSense4 unbound[52723]: [52723:1] notice: sendmsg failed: No buffer space available Nov 19 12:14:57 pfSense4 unbound[52723]: [52723:1] notice: remote address is 192.201.0.208 port 53705 Nov 19 12:15:56 pfSense4 unbound[52723]: [52723:0] notice: sendmsg failed: No buffer space available Nov 19 12:15:56 pfSense4 unbound[52723]: [52723:0] notice: remote address is 192.201.0.72 port 52015
where 192.201.0.208 and 192.201.0.72 are my LAN IP-addresses. But such lines appear periodically even without the issue (and so I didn't mention they earlier. But they are the only suspicious thing now).
Any ideas?
-
Are you perhaps using traffic shaping or a limiter? If so, maybe this thread from 2019 has relevance: https://forum.netgate.com/topic/144487/unbound-and-traffic-shaping-cause-sendto-failed-no-buffer-space-available.
-
@oleg_v_les You said "restarted" unbound...so it was not stopped? In that case the Watchdog won't help because it starts stopped services. You wrote above it was crashing.
I hadn't heard of the buffer issue w/r/t shaping. I pulled up an SG-3100 router with shaping and there are 3 of those logged in the past 3 months. I did a quick search and found this Reddit post which says updating Realtek drivers was a solution...what kind of NIC do you have? Realtek gets talked down in this forum a lot, and I have seen a few posts in the last 3-6 months about updating Realtek drivers.
-
@bmeeks Thank you for the link! I am indeed using both traffic shaping and limiters; and there is necessity in it, so I don’t want to switch off them. I had had some floating rules for DNS (high priority) and I have improved them according to the post you provided - but with no results, unfortunately.
Actually I don’t care about the lines ‘No buffer space available’ in resolver.log as long as they don't cause my major issue. And it is hard to establish the link between them because it requires probably too much time to wait the next DNS resolving dysfunction with traffic shaping and limiters off (it happens sometimes a few times a day, sometimes one time in fortnight).
I have also changed my hardware and I’m waiting for the results… I’ll describe the details a bit later.