DNS randomly stops working
-
@sashli said in DNS randomly stops working:
dig @127.0.0.1 pfsense.org +short
Let's add some details.
Lot's of details.dig @127.0.0.1 pfsense.org +trace
But the real issue is here :
net.c:536: probing sendmsg() with IP_TOS=b8 failed: Can't assign requested address
'dig' is not family of, or related to 'unbound'.
dig comes from the bind FreeBSD package 'bind-tools'."net.c" is one of the main source code files, and it can not "use the IP you gave" ... it can not use "127.0.0.1" so it can contact unbound over 127.0.0.1 port 53 UDP.
When I stop unbound - so no one is listing on 127.0.0.1 port 53, and execute a
dig @127.0.0.1 pfsense.org +trace
it comes back after several seconds with a logical "connection timed out; no servers could be reached"You have interface problems ..... and not 'unbound' problems. unbound is yet another victim of the real issue.
-
@gertjan yes problem is related to interfaces...
if i try to ping 127.0.0.1 i've the same error...
Someone opened a bug for that?
-
@gertjan thanks for your feedback but it looks really like a problem with the localhost 127.0.0.1 interface. I just reinstalled the ntopng and this has also problems starting and running services. Issue leading in the same direction here "Could not connect to Redis at 127.0.0.1:6379: Can't assign requested address'".
The question is: why is the interface on the localhost not reachable by the system itself ? Any idea ?
-
@sashli try to ping other interfaces from pfsense itself...
i think is jail related or something similar...
-
@juniper just found this ticket in redmine, not sure if this can be an issue created by this gatewax problem
-
@sashli i don't know...
I can add an information, my pfsense installation is on proxmox ve virtual machine.
-
@sashli said in DNS randomly stops working:
@juniper just found this ticket in redmine, not sure if this can be an issue created by this gatewax problem
That's a special case, using IP's like 169.254.0.0/16.
There is a small patch for this - jimp posted one yesterday.I've updated my pfSense at home (Hyper-V = VM) based : it's just perfect.
I've just updated my work pfSense, they update went just fine. Again perfect.127.0.0.1 isn't even an external driver related, as it is part of the build in 'kernel' IP stack facilities.
Use another VM host - if you have a Windows 10 Pro somewhere, you could make one right away, or install pfSense bare bone and you see that their is - can not be - localhost issues, as it will break everything.
Just to be sure : your issue exists after you reset to default - and you only changed the password (!! did NOT import your settings !!) ?
-
Problem appear if i insert 127.0.0.1 on GENERAL SETUP---DNS Server Settings
I use to have there 127.0.0.1 if pfsense is a dns resolver.
If i config with 127.0.0.1 GENERAL SETUP---DNS Server Settings as a default pfsense dns server there is the problem with interface lo0.
-
@juniper Then you create a loopback??
If nothing is in General setup then it uses localhost.
-
@cool_corona no i use 127.0.0.1 as dns resolver for pfsense.
-
@juniper Leave it blank and
And reboot
-
@cool_corona yes it works!
Problem appear if i use 127.0.0.1 in general setup.
If i set an external dns and set dns resolution behavior as you suggest all working fine.
Thank you.
-
@juniper exactly its a problem when 127.0.0.1 is listed in the general setup of the DNS server list.
-
Why do you want to add 127.0.0.1 here :
?
As you can see, I have nothing.
Because that's default : nothing.Still, the magic is happening :
I hope (didn't test) that pfSense is intelligent enough that, when 127.0.0.1 is added here :
It will ignore that 'request' as 127.0.0.1 is already there.
Here it is :
/etc/resolv.conf .......nameserver 127.0.0.1 search your-domain.tld
Ok, I broke my won rules and added some DNS settings myself.
but /etc/resolv.conf didn't change.
I'm missing something ....But pfSense (unbound) works.edit : this is a no go :
[2.5.1-RELEASE][admin@pfsense.my-domain.tld]/root: dig @127.0.0.1 google.com
net.c:536: probing sendmsg() with IP_TOS=b8 failed: Can't assign requested address
net.c:536: probing sendmsg() with IPV6_TCLASS=b8 failed: No route to hostOk. Great.
Who calls Houston ?Solution : remove ::1 and 127.0.0.1 from the General settings as it was
Useless (before)
Break things (today).So, please : don't do that ;)
edit : even why I removed ::1 and 127.0.0.1 I had could not use "127.0.0.1" any more.
It was :
net.c:536: probing sendmsg() with IP_TOS=b8 failed: Can't assign requested address
all the time now.
Si, guys,I don't quiet understand what I'm seeing, but I see what you see.
Unbound wasn't listening to 127.0.0.1 any more - I restarted unbound : didn't help. I had to restart "127.0.0.1" - if possible.
I had to reboot pfSense - as this is a way to 'restart' the kernel. -
@gertjan Personally had 127.0.0.1 from before that "DNS Resolution Behavior" section existed, a couple of years at least.
If memory serves right
Guess that's because we wanted to ensure the system used its own resolver, and that only.Was stunned when all DNS resolution stopped after upgrading to 2.5.1. Not even pkg worked :)
But some dig(ging) lead to the entry removal, then restoration of DNS service.No bug listed as of now that I can find, but @jimp could we have this case covered in the pfsense-upgrade script that gets updated when pressing "13" in the CLI? It would eliminate the 127.0.0.1 entry, as workaround for now in place of debugging the lovely
net.c:536: probing sendmsg() with IP_TOS=b8 failed: Can't assign requested address
bork message.
-
I am also having this problem after upgrading directory from 2.4.5 CE to 2.5.1 on a Sophos SG-210.
In my case, enabling unbound on the Service Watchdog list restarts the service, but then the CPU is pegged at 100% and resolution still doesn't happen. Restarting the firewall works. I have not yet checked the PID or socket status during an outage, but I suspect unbound crashes, thinks its still running but can't clean itself up.One thing I noticed on my system is pkg info unbound says Python is enabled, even though it is disabled in the configuration. I manually restarted after toggling Python on and off. Is this even relevant?
[2.5.1-RELEASE][admin@myfirewallnotyours]/root: pkg info unbound unbound-1.13.1 Name : unbound Version : 1.13.1 Installed on : Thu Apr 15 03:10:26 2021 CDT Origin : dns/unbound Architecture : FreeBSD:12:amd64 Prefix : /usr/local Categories : dns Licenses : BSD3CLAUSE Maintainer : jaap@NLnetLabs.nl WWW : https://www.nlnetlabs.nl/projects/unbound Comment : Validating, recursive, and caching DNS resolver Options : DNSCRYPT : off DNSTAP : off DOCS : off DOH : on ECDSA : on EVAPI : off FILTER_AAAA : off GOST : on HIREDIS : off LIBEVENT : on MUNIN_PLUGIN : off PYTHON : on SUBNET : off TFOCL : off TFOSE : off THREADS : on Shared Libs required: libexpat.so.1 libnghttp2.so.14 libpython3.7m.so.1.0 libevent-2.1.so.7 Shared Libs provided: libunbound.so.8 Annotations : FreeBSD_version: 1202504 cpe : cpe:2.3:a:nlnetlabs:unbound:1.13.1:::::freebsd12:x64 repo_type : binary repository : pfSense Flat size : 7.79MiB
Tried all recommendations on this post but nothing is working so far.
-
Eagerly following any threads about DNS, My watch dog is restarting Unbound all the time.
2.5.1-RELEASE (amd64)
pfBlockerNG-devel: 3.0.0_16
snort: 4.1.3_5
Telegraf: 0.9_5Just wanted to share what see just on the off chance it helps the group, I did notice the already in use error in my system logs, when the watch dog is trying to start it back up
Apr 23 10:15:04 pfsense php[92161]: servicewatchdog_cron.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1619136904] unbound[68018:0] debug: creating udp4 socket 192.168.1.1 53 [1619136904] unbound[68018:0] error: bind: address already in use [1619136904] unbound[68018:0] fatal error: could not open ports' Apr 23 10:15:04 pfsense php[73303]: notify_monitor.php: Message sent to XXXXX@hotmail.com OK Apr 23 10:15:01 pfsense php[92161]: servicewatchdog_cron.php: Service Watchdog detected service unbound stopped. Restarting unbound (DNS Resolver)
-
Hi, so this is a documented upstream bug.
https://redmine.pfsense.org/issues/11316
I just found out about it because I submitted a trouble ticket.
Unfortunately, until this regression is fixed, the solution is either- Turn of Register "DHCP leases in DNS"
- Downgrade to 2.4.5
- Downgrade the package
- use the DNS forwarder
Unfortunately 1) and 4) don't help if you need to Register DHCP in DNS in your organization.
So here's hoping the developers on unbound have an easy fix.
-
Service watchdog and unbound don't play well together.
Especially if pfblockerng is also used (since it does take time to come up)
In various situations, it ends up in unbound restart loops.By enabling unbound python mode, and disabling dhcp integration, unbound is stable.
However, if wan ip changes due to pppoe restarting, unbound will die.
Always.
And since service watchdog is a no go for unbound, it has to be restarted manually
Yikes!.
At the time of ppp restart I get this
Apr 19 11:18:24 unbound 19913 [19913:0] info: service stopped (unbound 1.13.1).
2.5.1 pfblockerngng 3.0.16 -
I'm using unbound & service watchdog , and have no isues.
Not using pfblocker though.