Unbound not responding on all chosen interfaces after reboot
-
@gertjan said in Unbound not responding on all chosen interfaces after reboot:
Never leave such an interrogation open. Get it out of the way by knowing for sure what's happening.
Ask the system what NICs and ports unbound uses.
Console :
sockstat | grep 'unbound'Here is what I see with Unbound configured to bind to Localhost and LAN after a full system restart:
sockstat | grep 'unbound' unbound unbound 93889 3 udp4 127.0.0.1:53 *:* unbound unbound 93889 4 tcp4 127.0.0.1:53 *:* unbound unbound 93889 5 udp4 127.0.0.1:853 *:* unbound unbound 93889 6 tcp4 127.0.0.1:853 *:* unbound unbound 93889 7 udp6 ::1:53 *:* unbound unbound 93889 8 tcp6 ::1:53 *:* unbound unbound 93889 9 udp6 ::1:853 *:* unbound unbound 93889 10 tcp6 ::1:853 *:* unbound unbound 93889 11 tcp4 127.0.0.1:953 *:* unbound unbound 93889 12 stream -> ?? unbound unbound 93889 13 stream -> ?? unbound unbound 93889 14 stream -> ?? unbound unbound 93889 15 stream -> ?? unbound unbound 93889 16 stream -> ?? unbound unbound 93889 17 stream -> ?? unbound unbound 93889 18 stream -> ?? unbound unbound 93889 19 stream -> ??
So it's as I said - unbound is not binding correctly to the LAN interface automatically on startup. Here it is after I manually restart the service:
sockstat | grep 'unbound' unbound unbound 12666 3 udp4 192.168.0.1:53 *:* unbound unbound 12666 4 tcp4 192.168.0.1:53 *:* unbound unbound 12666 5 udp4 192.168.0.1:853 *:* unbound unbound 12666 6 tcp4 192.168.0.1:853 *:* unbound unbound 12666 7 udp4 127.0.0.1:53 *:* unbound unbound 12666 8 tcp4 127.0.0.1:53 *:* unbound unbound 12666 9 stream /var/run/php-fpm.socket unbound unbound 12666 10 stream /var/run/php-fpm.socket unbound unbound 12666 11 udp4 127.0.0.1:853 *:* unbound unbound 12666 12 tcp4 127.0.0.1:853 *:* unbound unbound 12666 13 udp6 ::1:53 *:* unbound unbound 12666 14 tcp6 ::1:53 *:* unbound unbound 12666 15 udp6 ::1:853 *:* unbound unbound 12666 16 tcp6 ::1:853 *:* unbound unbound 12666 17 tcp4 127.0.0.1:953 *:* unbound unbound 12666 18 dgram -> /var/run/logpriv unbound unbound 12666 19 stream -> ?? unbound unbound 12666 20 stream -> ?? unbound unbound 12666 21 stream -> ?? unbound unbound 12666 22 stream -> ?? unbound unbound 12666 23 stream -> ?? unbound unbound 12666 24 stream -> ?? unbound unbound 12666 25 stream -> ?? unbound unbound 12666 26 stream -> ??
@gertjan said in Unbound not responding on all chosen interfaces after reboot:
Btw :
I've selected my unbound "Network interfaces" ;:unbound was still working on all local interfaces.
Yes, and it worked fine for me on other hardware as well, as I noted.
But that doesn't mean there isn't a bug in the startup sequence just because it works on some devices - clearly it doesn't work correctly on this particular hardware, and I can smell a startup script race condition from a mile away having dealt with many similar issues in the past on other systems.
Is there any in depth documentation on the startup process for PFSense - in particular the non standard service manager that it uses to start/stop service and restart them when interfaces change etc ?
If there is some documentation somewhere I can have a poke around myself to see if I can work out what's going on as simply binding ALL should be considered a workaround only.
-
@dbmandrake said in Unbound not responding on all chosen interfaces after reboot:
unbound unbound 93889 3 udp4 127.0.0.1:53 :
unbound unbound 93889 4 tcp4 127.0.0.1:53 :
unbound unbound 93889 5 udp4 127.0.0.1:853 :
unbound unbound 93889 6 tcp4 127.0.0.1:853 :
unbound unbound 93889 7 udp6 ::1:53 :
unbound unbound 93889 8 tcp6 ::1:53 :
unbound unbound 93889 9 udp6 ::1:853 :
unbound unbound 93889 10 tcp6 ::1:853 :
unbound unbound 93889 11 tcp4 127.0.0.1:953 :If you found that after a reboot, and you did include 'LAN' in the Network Interfaces list, then ok, that's pretty bad, not good at all.
When this happens, can you share the unbound config used ?
That's not what you see in the GUI.
It's here : /var/unbound/unbound.confDuring unbound start, what was shown in it's log file ?
@dbmandrake said in Unbound not responding on all chosen interfaces after reboot:
If there is some documentation somewhere
All my opinion, of course, but no one maintains documentation of these things.
The best answer you have is : it's all (like all !) open source.
( open source is not equivalent to open documentation ^^ )
Depending what you find in the unbound.conf file, unbound's behavior could be explained.
The config file instructs what unbound should do.
During startup, this file is re created from 'scratch' == from the general pfsense config file.
if other processes were already using the 53 port, then it's normal that unbound fails to listen on them. This has been sen before : people had also an instance of bind running .... -
@gertjan said in Unbound not responding on all chosen interfaces after reboot:
If you found that after a reboot, and you did include 'LAN' in the Network Interfaces list, then ok, that's pretty bad, not good at all.
When this happens, can you share the unbound config used ?
That's not what you see in the GUI.
It's here : /var/unbound/unbound.conf########################## # Unbound Configuration ########################## ## # Server configuration ## server: chroot: /var/unbound username: "unbound" directory: "/var/unbound" pidfile: "/var/run/unbound.pid" use-syslog: yes port: 53 verbosity: 1 hide-identity: yes hide-version: yes harden-glue: yes do-ip4: yes do-ip6: yes do-udp: yes do-tcp: yes do-daemonize: yes module-config: "validator iterator" unwanted-reply-threshold: 0 num-queries-per-thread: 4096 jostle-timeout: 200 infra-host-ttl: 900 infra-cache-numhosts: 10000 outgoing-num-tcp: 10 incoming-num-tcp: 10 edns-buffer-size: 1432 cache-max-ttl: 86400 cache-min-ttl: 0 harden-dnssec-stripped: yes msg-cache-size: 4m rrset-cache-size: 8m num-threads: 4 msg-cache-slabs: 4 rrset-cache-slabs: 4 infra-cache-slabs: 4 key-cache-slabs: 4 outgoing-range: 4096 #so-rcvbuf: 4m auto-trust-anchor-file: /var/unbound/root.key prefetch: no prefetch-key: no use-caps-for-id: no serve-expired: no aggressive-nsec: no # Statistics # Unbound Statistics statistics-interval: 0 extended-statistics: yes statistics-cumulative: yes # TLS Configuration tls-cert-bundle: "/etc/ssl/cert.pem" tls-port: 853 tls-service-pem: "/var/unbound/sslcert.crt" tls-service-key: "/var/unbound/sslcert.key" # Interface IP(s) to bind to interface: 127.0.0.1 interface: 127.0.0.1@853 interface: ::1 interface: ::1@853 # DNS Rebinding # For DNS Rebinding prevention private-address: 127.0.0.0/8 private-address: 10.0.0.0/8 private-address: ::ffff:a00:0/104 private-address: 172.16.0.0/12 private-address: ::ffff:ac10:0/108 private-address: 169.254.0.0/16 private-address: ::ffff:a9fe:0/112 private-address: 192.168.0.0/16 private-address: ::ffff:c0a8:0/112 private-address: fd00::/8 private-address: fe80::/10 # Access lists include: /var/unbound/access_lists.conf # Static host entries include: /var/unbound/host_entries.conf # dhcp lease entries include: /var/unbound/dhcpleases_entries.conf # Domain overrides include: /var/unbound/domainoverrides.conf # Forwarding forward-zone: name: "." forward-addr: 1.1.1.1 forward-addr: 8.8.8.8 ### # Remote Control Config ### include: /var/unbound/remotecontrol.conf
I note that "Interface IP(s) to bind to" does not include 192.168.0.1, however after I restart the service, it does include this.
I assume that the startup script looks at the interfaces selected in the GUI config, generates a config file on the fly but leaves out this interface because it is not up yet at the time the script runs.
In the system.log I can see that unbound starts after igb0 (WAN) initialises but before igb1 (LAN) initialises:
Nov 29 09:18:48 pfSense-Home check_reload_status[409]: Linkup starting igb0 Nov 29 09:18:48 pfSense-Home kernel: Nov 29 09:18:48 pfSense-Home kernel: igb0: link state changed to UP Nov 29 09:18:48 pfSense-Home check_reload_status[409]: rc.newwanip starting igb0 Nov 29 09:18:48 pfSense-Home php[431]: rc.bootup: Resyncing OpenVPN instances. Nov 29 09:18:48 pfSense-Home kernel: done. Nov 29 09:18:48 pfSense-Home php[431]: rc.bootup: [squid] Installed but disabled. Not installing 'nat' rules. Nov 29 09:18:48 pfSense-Home kernel: pflog0: promiscuous mode enabled Nov 29 09:18:48 pfSense-Home php[431]: rc.bootup: [squid] Installed but disabled. Not installing 'pfearly' rules. Nov 29 09:18:48 pfSense-Home kernel: . Nov 29 09:18:48 pfSense-Home php[431]: rc.bootup: [squid] Installed but disabled. Not installing 'filter' rules. Nov 29 09:18:48 pfSense-Home kernel: .. Nov 29 09:18:48 pfSense-Home kernel: .done. Nov 29 09:18:49 pfSense-Home php[431]: rc.bootup: Default gateway setting Interface WAN_DHCP Gateway as default. Nov 29 09:18:49 pfSense-Home php[431]: rc.bootup: Gateway, none 'available' for inet6, use the first one configured. '' Nov 29 09:18:49 pfSense-Home kernel: done. Nov 29 09:18:49 pfSense-Home php-fpm[371]: /rc.newwanip: rc.newwanip: Info: starting on igb0. Nov 29 09:18:49 pfSense-Home php-fpm[371]: /rc.newwanip: rc.newwanip: on (IP address: 77.101.155.235) (interface: WAN[wan]) (real interface: igb0). Nov 29 09:18:50 pfSense-Home php[431]: rc.bootup: sync unbound done. Nov 29 09:18:50 pfSense-Home kernel: done. Nov 29 09:18:50 pfSense-Home kernel: done. Nov 29 09:18:53 pfSense-Home check_reload_status[409]: Linkup starting igb1 Nov 29 09:18:53 pfSense-Home kernel: Nov 29 09:18:53 pfSense-Home kernel: igb1: link state changed to UP
Having a look rc.linkup on the following lines I see there is a workaround for unbind to restart it when an interface goes DOWN but not one for when it goes UP:
https://github.com/pfsense/pfsense/blob/master/src/etc/rc.linkup#L81-L120
This refers to the following redmine bug ticket:
https://redmine.pfsense.org/issues/13254
which also refers to this one:
https://redmine.pfsense.org/issues/12613
So it looks like a known issue, and for some reason it was only fixed for when the interface goes down but not when it goes up as there is no code to restart unbound when an interface its configured to use goes up...
I'll do a bit more research through the redmine tickets and perhaps post there as I have a pretty good idea of what the issue is now.
-
Ok I've just realised that the code I'm looking at on Github is new and doesn't represent what is running in 2.6.0. When I look at /etc/rc.linkup on my running 2.6.0 system it does have code to restart unbound on linkup - although it doesn't seem to be working for me on startup:
switch ($argument2) { case "stop": case "down": log_error("DEVD Ethernet detached event for {$iface}"); interface_bring_down($iface); break; case "start": case "up": log_error("DEVD Ethernet attached event for {$iface}"); log_error("HOTPLUG: Configuring interface {$iface}"); // Do not try to readd to bridge otherwise em(4) has problems interface_configure($iface, true, true); /* Make sure gw monitor is configured */ if ($ip6addr == 'slaac' || $ip6addr == 'dhcp6') { setup_gateways_monitor(); } /* restart unbound on interface recover, * https://redmine.pfsense.org/issues/11547 */ if (isset($config['unbound']['enable']) && (in_array($iface, explode(',', $config['unbound']['active_interface'])) || in_array($iface, explode(',', $config['unbound']['outgoing_interface'])))) { services_unbound_configure(); } break; } } }
It looks like this whole code area is being significantly re-written for 2.7.0.
-
About the log lines :
Nov 29 09:18:48 pfSense-Home check_reload_status[409]: Linkup starting igb0
and moreWhy would your igb1=LAN would have to handle a "link state changed to DOWN" and "link state changed to UP" this late in the startup of the system ?
LAN is, normally, a statically set interface and gets initialized in very early booting process.
Typically, it goes to a switch that should be the 'always' on.If unbound was told (according the config file) to bind to 192.168.1.1 and it can't during startup, it will complain in the log file.
You didn't mention that, so, according to unbound, all is well : config told it to bind to 127.0.0.1 and that's it.When your system boots, an unbound.conf is created that says : listen only to local host IPv4 = 127.0.0.1 then it's the pfSense low level GUI that fails to build a correct config file.
Or the info stored is wrong, see the config.xml file ( /cf/conf/config.xml )This is a not-good situation, as you've sue the info differently in the GUI, resolver settings page.
I've read https://redmine.pfsense.org/issues/13254
And https://redmine.pfsense.org/issues/12613 which is related.There is a solution.
Put a switch between your pfSense LAN port and your other devices.
This will stop the flapping on the LAN interface.
Also : use this : select All :as you know by now that you can trust your firewall and no one will be able to contact your DNS resolver "from the out side", the WAN interface(s)
So select All and call it a day ;)You unbound.conf interfaces list
:# Interface IP(s) to bind to interface: 127.0.0.1 interface: 127.0.0.1@853 interface: ::1 interface: ::1@853
Let's forget about 853 (used by the less known 'rndc') :
:# Interface IP(s) to bind to interface: 127.0.0.1 interface: ::1
so unbound listens only to 127.0.0.1 IPv4 and ::1 IPv6.
Or, it should be something like (I removed the 853 stuff) (my list) :
# Interface IP(s) to bind to interface: 192.168.1.1 interface: 2001:470:beef:5c0:2::1 interface: 192.168.100.1 interface: 192.168.100.1@853 interface: 2001:470:dead:100::1 interface: 192.168.2.1 interface: 192.168.3.1 interface: 2001:470:dead:3::1 interface: fe80::92ec:77ff:fe29:392c%igc0 interface: fe80::92ec:77ff:fe29:392e%igc2 interface: fe80::92ec:77ff:fe29:392d%igc1 interface: fe80::92ec:77ff:fe29:392c%ovpns1 interface: 127.0.0.1 interface: ::1
When I rebooted, I saw the same interfaces list.
which is ok for me as I have selected :
I'm am using IPv6 and IPv4 ;)
Btw : I'm using a Netgate device, the 4100, so I had to leave 2.6.0, I'm using 22.05 these days.
@dbmandrake said in Unbound not responding on all chosen interfaces after reboot:
It looks like this whole code area is being significantly re-written for 2.7.0.
Let's reserve that for future fun ^^
-
Hi,
I seem to have a similar problem.
In my case unbound is set to use Outgoing Interfaces that are VPNs.
I guess unbound ignores them as they haven't come up yet during its start.
I need to later restart the unbound service.
I can't even find proof of that.
But I've read about this behavior a couple of times.
So I guess it is definitely not related with one user with specific setup.
Thanks. -
With unbound set to listen to "All" interfaces, it will (I should say : should) listen on all available interfaces when the system boots and unbound starts.
Also : when using a OpenVPN client, see here https://www.youtube.com/watch?v=ulRgecz0UsQ at 9 minutes and 28 seconds : an 'OpenVPN' interface has to be created so the system and unbound has another interface.
In theory, as I didn't test this, I'm not using any OpenVPN client, when the OpenVPN clients start, and 'interface' event will happens : unbound gets restarted and now it will find the OpenVPN interface and 'bind' (use) to it. Did you saw such an event ? -
@gertjan said in Unbound not responding on all chosen interfaces after reboot:
unbound gets restarted and now it will find the OpenVPN interface and 'bind' (use) to it. Did you saw such an event ?
Thank for your time.
I don't use "All" interfaces so that all DNS queries are routed to the VPNs (I have at the moment a total of 3 to test redundancy) for privacy reasons. It is said that this is the only way to prevent DNS leaks to the actual ISP at WAN.
Unbound doesn't seem to restart at all after initial boot for any reason. Watchdog not helpful in this case as the service is really running.
I just increased log level from 3 to 5 but I don't think it will show me any new information for what I am looking for.
Will shutdown and boot again to have a new look at it.
Thanks again. -
@gertjan
Tried removing the physical WAN cable and it brings unbound to apparently resolve itself.Tried removing all VPNs but leaving only 1.
Unbound starts 2 seconds before every interface and does nothing after that.Thank you once more.
-
@robotox said in Unbound not responding on all chosen interfaces after reboot:
@gertjan said in Unbound not responding on all chosen interfaces after reboot:
unbound gets restarted and now it will find the OpenVPN interface and 'bind' (use) to it. Did you saw such an event ?
Thank for your time.
I don't use "All" interfaces so that all DNS queries are routed to the VPNs (I have at the moment a total of 3 to test redundancy) for privacy reasons. It is said that this is the only way to prevent DNS leaks to the actual ISP at WAN.Why not just enable forwarder mode in the unbound configuration and tell it to forward to a DNS server on the other end of your VPN ? (Presumably head office) That way it will not try to query root servers.
Also you could use egress rules in floating rules to block outgoing queries on the WAN interface so that even if unbound tried to send dns queries outside your VPN's they would be blocked.
Because unbound runs on the firewall itself only egress rules can block its query traffic.
-
@dbmandrake
Hi, unbound is in forwarding mode with selected Outgoing Interfaces -- not "All".
Tried with or without declaring the VPN provider's DNSs.
Floating rules are set up to prevent leaks to WAN.
As far as I understood, those would apply to the client devices only -- the firewall's setup for Unbound is managed separately.
As far as I understood, to achieve the desired solution I need to unselect "All" and select only those desired interfaces.
Thanks. -
@dbmandrake
This is potentially wildly off topic; but are we attempting to spoof MACs?
I could potentially see it initially come up with real MAC; then swap to fake MAC and bork things. -
Added my plea here https://redmine.pfsense.org/issues/13707.
-
I'm with that problem too. I'm using pfsense+ 23.01 but it happened on 2.6 too. My setup is a virtual pfSense running on qemu (Linux) with explictly passed through intel dual port 1 Gig adapter to the pfSense so it works as bare-metal NIC.
Each and every reboot of pfSense guest, machine reboot, renders Unbound unusable - it starts but it doesn't resolve a thing. Only restart of the service helps immediately.
My setup is similar to those described in the other places - listen on LAN, localhost, outgoing interface WAN. I tried various workarounds for this eg. setting gateway monitoring address to something few hops further than default ISP gw - like 1.1.1.1 but no avail... the issue persist.
-
Hi,
I now have an SG-2100 with 23.05.1 for the same setup and still the same problem.
Unbound fails to start as I have OpenVPNs as Outgoing Network Interfaces.
Still trying to get attention at https://redmine.pfsense.org/issues/13707. -
While it's technically only a workaround and I would hope this will get fixed one day, the pragmatic solution is to just set "Network Interfaces" and "Outgoing Network Interfaces" to All, and simply use firewall rules to block / allow access to the DNS server from client devices.
That way, no matter how interfaces go up or down during boot or later on (including VPN's going up and down after the system has booted) unbound will always bind to all interfaces, but access will be dictated by the firewall rules for each interface.
This is how I have been running ever since reporting the issue, and in some ways blocking using firewall rules is a more explicit and secure way to prevent access to the DNS server for clients who should not have access than relying on unbound to bind to the correct interfaces in its own configuration.
Given the default firewall rules for an interface are to block, this is fairly easy because unless you add an allow rule access to the DNS server is blocked by default including attempts to query from the WAN side.
-
Consider also using Unbound ACL rules.
-
@Gertjan Much less secure, because unbound still receives and processes the packets and then decides whether they should be ignored or responded to based on its own configuration file.
If there was ever a problem like a buffer overflow found in unbound it would be vulnerable to attack from clients that are "blocked" by the ACL list but allowed by firewall rules.
Firewall rules on the other hand are absolute, and do not allow any packets to reach unbound for processing and would prevent such exploitation. So if you're going to bind to all interfaces (as in this workaround) why not just set access to unbound using firewall rules. I would not rely on unbounds own ACL's except to allow remote subnets which are normally denied by default. I would not rely on it as a means of blocking.
-
Now that's what I call 'considering'
-
Thank you for bringing the thread back to life!
But in my case, the problem being with Outgoing Interfaces, rules won't apply to the firewall.