Unbound issues on boot
-
I have observed a difference in startup behaviour between 2.2 and 2.4, with it broken for me on 2.4.
On 2.2 on bootup if I watched the console it would take a good 10+ seconds for dns resolver to start but it would start eventually.
On 2.4 it reports it starts without a pause but it actually does not start, evident by failed dns queries and the UI reporting it down, I can successfuly start in the GUI albeit with the 10-20 second wait for it to startup.Please let me know what information is needed from me to debug this and if I should file a bug report or not.
-
Any errors in the resolver log? Or system log?
Do you have any custom advanced settings in unbound?
-
system log has a few of these which stopped after I manually started it from the GUI
dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
resolver log however seems to give the answer, it could not bind to the link local address, loads of these.
Jan 3 12:55:05 unbound 6359:2 error: can't bind socket: Can't assign requested address for fe80::
So it seems its trying to start during boot before the link local address is online?
the custom box for unbound only has this added, which I think was auto added by pfblockerNG
server:include: /var/unbound/pfb_dnsbl.conf
on the advanced page the following are ticked.
hide identity
hide version
prefetch support
prefetch dns key support
bit 0x20 supporton standard page
is listening on
lan
wan ipv6 link local (will turn this off as I dont want it on any wan interfaces)
lan ipv6 link local
localhostoutgoing network interfaces is just WAN
dhcp registration and static dhcp are tickedrest is default.
–edit--
I suspect it will be ok now I will test in a bit.
The bind error was the wan ipv6 link local address which I now have disabled, but of course it will bind to that address when using the default ALL setting. -
I was going to post in dns section but will post here.
I have found what I consider a bug.
So basically as mentioned I have set unbound in the UI to make outgoing connections on the WAN.
This to me should mean both the ipv4 wan ip, and ipv6 wan ip (if one is allocated to the pfsense box).
Now the ipv6 ip is considered a lan ip by pfsense, and it seems unbound follows this and instead has set the wan link local ipv6 for outgoing queries, given fe80 is not routable on the wide internet this is obviously an invalid configuration.
Relevant lines.
# Outgoing interfaces to be used outgoing-interface: 87.81.222.10 outgoing-interface: fe80::230:<censored>%igb0</censored>
The second line should actually be the ipv6 thats listed under LAN. That is the internet routable ipv6 on the router.
I discovered this bug as the cannot assign errors are still flooding my resolver log.
-
You select addresses by interface, not by purpose. When you pick "WAN" it means the addresses on the WAN interface.
If your routable IPv6 address is on the LAN then you'd have to pick LAN as an outgoing interface, too.
-
except wan is all ip's, so if I deselected wan then the ipv4 would not be used.
Please dont tell me you trying to claim thi is not a problem? :)
First I dont want outgoing requests on my lan interface, however thats not the real issue.
The real issue is that selecting WAN generates a invalid outgoing interface line in the unbound config which floods the resolver log.The error would not occur I expect tho if %igb0 was removed, as thats what is actually generating the error.
-
fe80 addresses are link-local and must be scoped, so %igb0 is required. It can't be used to reach outside the segment, so it should probably not be used for that purpose. There might be a legitimate bug there, but just that.
If you chose only WAN and there is no routable IPv6 address on WAN, it can't magically guess you want LAN to be outgoing. You have to tell it exactly what you want it to do and what to use.
-
Ok I will try to explain again.
1 - WAN and WAN ipv6 link-local are separate configurable options in the GUI.
2 - I only have WAN selected for outgoing interface in the GUI, WAN ipv6 link-local is "NOT" selected.
3 - Unbound does not accept a scope as part of an ip address, it is invalid syntax.So to me a fix would be the following.
Change WAN and LAN options to WAN IPV4 and LAN IPV4
Add WAN IPV6 (this one might not be needed as WAN ip is put on lan interface) and LAN IPV6 option(s).
Do not add the link-local ip from WAN if "WAN ipv6 link-local"is not selected.
If "WAN ipv6 link-local" is selected, do not add the scope part of the ip.I hope this is understandable for you now, and do you prefer if I raise this as a bug, or can you yourself make the arrangements for the fix?
Also to add the wan ipv6 link-local when I deselected it for the bind options, it did remove it correctly from the config file, so the behaviour is inconsistent between the 2 options which further substantiates this is a bug.
-
It's normally a good practice to leave things at default when unsure what they are doing - which is "all interfaces" in this case. If you "dont want outgoing requests on lan interface," then you'll have to get a routable IPv6 on your WAN, I'm afraid.
-
It's normally a good practice to leave things at default when unsure what they are doing - which is "all interfaces" in this case. If you "dont want outgoing requests on lan interface," then you'll have to get a routable IPv6 on your WAN, I'm afraid.
That may be the case, but this is a bug and broken behaviour.
If I select ALL interfaces the same problem occurs anyway as it still incorrectly adds the scope to the syntax in the configuration file.
Unbound developers would also likely frown at what is the default on pfsense, any sane admin only configures required interfaces, its more secure, simple and less likely to give undesirable behaviour.
-
You'll have to provide far more detail than that. screenshots/XML of the unbound settings, your interface configuration/status/routing/etc.
The only case to be made so far is to prevent IPv6 link-local selected or used automatically by unbound. The other parts don't make sense.
-
The scope is required. And if that doesn't work in unbound any more, it needs to be fixed upstream. The relevant bug for this was https://redmine.pfsense.org/issues/4062 BTW.
-
Ok I will register on the redmine site, and do a detailed bug report there.
I agree the suggested changes to split off IPV4/IPV6 are not important, but the link-local issue is, so I will just concentrate on that.
-
The scope is required. And if that doesn't work in unbound any more, it needs to be fixed upstream. The relevant bug for this was https://redmine.pfsense.org/issues/4062 BTW.
they seem to think its fixed on there a year ago :)
-
right ok so if the default ALL is selected then none of the outgoing interface lines get populated which does stop those errors, this may explain why it hasnt been noticed until now, I will run like this for now but do the bug report tomorrow morning.
Basically with the errored outgoing-interface the error is generated whenever a external dns lookup is performed so the log does get quite noisy.
I also retested the same option on the network interfaces setting, and it works correctly on that, it will not add the wan local-link below```
Interface IP(s) to bind to
I didnt retest if unbound works on boot up yet, will do that tomorrow morning also.
-
I can confirm unbound is still dead on bootup, I even tested it again with both interface boxes set to the default ALL just to rule that out.
I also know that if it is left alone for long enough unbound eventually comes online by itself, I cannot say an exact time but I would say 30-60 minutes. I found this out when the box rebooted itself earlier from a panic.
The problem also still exists on the latest snapshot with the updated unbound 1.6.0 version.
I suspect its wan interface related still as my wan does take time to come up. Also pfblockerNG has dnsbl feeds configured.
-
Ok I can say why its not starting but not why this behaviour is occurring.
So as you know on ipv6.
There is a wan ipv6 link-local address on the wan interface.
For some reason and I havent been able to find out why, when unbound is started from rc.boot it tries to use the incorrect address. I will not post my full address here, but its wrong on one octect.
So e.g.
Correct address is fe80::<censored>:d0e5
But its trying to bind to fe80::<censored>:d0e6I suspect this may be a bug in the sticky DUID code made by marjohn but I am not sure.
The actual WAN ip which is on the LAN interface does end in d0e6 but not the link local ip.
When I start from the GUI this issue doesnt occur.
I found some more information, will put it in the bug report I am posting now.</censored></censored>
-
Why the heck are you censoring fe80?
-
I suspect this may be a bug in the sticky DUID code made by marjohn but I am not sure.
The DUID only deals with dhcp6c, saving and restoring the duid, it's got diddly squat to do with anything else, methinks you need to look elsewhere! :)
-
yeah no worries I agree with you now that was an earlier suspicion.
I now just startup unbound using the default ANY setting to get round the issue, it seems the bootup script gets confused somehow by my configuration when setting specific interfaces, I am not easy about my unbound resolver listening on my internet facing ip but its what I will have to live with for now.