Unbound/DNS resolver with IPv6 unreliable finally solved
-
@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
As most of you know the incoming queries are either resolved from cache or forwarded to an external DNS server
Exact.
3 steps :- If the DNS request reached unbound, it can do something with it.
- If the DNS request has a match in the cache (TTL valid, etc), the answer will be send back right away.
- If unknown, resolving starts, which implies usage of an aviable gateway 'upstream'.
With unbound extended logging, and if pfBlockerng is installed, the python mod pfb_python will also log, so you check if the DNS traffic actually reached unbound.
I have the impression that its told that unbound has failed, or is was actually up and running, but there was an "interface problem". Like, just an example, an IPv6 prefix has changed, the client wasn't aware, still using the now depreciated IPv6 address, and as IPv6 traffic is preferred over IPv4, the comm fails.If a resolver action is needed, same thing : IPv6 is preferred, and if issues with IPv6, unbound will / might switch over to IPv4 ... and if it doesn't for some reason, well, game over.
There are people that prefer to forward instead of resolving. And of course : forwarding over TLS.
That will add a massive quantity of "TLS" code (external libraries, etc), and the slightest bug will ... fail again.
That's why I tend to think : DNS is important for me so I KIS it : I resolve.
I do use DNSSEC, which is more of a parallel process to the classic DNS handling.Something that might protect me from potential issues : all my pfSense interfaces are hooked up to devices that are UPS protected : my upstream ISP router and all my 'core' switches' so my pfSense always stay 'up'.
And the final potential issue : ISP ... and IPv6.
For some reason there can't be an ISP that implemented IPv6 according the existing RFCs .... Mine, the biggest in France (like 16 million clients), has known IPv6 flaws.@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
You have the choice to either use the default choice of not checking the "Disable Auto-added Access Control" under unbound "Advanced Settings", or to check it and configure the access to do queries against the DNS resolver of the pfSense boxes manually.
If you choose the default access settings I believe that access to the ::1 localhost interface has been forgotten in the access rules. After I changed to manual access rules, and added both 127.0.0.1 and ::1 to the access rules for allowing queries to the localhost interface unbound has been rock solid.Mine is unchecked.
You said in the bug post :
I could probably have checked in the source code if my assumption is correct,
so why did't you open that file ?
Not the source : look at the unbound config file.My access_list.conf file (soured by unbound.conf) :
access-control: 127.0.0.1/32 allow_snoop access-control: ::1 allow_snoop access-control: 127.0.0.0/8 allow access-control: 192.168.1.0/24 allow access-control: 192.168.2.0/24 allow access-control: 192.168.3.0/24 allow access-control: 192.168.100.0/24 allow access-control: 2a01:cb19:dead:beef::/64 allow access-control: ::1/128 allow
::1 is there - twice !
Isn't this what you mean ?
Btw : I'm using 25.03 beta 2 for two weeks now, rock solid.
I presume 24.11 was also good. -
When I do the search:
find / -name access_list.conf
The name seems to be with an "s" on list -> access_lists.conf
When I search the config /usr/local/etc/unbound/unbound.conf I do not find any reference to access_lists.conf.
When I search the /var/unbound/unbound.conf I do see the reference to access_lists.conf.What are the difference between the two config files?
I also see that access_lists.conf changes to my own rules when I check the "Disable Auto-added Access Control"
So when I use manual access control the:
access-control: ::1 allow_snoop
is not included in access_lists.conf
Maybe the access line:
access-control: ::1/128 allow
isn't evaluated since the "access-control: ::1 allow_snoop" comes first in the list in access_lists.conf when auto rules is chosen?
This should indicate that ::1 has not been forgotten, but that is not what I experience.
-
@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
When I do the search:
find / -name access_list.conf
All unbound related settings files are here
/var/unbound/If you use the general 'search all' command, you might find the same file else where. These are not used.
Run the magic command :ps aux | grep 'unbound' ... unbound 64814 0.0 2.8 142788 114284 - Ss 10:04 1:25.10 /usr/local/sbin/unbound -c /var/unbound/unbound.conf ...
Now you know where the actually unbound.conf file is, and all other files it includes, like the access_list.conf file.
@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
When I search the config /usr/local/etc/unbound/unbound.conf I do not find any reference to access_lists.conf.
When I search the /var/unbound/unbound.conf I do see the reference to access_lists.conf.What are the difference between the two config files?
pfSense is based upon FreeBSD, but isn't FreeBSD.
pfSense uses FreeBSD packages, and when you install them, they can place config file somewhere under (example) /usr/local/.... but these are rarely used by pfSense.
All (most) of the processes that are sued by pfSense have their config lives kept /var/.....About these :
access-control: ::1 allow_snoop
access-control: ::1/128 allowI couldn't tell you what the difference is between allow_snoop and allow or why ::1/128 and ::1.
For me, these two, as 127.0.0.1, are only be used / reached by processes running on pfSense itself that need some host name to be resolved, like the pfSense package upgrade checker. -
It says in the web gui what the differences are between the Allow and allow_snoop:
Allow: Allow queries from hosts within the netblock defined below.
Allow Snoop: Allow recursive and nonrecursive access from hosts within the netblock defined below. Used for cache snooping and ideally should only be configured for the administrative host.
I will start testing with the allow_snoop before or after the allow in my manual access list. Then we can see if this is the root problem.
-
allow_ or allow_snoop, thats one thing.
But what does is mean :access-control: ::1 allow_snoop
access-control: ::1/128 allowas ::1 and ::1/128 are the same for me.
So, allow_snoop gets set on ::1 and then overridden by 'allow' ?Here you can see how the access_lists.conf file gets created :
/etc/inc/unbound.incFirst, "127.0.0.1/32 allow_snoop" gets thrown in and then "::1 allow_snoop".
You and I don't chose the 'allowed_snoop' from the GUI here, it's hard coded.Then, all your local known interfaces, and this includes
127.0.0.0/8 allow
and
::1/128 allowand as said : these are the same for me.
Note that the 'allow' here is the one I set up here :
I wonder what happens if I delete these two lines :
-
It is possible to configure manually the "Allow_snoop", by choosing "Allow_snoop" under "Action" in the web gui of unbound under "Access Lists". The sequence of rules shown in the lower part of that web page are the same as the sequence of the rules in the access_lists.conf file. I'm currently testing to see if putting the snoop rule first or last has any influence on the end result, but so far I can't say that it seems to have any effect.
What I see is that after doing a change in the configuration the resolver will work for some minutes more, then be unresponsive for some minutes and then come back. I wonder if it is pfBlockers large DNSBL lists which need to be loaded before unbound can take care of resolving again?
After this down period of some minutes it again seems to be stable no matter if the snoop is first or last. The only thing I'm not able to reproduce is to make the rule in access_lists.conf 100% similar to the auto created rule:Auto created it looks like this:
access-control: ::1 allow_snoop
but when I manually create it I can't make it in any other way than this (mask needs to be selected, and if you do not select it will be auto created):
access-control: ::1/128 allow_snoop
I guess that should be the same, if it isn't a bug which makes trouble for the auto rule?
-
@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
It is possible to configure manually the "Allow_snoop", by choosing "Allow_snoop" u
Noop.
I selected some random "Refuse Nonlocal" :
this creates :
access-control: 127.0.0.1/32 allow_snoop access-control: ::1 allow_snoop access-control: 127.0.0.0/8 allow access-control: 192.168.1.0/24 allow access-control: 192.168.2.0/24 allow access-control: 192.168.3.0/24 allow access-control: 192.168.100.0/24 allow access-control: 2a01:dead:beef:a6e2::/64 allow access-control: ::1/128 allow #Local access-control: fc00::/7 refuse_non_local access-control: fe80::/64 refuse_non_local access-control: 10.0.0.0/24 refuse_non_local access-control: ::ffff:0:0/96 refuse_non_local access-control: 192.168.4.0/24 refuse_non_local access-control: 192.168.3.0/24 refuse_non_local access-control: 2a01:dead:beef:a6e2::/64 refuse_non_local
so everything before
#Local
didn't change. -
Are you sure you have disabled the auto rules?
The access_lists.conf does not look like that in my case with auto rules disabled. -
When I check this :
( which I don't have checked right now )
I have to create my own access list .... so more chances to f##k up.
I'm a "leave it to default" guy -
@Gertjan said in Unbound/DNS resolver with IPv6 unreliable finally solved:
I wonder what happens if I delete these two lines :
a9c12224-4af3-4fde-8015-2265b6b91de5-image.png
I would delete
::1/128 allow
, and add the/128
CIDR notation to the::1 allow_snoop
entry manually—and leave127.0.0.1/32 allow_snoop
as is.But I agree that neither may be necessary as my auto-generated
/var/unbound/access_lists.conf
contains only the ACLs I've defined via the webGUI. No loopback addresses are present. -
This post is deleted! -
@tinfoilmatt said in Unbound/DNS resolver with IPv6 unreliable finally solved:
127.0.0.1/128
Isn't that a 'syntax error' ?
127.0.0.1/32 is as far as it goes. -
I tried to add the:
access-control: ::1/128 allow_snoop
to my manual access list over the weekend. The result was that both the primary and the secondary firewall had a unresponcive unbond service on sunday. Today I have removed the access rule above. We will see how this goes.
Does anybody know what this rule is for?
-
Yes, 127.0.0.1/128 is wrong, and 127.0.0.1/32 is correct, but I see that the auto rule allow 127.0.0.0/8. Is that necessary? In case it is which other IP addresses in the 127.0.0.0/8 are in use?
-
@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
but I see that the auto rule allow 127.0.0.0/8. Is that necessary? In case it is which other IP addresses in the 127.0.0.0/8 are in use?
127.0.0/8 is a bit large, true.
Execute for example
sockstat -4 | grep '127'
to see who is using 127.a.b.c
-
I can't see any othe address in the 127.0.0.0/8 used other than 127.0.0.1, so I would assume it would be ok to change out 127.0.0.0/8 with 127.0.0.1/32.
-
Sure.
Will it make any difference ?
Not sure. -
@strandte said in Unbound/DNS resolver with IPv6 unreliable finally solved:
After I setup monitoring I found out that the DNS resolver on the pfSense boxes often stopped for a while and then automatically started to respond to queries again, and that the problem seemed to be more pronounced for resolving via the IPv6 addresses of the pfSense boxes. Often the unbound stopped responding to queries done via the IPv6 address, but still responded to queries done via the IPv4 address. After a while both became unresponsive. When this was the case restarting the service made it respond to queries again, but I think it might also would have started again at some time if I had not done anything. When the service had stopped this would not be the case of cause.
I honestly don't think that the unbound control settings are related to this issue. Unless access control for unbound simply prevents its endless restarts and refreshes, which, in turn, solves one problem but clearly causes a thousand others. In fact, unbound was rock-stable for me on 24.11 and earlier. But it "broke" on the 23.05 beta because pfSense suddenly decided that now, every time it receives configuration packets (RA info) from the ISP, it needs to refresh and update all related settings, including unbound, even if no changes are detected in those settings received. When I started digging into this issue, I was surprised to see just how many requests there were to stop and restart the service — sometimes ending with it stopping and not starting again. Ideally, with proper Python module integration, everything should be much more stable, but sometimes it is not.
-
@Gertjan said in Unbound/DNS resolver with IPv6 unreliable finally solved:
Isn't that a 'syntax error' ?
Yes, typo. Post edited. Thanks for pointing out.