Slow DNS after 22.05
-
@gertjan yeah it would scream at you if something wrong in your config options text box.
i think I have edited all my mistakes of using the wrong setting name in actual post text.
As check - if you look in your resolver status menu item in status - if you see IPv6 in the cache - then setting is not there. See my above examples showing the IPv6 address in the resolver status..
Now that I have put it back to no, no IPv6 listed there.. And have yet to see any issues in browser having issues with sites, etc. But then again, I only found that 1 issue when I was allowing use of my HE tunnel.. But different users go to different sites, etc. So if there is an issue with transport over ipv6, it might be so pronounced for some users that to them dns is just borked, or could be for others 1 out of 100 sites or something and simple refresh of page and it works so they don't even notice it, etc.
My recommended settings for resolving would be to
turn on prefetch
serve 0These can help reduce any sort of first query delay when a resolver has to start from scratch and work all the way down from roots. Should they be required no, are they default - I don't think so. But prefetch can allow for something that is close to expiring in the cache to get a refresh before someone asks for it and cache is completely empty and has to be resolved from scratch again.
Same thing with serve 0 ttl - this allows for a client asking for whatever.something.com to get the last entry that was cached for it, and then in the background unbound will resolve it again.
This can also reduce any sort of delay with initial resolve all the way from roots.
Keep in mind even a full resolve all the way from roots shouldn't take very long - but these settings should help if whatever reason there is delays in a full resolve, long cname chains, connectivity issues with specific NS in the path to get to the specific fqdn, etc.
I have been using these for years, and have not seen any problems with them.
-
Isn't disabling IPv6 in unbound just masking the problem? I may be a great work-around, but it seems a little short-sighted.
This behavior changed between 22.01 and 22.05 due to unbound itself changing. Since pfSense defaults to it instead of bindd then that dependency is clear and the problem needs to be escalated properly for review.
Yes, we're in a IPv6 transition period and yes, today, you MIGHT be able to disable it and walk away, but I would consider it a bit of a head-in-the-sand response.
What is the proper channel for escalating observations to the unbound team?
-
@lohphat said in Slow DNS after 22.05:
What is the proper channel for escalating observations to the unbound team?
Hello!
Isnt the fix for the do-ip6 workaround (as derived from https://github.com/NLnetLabs/unbound/issues/670) already in the pfsense development snaphots?
John
-
@lohphat said in Slow DNS after 22.05:
Isn't disabling IPv6 in unbound just masking the problem
See the "670" link below.
Short answer : yes, of course, it's a sledge hammer solution.
But read the bug thread @redmine from unbound : it looks like buffers fill up to the max, and outstanding requests get aborted. Reading the patch proposed also makes me think the code wasn't isn't resilient enough. The issue might even be bad DNS record / bad DNS setup on the DNS server side. And the result is : unbound bails out.
The "670" mentioned other variables you can set to bigger values so cache memory becomes bigger.IPv6 isn't always well implemented. Or less tested. And there is more to test.
@serbus said in Slow DNS after 22.05:
Isnt the fix for the do-ip6 workaround (as derived from https://github.com/NLnetLabs/unbound/issues/670) already in the pfsense development snaphots?
If that unbound version can be back ported to 12.3 or whatever FreeBSD it has to use.
Netgate isn't using the latest and greatest. As the latest has also less known bugs ^^@johnpoz said in Slow DNS after 22.05:
I only found that 1 issue when I was allowing use of my HE tunnel..
In the past, I had way more issues using he.net.
The web site of my own ISP, Netflix, and yes, Apple (the cloud stuff) : when I was using IPv6 ( using he.net ) pages won't load, stalled, CSS was broken.So I used pfBlockerng-devel with the No-AAAA option, so I could list sites to which I wanted to talk to using IPv4 only.
That now over :
My Netflix works well over tghe he.net ipv6 tunnel. More and more sites are IPv6 without any IPv4 pollution.
And remember : the IPv6 of he.net can be considered as a VPN access.
These are the access points : https://tunnelbroker.net/status.php. I took the closest to me of course, but I could take an tunnel server in the states, and then I would have a US IPv6 ?
No need to recall : some sites don't like to be accessed using VPN !? :)@johnpoz said in Slow DNS after 22.05:
turn on prefetch
serve 0Yeah, reading the description made me think : these are to good to be true !
I've checked these from day 1. -
@johnpoz said in Slow DNS after 22.05:
My recommended settings for resolving would be to
turn on prefetch
serve 0Would you recommend these instead of 'do-ip6: no' or as well as?
Happy to play around with settings and see what impact it has.
Strangely lost internet last night and didn't come back on it's own. I had to bounce pfSense and all my other networking gear to get it back for some reason. I did noticed the dpinger gateway monitoring service had died.
Suspect all my kit was overheating because it's pretty hot here in the UK at the moment and my makeshift comms cabinet has limited airflow. One of the things on my very long list to sort out! For now I've strung a load of computer fans together and chucked them in there which seems to be doing the job for now
-
@gertjan This is the reply I was hoping for. Thank you.
It's clear there IS a problem with unbound running out of resources and thus is affecting 22.05. My hunch that all (or close to all of) the reports of broken DNS had IPv6 enabled as a common symptom has been corroborated.
The fact that there's been a significant and open unbound bug since April is interesting that this known problem wasn't somehow included in a "known issue" version release notes of pfSense is of minor concern. We all know there are open issues in all products in our modern software driven world. But we shouldn't have to make upgrade decisions blindly.
Could this be a wake-up call that upon pfSense releases that there's an inclusion of known open issues from the upstream BSD or component (e.g. unbound) bug trackers so that we can make a better informed decision to apply the update or not?
-
Okay all,
After alle the comments and usefull information provided we can come / came to the conclusion that the essence lies within IPv6, DNS (unbound) and 22.05.With that in mind i reversed my previous temporary solution of for enabling query forwarding (essentialy forwarding all my requests to an upstream provider). Essentialy back to basic.
With a lot of settings, changes, and some involved headache because my provider isn't that informative when it comes to having your own firewall/router. I configured IPv6 to completly function. Testing this with different websites like https://ipv6-test.com/ and https://test-ipv6.com/ i cloud confirm that my configuration was a success.
I can fully confirm that with a non working IPv6 configuration or a provider supporting that.... that you should look at elliminating IPv6 from your current config as suggested in this forum post.
Ok... i must say i have been only running for an hour... but all seems fine now.
So my suggesttion would be to check if your provider supports IPv6, if so, check your settings and follow the test websites to see if your resolving ok and your config is as expected.... if yes; you are probaly in the clear and smiling, if not... then;
Option 1: start all over again with your IPv6 config as i did severall times (TEST TEST TEST)
Option 2: just follow the instructions for diabling IPv6 in the resolver and wait for your provider to fully support IPv6 as they should.Basicly: Standard PFsense configuration with a good ISP IPv6 config.
-
@johnpoz said in Slow DNS after 22.05:
@pcol-it-admin said in Slow DNS after 22.05:
said that they had "stock" pfSense DNS resolver settings
I find this is rarely the case to be honest..
I run pfSense in a proxmox VM on a Dell workstation. These are my DNS settings, which exhibit the issue.
As previous indicated I believe the only two "non-stock" options are the ones related to dhcp.
My problem is 100% reproducible. If I use the pfSense resolver after a random period of time I still start to get NXDOMAIN errors in Chrome for common websites. Hitting refresh/reload a couple of times will clear the error and the page will load. It's not because I have fat fingers and am typing facebook.c0m rather than facebook.com.
This is less of an issue for me as I simply spun up a pi-hole lxc on my proxmox server and redirected all my dns inquires to the pi-hole (which has no issues w/ resolution), but obviously not everyone has this option.
-
@tentpiglet
Do some more testing with this option removed :as it is perfectly normal to see NXDOMAIN popping up ones in a while : unbound is restarting because of DHCP leases activity.
Add some DHCP MAC static leases for devices that you always want to have the same IP, like printers, servers, NAS etc. -
@gertjan
As my tests and conclusions, with al your and others help made IPv6 the conclusive problem of this build when you have not correctly configured IPv6.Ruining your setup with other settings may make reverting back a lot less harder for some.
-
@tentpiglet
Did you read my post?
Can you verify that your IPv6 setup is correct?
You can check on advance by forwarding al you dns requests in the resolver to your providers dns servers.When you have a working IPv6 connection you probably can revert to basic configuration.
If not…. Then just use the no-ip6 option in the resolver
-
@lohphat said in Slow DNS after 22.05:
The fact that there's been a significant and open unbound bug since April
.....
...upon pfSense releases that there's an inclusion of known open issuesI can only say : it looks like what's descibed there.
For me : Itr's an OpenBSD thing. If it was an 'any' BSD bug, then why specify OpenBSD ?
The bug was also closed back in Avril 2022.
Also : I'm using the IPv6, and do not have any issues what so ever.What should Netfgate have to do : list every closed bug from an external package in the past as "maybe not solved yet" ? That would be thousands of entries.
I saw you posted you posted to the bug report @unbound.
You should do what has been asked many times over there : you should add complete (very detailed) detailed unbound logs, so the author can see what's up and confirm what happened.
Right know, they (the author) will say : use the unbound version with the merged solution
included, and that's not possible right now.All this IMHO of course.
-
@mihaifpopa said in Slow DNS after 22.05:
Anyone else experiencing this?
This has been an amazing post... I got my issues fixed with the contributions of everyone, and in that process I got to learn how to debug dns unbound issues and get IPv6 working in my lab.
@Gertjan contributions have been great - made me want to start looking at Server Monitoring with Munin.
-
Can you verify that your IPv6 setup is correct?
ip6 is functioning. My wan has a 2001: address and clients on my network have a 2601: address. I can ping 2001:4860:4860::8888 from any of my network clients.
-
@tentpiglet did you try the tests on the suggested websites as well…. This will sometimes give you a bit more insight.
-
@gertjan The bug may have been closed in April but the issue still remains open in the 22.05 pfSense distribution since it's the unpatched unbound.
The issue should remain in an "Known Issues" list until the fix makes it into the next pfSense release. If that's not going to happen, then the process which NetGate uses to determine release viability needs to review upstream issues before release if it's not going to compile them for customers to review.
If the business model relies on components from upstream providers then doing some legwork to determine you're not inheriting problems sight-unseen seems reasonable. e.g. if there are included 3rd party modules not maintained by the OpenBSD distro (e.g. dhcpd, dpinger, igmpproxy, ntpd, radvd, sshd, syslogd, unbound, watchdogd) which are installed by default, then the "What's changed" notes of each should be reviewed by the release team and see what was changed, and if the versions included now have subsequent issues discovered before you pass the new versions on to us.
In all the software companies I've worked in, the release team took care of watching dependencies for any OSS (or commercial) component we then redistributed in our products.
-
@lohphat
I fully agree with you.
What happened if I was testing these new versions ?
For me, on my Netgate SG 4100, unbound works fine, and I'm using IPv6. half of all DNS requests go out over IPv6.
For me, Unboud runs solid for day, and get restarted because my pfBlockerng-devel reloads it after a week ( I'm not updating non updated feeds every hour or so).I would have said to the Netgate team : for me, these new versions, like unbound, are ok.
The thing is : there are people using settings, or hardware, that differs from what Netgate used to test.
There is situation where the error pops up.
As we all us the exact same same unbound binary code, and the same pfSense code, only our settings can differ. And our uplink .... -
I don't have any DNS issues, but just out of curiosity regarding unbound settings...
I assume that the config file is /var/unbound/unbound.conf, because the the custom options get added to that file if set via resolver settings.
Here is my IPv6 system options, also no custom options set under DNS Resolver
When looking at the beginning of the unbound.conf file, the "do-ip6" is set to "no"...
########################## # Unbound Configuration ########################## ## # Server configuration ## server: chroot: /var/unbound username: "unbound" directory: "/var/unbound" pidfile: "/var/run/unbound.pid" use-syslog: yes port: 53 verbosity: 1 hide-identity: yes hide-version: yes harden-glue: yes do-ip4: yes do-ip6: no do-udp: yes do-tcp: yes do-daemonize: yes
Does pfSense actually also disable unbound IPv6 when IPv6 is disabled from System/Advanced settings?
Of course I could have tested it myself, but I didn't want to mess with my working system...This with pfSense Plus 22.05
-
@mvikman do you have an actual IPv6 address?
here is from my config
## # Server configuration ## server: chroot: /var/unbound username: "unbound" directory: "/var/unbound" pidfile: "/var/run/unbound.pid" use-syslog: yes port: 53 verbosity: 1 hide-identity: no hide-version: no harden-glue: yes do-ip4: yes do-ip6: yes do-udp: yes do-tcp: yes do-daemonize: yes
See lower in the config is where your options get set and can override what is set there.
# Unbound custom options server: do-ip6: no private-domain: "plex.direct" local-zone: "use-application-dns.net" always_nxdomain
-
I have IPv6 disabled and I don't have IPv6 address, my ISP doesn't support it.
The current unbound config file doesn't have that custom options section, because I haven't set any custom options.
But I tested adding the custom options and it does add them in the config file.Just curious about that unbound's "do-ip6" is set to "no" without using custom options to set it.