23.01 Upgrade unbound Issue
-
@inferno480 do you have dnssec checked, and your forwarding? If so to where?
Have you checked the date time on your box?
-
@johnpoz Time/Date seem accurate, they are NTP sync'd to 2.pfsense.pool.ntp.org - nothing noteworthy in the NTP logs
"DNSSEC" is checked under General Settings, I am using DNS Forwarding and my servers are Google DNS with the two V6 servers listed first, then the V4. DNS Resolution Behavior is "Use local DNS (127.0.0.1), fall back to remote DNS Servers (Default)". None of this was modified from 22.05 and I never had a problem until 23.01.
Any suggestions on additional logging I can enable (and how), for the next time it happens? I realize the randomness can make things difficult to troubleshoot.
-
@inferno480 uncheck DNSSEC and I suspect your issues will disappear. Unbound seems more sensitive in this version, when using it and forwarding. As discussed in this and other threads and the pfSense troubleshooting doc, DNSSEC is irrelevant if you’re already trusting the other DNS servers to do the lookup for you.
-
If 'DNSSEC' is enabled, pfSense, during preparation of the unbound start, gets a copy of the good, known DNSSEC root key.
/usr/bin/su -m unbound -c '/usr/local/sbin/unbound-anchor -a /var/unbound/root.key'
If this fails, unbound will know it is using a not-good copy, and bail out.
So, you could check with :
/usr/bin/su -m unbound -4 -v -c '/usr/local/sbin/unbound-anchor -a /var/unbound/root.key'
to know if your IPv4 works.
The same for IPv6 :/usr/bin/su -m unbound -6 -v -c '/usr/local/sbin/unbound-anchor -a /var/unbound/root.key'
Remember : no return message : all is well.
You could check content and time stamp of the /var/unbound/root.key file to see for yourself.But : if you are forwarding, as said a million times time by now : disable DNSSEC.
Remember : forwarding means : you don't want certified DNS answers. You've decided to trust "some one else"
That's why forwarding is not the default mode. -
It seems there's also an IPv6 ACL bug, if set to listen on "all" interfaces, that now has a patch:
https://forum.netgate.com/topic/176989/problems-with-pfsense-ipv6-dns-function-does-it-exist/36 -
@bingo600 said in 23.01 Upgrade unbound Issue:
following this guid[e]
https://github.com/jpgpi250/piholemanualI see that includes OneOffDallas. (@moelassus)
It has an important typo though, the three local-zone lines have a leading space:
local-zone: " use-application-dns.net."
That doesn't work in my testing; needs to be:
local-zone: "use-application-dns.net."
-
@steveits
Nice catchDid you use the new giude , or the old one i posted as a"PDF/zip" here (bottom):
https://forum.netgate.com/post/1089443I just briefly skimmed the new guide, and it seemed very "complicated" ..
I have just implemented the "Old guide"./Bingo
-
@bingo600 I just looked through what was on GitHub and set it up myself. I’d already done something similar using pfBlocker’s Greatwall list.
-
@stephenw10 said in 23.01 Upgrade unbound Issue:
Mmm, it's odd because if it's enabled and fails then whatever you're forwarding to doesn't support it. So I would expect it to be an all or nothing situatiuon.
I wonder if it was previously disabled automatically in the old Unbound version.I'm starting to think it's TLS forwarding. We have changed over about a dozen firewalls now and all are having DNS issues with 23. Disabling "Use SSL/TLS for outgoing DNS Queries to Forwarding Servers" seems to be the only way to keep DNS Resolver working. We just switched all of them this morning to see if that holds up.
-
@cylosoft Interesting, let us know. I haven't noticed any DNS issues at home in a week after disabling DNSSEC while forwarding. Haven't upgraded others yet.
Either way 23.01 does have different/problematic behavior than prior versions for people, since there are a lot of posts about DNS.
To be fair I recall plenty of posts about DNS issues in 22.05, but I did not experience that in the routers we upgraded to 22.05.
@stephenw10 said in 23.01 Upgrade unbound Issue:
wonder if it was previously disabled automatically in the old Unbound version
Maybe internally to Unbound? I suggested pfSense do that in a redmine and it was declined, in the context of not wanting to disable people's security choices unexpectedly.
-
@steveits said in 23.01 Upgrade unbound Issue:
in the context of not wanting to disable people's security choices unexpectedly.
To be honest this is fair point of view to be honest.. Many Many years ago there was a pretty lively debate about this.. My point was there is zero point to asking for dnssec on your end if your forwarding. Where you forward is either doing it or their not - you asking for it is only going to be problematic at best.
Maybe no issue, maybe issues like seems to be seeing more of currently. Might have to do with what domains your doing queries for and such.. Either way even if no problems - your doing extra queries for no good reason. Its not like you can ask for dnssec to where you forward and expect dnssec to actually function correctly. At least not with a service that doesn't explicitly support their clients to do such a thing, which I am not aware of any service running in such mode - I don't even think it would be possible to be honest.
While its nice of pfsense to try and prevent you from shooting yourself in the foot when possible, to be fair the admin of the box/service/firewall/whatever needs or should take responsibility for proper configuration.. It is almost impossible to completely, lets use the term "idiot proof" something as complex as a firewall.
I would like to see at least a clearer warning about - hey your prob going to have problems if you do this and forward..
Something like the warning about strict qname use.
The only caveat to keep in mind that pfsense is used by a lot of people that are not network engineers, that might have very limited if any understanding of how dns works at all, let alone with dnssec.. And many users don't even pay attention to the notes on check box items either ;)
I have worked with many a fellow engineer over the years, that might be able to give you all the inner workings of bgp, or spanning tree, or igmp etc. etc.. But is pretty clueless to the inner workings of something as day to day as dns ;) So it might be a bit much to expect some home user trying to secure their network to understand all the ins and outs of how dns works, etc. See it all the time where users don't actually understand the difference between resolving and forwarding.
It can be a difficult situation that is clear..
-
no issues since I last posted (~2wks) after disabling DNSSEC entirely
-
Since upgrading to 23.01 I have been noticing strange intermittent connectivity issues, with trouble tracing it down. What I found is it seems to be linked to this same issue.
My setup is unbound forwarding to cloudflare - Tested both lists below
- 1.1.1.1, 1.0.0.1, 2606:4700:4700::1111, 2606:4700:4700::1001
- 1.1.1.2, 1.0.0.2, 2606:4700:4700::1112, 2606:4700:4700::1002
DNSSEC has always been disabled.
Forward over TLS has been always been enabled.I started noticing ServFail errors in TCPDUMP for DNS queries. I went and enabled unbound debugging. I seem to have the same problem over IPv4 and IPv6. Below you will see logs.
Once I disabled both TLS options below, I stopped seeing the ServFail errors.
- Use SSL/TLS for outgoing DNS Queries to Forwarding Servers
- Respond to incoming SSL/TLS queries from local clients
This all started once I upgraded from 22.05 to 23.01. If I boot back into 22.05 the problem goes away with TLS enabled.
Mar 6 08:48:41 mortis unbound[8065]: [8065:1] info: iterator operate: query mobifts.ebay.com. AAAA IN Mar 6 08:48:41 mortis unbound[8065]: [8065:1] info: processQueryTargets: mobifts.ebay.com. AAAA IN Mar 6 08:48:41 mortis unbound[8065]: [8065:1] debug: configured stub or forward servers failed -- returning SERVFAIL Mar 6 08:48:41 mortis unbound[8065]: [8065:1] debug: return error response SERVFAIL Mar 6 08:48:41 mortis unbound[8065]: [8065:1] debug: cache memory msg=1919609 rrset=1370978 infra=8062 val=0 Mar 6 08:48:41 mortis unbound[8065]: [8065:1] debug: tcp error for address 2606:4700:4700::1001 port 853 Mar 6 08:48:41 mortis unbound[8065]: [8065:1] debug: iterator[module 0] operate: extstate:module_wait_reply event:module_event_noreply
Mar 6 08:48:41 mortis unbound[8065]: [8065:0] info: iterator operate: query api.snapkit.com. AAAA IN Mar 6 08:48:41 mortis unbound[8065]: [8065:0] info: processQueryTargets: api.snapkit.com. AAAA IN Mar 6 08:48:41 mortis unbound[8065]: [8065:0] debug: configured stub or forward servers failed -- returning SERVFAIL Mar 6 08:48:41 mortis unbound[8065]: [8065:0] debug: return error response SERVFAIL Mar 6 08:48:41 mortis unbound[8065]: [8065:0] debug: cache memory msg=1919537 rrset=1370978 infra=8062 val=0 Mar 6 08:48:41 mortis unbound[8065]: [8065:0] debug: outnettcp got tcp error -1 Mar 6 08:48:41 mortis unbound[8065]: [8065:0] debug: tcp error for address 1.1.1.1 port 853 Mar 6 08:48:41 mortis unbound[8065]: [8065:0] debug: iterator[module 0] operate: extstate:module_wait_reply event:module_event_noreply
-
@defunct78 said in 23.01 Upgrade unbound Issue:
Once I disabled both TLS options below, I stopped seeing the ServFail errors.
And it sounds like those were random? For me if I enabled DNSSEC again the domain didn't immediately fail so it may have been something that was random or cropped up over time.
For me (using Quad9) "Use SSL/TLS for outgoing DNS Queries to Forwarding Servers" is checked with no issues but "Respond to incoming SSL/TLS queries from local clients" has never been enabled.
-
@steveits said in 23.01 Upgrade unbound Issue:
checked with no issues but "Respond to incoming SSL/TLS queries from local clients" has never been enabled.
Local "DNS over TLS" was not supported by Windows 10 and below. When testing this one :
I had to equip my Windows 10 with special, 'hard to find' software that extends the Windows 10 DNS lookups. With this software, DNS was going out to 'port' 853' instead of '53'.
The certificate used by unboundnow becomes important, the certificate host names(s) now must match the IP LAN pfSense.
It's nice to have this one, and I'll be using it when my hat start to have a sub micron thickness.@defunct78 said in 23.01 Upgrade unbound Issue:
Since upgrading to 23.01 I have been noticing strange intermittent connectivity issues, with trouble tracing it down. What I found is it seems to be linked to this same issue.
My setup is unbound forwarding to cloudflare - Tested both lists below1.1.1.1, 1.0.0.1, 2606:4700:4700::1111, 2606:4700:4700::1001
1.1.1.2, 1.0.0.2, 2606:4700:4700::1112, 2606:4700:4700::1002I've been using these for two weeks or so.
DNS traffic was mostly IPv6 based to these servers. I didn't notice any issues what so ever, although I did not push the unbound log details to the max.I'm using 23.01 on a 4100, and changed my internet access last month from VDSL (20 Mbits /sec ) to a whopping 'more the half a 1 Gbit /sec symmetrical'.
So, what's good and what's right, these days, I can't really tell.
Added to that : I had to abandon my IPv6 setup that I was using for the last several years as the new ISP router doesn't want to pass the needed protocol '6in4' (protocol lik ICMP, TCP, UDP GRE etc). So I was forced to use the ISP IPv6, leaving me with just one prefix.
I tend to say : it seems to be working ok
Remember : IPv4 and IPv6flow over the same wire, but are not the same.
Non possible issues for IPv4 can hurt IPv6.
A good ( ?) way to try your IPv6 is : block all outgoing IPv4 traffic and sees what happens.
You will have issues, and the bad news is that most of them won't be on your side only. -
@defunct78 Adding more details to my post.
tcpdump on the inside shows the ServFail as stated. Enabling TLS causes these errors. Again, DNSSEC has always been disabled.
13:48:47.739211 IP (tos 0x0, ttl 64, id 57751, offset 0, flags [none], proto UDP (17), length 59, bad cksum 0 (->dab3)!) 192.168.X.254.53 > 192.168.X.24.63104: [bad udp cksum 0xbe9f -> 0xb98a!] 11684 ServFail q: AAAA? i.ebayimg.com. 0/0/0 (31)
and IPv6
13:32:22.688367 IP6 (hlim 64, next-header UDP (17) payload length: 41) XXX:XXX:XXX:30::1.53 > XXX:XXX:XXX:30:f470:14f5:f634:1308.55800: [udp sum ok] 5238 ServFail q: AAAA? ssl.gstatic.com. 0/0/0 (33)
I am not seeing errors on the WAN side, though that data is encrypted so it is a bit harder to see the content. I have tried Quad9 and Cloudflare both. Also disabled IPv6 on the client side just to isolate the issue, none of these seemed to have changed the behavior.