Strange DNS Issue
-
I'm having a strange DNS issue that just popped up out of nowhere for my family's network I manage. My mother has her own WordPress site and for years has had no issue accessing it. A couple days ago it became unreachable, as well as the host. I'm using Unbound in python mode with pfBlockerng. (I've been through all the pfblocker logs to ensure it didn't get put in a spam list.) Using dig, provides an A record but no nameservers. Chrome times out when attempting to browse to the domain.
I've bypassed pfSense entirely and everything works. The ISP DNS servers provide the nameservers and the webpage loads. The interesting thing is telling dig to use 8.8.8.8 or 1.1.1.1 as they don't provide the nameservers either. At one point, unbound was taking seconds, saying SERVFAIL, or timing out on attempting to resolve the domain, until I enabled forwarding mode.
If I run dig with +trace against unbound provides some interesting info:; <<>> DiG 9.20.11-1-Debian <<>> nanjones.com +trace ;; global options: +cmd . 86126 IN NS b.root-servers.net. . 86126 IN NS i.root-servers.net. . 86126 IN NS g.root-servers.net. . 86126 IN NS f.root-servers.net. . 86126 IN NS l.root-servers.net. . 86126 IN NS d.root-servers.net. . 86126 IN NS e.root-servers.net. . 86126 IN NS h.root-servers.net. . 86126 IN NS a.root-servers.net. . 86126 IN NS m.root-servers.net. . 86126 IN NS c.root-servers.net. . 86126 IN NS k.root-servers.net. . 86126 IN NS j.root-servers.net. . 86126 IN RRSIG NS 8 0 518400 20250813210000 20250731200000 46441 . EuulD1XC8Msqhtca+sTPLg1yyIDKKxMlDIH7VPNj6+gMqd6LIpkKebV+ Fsi4RNtqKpEd9Wm65pzJeSRSIrZIy8uCR+pah3TR35hu+I2bnbwvDr7N JZyamoikwuxxMKMEzTQi2Zhv11h89aycz7om+FKQBtO6ygE9dq4EdlEh t4uXbwJqUKWjilggvpZ/YampYQ3E3x0KyNAwW+FBzGQS/kXd2uZnAeyX 0JGDcMfuYK1JXXCw7O9JbrvY/bEADlcbtbMwFHJH5HDgwL4YYbg3qYjP t4cQxEd1au2jHmjJQQb/gR5/Qp7YygaFSiyU9zz0WF6ddQs35T9zbDt4 azYrcg== ;; Received 525 bytes from 10.0.1.1#53(10.0.1.1) in 7 ms com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. com. 172800 IN NS f.gtld-servers.net. com. 172800 IN NS g.gtld-servers.net. com. 172800 IN NS h.gtld-servers.net. com. 172800 IN NS i.gtld-servers.net. com. 172800 IN NS j.gtld-servers.net. com. 172800 IN NS k.gtld-servers.net. com. 172800 IN NS l.gtld-servers.net. com. 172800 IN NS m.gtld-servers.net. com. 86400 IN DS 19718 13 2 8ACBB0CD28F41250A80A491389424D341522D946B0DA0C0291F2D3D7 71D7805A com. 86400 IN RRSIG DS 8 1 86400 20250814050000 20250801040000 46441 . i4ICsCSGQ3UwnzYwyCpPiAPvm6pfJ0WR5bvA/XfjtwtGNCvHT9sm+Xl9 6v04XpRv6AQ6fgN/pNEfNw66QW9dcXsocE2Ozo0xuGfeL/+yIUppswXJ DHISRR2zsOM/DmgDGKetJYnBMTYf1+SAI0cPaW4o1OiLZe3MQDvI0Lc1 49mAtWRFEAjAM5OZIlNmH+LDQeJzoxNa5dj/vVOSrVeLqz+BPdEEZQfu AYDZR1XoLtME+VwvfPvrovDAbHrlLXIK4LtcFYhPhvseGOL3K61zsP5e yBjNBtM/9I2l5wcVRKHNZvuz7rZulsmzjT8xKgHKQiRP5IyQLS+U3XVm cKYpJw== ;; Received 1172 bytes from 192.5.5.241#53(f.root-servers.net) in 15 ms ;; UDP setup with 2001:500:856e::30#53(2001:500:856e::30) for nanjones.com failed: network unreachable. ;; no servers could be reached ;; UDP setup with 2001:500:856e::30#53(2001:500:856e::30) for nanjones.com failed: network unreachable. ;; no servers could be reached ;; UDP setup with 2001:500:856e::30#53(2001:500:856e::30) for nanjones.com failed: network unreachable. ;; UDP setup with 2001:503:39c1::30#53(2001:503:39c1::30) for nanjones.com failed: network unreachable. nanjones.com. 172800 IN NS ns1.fistbumpmedia.com. nanjones.com. 172800 IN NS ns2.fistbumpmedia.com. CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 900 IN NSEC3 1 1 0 - CK0Q3UDG8CEKKAE7RUKPGCT1DVSSH8LL NS SOA RRSIG DNSKEY NSEC3PARAM CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 900 IN RRSIG NSEC3 13 2 900 20250807002527 20250730231527 20545 com. pOAE1M0WzrJFKtlgIQj0ip94mDGQHGAEvxkjL7SsPkQ+oURqShq6m05y KggqbxYQi+Us7hpYcwdfw5JdBB89zg== 418H8RL4US2HVGB09CCP6J0K7UN89E3K.com. 900 IN NSEC3 1 1 0 - 418HJCM0G00EJ5F8R6EOHSCHDVHBODA3 NS DS RRSIG 418H8RL4US2HVGB09CCP6J0K7UN89E3K.com. 900 IN RRSIG NSEC3 13 2 900 20250808022636 20250801011636 20545 com. lJVUuNS6phzZL2gws/QiASwDs9ERtOOwOL2/Ifm92hHUS+RW2XkWnAxD lD5BioKvavx3ljYmWH4qeRSwaNqPBg== ;; Received 480 bytes from 192.42.93.30#53(g.gtld-servers.net) in 27 ms ;; communications error to 98.142.103.195#53: timed out ;; communications error to 98.142.103.196#53: timed out ;; no servers could be reached
I get the same results with 8.8.8.8 and 1.1.1.1. If I run dig with the ISP DNS servers, I don't get the same result, but the root servers state NXDOMAIN. No connection issues though.
I've set my laptop to use the ISP DNS server behind pfSense, dig resolves it correctly but the page still times out. If I use a VPN, everything works flawlessly.
So I'm officially stumped. Here's a few screenshots of configs to help.
Unbound access list is empty.
Any help is greatly appreciated! -
@JonesTech said in Strange DNS Issue:
I'm having a strange DNS issue that just popped up out of nowhere for my family's network I manage. My mother has her own WordPress site and for years has had no issue accessing it. A couple days ago it became unreachable, as well as the host. I'm using Unbound in python mode with pfBlockerng. (I've been through all the pfblocker logs to ensure it didn't get put in a spam list.) Using dig, provides an A record but no nameservers. Chrome times out when attempting to browse to the domain.
A web site ... so bookmark this test : https://www.zonemaster.net/en/run-test and correct any issues. This shows is a typical 'ok' result.
These are web admin, host and DNS related. None if it concerns pfSense at this stage.From then on :
@JonesTech said in Strange DNS Issue:
The ISP DNS servers provide the nameservers and the webpage loads.
You don't need these anymore. The ISP DNS is a thing of the past.
Your ISP has a resolver. That's the DNS tool needed to tap into the DNS resources of the Internet.
You use pfSense now, you have your own tool. (Your ISP probably uses it also - or the other on, called named or bind)
So, go back to default, uncheck :Actually : on this page : SystemGeneral Setup, the only thing that you had to change was :
where you might call pfSense like .. pfSense, so even that one can stay as it is.
Don't change anything else on that page, and you have a working system.
Ok, true, some ISPs are special and do (very !) 'strange' things forcing you to make a use of these options.
The default settings Netgate has chosen should work "out of the box" in most cases.@JonesTech said in Strange DNS Issue:
The interesting thing is telling dig to use 8.8.8.8 or 1.1.1.1 as they don't provide the nameservers either.
Both 8.8.8.8 and 1.1.1.1 are also revolvers.
@JonesTech said in Strange DNS Issue:
At one point, unbound was taking seconds, saying SERVFAIL, or timing out on attempting to resolve the domain, until I enabled forwarding mode.
Forwarding to who ?
@JonesTech said in Strange DNS Issue:
;; communications error to 98.142.103.195#53: timed out
;; communications error to 98.142.103.196#53: timed outSo it can't reach "98.142.103.196" or "98.142.103.196" doesn't want to answer for "whatever reason".
You know who this is :
[25.07-RC][root@pfSense.bhf.tld]/root: host 98.142.103.195 195.103.142.98.in-addr.arpa domain name pointer ns1.fistbumpmedia.com.
The ns1.fistbumpmedia.com and ns2.fistbumpmedia.com are the domain name servers (probably from your host, the company from which you rent the domain name) - these two server will be the one that tell the world (Internet actually) that "nanjns.com" has the IPv4 :
[25.07-RC][root@pfSense.bhf.tld]/root: dig nanj*n*s.com +short 98.142.103.194
If ns1.fistbumpmedia.com and ns2.fistbumpmedia.com doesn't answer for whatever reason the the request :
What is the IPv4 of "nanjns.com" ? then yeah, you cant' reach your web site.That said, I use pfSense (Plus, but that shouldn't matter) - the nearly release version 25.07 (shouldn't matter neither) and a 'dig' worked just fine :
[25.07-RC][root@pfSense.bhf.tld]/root: dig nanj*n*s.com +trace +nodnssec ; <<>> DiG 9.20.6 <<>> nanj*n*s.com +trace +nodnssec ;; global options: +cmd . 55893 IN NS i.root-servers.net. . 55893 IN NS m.root-servers.net. . 55893 IN NS e.root-servers.net. . 55893 IN NS g.root-servers.net. . 55893 IN NS c.root-servers.net. . 55893 IN NS a.root-servers.net. . 55893 IN NS l.root-servers.net. . 55893 IN NS j.root-servers.net. . 55893 IN NS f.root-servers.net. . 55893 IN NS h.root-servers.net. . 55893 IN NS d.root-servers.net. . 55893 IN NS b.root-servers.net. . 55893 IN NS k.root-servers.net. ;; Received 239 bytes from 127.0.0.1#53(127.0.0.1) in 2 ms com. 172800 IN NS m.gtld-servers.net. com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. com. 172800 IN NS j.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS l.gtld-servers.net. com. 172800 IN NS f.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS h.gtld-servers.net. com. 172800 IN NS i.gtld-servers.net. com. 172800 IN NS k.gtld-servers.net. com. 172800 IN NS g.gtld-servers.net. ;; Received 865 bytes from 192.36.148.17#53(i.root-servers.net) in 314 ms nanj*n*s.com. 172800 IN NS ns1.fistbumpmedia.com. nanj*n*s.com. 172800 IN NS ns2.fistbumpmedia.com. ;; Received 123 bytes from 192.31.80.30#53(d.gtld-servers.net) in 25 ms nanj*n*s.com. 14400 IN A 98.142.103.194 ;; Received 57 bytes from 98.142.103.196#53(ns2.fistbumpmedia.com) in 125 ms
My dig tells me : from one of the DNS root server (x.gtld-servers.net) - d.gtld-servers.net was picked (as the fastest for me = my pfSEnse resolver) and it gave "i.root-servers.net" (a dot com TLD server) as the TLD, for dot com domain names. "i.root-servers.net" gave both (minimal 2 !) "ns1.fistbumpmedia.com and "ns1.fistbumpmedia.com" when "i.root-servers.net" was asked : who knows about "nanjns.com".
Btw : I use the +nodnssec option as your domain name isn't DNSSEC protected. This gives a cleaner 'dig' result.
nanjns.com, probably a multi host 'shared' web server, is slow to reply - but be ware that I visit it from France.
Maybe a red flag : I see
... UDP setup with 2001:500:856e::30#53(2001:500:856e::30) for ... for nanj-n-s.com failed: network unreachable.
which means IPv6 is used, and it fails. So no DNS replies.
Great, why not, but is your local (pfSense) IPv6 fully working ?
If there is one world wide mess going one right know, then it is this one : most ISPs on planet earth d, some how, implement IPv6 not the correct way.
If they don't totally break it, or simply not support it.
Or, next best : their IPv6 setup is special, not common,I saw you've instructed the resolver (unbound) not to use IPv6
do-ipv6: no
The dig command bypasses the resolver, and contacts itself the needed DNS servers. So it can use IPv6 as it- is avaible on your pfSense ?
You can tell dig to use IPv4 only by adding the option
dig -4 ........
This means you use pfBlockerng in unbound mode ?
Not really related, but switch pfBlockeng DNSBL to use "Python mode" as it is 'way better'.
And, as some one has to say it at least ones a day on this forum :
If you have to use :
then you might as well make live easier on yourself by stopping stopping (remove the check) :
Or get back to what Netgate decided for you (they are probably network experts, so you can trust them) and use the defaults :
Enable DNSSEC
Disable forwarding.Also : to rule out any local (pfSense) issues : disable pfBlockerng while testing.
Only add things when everything works fine.
If something starts to break, you know what it is : the change you just made -
@Gertjan Thanks for the reply.
I don't use the ISP DNS servers at all. They were only used when I bypassed pfSense entirely, and thought it was interesting to note that the website worked while everything was bypassed. Just like while using a VPN, everything works.I should have noted that I've been using pfSense since the 1.x.x days, so I'm fairly versed on a lot of things. This issue was just stumping me.
I use pfBlockerng DNSBL in python mode, sorry I should have clarified.
I initially didn't have DNS Query Forwarding Mode on and haven't for years. Having it off resulting in SERVFAILs or connection timeouts on digging the domain.
I didn't mention that I tried with pfBlockerng disabled and still had the same result.
Thanks again for the reply.
edit: while trying new things: I'm unable to ping the ip of the domain and the nameservers. While utilizing my VPN, I'm able to do both. So something in pfSense is blocking the connection.
edit 2: utilizing tcpdump -lnettti pflog0 and monitoring while doing a curl shows me:00:00:01.315024 rule 77/0(match) [ridentifier 1770009693]: pass in on bce1: 10.0.1.68.35294 > 98.142.103.194.80: Flags [S], seq 781394860, win 64240, options [mss 1460,sackOK,TS val 3127155263 ecr 0,nop,wscale 7], length 0
Rule 77 corresponds to pfBlockerng's whitelist rules, where I put her domain to be safe. That's a match to pass so.......nothing blocking it.
-
@JonesTech said in Strange DNS Issue:
I use pfBlockerng DNSBL in python mode, sorry I should have clarified.
Then you can remove :
as you've probably added it yourself way, (way !) in the past.
Right now, that line will read/include:
-rw-r--r-- 1 root unbound 2293 Jan 22 2025 /var/unbound/pfb_dnsbl_lighty.conf
that that well error out at best, as that file isn't a DNSBL file, but the pfBlockerng DNSBL nginx (web server) config file.
@JonesTech said in Strange DNS Issue:
using pfSense since the 1.x.x days, so I'm fairly versed on a lot of things.
Welcome to the club. Ex M0n0wall user myself
You're probably aware that it still happens every day : you're missing the obvious, what is right in front of you all the time ...@JonesTech said in Strange DNS Issue:
Having it off resulting in SERVFAILs or connection timeouts on digging the domain
pfSense, with it's WAN should have (in theory) a constant upstream connection.
So, the resolver, unbound, should be able to 'poll' all the needed servers all the time.
That said, it can do so if its really running all the time.
If the resolver is down (about to restart again) then during that time : SERVFAIL.
So the question is : is it running all the time.
Answer :cat /var/log/resolver.log | grep 'start'
In a perfect world, nothing shows up.
In our world, reality :
The admin was saving some setting ?
A network plug LAN WAN etc was going down and/or up ?
pfBlocker was set to anti social mode (updates IP and DNSBL every hour !)
and the dreaded, decade lasting issue : you've are using ISC DHCP and selected "register every lease into the DNS" - and xx new leases and renewals are happening every hours so unbound get restarted at chain gun rate ?Again : during a restart unbound / the resolver is temporarily unavailable. That means : "SERVFAIL".
One of your many pfSense admin missions : stop this from happening.
Or : know that when it happens it has to happen and live with it ^^At least :
Update pfBlockerng ones a week (lists are most often not updated very often anyway).
Never ever make pfSense interfaces going down : power ever connected device (upstream ISP, downstream LAN ports) to switches and power these with the same power source as pfSense : an UPS.
The ISC DHCP lease issue : upgrade to 2.8.1 or 25.07-RC (for now), use kea and done. -
@Gertjan said in Strange DNS Issue:
Then you can remove :
Actually, with Safesearch enabled in pfBlockerng, it drops pfb_dnsbl.safesearch.conf in /var/unbound, so that line is necessary.
Welcome to the club. Ex M0n0wall user myself
You're probably aware that it still happens every day : you're missing the obvious, what is right in front of you all the time ...It has to be right in front of me but all I see is pitch black! lol
pfSense, with it's WAN should have (in theory) a constant upstream connection.
So, the resolver, unbound, should be able to 'poll' all the needed servers all the time.
That said, it can do so if its really running all the time.
If the resolver is down (about to restart again) then during that time : SERVFAIL.
So the question is : is it running all the time.
Answer :cat /var/log/resolver.log | grep 'start'
Zero restarts. I had a similar thought process and had checked before. Although that was recently and haven't seen anymore timeouts during a dig.
Update pfBlockerng ones a week (lists are most often not updated very often anyway).
Never ever make pfSense interfaces going down : power ever connected device (upstream ISP, downstream LAN ports) to switches and power these with the same power source as pfSense : an UPS.
The ISC DHCP lease issue : upgrade to 2.8.1 or 25.07-RC (for now), use kea and done.Updates are every few days and I've been on kea since it debuted.
Any other thoughts on how to troubleshoot this? I'm pulling my hair out lol. Thanks again for the reply!
-
If you're using forwarding mode then disable DNSSec.
-
@JonesTech said in Strange DNS Issue:
;; communications error to 98.142.103.195#53: timed out
;; communications error to 98.142.103.196#53: timed outThose are the 2 name servers for that domain - if you can not talk to them, then yeah your going to have hard time getting the an IP for a record they are serving.
nanjones.com. 86400 IN NS ns2.fistbumpmedia.com. nanjones.com. 86400 IN NS ns1.fistbumpmedia.com.
Could be a peering problem your isp currently having.. But yeah if you are resolving and can not talk to the owning NS for a domain, your not going to be able to resolve anything from them.
-
@johnpoz said in Strange DNS Issue:
Could be a peering problem your isp currently having.. But yeah if you are resolving and can not talk to the owning NS for a domain, your not going to be able to resolve anything from them.
I came to the same conclusion as it's now miraculously working! I knew I dotted all my i's and crossed my t's and coming up with nothing on my end lead to me to believe it was something upstream.
Thanks to everyone that chimed in!