Strange issue - not sure how to fix
-
Spoke too soon - feedly.com resolved in firefox (still using the firefox internal cloudfare DNS lookup) but not safari at the same time - i.e. it could resolve when 1.1.1.1 was used but not unbound. As soon as pfBlocker NG finished its update, firefox could not resolve feedly.com any longer.
-
Tried another packet capture while browsing to feedly.com, pinging feedly.com, and performing an nslookup via the command prompt. During all of that, feedly.com never appeared in the UDP port 53 packet capture, even though other domains did appear as expected (and resolved without issue) - including websites that I have never accessed in the past. The problem does seem to be limited to feedly. I remain mystified.
-
@pfguy2018 said in Strange issue - not sure how to fix:
notice that this is at exactly the same time that pfBlocker NG is updating itself/cron job.
Dude that restarts unbound! And yes that will cause you problems with resolving. As to not seeing anything for feedly in your sniff - because once its cached you don't have to go lookup it up again..
How freaking often do you have pfblocker updating?
Why don't you turn off that firefox nonsense?
Here lets do this - how long has unbound been up?
[2.4.4-RELEASE][admin@sg4860.local.lan]/root: unbound-control -c /var/unbound/unbound.conf status version: 1.9.1 verbosity: 1 threads: 4 modules: 2 [ validator iterator ] uptime: 555956 seconds options: control(ssl) unbound (pid 83160) is running... [2.4.4-RELEASE][admin@sg4860.local.lan]/root:
-
Here is another data point. The issue I noted above where I was able to resolve feedly.com while pfBlockerNG was reloading its lists, but was then unable to resolve it once the cron job finished made me look harder at pfBlockerNG. I found an alias in the firewall rules called pfBlockerNGSuppress, which contained feedly.com as one of the addresses. As soon as I removed feedly from that alias and reloaded, feedly.com resolved immediately. Still not sure how this all fits together, but it seems to be working now, and I will monitor to see if it fails again or whether this has solved the problem.
PfBlocker is set to update hourly, which is the default setting that it came with. Would you suggest a lesser frequency?
I have disabled Firefox's use of cloudfare's DNS servers, as you suggested. Out of interest, is your distaste for the use of these servers based on security concerns, or privacy issues, or some other factor?
EDIT:
Here is the output of that command. Note that I had just reloaded the firewall rules when I deleted the entry in the Suppress list as noted above.
version: 1.9.1 verbosity: 3 threads: 4 modules: 2 [ validator iterator ] uptime: 7515 seconds options: control(ssl) unbound (pid 71227) is running...
-
Its all three - I don't want firefox using anything different for dns than what I tell it too! Period.. That that think its ok to do this is just utter madness.. And they are not fooling anyone with there - our users are too freaking stupid, and we are doing it for their own good nonsense..
I have had TRR disabled well before they turned it on.. There is a long running thread about it here..
From that your unbound has only been up for bit over 2 hours... Every time unbound restarts you loose all caching.. So no you shouldn't be restarting it.. Sure and the F not every hour..
I only uses pfblocker for aliases... I don't let it block anything on its own..
edit: Here is the TRR thread.. Been running since 2018
https://forum.netgate.com/topic/133679/heads-up-be-aware-of-trusted-recursive-resolver-trr-in-firefoxYou understand if firefox is bypassing your dns - none of your filtering is of any use at all..
-
Here is what I don't get though:
-
Feedly.com WAS able to be resolved (in non-firefox browsers and portable devices) ONLY when pfBlockerNG was running its cron job (i.e. pfBlockerNG was "down" at the time). Since the only DNS resolver on my network (other than for firefox at that time) is Unbound, does this not suggest that pfBlockerNG was blocking the DNS resolution requests?
-
Why is it that removing the feedly entry from the pfBlockerNGSuppress alias fixed this problem?
Yes, I understand that firefox was bypassing the DNS (not anymore since I disabled that setting though, following your suggestion). I will check out the thread you linked to.
-
-
If you have questions about pfblocker doing dns blocking.. Your best is to ask in that section. I don't use it for that, I only use it for geoip alias lists that I use in my own rules..
-
Fair enough. I appreciate your diligent help over the last couple of days with this vexing issue. Along the way, I have learned a lot thanks to you. Much appreciated. I will continue to monitor this to see if the problem recurs, and will post again if it does. Thanks again.
-
No problem - glad finally figured out the issue.
-
@johnpoz said in Strange issue - not sure how to fix:
No problem - glad finally figured out the issue.
... hopefully! I am keeping my fingers crossed.
-
Looks like I spoke too soon. Problem recurred. And the output of the dig feedly.com + trace command is showing that the root servers are not being reached.
; <<>> DiG 9.12.2-P1 <<>> feedly.com +trace ;; global options: +cmd . 86400 IN NS b.root-servers.net. . 86400 IN NS a.root-servers.net. . 86400 IN NS e.root-servers.net. . 86400 IN NS m.root-servers.net. . 86400 IN NS l.root-servers.net. . 86400 IN NS j.root-servers.net. . 86400 IN NS g.root-servers.net. . 86400 IN NS c.root-servers.net. . 86400 IN NS h.root-servers.net. . 86400 IN NS d.root-servers.net. . 86400 IN NS k.root-servers.net. . 86400 IN NS f.root-servers.net. . 86400 IN NS i.root-servers.net. . 86400 IN RRSIG NS 8 0 518400 20200307170000 20200223160000 33853 . GN9hZh6mOFruU2IWiP4EIvALgU6uQLlXo748wScmwsJYCcmPiPFT6y2q NnsJfg06OrI2qhZueL0NNtcZ5W9hGLFff3nzUcOETUnEWcbW4MwIRWDx VQ4MVMmsnIhWM3BCQdA5hG0eIALwJ+9q3aUe+lHhORN98lpYxfs+tx73 A+GgmNZUm4Coz44hmhJ6G+mM0mYsMLZ1oAvDH/exgo/VExwEA9P3xyRQ b5H09yJdc0cdmygbD8R1L/yjyQUlnyKLOC8ZQ3bpei9NKRXWqv5p29cn pwt4AiaAuZNkCVQA9SIWIKdFVrBh40NsO+RDpEcmh84r30wTVm+qYGT4 PItLag== ;; Received 525 bytes from 127.0.0.1#53(127.0.0.1) in 4256 ms ;; connection timed out; no servers could be reached
But I am able to connect to Feedly because there is a cached entry in Unbound - at least until the cache expires. Back to being mystified.
-
Any chance this NAT rule could be causing the problem?
Source any/any
Destination (non-local DNS resolver)/port 53
Redirect to 127.0.0.1:53Would this be blocking access to the root servers?
-
No your just redirecting say dns to 8.8.8.8 from a client to pfsense..
What unbound restarting when you tested that? Did you sniff while you did that, did you see anything go out? Did you see answers? Look to see how long unbound was running after that..
You sure your not just restarting unbound all the time.. Do you have it registering dhcp?
-
Then I am stumped. I really don't understand what is blocking access to the root servers.
No unbound was not restarting when I posted the last snippet last night.
Here is the output for unbound now:version: 1.9.1 verbosity: 1 threads: 4 modules: 2 [ validator iterator ] uptime: 33792 seconds options: control(ssl) unbound (pid 79706) is running...
(I rebooted my pfsense last night after my previous post - it has been running since then.)
Yes, Unbound is registering DHCP. Could that be the issue here? I have always had it set up like that (for many years) with none of these problems.
My suspicion is that the answer here lies somewhere with pfBlockerNG and its updates and the effect on DNS resolver. But I am not sure exactly where, or how to troubleshoot or resolve it beyond disabling pfBlocker. For now, I have reduced the frequency of pfBlocker NG updates/cron jobs to once daily, which will make it easier for me to figure out if the failure of DNS resolution coincides with that process.
-
Removing dhcp leases from Unbound will prevent it restarting anywhere near as often. If that stops this issue happening it will be a clue at least.
You can hit a critical point when the size of the lists pfBlocker is putting into Unbound makes it slow to start and the number of clients creating dhcp leases ends up running together and bad things happen. Though usually that a lot. Of both.Steve
-
@stephenw10 said in Strange issue - not sure how to fix:
Removing dhcp leases from Unbound will prevent it restarting anywhere near as often. If that stops this issue happening it will be a clue at least.
You can hit a critical point when the size of the lists pfBlocker is putting into Unbound makes it slow to start and the number of clients creating dhcp leases ends up running together and bad things happen. Though usually that a lot. Of both.Steve
I have not observed frequent Unbound restarts, so I am not sure whether that is part of the issue. (I do have DHCP leases in Unbound, but these leases have been there for years and have not previously caused any issues for me).
As I noted earlier, I reduced the frequency of pfBlockerNG updates to once daily. The first such update occurred at midnight last night, and this AM, I was in fact able to access feedly.com (good). However, when I checked the log for pfBlockerNG, every single feed had the following message:
Downloading update [ 02/25/20 00:26:26 ] . cURL Error: 7 Retry in 5 seconds... . cURL Error: 7 Retry in 5 seconds... . cURL Error: 7 Retry in 5 seconds... .. unknown http status code | 0
i.e. no feeds could be updated. I know for a fact that after I reboot pfSense, the pfBlockerNG list updates work, because I have previously checked the log at that time to ensure that the lists updated. Is this evidence that pfSense can't reach the root servers to resolve DNS? And if so, does this get me any closer to identifying the issue in my case? I should point out that I don't have a ton of lists active in pfBlocker, and my pfSense has plenty of RAM available.
output of dig feedly.com +trace command right now:
; <<>> DiG 9.12.2-P1 <<>> feedly.com +trace ;; global options: +cmd . 86361 IN NS d.root-servers.net. . 86361 IN NS c.root-servers.net. . 86361 IN NS i.root-servers.net. . 86361 IN NS l.root-servers.net. . 86361 IN NS b.root-servers.net. . 86361 IN NS g.root-servers.net. . 86361 IN NS e.root-servers.net. . 86361 IN NS m.root-servers.net. . 86361 IN NS f.root-servers.net. . 86361 IN NS h.root-servers.net. . 86361 IN NS a.root-servers.net. . 86361 IN NS k.root-servers.net. . 86361 IN NS j.root-servers.net. . 86361 IN RRSIG NS 8 0 518400 20200309050000 20200225040000 33853 . H9pXeT3s7yemEb6BFdL+eF38hP6nN2fY+S5UD+yM/06AdAlAVOo2wEBW oPAeplPIb1Zb7qNunqUQIHn+FDZqt79A43Poa3OevXxsQERuBcat6IkX v4H6/b3pGsA40KH1GfcbAnz8QlBhHrGIoFlgrQnvltIt7k9pCqA+D3iq YjMz/3dTlePWDQggN3Rao/nn6ZLO+4LF+FM52XTkCgXFPU5Ska6qzm5G 2gDOI7dpyk5DhHj2uc6IrsZWjwgG897Ba1G1fKZxUt8A208O58m4tbs1 ccrHmkfeVdQOwWOBOPCsnYQLtY8BVGL35U6VkGRdqU4U6jeGEFD0A2Pe QfIDkg== ;; Received 525 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms ;; connection timed out; no servers could be reached
(feedly.com is reachable from my network - DNS resolver must be serving up a cached address).
Unbound has been up for a while (33+ hours, i.e. since my last reboot):
version: 1.9.1 verbosity: 1 threads: 4 modules: 2 [ validator iterator ] uptime: 119718 seconds options: control(ssl) unbound (pid 79706) is running...
Other data:
The following name servers are used for lookup of . ;rrset 85994 13 1 11 5 . 85994 IN NS d.root-servers.net. . 85994 IN NS c.root-servers.net. . 85994 IN NS i.root-servers.net. . 85994 IN NS l.root-servers.net. . 85994 IN NS b.root-servers.net. . 85994 IN NS g.root-servers.net. . 85994 IN NS e.root-servers.net. . 85994 IN NS m.root-servers.net. . 85994 IN NS f.root-servers.net. . 85994 IN NS h.root-servers.net. . 85994 IN NS a.root-servers.net. . 85994 IN NS k.root-servers.net. . 85994 IN NS j.root-servers.net. . 85994 IN RRSIG NS 8 0 518400 20200309050000 20200225040000 33853 . H9pXeT3s7yemEb6BFdL+eF38hP6nN2fY+S5UD+yM/06AdAlAVOo2wEBWoPAeplPIb1Zb7qNunqUQIHn+FDZqt79A43Poa3OevXxsQERuBcat6IkXv4H6/b3pGsA40KH1GfcbAnz8QlBhHrGIoFlgrQnvltIt7k9pCqA+D3iqYjMz/3dTlePWDQggN3Rao/nn6ZLO+4LF+FM52XTkCgXFPU5Ska6qzm5G2gDOI7dpyk5DhHj2uc6IrsZWjwgG897Ba1G1fKZxUt8A208O58m4tbs1ccrHmkfeVdQOwWOBOPCsnYQLtY8BVGL35U6VkGRdqU4U6jeGEFD0A2PeQfIDkg== ;{id = 33853} ;rrset 86035 1 0 8 3 j.root-servers.net. 86035 IN A 192.58.128.30 ;rrset 86035 1 0 8 3 j.root-servers.net. 86035 IN AAAA 2001:503:c27::2:30 ;rrset 86035 1 0 8 3 k.root-servers.net. 86035 IN A 193.0.14.129 ;rrset 86035 1 0 8 3 k.root-servers.net. 86035 IN AAAA 2001:7fd::1 ;rrset 86035 1 0 8 3 a.root-servers.net. 86035 IN A 198.41.0.4 ;rrset 86035 1 0 8 3 a.root-servers.net. 86035 IN AAAA 2001:503:ba3e::2:30 ;rrset 86035 1 0 8 3 h.root-servers.net. 86035 IN A 198.97.190.53 ;rrset 86035 1 0 8 3 h.root-servers.net. 86035 IN AAAA 2001:500:1::53 ;rrset 86034 1 0 8 3 f.root-servers.net. 86034 IN A 192.5.5.241 ;rrset 86035 1 0 8 3 f.root-servers.net. 86035 IN AAAA 2001:500:2f::f ;rrset 86034 1 0 8 3 m.root-servers.net. 86034 IN A 202.12.27.33 ;rrset 86034 1 0 8 3 m.root-servers.net. 86034 IN AAAA 2001:dc3::35 ;rrset 86034 1 0 8 3 e.root-servers.net. 86034 IN A 192.203.230.10 ;rrset 86034 1 0 8 3 e.root-servers.net. 86034 IN AAAA 2001:500:a8::e ;rrset 86034 1 0 8 3 g.root-servers.net. 86034 IN A 192.112.36.4 ;rrset 86034 1 0 8 3 g.root-servers.net. 86034 IN AAAA 2001:500:12::d0d ;rrset 86034 1 0 8 3 b.root-servers.net. 86034 IN A 199.9.14.201 ;rrset 86034 1 0 8 3 b.root-servers.net. 86034 IN AAAA 2001:500:200::b ;rrset 86034 1 0 8 3 l.root-servers.net. 86034 IN A 199.7.83.42 ;rrset 86034 1 0 8 3 l.root-servers.net. 86034 IN AAAA 2001:500:9f::42 ;rrset 86034 1 0 8 3 i.root-servers.net. 86034 IN A 192.36.148.17 ;rrset 86034 1 0 8 3 i.root-servers.net. 86034 IN AAAA 2001:7fe::53 ;rrset 86034 1 0 8 3 c.root-servers.net. 86034 IN A 192.33.4.12 ;rrset 86034 1 0 8 3 c.root-servers.net. 86034 IN AAAA 2001:500:2::c ;rrset 86034 1 0 8 3 d.root-servers.net. 86034 IN A 199.7.91.13 ;rrset 86034 1 0 8 3 d.root-servers.net. 86034 IN AAAA 2001:500:2d::d Delegation with 13 names, of which 0 can be examined to query further addresses. It provides 26 IP addresses. 2001:500:2d::d not in infra cache. 199.7.91.13 rto 315 msec, ttl 494, ping 3 var 78 rtt 315, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:500:2::c not in infra cache. 192.33.4.12 rto 199 msec, ttl 494, ping 11 var 47 rtt 199, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:7fe::53 not in infra cache. 192.36.148.17 rto 344 msec, ttl 494, ping 44 var 75 rtt 344, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:500:9f::42 not in infra cache. 199.7.83.42 rto 195 msec, ttl 494, ping 11 var 46 rtt 195, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:500:200::b not in infra cache. 199.9.14.201 rto 308 msec, ttl 112, ping 36 var 68 rtt 308, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:500:12::d0d not in infra cache. 192.112.36.4 not in infra cache. 2001:500:a8::e not in infra cache. 192.203.230.10 rto 195 msec, ttl 494, ping 3 var 48 rtt 195, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:dc3::35 not in infra cache. 202.12.27.33 rto 360 msec, ttl 494, ping 8 var 88 rtt 360, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:500:2f::f not in infra cache. 192.5.5.241 rto 229 msec, ttl 494, ping 1 var 57 rtt 229, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:500:1::53 not in infra cache. 198.97.190.53 rto 204 msec, ttl 494, ping 12 var 48 rtt 204, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:503:ba3e::2:30 not in infra cache. 198.41.0.4 rto 145 msec, ttl 494, ping 13 var 33 rtt 145, tA 0, tAAAA 0, tother 0, EDNS 0 probed. 2001:7fd::1 not in infra cache. 193.0.14.129 not in infra cache. 2001:503:c27::2:30 not in infra cache. 192.58.128.30 rto 163 msec, ttl 494, ping 15 var 37 rtt 163, tA 0, tAAAA 0, tother 0, EDNS 0 probed.
-
What I posted above would seem to confirm that the root servers are reachable by my system. So I am not sure why the dig feedly.com +trace command yields the output I posted above. But feedly is still reachable via cached DNS:
; <<>> DiG 9.12.2-P1 <<>> feedly.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18657 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;feedly.com. IN A ;; ANSWER SECTION: feedly.com. 0 IN A 104.20.59.241 feedly.com. 0 IN A 104.20.60.241 ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Tue Feb 25 07:24:23 EST 2020 ;; MSG SIZE rcvd: 71
-
@johnpoz said in Strange issue - not sure how to fix:
dig @h.root-servers.net com NS
This was just now.
dig: couldn't get address for 'h.root-servers.net': not found
; <<>> DiG 9.12.2-P1 <<>> @198.97.190.53 com ns ; (1 server found) ;; global options: +cmd ;; connection timed out; no servers could be reached
I am really stumped by all this.
-
Because unbound not running.. Why don't you turn off pfblocker for a bit and see if you continue to have issues.
-
@johnpoz said in Strange issue - not sure how to fix:
Because unbound not running.. Why don't you turn off pfblocker for a bit and see if you continue to have issues.
My previous post showed that unbound IS running. That's what makes this so perplexing.
I am at work now, but when I return home, I am going to give your suggestion of disabling pfBlocker a shot to see what I can discover.