Unbound queries to root server via VPN being refused but work when via WAN
This morning, I awoke to find that DNS queries weren't getting resolved. After going through the usual (check ISP, check VPN, reboot pfsense, etc), I was able to determine that DNS queries that are going through ExpressVPN aren't working. Changing the outbound interface on resolver from VPN to WAN "fixed" it.
Digging deeper, I captured DNS packets while unbound was restarting. When the outbound interface was set to WAN, all of the queries had responses and all was well. However, when I switched the outbound interface to VPN, all of the queries are returning with a "refused NS" response.
I contacted ExpressVPN to let them know, but they pointed me towards this forum. I looked through the recent posts to see if I could find anything similar that anyone else was experiencing, but none popped out.
So...is anyone else running into this issue? Any thoughts on what I can do at my end, besides my WAN "work-around"?
Btw, I wasn't certain if I should post this under DHCP and DNS or OpenVPN, so please let me know if I should move it over to the other category...
Here is some additional info, in case anyone is interested. I still haven't figured out the underlying issue, but I have put this back on ExpressVPN's lap to figure out what's changed in the last few days.
When comparing the working WAN queries to the refused VPN queries, I noticed that the link-layer frame type shows Ethernet (1) - along with src & dest MAC addresses - when on WAN, but shows NULL/Loopback (15) when on VPN. This makes sense as VPN's try and mask the originator info. As such, I suspect that the root servers, in an effort to improve security and prevent various attacks, have started refusing replies to non-Ethernet link-layer requests. Unfortunately, I was unable to find any info on this through various searches or forums.
In the meantime, I've switched unbound to forward DNS queries to a public DNS server via VPN. It's not ideal, but I felt it was a better option than running my queries in the open over WAN. I'm open to alternatives and/feedback on my choice. And, yes, I know that DNS requests only provide a limited set of info to my ISP, but still...
Is anyone running unbound in standard mode through a VPN and not seeing this issue? If so, could you tell me which VPN service you are using?
here is what i do
- create an alias for the devices you want to travel over the tunnel.
- under firewall- rules - lan. add the alias here but change the gateway to your VPN tunnel
- services - dhcp server -build static mappings for each device. then under DNS servers add only express VPN's DNS servers.
Thanks for your suggestion, but I may not have explained my issue clearly.
I do not have any problems talking/using various public DNS servers (e.g. google, opendns, cloudflare, etc) through ExpressVPN. I verified this by changing the resolver to "forwarding" mode, thus utilizing the DNS servers from the "General Setup" page.
The problem is specific when the resolver starts up and talks to the 13 root servers to build out its map. If I set the resolver's outbound interface to WAN, no problems. However, if I set it to VPN, then the root servers return with "REFUSE" codes and will not service my DNS queries.
Since my last update, I have verified that this is an ExpressVPN-specific issue. I received a trial license for AirVPN, which I setup on on pfsense. Then, changing the resolver's outbound interface to AirVPN and running in standard (non-forwarding) mode, everything works. I have provided this feedback to ExpressVPN as well, so we'll see if they can resolve (no pun intended) the issue...
After tinkering with the setup more, it appears the issue has to do with specific VPN IP addresses. It appears that the DNS root servers are also using geo-restrictions...
It all comes down to security and diversity needs of yours. You don't want to run unbound/resolving on your WAN: why? Do you suspect your ISP that it reads all DNS queries to the multitude of different addresses you use with resolving (Root zones, SOA servers, etc.)? If not, why is it more secure to throw this into a VPN tunnel to a provider that you also have to trust to do no BS to throw you a wrench and resolve DNS there (cleartext)?
Also with forwarding (securely for that matter via TLS oder HTTPS for example) the same is true for the provider operating the DNS forwarder. So all comes down to: who do you trust not to f*** you up in routing your traffic ;) As for myself I don't set much trust in the varying VPN companies as there's been a lot of shady business in that category. Your ISP on that end has more to loose from the payment/customer perspective than some cheap VPN provider and also has to route that much traffic that generic "log all" is very improbable (and often enough simply illegal). Central forwarding to one of the big ones (google, quad9, cloudflare) bears the risk in being dependent on their (SPoF) servers/cluster.
So it's up to you to decide who earns your trust :)
Thanks for your reply and you are right!
As I was in a bit of a "panic", I was just trying to set things back up to work as they did when first deployed ~2y ago. I hadn't had a chance to update my DNS setup since then, but after looking at it some more, I realize I need to rethink my DNS setup in regards to who I "trust" and, also, to determine whether it continues to make sense to run unbound as a resolver via WAN or as a forwarder to my ISP, VPN, or one of the big guys.
Time to sharpen the pencil...
It appears that the DNS root servers are also using geo-restrictions...
Nonsense.. While they might block and IP or range of IPs that were actively attacking them... They would not block some IP just because it was from country xyz..
Fair enough - perhaps "geo-restricting" was an inaccurate description.
I only meant to say that the root servers appear to be restricting responses to certain IP addresses belonging to my VPN provider. This was somewhat unexpected as it had been working properly for ~2y, thus it took some time to track down the issue. Even changing my VPN "location" did not appear to (initially) resolve the problem, hence my "geo-restriction" comment. However, after much trial and error, I was able to find a VPN "location" that worked.
Unfortunately, the location is not an ideal egress point to the internet, so I've been reworking my DNS strategy to take advantage of alternate upstream DNS servers instead of going directly to the root servers.
IP addresses belonging to my VPN provider.
Do you blame them ;) heheheh I would block all these nonsense vpn providers as well ;)
But more likely its not them blocking but just a shitty vpn provider.. They are prob blocking you from using your own dns because they want you to use there dns.. How else would they make a profit on selling that info ;)
So when you query one of the roots directly - you get a REFUSED back?
example here is a direct query to a root for ns for .com
$ dig @220.127.116.11 com. ns ; <<>> DiG 9.14.3 <<>> @18.104.22.168 com. ns ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55319 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 27 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1472 ;; QUESTION SECTION: ;com. IN NS ;; AUTHORITY SECTION: com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. com. 172800 IN NS f.gtld-servers.net. com. 172800 IN NS g.gtld-servers.net. com. 172800 IN NS h.gtld-servers.net. com. 172800 IN NS i.gtld-servers.net. com. 172800 IN NS j.gtld-servers.net. com. 172800 IN NS k.gtld-servers.net. com. 172800 IN NS l.gtld-servers.net. com. 172800 IN NS m.gtld-servers.net. ;; ADDITIONAL SECTION: a.gtld-servers.net. 172800 IN A 22.214.171.124 b.gtld-servers.net. 172800 IN A 126.96.36.199 c.gtld-servers.net. 172800 IN A 188.8.131.52 d.gtld-servers.net. 172800 IN A 184.108.40.206 e.gtld-servers.net. 172800 IN A 220.127.116.11 f.gtld-servers.net. 172800 IN A 18.104.22.168 g.gtld-servers.net. 172800 IN A 22.214.171.124 h.gtld-servers.net. 172800 IN A 126.96.36.199 i.gtld-servers.net. 172800 IN A 188.8.131.52 j.gtld-servers.net. 172800 IN A 184.108.40.206 k.gtld-servers.net. 172800 IN A 220.127.116.11 l.gtld-servers.net. 172800 IN A 18.104.22.168 m.gtld-servers.net. 172800 IN A 22.214.171.124 a.gtld-servers.net. 172800 IN AAAA 2001:503:a83e::2:30 b.gtld-servers.net. 172800 IN AAAA 2001:503:231d::2:30 c.gtld-servers.net. 172800 IN AAAA 2001:503:83eb::30 d.gtld-servers.net. 172800 IN AAAA 2001:500:856e::30 e.gtld-servers.net. 172800 IN AAAA 2001:502:1ca1::30 f.gtld-servers.net. 172800 IN AAAA 2001:503:d414::30 g.gtld-servers.net. 172800 IN AAAA 2001:503:eea3::30 h.gtld-servers.net. 172800 IN AAAA 2001:502:8cc::30 i.gtld-servers.net. 172800 IN AAAA 2001:503:39c1::30 j.gtld-servers.net. 172800 IN AAAA 2001:502:7094::30 k.gtld-servers.net. 172800 IN AAAA 2001:503:d2d::30 l.gtld-servers.net. 172800 IN AAAA 2001:500:d937::30 m.gtld-servers.net. 172800 IN AAAA 2001:501:b1f9::30 ;; Query time: 15 msec ;; SERVER: 126.96.36.199#53(188.8.131.52) ;; WHEN: Wed Jul 24 21:18:23 Central Daylight Time 2019 ;; MSG SIZE rcvd: 828
What happens when you do that, do you get just a time out, or do you actively get sent back REFUSED?
If just times out, did you try via tcp?
They are prob blocking you from using your own dns because they want you to use there dns
Actually, in all cases, I used the same VPN provider, but connecting to different servers (locations) produces different results. For example, if I startup unbound while connected to one of my VPN provider's servers (location A), I captured the following packets:
1 0.000000 10.141.0.138 184.108.40.206 DNS 60 Standard query 0xbe66 NS <Root> OPT 2 0.064303 220.127.116.11 10.141.0.138 DNS 49 Standard query response 0xbe66 Refused NS <Root> 3 0.064380 10.141.0.138 18.104.22.168 DNS 60 Standard query 0x9976 NS <Root> OPT 4 0.133864 22.214.171.124 10.141.0.138 DNS 49 Standard query response 0x9976 Refused NS <Root>
and so on, from all 13 root servers (multiple times).
But if I change the VPN connection to a different server (location B), then unbound starts properly with the following:
1 0.000000 10.70.0.118 126.96.36.199 DNS 70 Standard query 0x9176 NS <Root> OPT 2 0.069589 188.8.131.52 10.70.0.118 DNS 1139 Standard query response 0x9176 NS <Root> NS i.root-servers.net NS h.root-servers.net NS c.root-servers.net NS e.root-servers.net ... OPT
I truncated the response for brevity, but all of the info is returned, including A and AAAA records. The complete response is also received from the other root servers as well, so unbound comes up and is happy! Note, it took a bit of trial/error to find which of my VPN's servers (hence IPs) worked and which ones didn't. Needless to say, I couldn't pinpoint it by country, site, etc. It was definitely a mixed bag!
Thus, my conclusion has been that queries from certain IP addresses are being refused by the root servers, but I'm open to other interpretations...
Standard query 0xbe66 NS <Root> OPT
What kind of query is that?? that is not a valid domain.. So yeah it would be refused...
here is a query.
What did you actually ask them for?? Ah - ok Im getting tired ;) you just asking them for . -- yeah that should respond.
edit: be curios to why the ones that refused are only 60 length that is pretty small, but the one that got answer was 70?
Yup...note that these came straight from packet captures (promiscuous mode, filtered by host address) from pfsense on the VPN interface during a restart of unbound. I just pulled my capture files during my troubleshooting and copy/pasted them to my note.
Been doing this a really long time, never seen roots REFUSE a query.. The ips your coming from must be "bad"
In regards to the length:
I just realized the 70 length request is not one that went through the VPN, but through a proxy on the WAN - the standard length includes frame type and MAC address info, which the VPNs strip before sending on (which also explains how they "hide" you), resulting in the 60 length.
I have a capture that goes through the VPN and is 60 length and working, but I'll need to dig it out. The net result is the same - unbound comes up.
And, yes, it doesn't surprise me that some IPs are being marked as "bad", even by the root servers. As VPNs use the same IP for multiple clients, it's likely that some of their IPs have been used for nefarious means, resulting in their being blocked, refused, etc.