DNS Resolver Timeouts
-
I have two pfsense boxes in two different places with the same ISP. Both sites are connected by VPN.
- Site A has a static public IP subscription (WAN assigned a public IP). Clients use pihole as a DNS server and pihole forwards to the pfsense DNS resolver.
- Site B is behind a CGNAT (WAN assigned a private IP). It has the same pihole setup as site A.
Site A's DNS resolver (no forwarding) works perfectly. Very seldom do I see timeouts in Status -> DNS Resolver
Site B's DNS resolver, however, is experiencing tons of timeouts but these happens randomly within a day. You can immediately notice this when Internet access is halted for all devices in the network. In the past ( a few months ago), using unbound and forwarding to, say, 8.8.8.8 or 1.1.1.1 solves the problem but now even with forwarding enabled I get the same timeout issue. When doing a packet capture on the WAN interface, I see packets from the WAN IP to DNS servers but don't see replies. For example:
https://www.dropbox.com/s/pxkgn3usii38qrn/packetcapture%281%29.cap?dl=0 -> pcap when unbound is set as a resolver. Try filtering the "205.251.194.154" ip address and it should show no replies
https://www.dropbox.com/s/ile9lc2wmfjvmhi/packetcapture%282%29.cap?dl=0 -> pcap when unbound is set to forward to 1.1.1.1
This has been happening for almost a year now and my ISP's solution is to subscribe to a static IP subscription which does not make sense! I decided to create a new thread but this is basically a continuation of my thread here.
When I forward to the DNS server in site A (on the other end of the tunnel), I don't see these timeouts but DNS resolution sometimes fails. I'm guessing it's because DNS through a tunnel is not that reliable?
Just for the heck of it, I tried using dnsmasq overnight and it seems to be better. Although I cannot really confirm because there are no stats for dnsmasq unlike what you have in unbound in Status -> DNS Resolver. It's just that dnsmasq seems to have not acted up during my testing. Could just be a coincidence and that I need to wait more to conclude more accurately, I don't know.
At this point, what is my next best step? I'm really tired talking to my ISP's customer support agents as they are pretty useless. Their network team does not even want to talk directly to customers, it's insane. I still want to continue troubleshooting this to have a more conclusive idea on what's happening. Any tips please?
Thanks.
-
@kevindd992002 said in DNS Resolver Timeouts:
and pihole forwards to the pfsense DNS resolver.
Almost the same setup everywhere (in our sys) and there is no problem.
WITHOUT Pi-hole! +++edit: of course!I know everyone is stoning me now, but why donβt we forget this Pi - hole stuff (on a serious NGFW) anymore... = pfBlocker.......tatatta
(I think it's also run on a crappy container and * or on a RaspberryPI)@kevindd992002 "I'm guessing it's because DNS through a tunnel is not that reliable?"
so this surprised me and why?
+++edit2:
Seriously, Pi-hole also has a place, but pfSense offers it all in one, pls. take advantage -
@daddygo Let's not get into the pihole vs. pfblocker debate, please. Pihole is not the issue here. It has been serving me well for the past few months. I have no complains.
As for your surprise regarding the DNS through a tunnel not being reliable, I don't know the answer to that. It's exactly why I said "I'm guessing" as what I experience is that even though there are no timeouts when using the DNS through the tunnel interface, I do get DNS resolution error when browsing the Internet. Refreshing the page multiple times usually solves the issue.
On the same note, I've been running dnsmasq in pfsense now for a couple of hours without any issues. So I'm not sure what the difference between dnsmasq and unbound with forwarding. Why am I seeing a difference in performance when both do forwarding? My ultimate goal here is to use unbound as a resolver but I'm having major timeout problems when doing that.
-
@kevindd992002 said in DNS Resolver Timeouts:
Let's not get into the pihole vs. pfblocker debate,
OK
Believe me there should be no problem with DNS through the tunnel, if it were, many of us would sweat here...
Have you ever tried to analyze the transmission parameters of your tunnel?
and yes I am definitely a fan of UNBOND use and all things that are given and can be found in an environment (NGFW), because they are coordinated with each other
-
@daddygo said in DNS Resolver Timeouts:
@kevindd992002 said in DNS Resolver Timeouts:
Let's not get into the pihole vs. pfblocker debate,
OK
Believe me there should be no problem with DNS through the tunnel, if it were, many of us would sweat here...
Have you ever tried to analyze the transmission parameters of your tunnel?
and yes I am definitely a fan of UNBOND use and all things that are given and can be found in an environment (NGFW), because they are coordinated with each other
Right, that's what I thought too. There shouldn't be any issues with the tunnel for whatever type of traffic.
What are the "transmission parameters"? Sorry, I'm not an expert with VPN but I know the basics. How do you propose I start with my analysis?
Yes, I'm also a fan of unbound. I want to maintain my own DNS Resolver and I don't relying/forwarding client DNS queries to other DNS servers but I really don't have a choice at this point because of this very issue.
I still don't understand why dnsmasq seems to be working perfectly fine though. I'm not experiencing any timeouts with it so far (a few days into the testing now).
-
@kevindd992002 said in DNS Resolver Timeouts:
How do you propose I start with my analysis?
I don't know how far apart the endpoints are and how many ISPs are in this ...
but I would also take short- and longer-term measurements (on tunnel network):
short: iperf3
https://docs.netgate.com/pfsense/en/latest/packages/iperf.htmlfor long-term discipline I use this, free for 5 endpoints:
https://emcosoftware.com/ping-monitor
(I set it up and let it run for hours)although I still suspect it will be a different issue than the tunnel itself
DNS:
this pairing is best in your case
https://github.com/synackray/dns-load-generator
https://www.wireshark.org/ -
@daddygo said in DNS Resolver Timeouts:
@kevindd992002 said in DNS Resolver Timeouts:
How do you propose I start with my analysis?
I don't know how far apart the endpoints are and how many ISPs are in this ...
but I would also take short- and longer-term measurements (on tunnel network):
short: iperf3
https://docs.netgate.com/pfsense/en/latest/packages/iperf.htmlfor long-term discipline I use this, free for 5 endpoints:
https://emcosoftware.com/ping-monitor
(I set it up and let it run for hours)although I still suspect it will be a different issue than the tunnel itself
DNS:
this pairing is best in your case
https://github.com/synackray/dns-load-generator
https://www.wireshark.org/Yeah, I don't have issues with using iperf3 across the tunnel. Pfsense also has gateway monitoring for the IPsec tunnel (routed VTI) and long term ping monitoring seems stable.
Using the DNS server across the tunnel is just my last resort. My real goal here is for each site to use their own DNS resolvers. Do you have any ideas why I'm getting a lot of timeouts when DNS resolver is enabled in site B? Can an ISP block DNS requests to external DNS servers if it's a resolver (as opposed to a forwarder)?
-
@kevindd992002 said in DNS Resolver Timeouts:
Do you have any ideas why I'm getting a lot of timeouts when DNS resolver is enabled in site B?
So, honestly not, because there is little info - here Wiresark can help solve this
@kevindd992002 "Can an ISP block DNS requests to external DNS servers if it's a resolver (as opposed to a forwarder)?"
it would be very annoying and unfair (from ISP, but we have already seen a crow on a stick)
a packet capture also shows this
-
@daddygo said in DNS Resolver Timeouts:
@kevindd992002 said in DNS Resolver Timeouts:
Do you have any ideas why I'm getting a lot of timeouts when DNS resolver is enabled in site B?
So, honestly not, because there is little info - here Wiresark can help solve this
@kevindd992002 "Can an ISP block DNS requests to external DNS servers if it's a resolver (as opposed to a forwarder)?"
it would be very annoying and unfair (from ISP, but we have already seen a crow on a stick)
a packet capture also shows this
What can Wireshark provide that the packet capture (from pfsense's tcpdump) I already provided don't? If you check the OP again, you'll see that I have packet captures there.
-
Might not be related.
But when i had unbound "DNS issues".
I had "ticked" register DHCP Leases in unbound
That made unbound restart every time a DHCP event happened, and made my system unusable.
Untick that DHCP Registration if set
/Bingo
-
@bingo600 Yeah, tried that already, didn't make a difference. In Site A where unbound is perfectly working, I have that checked and the DHCP service restart is very fast that it is barely noticeable. These are both for home sites and it's not like the lease of my few DHCP clients are always expiring.
-
@kevindd992002 said in DNS Resolver Timeouts:
What can Wireshark provide that the packet capture
you can see an online and / or real-time scan on the Wireshark screen - when you launch an action
do you pass this on site B?
(for this installation, for example, the DNS goes through a tunnel)translate this, of course, into your example (GW A site)
-
@daddygo said in DNS Resolver Timeouts:
@kevindd992002 said in DNS Resolver Timeouts:
What can Wireshark provide that the packet capture
you can see an online and / or real-time scan on the Wireshark screen - when you launch an action
do you pass this on page B?
(for this installation, for example, the DNS goes through a tunnel)translate this, of course, into your example (GW A site)
Sorry, what? What do you mean by "page B"?
Let's forget about the tunnel for now. Like I said, that is my last resort/workaround. Let's treat site B as an independent site without an S2S VPN. My goal here is to simply use unbound on site B as a resolver (not forwarder) without any issues.
-
@kevindd992002 said in DNS Resolver Timeouts:
Let's treat site B as an independent site without an S2S VPN
Okay then we misunderstand each other...
can you draw a quick diagram of what you want to achieve?
A site pfSense A
B site pfSense Bor exactly what
-
@daddygo said in DNS Resolver Timeouts:
@kevindd992002 said in DNS Resolver Timeouts:
Let's treat site B as an independent site without an S2S VPN
Okay then we misunderstand each other...
can you draw a quick diagram of what you want to achieve?
A site pfSense A
B site pfSense Bor exactly what
It looks like it, yes.
So I have two sites that are connected through IPsec VPN, yes, but I just gave that information here because it was one of the tests I had (using the DNS resolver on the far end of the tunnel).
Site A (main site)
- WAN interface has a public static IP
- no problems with being a DNS resolver (without forwarding)
Site B (remote site)
- WAN interface is assigned a private IP since it is behind a CGNAT
- when DNS resolver (without forwarding) is set, tons of timeouts are seen in Status -> DNS Resolver and the whole network is affected, browsing is very intermittent
- when DNS resolver (with forwarding to 1.1.1.1, or 8.8.8.8, or even to the ISP's own DNS servers) is set, same behavior, lots of timeouts. I must say though, that this was my workaround before like a few months ago and it worked. For some reason, it is also timing out these past few days I tested.
- when DNS Forwarder (dnsmasq) is enabled instead, everything is working properly. It's been almost two days without any issues.
- as soon as I go back to using DNS resolver (unbound), then the problem is immediately back
-
@kevindd992002 said in DNS Resolver Timeouts:
It looks like it, yes.
so I understand, so in terms of your question, it has nothing to do with A - B.
in summary:
- The pfSense installation which is used in point B, works with a timeout.... DNS, if UNBOUND is used
- in addition, it is behind CGNAT
Can you do a test with this for both conditions? (Unbound / Forwarder):
https://www.grc.com/dns/benchmark.htmFinaly, you can show UNBOUND settings such as:
-
You got it.
I have to get back to you after Christmas for that benchmark test (which I'm familiar with as I used it before). I'm physically at site A right now and while troubleshooting another issue with IPsec, I accidentally lost access to site B's pfsense and no one is physically there to undo what I did.
As for the settings, they are exactly the same with the unbound settings I have site A and here they are:
I don't have a DNS server in the DNS settings under General because I don't need one. I'm using unbound as a "resolver" so it queries the root hints directly. In the settings that you've shown, it looks like you're using unbound as a forwarder too, why?
-
How is this not the same exact problem you had before.. If you have a shit isp, then you have a shit isp..
Your previous thread showed loss on your isp.. If either of these sites its on that isp, or whatever isp they have is loosing packets.. Then yes you can have issue, be it dns or anything else.
Doesn't matter if you forward or tunnel or whatever.. If your isp sucks it sucks.. Nothing pfsense can do about that.
Previous you had sniffs showing traffic leaving your wan, with no answer.. There is nothing pfsense can do to fix that..
-
@kevindd992002 said in DNS Resolver Timeouts:
it looks like you're using unbound as a forwarder too, why?
Forwarding Mode to 1.1.1.1 = general tab
as I try to achieve more privacy and greater security
CloudFlare / 853 DoT
-
@daddygo said in DNS Resolver Timeouts:
as I try to achieve more privacy and greater security
Well that sure isn't doing anything about that..