Specific client suddenly cant access the web but can reach LAN clients or gateway
-
Hello,
One of my clients all of a sudden seems no longer capable of reaching the web in a general sense. Trying to reach any websites in the web browsers will error out immediately with something like
Hmm. We're having trouble finding that site. We can't connect to the server at www.youtube.com.However services running on that client can reach my other clients or servers running both on same subnet or another local subnet so I know this client has not totally lost network connectivity.
It can also ping any local machines (from any subnets currently configured in pfsense).
However when I try to ping or nslookup an external site such as google, it fails
ping www.google.ca ping: www.google.ca: Name or service not knownnslookup google.ca ;; Got SERVFAIL reply from 127.0.0.53 Server: 127.0.0.53 Address: 127.0.0.53#53 ** server can't find google.ca: SERVFAILObviously I suspected DNS resolution issues but it seems to resolve local machines just fine (example my pfsense machine):
nslookup net-fwl01.mydomain Server: 127.0.0.53 Address: 127.0.0.53#53 Non-authoritative answer: Name: net-fwl01.mydomain Address: 192.168.110.1resolvectl status Global Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported resolv.conf mode: stub Link 2 (enp3s0) Current Scopes: DNS Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported Current DNS Server: 192.168.210.1 DNS Servers: 192.168.210.1 DNS Domain: mydomainI cannot (for the life of me) find anything in pfsense's routing or firewall logs that could point to a possible culprit.
This client is a media center type of machine, and runs 24/7 without any auto-updates. Yesterday it was working just fine, but tonight when we tried to watch some online content it wouldn't budge...
That dont make any sense to me... How would I go about troubleshooting this? I cannot easily install any tools on it since it cant connect to anything online (even apt-get fails with "Could not resolve .........")
-
@pftdm007 is 192.168.210.1 the correct DNS for that system (seen in the
resolvctl statusoutput)? Your firewall is 192.168.110.1 (of course it may be just an interface on another VLAN).Does DNS lookup work if you specify an external DNS server (if your firewall rules allows it) like
nslookup google.ca 1.1.1.1And can you ping an external address, again
ping -c 3 1.1.1.1If none of the above works then you would check the routing table of the client
ip route show. Check if the default gateway is what you expect. -
If "192.168.210.1" is the IP of a pfSense 'LAN' network (the orignal LAN, a OPTx, or a VLAN), you can inform nslookup to test that IP using a given DNS server :
nslookup google.com 192.168.210.1This will force nslookup to use 192.168.210.1.
a) If you get an answer, an IP, you know the issue is local on your device,as pfSense (192.168.210.1) was answering.
b) No answer means pfSense wasn't answering on 192.168.210.1 and you'll have to investigate on the pfSense side. -
Hello guys, thanks for the replies!
@patient0 said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
is 192.168.210.1 the correct DNS for that system (seen in the resolvctl status output)?
Yes it is, its the "gateway" of the subnet 192.168.210.0 that client is part of... 192.168.110.1 is the LAN interface of pfsense. I can elaborate on topology if needed!
@patient0 said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
Does DNS lookup work if you specify an external DNS server (if your firewall rules allows it) like nslookup google.ca 1.1.1.1
nslookup google.ca 1.1.1.1 ;; Got SERVFAIL reply from 1.1.1.1 Server: 1.1.1.1 Address: 1.1.1.1#53 ** server can't find google.ca: SERVFAIL@patient0 said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
And can you ping an external address, again ping -c 3 1.1.1.1
ping -c 3 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=3.19 ms 64 bytes from 1.1.1.1: icmp_seq=2 ttl=59 time=2.66 ms 64 bytes from 1.1.1.1: icmp_seq=3 ttl=59 time=3.10 ms --- 1.1.1.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 2.663/2.985/3.192/0.231 ms@patient0 said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
ip route show
ip route show default via 192.168.210.1 dev enp3s0 proto dhcp metric 20100 169.254.0.0/16 dev enp3s0 scope link metric 1000 192.168.210.0/24 dev enp3s0 proto kernel scope link src 192.168.210.40 metric 100@Gertjan said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
you can inform nslookup to test that IP using a given DNS server :
nslookup google.com 192.168.210.1 ;; Got SERVFAIL reply from 192.168.210.1 Server: 192.168.210.1 Address: 192.168.210.1#53 ** server can't find google.com: SERVFAILPlease note that I repeated all of the above commands on my desktop PC which is connected to the same subnet as my faulty client, and got identical results. However, this PC has complete web connectivity.
I conclude the issue is on pfsense's side!? It worries me a lot because these systems (including pfsense) were untouched since at least a year ago and always worked just fine).... I will gather some settings and data and post back!
-

This part tells me that the command "nslookup" executed on the media device can't contact "1.1.1.1" so it can ask this "1.1.1.1" to resolve "google.ca".
Or, the second part :

tells me that, from your media device, ICMP packets (== 'ping') can reach 1.1.1.1.
So, pfSense is blocking UDP and or TCP packets with destination port 53 and destination IP 1.1.1.1 that come into the 192.168.210.1/24 interface ?
What are the firewall rules on this pfSense 192.168.210.1/24 interface ?Repeat the "nslookup google.ca 1.1.1.1" command, but before hitting enter, set up a packet capture on pfSense :

where, instead of "LAN" you chose your 192.168.210.1/24 network/interface.
Be ware : a lot of traffic will show up.
It's probably wise to enter the IP of your media device as a filter criteria so only DNS traffic from this device will be captured. You can do that here :

-
@pftdm007 said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
Got SERVFAIL reply from 1.1.1.1
servfail is not I can not talk to 1.1.1.1 error, which is what would happen if blocking dns with a firewall rule. That screams he is doing interception of dns, and where he redirects it to had a failure, and returned to the client the error.
Since he got the same error when talking directly to what is suppose to be his dns, ie pfsense IP on that interface.. Screams he has some issue with unbound.
As to this
However, this PC has complete web connectivity.
That screams to me the browser is using doh for its dns.
-
I confirm the issue is on pfsense and not on the "faulty" client because I moved a laptop to this subnet and bingo its the same issues. I also noticed the same issue on other subnets so definitely a larger issue than I thought.
Basically only my desktop PC continues functioning propery because it is included in a firewall alias which has a pass ALL rule on its subnet (see screenshot below). Why this issue started yesterday and all was fine for over a year now? Who knows...

Anyways...@Gertjan said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
Be ware : a lot of traffic will show up.
Packet capture from pfsense: Pastebin
@johnpoz said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
That screams he is doing interception of dns, and where he redirects it to had a failure, and returned to the client the error.
OK time for some pfsense details:
Subnet DHCP:

System DNS:

DNS Forwarder: DISABLED
DNS Resolver:

Firewall rules on the subnet where the "faulty" client (and my desktop PC) are:

NAT rules:

@johnpoz said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
That screams to me the browser is using doh for its dns.
See above, my desktop PC has a "special" place (i.e. bypasses everything in FW).
-
@johnpoz said in Specific client suddenly cant access the web but can reach LAN clients or gateway:
Screams he has some issue with unbound.
IN the DNS resolver logs, I see a lot of the following, not sure if related or not to my isues:
Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:1] notice: ssl handshake failed 8.8.8.8 port 853 Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:2] notice: ssl handshake failed 8.8.8.8 port 853 Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:1] notice: ssl handshake failed 1.1.1.1 port 853 Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:2] notice: ssl handshake failed 1.1.1.1 port 853 Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:1] notice: ssl handshake failed 1.1.1.1 port 853 Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:2] notice: ssl handshake failed 1.1.1.1 port 853 Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:2] notice: ssl handshake failed 8.8.8.8 port 853 Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:1] notice: ssl handshake failed 8.8.8.8 port 853 Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failed Nov 14 13:33:47 unbound 45122 [45122:2] error: ssl handshake cert error: hostname mismatch Nov 14 13:33:47 unbound 45122 [45122:2] notice: ssl handshake failed 8.8.8.8 port 853 Nov 14 13:33:47 unbound 45122 [45122:1] error: ssl handshake failed crypto error:0A000086:SSL routines::certificate verify failedNot sure if this will help but last year when I configured pfsense my intentions were to "intercept" DNS requests (both from 53 & 853) from the clients and redirect to their gateway IP (subnet IP) so pfsense can handle the DNS requests via the DNS servers specified in the general setup page...
-
@pftdm007 you trying to do dot forwarding.. That looks like a problem trying to create your dot connections with google and cloudflare.
Did you put the hostnames in when you set up the forwarding?
If me I would tear that down and just let unbound resolve normally - once that is working again, you could redo your dot forwarding if you want.
But you can not intercept dot traffic (853).. The whole point of dot and doh is validation your talking to who you are thinking your talking to - so you would have forge your cert to use their names, etc. And clients don't normally use dot (853) they use doh (443).. Name servers are normally the things that use dot - to forward to other name servers.
-
@johnpoz OK I removed all the NAT rules and removed all traces of firewall rules that were auto-created.
I manually created a rule allowing traffic (TCP/UDP) from subnet (all ports) to 192.168.210.40:53 (hoping DNS resolution using the gateway's IP will be allowed).
I think traffic to gateway on port 53 works because if I try "nslookup google.com 192.168.210.1" on the "faulty" client I see:

If I try "nslookup google.ca 1.1.1.1" on the "faulty" client I see :

Which makes sense because the FW rule is configured to allow only traffic on port 53 going to the gateway's IP, nothing else (1.1.1.1 or whatever...)

(Gateway_Services_Port alias = 53,123)
Then I removed port 853 from all aliases/rules.
I think this is as simple as it can be, but it still doesnt work.
In the pfsense FW logs I see tons of blocked traffic to 224.0.0.251:5353 but I believe these are multicast DNS which AFAIK I dont use...
I looked in my config logs this pfsense machine has been last modified 8 months ago. Everything was just fine until yesterday. Thats frustrating to say the least...
-
@pftdm007 did you change your unbound to not forward - that is where you major problem is. Your trying to forward to google and cloudflare - and its having an issue trying to do dot (853) because your hostname is not matching..
You need to setup the hostname here

If you are going to do dot forward.
But I would just turn off forwarding completely - until you get dns working.

-
@johnpoz OK now its working.
TO be honest I am not sure what happened between 2 days ago and yesterday for all of my setup to bug up like that.
If I understand correctly, now with FW mode disabled in unbound, things work a bit like this:
- Client gets an IP from pfsense
- Pfsense informs client of the IP to use for DNS resolution (in my case the gateway IP)
- Client wants to resolve something.com, it sends the request to the gateway's IP on port 53
- pfsense gets the request and passes it to unbound
- unbound resolves the domain name using the DNS servers configured in pfsense's general setup page
- pfsense returns the resolution to the client
-
@pftdm007 not quite - if you are not in forwarder mode, unbound resolves what was asks from the roots down.. It doesn't send the query anywhere - it resolves vs forwards.
And not so much pfsense passes it to unbound, unbound is listening on 53, and as long as your firewall rules allow it - unbound will get the query directly.
When you resolve - you don't need anything in the general setup at all. If pfsense itself needs to resolve something it will ask itself (unbound) via the loopback address 127.0.0.1
the only time something like 8.8.8.8 would be used if you have it in general is if pfsense itself wanted to lookup something and unbound wasn't answering. Or you were in forwarding mode, be that either native (just 53) or in dot mode (853 with encryption of the connection via tls)
Now that you know normal dns works - you could go back to forwarding if you want. I personally not fan, but sure if you want to forward forward.. Only thing I would suggest if you forward is uncheck to do dnssec. It can only be problematic if you forward - where you forward either does dnssec already or it doesn't, if it doesn't telling unbound to do dnssec is just going to cause extra queries, and could cause problems.
Also forwarding to different services can be problematic as well - especially if they do filtering, and the filtering could be different. Since you don't really know which one will be forwarded to when you have more than 1 service.. You are not sure which filtering you would get.. Its best if you forward to pick 1.