Windows Clients cannot access the internet, very strange unexpected DNS problem.
-
I have some ospf router/switches with different subnets behind pfsense. I have a static route pointing back to the internal network and a NULL (0.0.0.0 0.0.0.0) route on the ASBR pointing to the firewalls IP LAN interface and propagated the route through all ospf routers/switches.
Alle clients/routers/switches can access pfsense and vice versa so routing is definitely working.
A piece of the network represents some similarity with the diagram I have taken from the NetGate manual.I cannot access the internet from a client that is at least two hops away, but they do reach the pfsense GUI.
The output from wireshark has the following output while I open the web browser:It seems that the client queries the NS server, which is the local DNS server in pfsense at address 10.216.64.18 from 10.216.64.29 but DNS response does not come through. Why not? DNS on pfsense should work out of the box right? Doesn't make any sense.
I have a VDSL modem in transparent (bridged) mode and under NAT I have allowed the OSPF routes (summary route) to MODEM adddress
Under Rules I allowed the OSF routes (summary route) to everything (*)
Normally this should work.
Normally DNS works out of the box, I don't know what else I can double check?
I see under interfaces status that all IP addresses are in place and that ping to the external DNS address of the provider works from routers and clients. I cannot deduce from the output of a DNS lookup to external hostnames in pfsense that there is a problem there.Can someone help me?
Thank you,
-
@IrixOS That image is from the static route doc page. If you haven’t created a route on pfSense to send 192.168.2.0/24 backwards to the internal router, it will send the response the only place it knows, its default gateway (WAN).
-
@SteveITS I don't get it, please clarify.
Thank you so much.
-
@IrixOS if 192.168.1.1 doesn’t know where 192.168.2.0/24 is, it is unknown. So it sends unknown to its default gateway. Check Diagnostics/Routes and you’ll see there’s no route for 192.168.2.0/2.
“ Static routes are used when hosts or networks are reachable through a router other than the default gateway. The firewall knows about the networks directly attached to it, and it reaches all other networks as directed by the routing table. In networks where an internal router connects additional internal subnets, a static route must be defined for those networks to be reachable.”
-
I already have a static route and defined gateway, the summary route is the route for all internal networks.
It just doesn't make sense.
-
@SteveITS All routes are correct from client to pfsense and vice-versa. I just connected a portable to the pfsense in a /30 subnet, DNS works!
Behind the routers it doesn'tI am beginning to think there is a compatibility issue between pfsense and Cisco material.
It doesn't make sense,
Please help,
-
@IrixOS if you have downstream networks that want to query pfsense (unbound) the ACLs on unbound have to allow for it.
Out of the the box unbound acls auto add all locally attached networks to the ACLs - if you have some downstream network you route too, I am not sure if that happens.. Most likely not.. And could explain your problem..
edit:
vs looking at a sniff, why not just do a query from this windows box with your fav dns tool, nslookup, dig, host, doggo, dnslookup, etc..If the IP is doing the query is not in the source is not in the acl, oh wait... That is not a downstream network.. you have 64.29 asking 64.18.. assume those are the same network..
Your getting back a failure, assume that is servfail.. can you do a directed query from this .29 and then you have some other device on this 10.216.64 that can also do queries - and its working?
Are all dns failing to pfsense.. Say from 64.30 box or something else on 64.x - or just this one .29 box gets servfail for www.bing.com ?
Your sniff is odd... you show 64.29 talking to 64.18, but your route shows 10.216/16 is downstream.. Is your pfsense IP at .18 say /30 or something, so that 10.216.64.29 is different network that 10.216.64.18? Your going to run into problems with routing like that.
-
@johnpoz Thank you, appreciate you are joining thread.
To be clear, The 64.29 is the clients ip address part of a local route (L), the subnet it resides on is in a /30 subnet, so 10.216.64.29-10.216.64.30. If you advertise it into ospf, it gets routed.
Between the ospf router (ASBR) and pfsense I have a /30 subnet, so 10.216.64.17-10.216.64.18, that's the network that directly connects to pfsense which has the LAN interface IP of 10.216.64.18. I have static route configured in the form of a summary route (which includes all possible internal routes) pointing back to the internal ospf network and uses 10.216.64.17 as its downstream gateway address.
On the ASBR (router that point to the pfsense), I have configured a null route ip route 0.0.0.0 0.0.0.0 10.216.64.18 and propagated the route to every downstream ospf routers.
Routing works from the client to the firewall and vice-versa, it it just that dns anomaly, it just doesn't make sense.Actully I am doing nothing special, it's just routing, how can it possibly be that I am stuck with this for so long.
I did the whole config of psense into mottballs and started from scratch with a /30 subnet connecting to a laptop directly, then DNS works and I can access the internet, it doesn't make sense.
Either I am stupid, or there definitely some issue with pfsense, It did work in the past with some similar cisco routing with no DNS issues, don't remember what the version that time.
I survived a lot of network troubleshooting, but this silly horseshit problem cannot be, really come on. It should work out of the box.
I think pfsense doesn't like cisco material and OSPF.Feel free to ask me questions, I hope the above is clear enough,
Waiting for more possible answers,
Actually it is the same principle as in the diagram, advertise all routes and that's what I did.
-
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
10.216.64.29-10.216.64.30.
If unbound is listening that .17/30, then any downstream network be it a /24 or another /30, etc.. you would need to make sure the ACLs allow for queries from that downstream network.
I disabled the auto acls, and set my own... You can see here I just made it easy on myself and created a acl that allows all of rfc1918 space... So I could fire up a network on any rfc1918 and be able to query unbound from it.
-
@IrixOS
To amplify what @johnpoz is saying --The ACL (access control list) for
unbound
limits what IP addresses can ask for DNS lookups. Think of it as a type of "firewall" forunbound
. pfSense will automatically create ACL entries forunbound
to cover all of the directly attached networks. But if you have downstream networks that are not defined directly on pfSense that are attempting to queryunbound
on pfSense, then you will need to create custom ACL entries to allow that traffic (in addition to pfSense firewall rules that allow those downstream IP addresses past the pfSense firewall). -
@bmeeks I just filled in the fields of the ACL, already gone through that, I won't budge. My summary route is 10.216.0.0./17 so I filled in that, disabled the auto-rules.
Any other ideas?
Feel free to ask me any question,Thank you,
-
@IrixOS I am not sure if the manual ACLs work, if you have auto acls set... So its quite possible if auto is still enabled and you create your own that it might not be working.. Did you restart unbound after creating the acls?
A simple directed query would be easier to see if your getting refused, acl not allowing.. or a servfail?
$ dig @192.168.9.253 www.bing.com ; <<>> DiG 9.16.48 <<>> @192.168.9.253 www.bing.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55311 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.bing.com. IN A ;; ANSWER SECTION: www.bing.com. 18539 IN CNAME www-www.bing.com.trafficmanager.net. www-www.bing.com.trafficmanager.net. 539 IN CNAME www-bing-com.dual-a-0001.a-msedge.net. www-bing-com.dual-a-0001.a-msedge.net. 539 IN CNAME dual-a-0001.a-msedge.net. dual-a-0001.a-msedge.net. 539 IN A 13.107.21.200 dual-a-0001.a-msedge.net. 539 IN A 204.79.197.200 ;; Query time: 1 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Mon Feb 26 11:52:32 Central Standard Time 2024 ;; MSG SIZE rcvd: 184
-
@johnpoz that IP you are mentioning, is that your clients IP?
Auto-rules disabled and added 10.217.0.0/17 (summary) to the ACL,
When doing dig this is the output:
-
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
@johnpoz that IP you are mentioning, is that your clients IP?
Auto-rules disabled and added 10.217.0.0/17 (summary) to the ACL,
When doing dig this is the output:
Is that 10.217.0.0/17 the entry you added to the ACL? If so, then by my calculations the IP that is querying (10.216.64.29) is not within that subnet. Is 10.216.64.29 one of the Windows clients?
-
Yeah 10.217/17 would be 10.217.0.0 - 10.217.127.255, so you are correct 10.216.64.x would not be allowed.
But that error looks like firewall rule with a reject or something, not unbound acl refusing you.. which would look like this..
Here I temp removed 192.168/16 from my ACL, and then did a query..
Your getting an error that you couldn't even talk to 64.29, and from your sniff thought your dns your .29 client was asking was .18..
In that command your asking 64.29, isn't that your windows client? So yeah I would expect him not to answer a dns query.
-
@johnpoz:
Duh! You are correct. I didn't even notice he appears to have run the DNS query from a pfSense session (if the 10.216.64.29 client is in fact a Windows machine).@IrixOS:
Now, if the 10.216.64.29 Windows target is a Microsoft AD Controller/DNS server, then you also may have an issue with the Windows firewall on the server. It will automatically drop inbound traffic that is not from the local subnet (if the firewall is enabled, which it is ON by default in Windows these days). I don't think you can fully troubleshoot your problem at the pfSense firewall. You need to run a DNS query vianslookup
ordig
from a client on the network where you are having DNS problems. The returned error code will then be the clue to the real problem.By the way, since the default pfSense firewall rules allow the firewall to go anywhere, that "connection refused" message is likely coming from the target device (the 10.216.64.29 machine). I would initially suspect a local firewall to be the cause of the refused connection.
-
@bmeeks Yes 10.216.64.29 is the ip address of the client, the Local Route (L) in routing table is the /30 subnet 10.216.64.29-10.216.64.30 and this subnet is advertised into ospf.
Pardon me, it is 10.216.0.0/17 not 217 (summary route of al internal ospf routes) is in the ACL and that didn't work.
-
@johnpoz Sorry my mistake it is 10.216.0.0/17 I configured in the ACL, didn't work, wireshark outputs the error.
-
@IrixOS well you still did a query to .29 which no I wouldn't expect that to answer unless you were running dns on it.
lets see s basic nslookup from this windows client.
And then you could put it into debug mode to get more info..
Here
$ nslookup Default Server: sg4860.home.arpa Address: 192.168.9.253 > www.bing.com Server: sg4860.home.arpa Address: 192.168.9.253 Non-authoritative answer: Name: dual-a-0001.a-msedge.net Addresses: 13.107.21.200 204.79.197.200 Aliases: www.bing.com www-www.bing.com.trafficmanager.net www-bing-com.dual-a-0001.a-msedge.net > set debug > www.bing.com Server: sg4860.home.arpa Address: 192.168.9.253 ------------ Got answer: HEADER: opcode = QUERY, id = 7, rcode = NXDOMAIN header flags: response, auth. answer, want recursion, recursion avail. questions = 1, answers = 0, authority records = 0, additional = 0 QUESTIONS: www.bing.com.home.arpa, type = A, class = IN ------------ ------------ Got answer: HEADER: opcode = QUERY, id = 8, rcode = NXDOMAIN header flags: response, auth. answer, want recursion, recursion avail. questions = 1, answers = 0, authority records = 0, additional = 0 QUESTIONS: www.bing.com.home.arpa, type = AAAA, class = IN ------------ ------------ Got answer: HEADER: opcode = QUERY, id = 9, rcode = NOERROR header flags: response, want recursion, recursion avail. questions = 1, answers = 5, authority records = 0, additional = 0 QUESTIONS: www.bing.com, type = A, class = IN ANSWERS: -> www.bing.com canonical name = www-www.bing.com.trafficmanager.net ttl = 16365 (4 hours 32 mins 45 secs) -> www-www.bing.com.trafficmanager.net canonical name = www-bing-com.dual-a-0001.a-msedge.net ttl = 1665 (27 mins 45 secs) -> www-bing-com.dual-a-0001.a-msedge.net canonical name = dual-a-0001.a-msedge.net ttl = 1665 (27 mins 45 secs) -> dual-a-0001.a-msedge.net internet address = 13.107.21.200 ttl = 1665 (27 mins 45 secs) -> dual-a-0001.a-msedge.net internet address = 204.79.197.200 ttl = 1665 (27 mins 45 secs) ------------ Non-authoritative answer: ------------ Got answer: HEADER: opcode = QUERY, id = 10, rcode = SERVFAIL header flags: response, want recursion, recursion avail. questions = 1, answers = 0, authority records = 0, additional = 0 QUESTIONS: www.bing.com, type = AAAA, class = IN ------------ Name: dual-a-0001.a-msedge.net Addresses: 13.107.21.200 204.79.197.200 Aliases: www.bing.com www-www.bing.com.trafficmanager.net www-bing-com.dual-a-0001.a-msedge.net >
-
@IrixOS:
The next step in my opinion is to attempt annslookup
ordig
query from one of the impacted clients. You can't run thedig
command from the pfSense box and target that Windows machine. Unless that Windows machine is a DNS server, it will never respond to the query.Edit: see @johnpoz beat me posting a reply by a few seconds...