Windows Clients cannot access the internet, very strange unexpected DNS problem.
-
All of the sudden and as expected...
What can I say. Actually I am struggling with this same issue for quite some time, even years. Now I cannot even vpn anymore.
Didn't touch anything. It bet dns works if I directly connect to the pfsene LAN ip with a laptop in a /30 subnet. I don't quite get its relation with dns.I can't imagine this issue existing on a corporate network.
I am facing another issue along with the internet connectivity issue.
I am a megalomaniac, with a crazy idea having a cisco three tier model network into my home. Ended up with 3 catalayst 3750 and four 4948 series, a couple of virtual servers.
The sound in each room is crazy, it's highly overkill, but I like it the sound.
I looked up the power consumption on the specifications sheet, up to 212Watts per unit. According to AI chat, it's excessive and I am very scared for the energy bill.I got some energy bill a few years ago. About 3000 euro's. Lady said you had the average power consumption for a company. Looked at my stuff, turned white and shut of the whole network with the differential switch. Turned it it was an administrative mistake by the company.
Well this all might turn into reality with the current setup, what do you think? Normally home switches consume a lot less.Don't know what to do right now, planning to deploy a apache webserver for commercial purpose, I am beginning to wonder if it's all worth it. Pfsense didn't came free, it was also the hardware.
All connections and settings are set, there cannot be any fault by my own, it should work unless there is some unknown compatibility issue between cisco and Netgate. It is just a hypothesis. I am very careful by saying this. I see no relation.I remember before my recent network refit, blamed the old hardware for connection problems through pfsense because it was failing, I lost my remote windows desktops too many times over vpn, when the round turning circle appeared in the middle, then I knew there was a connection problem somehow somewhere.
No at the moment I am directly connected with the VDSL modem. Too bad,....
Pfsense can't access the net anymore, even after reboot, done what you asked, the screenshots is the result before the reboot...
-
@IrixOS if you can not even ping googledns your internet is not working, so no you wouldn't be able to resolve anything.. Not even sure why your asking about dns problems - when clearly you don't even have internet access working.
-
@johnpoz I am quite sure internet work because openvpn works, but DNS from the inside to the outside does not. Yesterday nothing worked, I slept this over one night and manually stopped the dns daemon the next day and rebooted, all came backup and dns now works.
I am 100% sure there is something with the pfsense dns resolver, definitely.
-
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
internet work because openvpn works,
So your routing traffic over a vpn service? That could cause issues with resolving for sure, many of those services only allow their dns to be used, etc. Make sure your dns does not route over your vpn.
Unbound has zero to do with you pinging googledns, ie 8.8.8.8 - clearly in your post you could not talk to them
-
@johnpoz Look, when the bottom square computer turns into a world icon, the I know there is a problem.
So three things occur:
- No internet access in the browser
- The SERVFAIL message in dnslookup from both the client and dnslookup in pfsense.
- From both the client and pfsense at the command line, ping to 8.8.8.8 fails with the the TTL error.
The VPN was just to test if i can access the firewall and beyond because you talked about my loss of ip connectivity as well.
It just all of a sudden internal clients are not able to resolve and I tried to reproduce the error as expected after a couple of days and I did.
Stopped and restarted unbound daemon and suddenly I have that square icon at the bottom of windows again and I'm online. -
This post is deleted! -
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
From both the client and pfsense at the command line, ping to 8.8.8.8 fails with the the TTL error.
This error you described in the quoted text really sounds like either an ISP issue or something going weird with your VPN setup.
If you can't get a repy from a
ping
command directly to an IP address, then your basic Layer 2/3 connectivity is broken for the client you are trying theping
command from. At that point DNS andunbound
are totally and completely out of the picture.You may be attacking this problem from the wrong end. Instead of worrying about
unbound
, you need to see first what is happening to Layer 2/3 connectivity (that is, why is aping
to an outside IP address not working?). Theunbound
daemon should not break Layer 2/3 connectivity for a client.Think about this logically and troubleshoot in a logical manner.
- When the problem occurs, don't restart anything. First try a simple
ping <pfSense_LAN_IP_address>
. Does that work? - Next try
ping 8.8.8.8
. Does that work?
If neither of the above work, then most certainly DNS resolving is going to be broken and Windows is going to show the globe icon (for no Internet). At that point you need to be troubleshooting Layer 2/3 connectivity to see why the basic
ping
to a hard-coded address is not working. - When the problem occurs, don't restart anything. First try a simple
-
@bmeeks Yes not able to ping an external ip address from a client is strange, even though all connections are set and working, the firewall is reachable....There must be some ISP issue....I can hardly believe it's the internal routing.
-
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
@bmeeks Yes not able to ping an external ip address from a client is strange, even though all connections are set and working, the firewall is reachable....There must be some ISP issue....I can hardly believe it's the internal routing.
Then I would concentrate all my troubleshooting efforts on figuring out why external connectivity is broken at the basic Layer 2/3 level. Could be something with routing, could certainly be an ISP issue, or it might be the VPN setup in some fashion.
Only after you can 100% reliably ping an external IP address all the way through the network should you start looking at DNS and
unbound
issues. -
The client is connected to a switch configured with a local route (L) and advertised into OSPF and propagated the default route to all ospf routers on the ASBR that is directly connected with pfsense.
I also had this issue on a past network setup, but instead with SVIs at that time.
You could be right, it's either the cisco hardware or some ISP isue, the thing is if I connect a laptop or a pc directly to the LAN interface in a /30 subnet, then it works.Programming the switch is very straightforward, what else can I do to troubleshoot with the tools that exist in cisco IOS?
-
@IrixOS from your post above you show a ttl expired from 10.216.64.17 what device is this - is this upstream of pfsense, or some router on your network?
That normally points to a routing loop..
Also you could have some asymmetrical routing going on.. Which depending on what is talking to what, and if there is a stateful firewall in the mix.. Stateful firewalls don't like asymmetrical routing because there is no state, etc.. or with only seeing one side of the traffic the state can expire depending.
But @bmeeks is right on the money (as always) you need to troubleshoot your connectivity issues before you go looking to what can be wrong with unbound.. Unbound is not going to function as it should if your connectivity is broken... And not being able to ping 8.8.8.8 screams of connectivity problem!!
-
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
The client is connected to a switch configured with a local route (L) and advertised into OSPF and propagated the default route to all ospf routers on the ASBR that is directly connected with pfsense.
I also had this issue on a past network setup, but instead with SVIs at that time.
You could be right, it's either the cisco hardware or some ISP isue, the thing is if I connect a laptop or a pc directly to the LAN interface in a /30 subnet, then it works.Programming the switch is very straightforward, what else can I do to troubleshoot with the tools that exist in cisco IOS?
It sounds to me that you may have a routing problem. And that problem may take a little bit to manifest itself as all the network equipment does its OSPF stuff. That's not my area of networking strength. @johnpoz will be much more help there as he does this kind of stuff all the time.
But I do know that these routing protocols are dynamic in that the devices participating periodically recheck the paths to calculate the shortest one. On the surface it seems that at some point they calculate something that is "suboptimal"
in terms of staying connected. Restarting and/or disconnecting a port would force a new OSPF algorithm run, and on that run they calculate correctly but then get lost again later and the cycle repeats.
-
@bmeeks Yes Cisco use that 'suboptimal' term in all their concepts all the f* time
-
@bmeeks good insight.. Depending for sure - you could get different paths taken, or path could change - it would all come down to the actual setup.. And if there is even multiple paths that could be taken..
But yeah you could be on to something with the routing changing to why seeing issue sometimes and not others.
-
@johnpoz Actually I am not doing anything special here, I just added some cisco switches behind a pfsense box, done everything according to pfsense and cisco regulations, it should work, and yes I have multiple path in the form of ether channels, but the client is only two hops away from pfsense. Even one hop away from pfsense is probably gonna give the same issue, directly connected I know will probably work for sure.
It is probably not the ISP, because when I directly connect the PC with the VDSL modem, then there is no issue.It has to be the cisco hardware.
By my knowledge there is nothing else you can do further in Cisco IOS to troubleshoot the problem with the current network condition.Now be frankly, is this firewall ever tested with cisco hardware or ospf in general?
If have to throw everything into mottballs, that would be very lame if you ask me... -
@bmeeks
Don't bother with text, text is wrong, just the network model applies. -
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
Don't bother with text, text is wrong, just the network model applies.
But do all the link aggregations still apply? If so, that's a lot of places where something can go weird with a slight misconfiguration.
I also found a long thread from 2020 from a user that was having LACP issues with pfSense that were apparently never resolved. Have a look here: https://forum.netgate.com/topic/158534/lacp-not-working.
-
Yes they still apply, but I have nothing configured on the leftside of the CATALYST in the middle yet. The right side of the catalyst in the middle is configured.
Just consider the spot with 'I am here'. From there zooooom over PO1 to the ASBR (the switch that is directly connected to pfsense) and zooooom over PO2 to pfsense).
That's about it. OSPF is configured between, all routes are advertised and on the ASBR a Null route 0.0.0.0 0.0.0.0 is configured with the pfsense IP as its Next hop address. Static route in pfsense pointing back to the internal network in the form of a summary route, so all connections are there.
It should work according to regulations. -
@IrixOS said in Windows Clients cannot access the internet, very strange unexpected DNS problem.:
Yes they still apply, but I have nothing configured on the leftside of the CATALYST in the middle yet. The right side of the catalyst in the middle is configured.
Just consider the spot with 'I am here'. From there zooooom over PO1 to the ASBR (the switch that is directly connected to pfsense) and zooooom over PO2 to pfsense).
That's about it. OSPF is configured between, all routes are advertised and on the ASBR a Null route 0.0.0.0 0.0.0.0 is configured with the pfsense IP as its Next hop address. Static route in pfsense pointing back to the internal network in the form of a summary route, so all connections are there.
It should work according to regulations.As I mentioned previously, this part of networking is not my strong point. I understand the basic concepts, but in my old job never had to actually fully design something like this. At my company we had the equivalent of @johnpoz engineers who designed the links. My job was primarily cybersecurity and firewalls, client/server software installation, configuration and administration, and various types of system programming. I interacted very frequently with the link-layer stuff and even did most of the firmware updates on equipment at my sites, but I was not heavy into the design phase.
-
@bmeeks
You mention that LACP of a past thread. I know the LACP aggregaat between pfsense is working at least from watching the pings.
JohnPoz advised me last year to do a wireshak to see what is going on, I have the feeling from the output of wireshark, not sure, there seems to be some point where dns doesn't come trough.
And that is the ip address of the LACP aggregate (10.216.64.17) as shown in the network diagram.