Not able to access websites/network connection issues
-
Thanks for the response! I called the ISP and they advised everything looked good on their end. They ran tests and said there was no packet loss and asked if I had a firewall installed and inferred the problem was likely firewall related, although I never told them I had a firewall.
During the issues, the websites I tried connecting to would mention something about DNS prior to timing out.
These are the only packages I have installed.
Here are the results from pinging the Google DNS servers.
Here are the results from pinging google which is successful then pinging messicks.com which failed with 100% package loss.
Here are some of the logs from earlier failures.
Heres a log from recently.
Thanks for the help! Much appreciated!!!
-
@LMorefield your log starts out with 93/96% packet loss.
WAN blocks are basically irrelevant as inbound traffic is blocked by default, and should be. I always turn off logging of the default block rules to clean up the logs.
Not every web site is pingable. The question would be, is it normally pingable. But irrelevant here, I think; with 90% loss barely anything is getting through.
-
@SteveITS said in Not able to access websites/network connection issues:
Not every web site is pingable.
@LMorefield : yep, "messicks.com" doesn't reply to ping neither.
But this :
is a "mess".
Where is this "192.168.0.1" ? pfSense cant' barely reach it. Same for "fe80::626d:c7ff::feb6::c30f".
The WAN plug, or cable, to the upstream device is ok ??
And guess what happens when 'dpinger' gets to many 'ping' loses ?? It will reset the connection.
If IPv4 "WAN" gets reset / recreated, IPv6 will go down with it. And so on ...And are you using a "lagg0" type interface on the WAN side ?
-
I think 192.168.0.1 is coming from the ISP. I'm not sure why it's not able to be reached. Things have hit the fan this morning unfortunately.
This is the interface on the WAN side.
Any idea how to open things up and let them through? Some of the connections require to do my daily tasks are not able to connect to the network.
-
@LMorefield said in Not able to access websites/network connection issues:
I think 192.168.0.1 is coming from the ISP
Your "Internet" connection is running out of your house ?
Or do you have an ISP 'modem' or ISP router in front of your pfSense ?I don't know what 'lagg' is, except from what I 'heard'.
But combining LAN and WAN together doesn't rime at all with yourNot able to access websites/network connection issues
as only network experts would use lagg stuff. Very IMHO, of course.
what is lagg : I wasn't far off. Please use some words to explain why you would need this lagg ?
I always thought that lagg was used to combine several WAN type interfaces to gain throughput. An equivalent setup has to made on the other side of the lagged interfaces (I guess). -
I agree with @Gertjan here, why the
lagg0
configuration?And you most definitely should not put both your LAN and WAN on the same
lagg
. No wonder you are having so many issues.lagg
interfaces are to increase throughput by aggregating multiple physical links. So, for example, you could combine two physical LAN ports into alagg
and theoretically have twice the throughput. But that is true only if you have a matchinglagg
at the other endpoint (for instance, a managed Ethernet switch properly configured).Here is the documentation on
lagg
interfaces: https://docs.netgate.com/pfsense/en/latest/interfaces/lagg.html. Read through that and you should immediately see your mistake. Here is the key line:...combines multiple physical interfaces together as one logical interface.
You have connected your LAN and WAN together via the common
lagg0
virtual interface! No wonder your firewall is completely confused. And, by the way, you are not filtering traffic at all in that configuration. You essentially have no firewall, but you have messed routing. So, at least because of the heavily broken routing it means hardly anything can get back to your LAN devices. That's why you are having all the problems.Remove the
lagg0
virtual interface and use a plain vanilla LAN and separate plain vanilla WAN interface. Then things will improve. -
@Gertjan said in Not able to access websites/network connection issues:
@LMorefield said in Not able to access websites/network connection issues:
I think 192.168.0.1 is coming from the ISP
Your "Internet" connection is running out of your house ?
Or do you have an ISP 'modem' or ISP router in front of your pfSense ?Coax into ISP modem, into Netgate/pfSense
I don't know what 'lagg' is, except from what I 'heard'.
But combining LAN and WAN together doesn't rime at all with yourNot able to access websites/network connection issues
as only network experts would use lagg stuff. Very IMHO, of course.
what is lagg : I wasn't far off. Please use some words to explain why you would need this lagg ?
I always thought that lagg was used to combine several WAN type interfaces to gain throughput. An equivalent setup has to made on the other side of the lagged interfaces (I guess).When I received this device, it was set up this way. I factory reset it through the console and these are the factory default settings, or at least I thought they were. I didn't change the settings, only the ip address pool/ranges.
@bmeeks said in Not able to access websites/network connection issues:
I agree with @Gertjan here, why the
lagg0
configuration?Same, was configured this way following the factory reset through the console.
And you most definitely should not put both your LAN and WAN on the same
lagg
. No wonder you are having so many issues.lagg
interfaces are to increase throughput by aggregating multiple physical links. So, for example, you could combine two physical LAN ports into alagg
and theoretically have twice the throughput. But that is true only if you have a matchinglagg
at the other endpoint (for instance, a managed Ethernet switch properly configured).This is something I'd be interested in down the road, increasing throughput.
Here is the documentation on
lagg
interfaces: https://docs.netgate.com/pfsense/en/latest/interfaces/lagg.html. Read through that and you should immediately see your mistake. Here is the key line:...combines multiple physical interfaces together as one logical interface.
You have connected your LAN and WAN together via the common
lagg0
virtual interface! No wonder your firewall is completely confused. And, by the way, you are not filtering traffic at all in that configuration. You essentially have no firewall, but you have messed routing. So, at least because of the heavily broken routing it means hardly anything can get back to your LAN devices. That's why you are having all the problems.Remove the
lagg0
virtual interface and use a plain vanilla LAN and separate plain vanilla WAN interface. Then things will improve.Thank you! Again, received the device in this configuration. Would love to make it as simple as possible and make it plain vanilla. Is this straightforward to do? If so, I'm going to shut the network down and do it asap!
Thank you both again!
-
I'm not seeing where I can change the configuration.
-
The XG 7100 is an older appliance first introduced in 2018 from what I found in a search. So, I am assuming you obtained this applicance used or did you purchase it new directly from Netgate?
I am tagging @stephenw10 in this thread. He is the Netgate hardware guru.
The XG 7100 is a special animal in that it has an integrated Ethernet switch. Not all of the ports on the device are fully independent ports. Some are part of the built-in switch.
Have you read all through the section in the XG 7100 docs on the switch configuration? Here is that link: https://docs.netgate.com/pfsense/en/latest/solutions/xg-7100-1u/.
I do not believe that
lagg0
configuration is valid with both LAN and WAN in the samelagg
. That makes no sense. I can perhaps see those particular switch ports being grouped as alagg
, but thatlagg
should not contain both LAN and WAN ports. So, in other words, all thelagg0
ports are LAN or all are WAN, but never both LAN and WAN.Here is an example switch reconfiguration from the docs to separate out the WAN interface from the default
lagg0
virtual interface: https://docs.netgate.com/pfsense/en/latest/solutions/xg-7100-1u/configuring-the-switch-ports.html#switch-configuration-examples. You could try following this example to isolate at least the WAN port from thelagg0
. -
https://docs.netgate.com/pfsense/en/latest/solutions/xg-7100-1u/switch-overview.html#switch-lagg
“In the default configuration, two VLANs are used to create the ETH1 WAN interface and ETH2-8 LAN interface:
WAN
VLAN 4090LAN
VLAN 4091”Looks like they are set up as VLANs on an 8 port switch, similar to the 1100. The VLAN isolates the port used for WAN and leaves the rest on LAN.
So while it looks a bit odd, seems to be normal.
If 192.168.0.1 is the ISP router, seems like it has a problem, bad patch cable, something causing packet loss.
-
@SteveITS said in Not able to access websites/network connection issues:
https://docs.netgate.com/pfsense/en/latest/solutions/xg-7100-1u/switch-overview.html#switch-lagg
“In the default configuration, two VLANs are used to create the ETH1 WAN interface and ETH2-8 LAN interface:
WAN
VLAN 4090LAN
VLAN 4091”Looks like they are set up as VLANs on an 8 port switch, similar to the 1100. The VLAN isolates the port used for WAN and leaves the rest on LAN.
So while it looks a bit odd, seems to be normal.
If 192.168.0.1 is the ISP router, seems like it has a problem, bad patch cable, something causing packet loss.
Yeah, that is definitely weird to me with wrapping LAN and WAN in the
lagg0
interface but with VLANs. There is a big red DANGER warning in the docs that says "do not delete thelagg0
interface". So, it must be some special requirement for the XG 7100 hardware to function.I agree the box is having problems getting out reliably (or at least getting replies back in). At first I thought it would be all confused with the
lagg0
setup, but apparently that's something unique to the XG 7100. -
This is what happens when it starts malfunctioning
Then it says the web address could've been mistyped. I tried screenshotting that as well but the website loaded before I could switch tabs back.
Any idea where to look or what to troubleshoot to fix this?
-
@LMorefield said in Not able to access websites/network connection issues:
This is what happens when it starts malfunctioning
Then it says the web address could've been mistyped. I tried screenshotting that as well but the website loaded before I could switch tabs back.
Any idea where to look or what to troubleshoot to fix this?
This error from a web browser means it was unable to connect to the requested website. There can be two fundamental reasons for that.
The first thing that must work correctly is the domain name must be looked up and translated into an actual IP address for a connection to happen. So, your web browsers asks its configured DNS server to find the IP address for the domain name "www.thechunkychef.com". That domain name resolves to three IP addresses:
173.239.8.164 173.239.5.6 74.206.228.78
Once it receives the IP address (or addresses in this case), the browser attempts a connection to the IP.
So, the first place for failure is the DNS lookup fails and so the browser has no IP address to use. The second place for failure is the ISP connection is faulty and the connecction to the IP address fails (or, back to the first problem, the connection is so bad that the DNS lookup attempt failed because it could not connect to the Internet to query the domain name servers).
You need to determine which issue you are having: DNS failures or network connectivity failures -- or both (network failures will naturally lead to DNS lookup failures).
Examine the pfSense system log under STATUS > SYSTEM LOGS. Do you see messages about WAN interface alarms and packet loss? If you do, then your ISP connection is at fault. That could be a bad port on your firewall, it could be a bad RJ45 connecting cable, or it could be a problem with your ISP equipment or connection.
Looking at the log entries you supplied earlier, it certainly looks to me that your ISP connection is sporadic. That might be a bad cable, or their support folks might be sandbagging you when they claim there is no problem on their side.
There is a process in the default pfSense setup called
dpinger
. This is a program that constantly sends anicmp
ping request to the configured default gateway. In your case that gateway IP appears to be 192.168.0.1. So long as that gateway IP replies to the ping request in a timely manner,dpinger
assumes everything is fine. But if the gateway does not respond to the ping request in the timeout window configured, thendpinger
assumes the interface is down and it will attempt to restart it. This is all configured under the SYSTEM > ROUTING menu for gateway monitoring. If your ISP connection is flaky, or if the configured gateway gets busy and does not reply to the pings in a timely manner,dpinger
can be triggered to restart things. During that restart your WAN connection will go away and then come back. That disrupts things on your firewall including theunbound
DNS Resolver that looks up IP addresses for domain names.At first I was shocked by the
lagg0
setup, but upon further reading in the docs it appears that's a normal but unique thing for the XG 7100. -
@bmeeks
Recent logs
It's weird that it worked fine for 1 week, then yesterday it started malfunctioning.
-
@LMorefield said in Not able to access websites/network connection issues:
It's weird that it worked fine for 1 week, then yesterday it started malfunctioning.
Then that points a finger directly at either your ISP or a bad cable.
First thing to do is swap out the cable connecting the WAN port of your firewall to the ISP's equipment. Next thing I would do, if that does not correct the problem, is get back on the phone with the ISP support and try to convince them it's their problem.
You have actual packet loss there. It's not enough to trigger
dpinger
to restart the interface, but it is significant loss. That loss is enough to cause issues with regular network traffic such as DNS name resolutions. -
@bmeeks said in Not able to access websites/network connection issues:
At first I was shocked by the lagg0 setup, but upon further reading in the docs it appears that's a normal but unique thing for the XG 7100.
Thanks for the info. -
I changed a few things today and seem to have fixed the issue. Now the issue will be trying to figure out which one actually fixed it (at least for the time being).
I was still having the same intermittent issues after making a brand new cat-6 ethernet cable that passed when tested. There are multiple lines of internet entering my org. The ISP is the same, however, the accounts are different. I used the ethernet from a different account. The speeds are much slower, but it's a different account, different modem, etc.
I disabled the DHCP6 Server
I changed the IPv6 Configuration type from "Tracking" to "none"
Here are my logs:
System General
Gateways
Firewall
I'll leave the current ethernet (not the original) connected through Tuesday 11/28, as that's the next day we'll have everyone in and the system will be loaded. If it works through then, I'll revert to the old ethernet and get on the horn with the ISP ensuring they fix the issue.
Any feedback on the changes I've made regarding IPv6 and DHCP6? Thanks in advance! Happy Thanksgiving!
-
If your ISP does not provide IPv6 service, then certainly disable those settings. But if your ISP provides an IPv6 connection, enabling that in pfSense is fine. However, if you are not skilled in the networking art, it may be better to not attempt to configure IPv6 because it seems each ISP has their own unique "quirks" in their implemention of that protocol.
The other thing I notice in your logs is that you seem to have the "Block Private Networks" setting enabled under INTERFACES > WAN. Your default gateway looks to be in RFC 1918 space (192.168.0.1), so you definitely would want to uncheck that option as shown below: