Pfsense internet goes down all the time
-
Is the ISP modem in bridgemode??
If not, start there.
-
@gertjan That would be the easiest solution. Because "ziggo" dug open 40 meters of street and replaced the whole cable already; that was damaged though.
Thanks! I'll post my findings@Cool_Corona it is
-
@unf0rg0tt3n said in Pfsense internet goes down all the time:
So it could be a nicked or damaged cat 6a cable?
Trying a new cable is the easiest thing to do. However, if the cable is bad, you will have no connection, connection at 100 Mb or lots of errors & poor throughput. You can get cheap testers to verify the cable is wired properly.
-
@jknott Thanks for your reply!
The cable is good (according to the tester I used), about the errors:enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 00:25:90:dc:0f:3a txqueuelen 1000 (Ethernet) RX packets 621161334 bytes 713693829981 (664.6 GiB) RX errors 1 dropped 6 overruns 0 frame 1 TX packets 411474807 bytes 208509699544 (194.1 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 16 memory 0xdfd00000-dfd20000
There arent many, 1RX error with 6 dropped.
This is on physical hypervisor interface.
Pfsense report 0 errors -
Yeah, that seems fine. You could also check the Status > Interfaces page in pfSense in case it's some intermittent fault that test didn't show.
Perhaps the cable is wrapped around your AC compressor and it's suppression has failed for example.The gateway logs like this are bad:
Jun 21 07:13:45 dpinger 84199 WAN_DHCP 1.1.1.1: Alarm latency 14389us stddev 2120us loss 22% Jun 21 07:21:25 dpinger 84199 WAN_DHCP 1.1.1.1: Clear latency 14821us stddev 2999us loss 5% Jun 21 07:26:40 dpinger 84199 WAN_DHCP 1.1.1.1: Alarm latency 15642us stddev 6878us loss 21% Jun 21 07:34:18 dpinger 84199 WAN_DHCP 1.1.1.1: Clear latency 14966us stddev 3562us loss 5% Jun 21 07:37:33 dpinger 84199 WAN_DHCP 1.1.1.1: Alarm latency 13933us stddev 1854us loss 21% Jun 21 07:45:12 dpinger 84199 WAN_DHCP 1.1.1.1: Clear latency 15617us stddev 4664us loss 6%
But note that is packet loss only. Typically you would see a large increase in latency too if it was a saturation issue.
Your modem may still have an IP you can connect to in bridge mode, just not in the WAN subnet. You might need to add a VIP on WAN to connect to it.
You are monitoring 1.1.1.1 which is generally a good idea but it might be interesting to switch that back to the gateway IP to see if that also fails.
If you have only one WAN you should disable the gateway monitoring action on that gateway ( not the monitoring itself) there is no point restarting everything when there is no secondary WAN to failover to.Steve
-
Mmm, I also note those packet loss events are almost exactly the same length, ~7m40s.
That seems too consistent to be something like a bad cable.
Steve
-
@stephenw10 said in Pfsense internet goes down all the time:
Mmm, I also note those packet loss events are almost exactly the same length, ~7m40s.
That seems too consistent to be something like a bad cable.
Steve
Yeah, I guess soft reboot times.
@stephenw10 said in Pfsense internet goes down all the time:
Yeah, that seems fine. You could also check the Status > Interfaces page in pfSense in case it's some intermittent fault that test didn't show.
Perhaps the cable is wrapped around your AC compressor and it's suppression has failed for example.The gateway logs like this are bad:
Jun 21 07:13:45 dpinger 84199 WAN_DHCP 1.1.1.1: Alarm latency 14389us stddev 2120us loss 22% Jun 21 07:21:25 dpinger 84199 WAN_DHCP 1.1.1.1: Clear latency 14821us stddev 2999us loss 5% Jun 21 07:26:40 dpinger 84199 WAN_DHCP 1.1.1.1: Alarm latency 15642us stddev 6878us loss 21% Jun 21 07:34:18 dpinger 84199 WAN_DHCP 1.1.1.1: Clear latency 14966us stddev 3562us loss 5% Jun 21 07:37:33 dpinger 84199 WAN_DHCP 1.1.1.1: Alarm latency 13933us stddev 1854us loss 21% Jun 21 07:45:12 dpinger 84199 WAN_DHCP 1.1.1.1: Clear latency 15617us stddev 4664us loss 6%
But note that is packet loss only. Typically you would see a large increase in latency too if it was a saturation issue.
Your modem may still have an IP you can connect to in bridge mode, just not in the WAN subnet. You might need to add a VIP on WAN to connect to it.
You are monitoring 1.1.1.1 which is generally a good idea but it might be interesting to switch that back to the gateway IP to see if that also fails.
If you have only one WAN you should disable the gateway monitoring action on that gateway ( not the monitoring itself) there is no point restarting everything when there is no secondary WAN to failover to.Steve
AC compressor? We don't have AC haha. But the cable is bundled with 2 other ones linked to AP's.
-
I have no AC either but you get the idea.
Whatever is causing a problem may not be present all the time. The test you ran looks good but the pfSense interface stats are monitoring it all the time so if it's something intermittent it should show there.@unf0rg0tt3n said in Pfsense internet goes down all the time:
Yeah, I guess soft reboot times.
That seems like a very long time for anything to reboot. The modem?
I would want to be sure the modem is or isn't rebooting.Steve
-
@stephenw10 I don't know if that might take that long. It feels forever https://imgur.com/a/jMjwwez
-
It may be it really takes that long if it loses upstream sync. It's a cable modem?
It could be it's not rebooting at all and only loses upstream sync. pfSense it connected to it directly, not via a switch?
The logs you posted do not show it losing link on the WAN just the gateway going down. It it really rebooted I would expect to see the link go down.Have you checked to see if the modem has an admin interface IP you might be able to access?
Steve
-
@stephenw10 There isn't an interface IP. it's a Arris TG2492LG (cable modem)
The link might not go down because it's a VM? In that case there is always link?
I might be wrong.According to the webpage it takes 5 minutes to fully reboot the modem. PFsense might take another 2, to see that it's actually up.
Also, no switch; directly attached to the modem via UTP
-
Some quick googling shows there's a good chance that modem still has a management interface at 192.168.100.1 when it's in bridge mode. So if you're not using that subnet already I suggest adding an IPAlias VIP on the WAN interface at, for example, 192.168.100.10/24 and then trying to ping it from pfSense.
If that works add a manual outbound NAT rule on the WAN to allow LAN clients to access that IP via the VIP.
See: https://docs.netgate.com/pfsense/en/latest/recipes/modem-access.htmlSteve
-
@stephenw10 said in Pfsense internet goes down all the time:
Some quick googling shows there's a good chance that modem still has a management interface at 192.168.100.1 when it's in bridge mode. So if you're not using that subnet already I suggest adding an IPAlias VIP on the WAN interface at, for example, 192.168.100.10/24 and then trying to ping it from pfSense.
If that works add a manual outbound NAT rule on the WAN to allow LAN clients to access that IP via the VIP.
See: https://docs.netgate.com/pfsense/en/latest/recipes/modem-access.htmlSteve
Didn't quite get how to get that done. Can't add interface. But I'm on the phone with my ISP, they saw it go down twice in 7 min and 40 second. They tried to blame my internal network, but that can't be because their modem goes offline. Even made a new pfsense box and moved that directly next to the modem and yeah that worked for 10 minutes. Before it went down. So all new box, all new cables.
-
Yeah, you don't need to assign an interface for a DHCP modem, just add a VIP on the existing WAN so pfSense has an IP ion the same subnet as the modem management IP.
-
Why are using 1.1.1.1 as your monitor? Does your gateway not answer ping?
Problem with using something like 1.1.1.1 is maybe they throttle you, hit so many packets from you even in zero sized data pings and they might say F this guy and firewall you for X amount of time.. Now pfsense sees loss of its monitor and goes through trying to recover, etc. Something on a constant cycle of X amount of time could point to something like this happening.
You should whenever possible check that you have connectivity to the internet, by checking the first IP you hit going to the internet, ie your isp gateway.
If they have that not answering pings for some asinine reason, then move up the stream - but picking something like 1.1.1.1 could have you think your offline when all it is peering problem to that network with your isp, or an issue with their routing, and nothing really with overall internet..
You could always just tell pfsense that connection is always up - ie turn off gateway action and see if actually internet still works, when monitoring fails..
I have this set because sometimes I saturate my connection.. HUGE downloads filling up my pipe, and monitor sometimes would fail the pings when doing this.. So just turn off the action, and you can check if monitoring is showing packet loss, but your actually still working.. Which could point to either something upstream, or your monitoring IP throttling you because you talk to them too much.. Default I think pfsense send 2 pings a second.. Maybe 1.1.1.1 doesn't like that..
Just throwing ideas out there is all.
-
@stephenw10 said in Pfsense internet goes down all the time:
Yeah, you don't need to assign an interface for a DHCP modem, just add a VIP on the existing WAN so pfSense has an IP ion the same subnet as the modem management IP.
So I figured it out thanks to you guys, This gives me so much more information like:
the modem does a reboot27-06-2021 11:42:40 notice REGISTRATION COMPLETE - Waiting for Operational status;CM-MAC=<MAC>;CMTS-MAC=<MAC>;CM-QOS=1.1;CM-VER=3.0; 27-06-2021 11:42:32 warning MIMO Event MIMO: Stored MIMO=-1 post cfg file MIMO=-1;CM-MAC=<MAC>;CMTS-MAC=<MAC>;CM-QOS=1.1;CM-VER=3.0; 01-01-1970 01:01:26 notice Cable Modem Reboot - due to power reset;CM-MAC=<MAC>;CMTS-MAC=00:00:00:00:00:00;CM-QOS=1.1;CM-VER=3.0; 27-06-2021 11:29:35 notice REGISTRATION COMPLETE - Waiting for Operational status;CM-MAC=<MAC>;CMTS-MAC=<MAC>;CM-QOS=1.1;CM-VER=3.0; 27-06-2021 11:29:27 warning MIMO Event MIMO: Stored MIMO=-1 post cfg file MIMO=-1;CM-MAC=<MAC>;CMTS-MAC=<MAC>;CM-QOS=1.1;CM-VER=3.0; 01-01-1970 01:01:25 notice Cable Modem Reboot - due to power reset;CM-MAC=<MAC>;CMTS-MAC=00:00:00:00:00:00;CM-QOS=1.1;CM-VER=3.0;
Which I didn't initiate
Also this one is fairly interesting:01-01-1970 01:01:41 critical No Ranging Response received - T3 time-out;CM-MAC=<MAC>;CMTS-MAC=<MAC>;CM-QOS=1.1;CM-VER=3.0;
@johnpoz said in Pfsense internet goes down all the time:
Why are using 1.1.1.1 as your monitor? Does your gateway not answer ping?
Problem with using something like 1.1.1.1 is maybe they throttle you, hit so many packets from you even in zero sized data pings and they might say F this guy and firewall you for X amount of time.. Now pfsense sees loss of its monitor and goes through trying to recover, etc. Something on a constant cycle of X amount of time could point to something like this happening.
You should whenever possible check that you have connectivity to the internet, by checking the first IP you hit going to the internet, ie your isp gateway.
If they have that not answering pings for some asinine reason, then move up the stream - but picking something like 1.1.1.1 could have you think your offline when all it is peering problem to that network with your isp, or an issue with their routing, and nothing really with overall internet..
You could always just tell pfsense that connection is always up - ie turn off gateway action and see if actually internet still works, when monitoring fails..
I have this set because sometimes I saturate my connection.. HUGE downloads filling up my pipe, and monitor sometimes would fail the pings when doing this.. So just turn off the action, and you can check if monitoring is showing packet loss, but your actually still working.. Which could point to either something upstream, or your monitoring IP throttling you because you talk to them too much.. Default I think pfsense send 2 pings a second.. Maybe 1.1.1.1 doesn't like that..
Just throwing ideas out there is all.
Yes! Thanks, set that up now. no ping, only monitor for online status.
-
@stephenw10 said in Pfsense internet goes down all the time:
You are monitoring 1.1.1.1 which is generally a good idea but it might be interesting to switch that back to the gateway IP to see if that also fails.
If you have only one WAN you should disable the gateway monitoring action on that gateway ( not the monitoring itself) there is no point restarting everything when there is no secondary WAN to failover to.Yup. That^.
I've never seen 1.1.1.1 throttle ping response but they certainly could. They are under no obligation to respond to ping at all.But, yeah, it you can see the modem really is rebooting there is nothing pfSense could do to cause that.
Steve
-
There is a good chance, that the DOCIS Error ( t3 timeout) is caused by noise on the line between your modem and the CMTS of the cable provider. This leads to timeouts on layer 3, e.g. paket loss. The only way to solve this to is contact your provider. They usually totally ignore this kid of t3 timeouts. usually there are severall t3 timeout failures, some if them are harmless, some cause the modem reboots.
-
Okay,
Everyone, Thank you so much for helping me out this far! It isn't a pfsense issue.
I will go to my ISP, again... but this time with hard evidence (thanks to you all). -
@unf0rg0tt3n said in Pfsense internet goes down all the time:
It isn't a pfsense issue.
Before you seal the 'pointing finger' in concrete, you could do one more test :
Exclude pfSense for a while, hookup the modem directly to a PC and set it up for a connection.
Or use another 'off the shelves' router.You should be seeing the same issue.