WAN looses connection randomly with 24-36 hours - tried everything
-
Hi all,
My pfSense box seems to loose the WAN connection at random times with 24-36 hour intervals, continuously. It looks like it looses the wan address.
This issue just started a few weeks ago, otherwise I've been running pfSense for 5+ years without any issues.Can anyone help me solve this weird issue, I'm at my wits end :(
Thanks
JimI have seen multiple similar forum threads and tried various things:
DHCP setting changes on WAN to FreeBSD default and other custom timings
Manual setting the speed on WAN NIC
Swapping ethernet cables to new ones
Removed all custom tweaks
Reinstalled pfSense (applied previous config xml)
Changed gateway monitor IP to 8.8.8.8
Swapped LAN ports on the modem
Scouring logs till I'm blindTimeline and useful information:
July 6th 10:27: Internet goes down againJul 6 10:27:08 dpinger WAN_DHCP 8.8.8.8: Alarm latency 19669us stddev 3716us loss 21%
July 6th 10:35: I restart the modemJul 6 10:35:00 pfSense dhclient[1007]: igb0 link state up -> down
This seems to be my DHCP server at the ISP:DHCPREQUEST on igb0 to 80.62.121.174 port 67
I seem to be recieving DHCP OFFER, but when internet is down 0.0.0.0 is showncat /var/log/dhcpd.log | grep 80.62.121.174 Jul 6 08:50:54 pfSense dhclient[1007]: DHCPREQUEST on igb0 to 80.62.121.174 port 67 Jul 6 08:50:54 pfSense dhclient[1007]: DHCPACK from 80.62.121.174
I use MANUAL outbound NAT, as I have selective routing for my VPN (worked fine for multiple years)
I have a STATIC ip assigned via DHCP reservation at my ISP
NO static routes setup
NO traffic shaperHow to resolve:
Reboot my cable modem solves the problem
Reboot pfsense box solves the problemHardware:
Qotom 6x1Gbit Intel i211 NIC / Intel i7-7500U
Cable modem Sagemcom Fast 3890v3 in BRIDGE MODE
1000/100Mbit connectionLog files:
dhcp logs:
https://pastebin.com/qPKSyvRvGateway log:
https://pastebin.com/NEK3WbSSGeneral log:
https://pastebin.com/fAyZnmuY/var/log/dhclient.leases.igb0: MOST RECENT version
lease { interface "igb0"; fixed-address 2.106.185.197; option subnet-mask 255.255.255.192; option routers 2.106.185.193; option domain-name-servers 193.162.153.164,194.239.134.83; option host-name "x1-6-40-62-31-0b-a7-d9"; option domain-name "webspeed.dk"; option interface-mtu 576; option broadcast-address 255.255.255.255; option dhcp-lease-time 259200; option dhcp-message-type 5; option dhcp-server-identifier 80.62.121.174; renew 2 2020/7/7 18:50:54; rebind 3 2020/7/8 21:50:54; expire 4 2020/7/9 06:50:54; } lease { interface "igb0"; fixed-address 192.168.100.10; next-server 192.168.100.1; option subnet-mask 255.255.255.0; option routers 192.168.100.1; option dhcp-lease-time 30; option dhcp-message-type 5; option dhcp-server-identifier 192.168.100.1; renew 1 2020/7/6 08:36:21; rebind 1 2020/7/6 08:36:40; expire 1 2020/7/6 08:36:51; } lease { interface "igb0"; fixed-address 192.168.100.10; next-server 192.168.100.1; option subnet-mask 255.255.255.0; option routers 192.168.100.1; option dhcp-lease-time 30; option dhcp-message-type 5; option dhcp-server-identifier 192.168.100.1; renew 1 2020/7/6 08:36:28; rebind 1 2020/7/6 08:36:47; expire 1 2020/7/6 08:36:58; } lease { interface "igb0"; fixed-address 192.168.100.10; next-server 192.168.100.1; option subnet-mask 255.255.255.0; option routers 192.168.100.1; option dhcp-lease-time 30; option dhcp-message-type 5; option dhcp-server-identifier 192.168.100.1; renew 1 2020/7/6 08:36:59; rebind 1 2020/7/6 08:37:18; expire 1 2020/7/6 08:37:29; } lease { interface "igb0"; fixed-address 2.106.185.197; option subnet-mask 255.255.255.192; option routers 2.106.185.193; option domain-name-servers 193.162.153.164,194.239.134.83; option host-name "x1-6-40-62-31-0b-a7-d9"; option domain-name "webspeed.dk"; option interface-mtu 576; option broadcast-address 255.255.255.255; option dhcp-lease-time 252802; option dhcp-message-type 5; option dhcp-server-identifier 80.62.121.174; renew 2 2020/7/7 19:44:13; rebind 3 2020/7/8 22:04:12; expire 4 2020/7/9 06:50:54; }
/var/log/dhclient.leases.igb0: PREVIOUS version
lease { interface "igb0"; fixed-address 2.106.185.197; option subnet-mask 255.255.255.192; option routers 2.106.185.193; option domain-name-servers 193.162.153.164,194.239.134.83; option host-name "x1-6-40-62-31-0b-a7-d9"; option domain-name "webspeed.dk"; option interface-mtu 576; option broadcast-address 255.255.255.255; option dhcp-lease-time 257956; option dhcp-message-type 5; option dhcp-server-identifier 80.62.121.174; renew 1 2020/7/6 05:03:03; rebind 2 2020/7/7 07:55:13; expire 2 2020/7/7 16:52:41; } lease { interface "igb0"; fixed-address 2.106.185.197; option subnet-mask 255.255.255.192; option routers 2.106.185.193; option domain-name-servers 193.162.153.164,194.239.134.83; option host-name "x1-6-40-62-31-0b-a7-d9"; option domain-name "webspeed.dk"; option interface-mtu 576; option broadcast-address 255.255.255.255; option dhcp-lease-time 245012; option dhcp-message-type 5; option dhcp-server-identifier 80.62.121.174; renew 1 2020/7/6 06:50:54; rebind 2 2020/7/7 08:22:10; expire 2 2020/7/7 16:52:40; }
Screenshots:
-
There are a bunch of leases from the modem shown: 192.168.100.10.
You should set the dhcp client to reject leases from 192.168.100.1.
That usually only happens when the modem loses sync with the cable so it implies the modem is losing sync. Possibly some upstream issue.I would use hybrid outbound NAT mode there but it shouldn't be affecting this.
Steve
-
Thanks for your suggestion. I will try to set that, however I find it weird that this just started happening. I have not changed anything for a long time.
-
Yup. And that can be some upstream problem. Cable degraded causing the modem to lose sync for example.
-
Called my ISP, they're looking into the issue as well. Will report back, when if something happens again.
-
Now there's new stuff in the logs:
Jul 6 15:12:34 kernel arpresolve: can't allocate llinfo for 2.106.185.193 on igb0 Jul 6 15:12:34 kernel arpresolve: can't allocate llinfo for 2.106.185.193 on igb0 Jul 6 15:12:34 kernel arpresolve: can't allocate llinfo for 2.106.185.193 on igb0 Jul 6 15:12:34 kernel arpresolve: can't allocate llinfo for 2.106.185.193 on igb0 Jul 6 15:12:34 kernel arpresolve: can't allocate llinfo for 2.106.185.193 on igb0 Jul 6 15:12:34 kernel arpresolve: can't allocate llinfo for 2.106.185.193 on igb0 Jul 6 15:12:34 check_reload_status rc.newwanip starting igb0 Jul 6 15:12:34 check_reload_status Restarting ipsec tunnels Jul 6 15:12:35 php-fpm 78624 /rc.newwanip: rc.newwanip: Info: starting on igb0. Jul 6 15:12:35 php-fpm 78624 /rc.newwanip: rc.newwanip: on (IP address: 2.106.185.197) (interface: WAN[wan]) (real interface: igb0). Jul 6 15:12:35 dhcpleases /etc/hosts changed size from original! Jul 6 15:12:35 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process. Jul 6 15:12:36 dhcpleases /etc/hosts changed size from original! Jul 6 15:12:36 php-fpm 78624 /rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 2.106.185.193 Jul 6 15:12:36 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process. Jul 6 15:12:37 dhcpleases kqueue error: unknown Jul 6 15:12:38 check_reload_status updating dyndns wan Jul 6 15:12:39 php-fpm 27344 /rc.dyndns.update: phpDynDNS (): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Jul 6 15:12:40 dhcpleases /etc/hosts changed size from original! Jul 6 15:12:40 php-fpm 67720 /interfaces.php: Removing static route for monitor 8.8.8.8 and adding a new route through 2.106.185.193 Jul 6 15:12:40 check_reload_status Reloading filter Jul 6 15:12:40 php-fpm 67720 /interfaces.php: Creating rrd update script Jul 6 15:12:40 snmpd 84244 disk_OS_get_disks: adding device 'ada0' to device list Jul 6 15:12:41 php-fpm 78624 /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1594041161] unbound[22452:0] error: bind: address already in use [1594041161] unbound[22452:0] fatal error: could not open ports'
-
@jim82 said in WAN looses connection randomly with 24-36 hours - tried everything:
Called my ISP, they're looking into the issue as well. Will report back, when if something happens again.
@stephenw10 is 100% spot on. Likely upstream issue with your provider.
When you have a cable modem, your pfSense WAN is configured to use DHCP to obtain its IP address. The cable modem is the provider of that DHCP address. The cable modem operates in two modes. The normal mode, when it is properly synchronized with the CMTS (cable modem termination system) equipment of your cable ISP, is for the cable modem to pass the IP it gets from the CMTS on to pfSense using DHCP. However, when an issue occurs out in the cable TV coax plant such that your modem loses sync with the CMTS, then the cable modem goes into "private DHCP mode" and issues attached devices a local non-routable IP address (typically in the 192.168.100.0/24 subnet).
So long as pfSense sees an IP address on its WAN, it thinks things are fine and tries to use it. But that 192.168.100.x IP is not good for the internet, so nothing works properly. Ideally, when the modem loses sync and goes into that private DHCP mode, it should tell pfSense when it gets back in touch with the CMTS and obtains a proper IP address and let pfSense restart the WAN interface with the new address. Unfortunately that can sometimes not happen properly and pfSense winds up stuck with the non-routable private address.
Telling pfSense to ignore DHCP offers from the cable modem's private DHCP pool (that 192.168.100.0/24 pool) can prevent pfSense from getting "stuck" on the private non-routable IP. That still won't stop your upstream issue that is the root cause, but it will prevent pfSense from getting stuck on one of those non-routable IP addresses whenever loss of sync happens.
-
Du kører med en Yousee forbindelse.
Sagem modem skal i bridge mode og være helt transparent. Det klarer yousee support.
Dernæst konfigurerer du DHCP på pfsense og lader den stå der.
Så kører det.
-
@bmeeks Thanks a lot for the indepth explanation. That sure makes sense. Hoping my ISP is able to see why it's loosing sync.
-
Current status:
Talked to a technician from my ISP today. He reports that Sagem apparently have issues with intermittent "freeze ups" when OFDN channels are activated on upstream for the Docsis 3.1 standard.
For now, he has deactivated those channels on my connection, which apparently removes the "freeze" problem. Fingers crossed, that it will also remove my sync problem.
New firmware is being worked on.
-
Huh, that sounds fun. But that's a good response from an ISP, most would never admit any fault exists.
Steve
-
So far, so good. It's been 48+ hours since the last dropout. Looks like the removal of OFDM** upstream channels, solved the issue. Now I'm waiting for the new firmware to arrive.
Not a single error in the logs since. Thanks for your help so far.
-
14 days after and my connection is rock solid. Thanks for your help.
-
@jim82 said in WAN looses connection randomly with 24-36 hours - tried everything:
14 days after and my connection is rock solid. Thanks for your help.