Router Locking Up (maybe due to excessive lan traffic?)
-
Hmm, nothing shown there at all. No responses from any DHCP server. That was during some outage I assume?
-
@stephenw10
Right, that was about the time the network failure occured yesterday. I don't know exactly when it started failing, but I was troubleshooting prior to having to power cycle at 21:47 (see log in yesterday's post).Mar 13 19:44:30 router dhcpd[74227]: DHCPDISCOVER from 30:e9:50:8e:e2:91 (Sweeper_p100) via igb2.31 Mar 13 19:44:30 router dhcpd[74227]: DHCPOFFER on 10.31.11.235 to 30:e9:50:8e:e2:91 (Sweeper_p100) via igb2.31 Mar 13 19:44:31 router dhcpd[74227]: reuse_lease: lease age 1510 (secs) under 25% threshold, reply with unaltered, existing lease for 10.31.11.235 Mar 13 19:44:31 router dhcpd[74227]: DHCPREQUEST for 10.31.11.235 (10.31.11.1) from 30:e9:50:8e:e2:91 (Sweeper_p100) via igb2.31 Mar 13 19:48:49 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:48:51 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:48:54 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:48:59 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:49:04 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:49:10 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:49:22 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:49:31 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:49:39 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:49:50 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:50:13 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:50:29 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:50:59 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:51:47 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:52:31 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:52:54 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:53:04 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:53:17 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:53:24 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:53:31 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:53:45 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:54:17 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:54:29 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:54:39 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:54:51 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:55:13 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:55:38 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:55:45 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:55:56 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:56:11 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:56:22 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:56:32 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:56:47 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:57:17 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:57:53 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:58:34 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 19:58:56 router dhcpd[74227]: DHCPREQUEST for 10.11.11.64 from 00:15:88:6b:bb:4d via igb2.11 Mar 13 19:58:56 router dhcpd[74227]: DHCPACK on 10.11.11.64 to 00:15:88:6b:bb:4d via igb2.11 Mar 13 19:59:28 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 20:00:36 router dhclient[84101]: DHCPREQUEST on igb1 to 192.0.0.1 port 67 Mar 13 20:01:32 router dhcpd[74227]: DHCPREQUEST for 10.111.11.138 from 82:58:13:8d:90:9a (Pixel-8) via igb2.111 Mar 13 20:01:32 router dhcpd[74227]: DHCPACK on 10.111.11.138 to 82:58:13:8d:90:9a (Pixel-8) via igb2.111
Then is a lot of the following DHCP logs just prior to power cycle:
Mar 13 21:29:49 router dhclient[9249]: FAIL Mar 13 21:30:04 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 1 Mar 13 21:30:05 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:30:07 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 5 Mar 13 21:30:12 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 6 Mar 13 21:30:18 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 15 Mar 13 21:30:33 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 18 Mar 13 21:30:51 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 12 Mar 13 21:31:03 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:31:05 router dhclient[50960]: No DHCPOFFERS received. Mar 13 21:31:05 router dhclient[50960]: No working leases in persistent database - sleeping. Mar 13 21:31:05 router dhclient[53331]: FAIL Mar 13 21:31:20 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:31:22 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 4 Mar 13 21:31:26 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 4 Mar 13 21:31:30 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 10 Mar 13 21:31:40 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 13 Mar 13 21:31:53 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 16 Mar 13 21:32:06 router dhcpd[82618]: uid lease 10.111.11.126 for client 48:a9:44:91:ec:4f is duplicate on 10.111.11.0/24 Mar 13 21:32:06 router dhcpd[82618]: DHCPREQUEST for 10.111.11.51 from 48:e9:44:91:ec:4f via igb2.111 Mar 13 21:32:06 router dhcpd[82618]: DHCPACK on 10.111.11.51 to 48:a9:44:91:ec:4f via igb2.111 Mar 13 21:32:09 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 12 Mar 13 21:32:13 router dhcpd[82618]: DHCPREQUEST for 10.0.11.121 from 78:a9:58:dd:69:cf (MstrCloset) via igb2 Mar 13 21:32:13 router dhcpd[82618]: DHCPACK on 10.0.11.121 to 78:a9:58:dd:69:cf (MstrCloset) via igb2 Mar 13 21:32:18 router dhcpd[82618]: DHCPREQUEST for 10.11.11.62 from dc:a6:32:9a:88:8c via igb2.11 Mar 13 21:32:18 router dhcpd[82618]: DHCPACK on 10.11.11.62 to dc:a6:32:9a:88:8c via igb2.11 Mar 13 21:32:21 router dhclient[50960]: No DHCPOFFERS received. Mar 13 21:32:21 router dhclient[50960]: No working leases in persistent database - sleeping. Mar 13 21:32:21 router dhclient[22]: FAIL Mar 13 21:32:36 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 1 Mar 13 21:32:37 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:32:39 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:32:41 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 5 Mar 13 21:32:44 router dhcpd[82618]: DHCPREQUEST for 10.11.11.91 from 00:e9:9d:db:a6:54 via igb2.11 Mar 13 21:32:44 router dhcpd[82618]: DHCPACK on 10.11.11.91 to 00:e9:9d:db:a6:54 via igb2.11 Mar 13 21:32:46 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 9 Mar 13 21:32:55 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 16 Mar 13 21:33:11 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 7 Mar 13 21:33:18 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 11 Mar 13 21:33:29 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 8 Mar 13 21:33:37 router dhclient[50960]: No DHCPOFFERS received. Mar 13 21:33:37 router dhclient[50960]: No working leases in persistent database - sleeping. Mar 13 21:33:37 router dhclient[79182]: FAIL Mar 13 21:33:52 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:33:54 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 4 Mar 13 21:33:58 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 11 Mar 13 21:34:09 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 11 Mar 13 21:34:20 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 12 Mar 13 21:34:32 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 15 Mar 13 21:34:47 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 6 Mar 13 21:34:53 router dhclient[50960]: No DHCPOFFERS received. Mar 13 21:34:53 router dhclient[50960]: No working leases in persistent database - sleeping. Mar 13 21:34:53 router dhclient[36776]: FAIL Mar 13 21:35:08 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 1 Mar 13 21:35:09 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 13 21:35:11 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 5 Mar 13 21:35:16 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 10 Mar 13 21:35:18 router dhcpd[82618]: DHCPREQUEST for 10.0.11.33 from 78:a9:58:46:f9:44 via igb2 Mar 13 21:35:18 router dhcpd[82618]: DHCPACK on 10.0.11.33 to 78:a9:58:46:f9:44 via igb2 Mar 13 21:35:26 router dhclient[50960]: DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 13 Mar 13 21:47:13 router dhclient[30088]: PREINIT
I just noticed this in above, which is our printer. However, there is no duplicate in the currect DHCP leases.
Mar 13 21:32:06 router dhcpd[82618]: uid lease 10.111.11.126 for client 48:a9:44:91:ec:4f is duplicate on 10.111.11.0/24 Mar 13 21:32:06 router dhcpd[82618]: DHCPREQUEST for 10.111.11.51 from 48:e9:44:91:ec:4f via igb2.111 Mar 13 21:32:06 router dhcpd[82618]: DHCPACK on 10.111.11.51 to 48:a9:44:91:ec:4f via igb2.111
-
Hmm, so it appears that when this goes down there are no responding DHCP servers on the WAN. So either the modem stops both bridging the ISP DS-Lite connection or handing out it's own leases. Or the NIC in pfSense is not actually sending the DHCP leases (or not seeing the responses).
Are you able to try to capture traffic on the WAN side?
Since you're already behind double NAT you might just try disabling the IP pass through in the modem. That would avoid the IP/subnet change. It might also prevent the modem failing if that's what is happening.
-
@stephenw10
It would take some time for me to set-up, but I could probably set-up wireshark on a RasPi & a small switch to log packets (I think thats what your suggesting.)There is a way to disable IP pass though in the modem, so I'll try that next.
-
Yep an ideal test there would be capturing packets from a switch mirror port but that's clearly quite involved to get in place.
That would prove it's not an issue with the NIC though.
-
@stephenw10 So I'm back this
I had all but decided its time to buy a new router, but had to go out of town so I threw together the script descibed below executed by cron ever 12 minutes. That was about 3 weeks ago. Router hasn't had any issue until today.The only reason it an issue was that I was testing the back-up connection by killing power the primary wan modem. Failover was fine, but when I brought the primary WAN back online the LAN comms would die. Looking at the log generated by my script, it appears the script has cycled the wan several times over the last few weeks which makes me think the script mostly works but I'm not sure why or what problem its resolving by cycling the wan.
gwstat=$(pfSsh.php playback gatewaystatus) WAN_STATE=$(echo "$gwstat" | awk '/'$GW_ID'/ { print $7}') if [ $WAN_STATE = "online" ]; then exit; fi echo "WAN Cycling on $varDate" >> $log #turn off modem using Hue Appliance Plug (doubtful this works, because lan comms are down so the Hue Hub is prob unreachable. Just now modifed script to log ouput) /usr/local/bin/curl -X PUT -H "Content-Type: application/json" -d '{"on": false}' $url >> $log ifconfig igb0 down # Sleep, then bring up modem sleep 30 /usr/local/bin/curl -X PUT -H "Content-Type: application/json" -d '{"on": true}' $url >> $log ifconfig igb0 up
-
Hmm, well if you kill the WAN connection deliberately that script is going to continually try to cycle the WAN modem. Though I'm not sure why that would prevent it coming back up or otherwise kill the LAN side connectivity.
-
One thing that was neglected to mention here, is Memtest. I’d suggest running memtest to see if there are any memory errors.
Second. Test another Power Supply, recently I had a client that had the same problem due to a faulty 12v rail.
Regards
-
Network just went down again, kind of, and through no action of mine. This time, I was able to load internet webpages through my fail-over connection (so that was working). I could ping the router over the LAN. I really don't think the primary WAN was down, as once I power cycled it was online. I also confirmed that the power plug was on (so everything was getting power and should be online.)
However, the router GUI was reporting a 504 error (white page with black text served by ngnix.) I was able SSH into the router, so I tried to restart PHP-FAM. This allowed me to load the GUI login, but after entering credentials the browser waited, waited, waited and then back to 504 error. Ended up power cycling.
Per @VioletDragon suggestion, I just replaced the power supply. Earlier in the troubleshooting process, I think I already tried another new power supply (maybe 3rd's the charm) and I had already run memtest overnight, no errors reported.
-
@Ximulate Just out of interest. Can you share System -> Advanced -> Network Interfaces.
Are you routing Layer 2 or 3 ? Is Jumbo Frames been left at default 1500 ?
Have you tested a fresh install ?
Regards.
-
@VioletDragon
Hardware Checksum Offloading [Unchecked - enabled]
Hardware TCP Segmentation Offloading [Checked - disabled]
Hardware Large Receive Offloading [Checked - disabled]
hn ALTQ support [checked - enabled]
ARP Handling {unchecked - do not suppress]
Reset All States [unchecked - do not reset] -
@VioletDragon
MTU is blank on all interfaces, so I assume default / 1500
In so far as I understand OSI, its all Layer 3. Its all firewall rules, no ethernet rules.
No I haven't tried a fresh install. I guess I should do that.