Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Router Locking Up (maybe due to excessive lan traffic?)

    Scheduled Pinned Locked Moved General pfSense Questions
    64 Posts 6 Posters 5.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator @Ximulate
      last edited by

      @Ximulate said in Router Locking Up (maybe due to excessive lan traffic?):

      Feb 25 08:03:15 router dhclient[20807]: DHCPREQUEST on igb1 to 255.255.255.255 port 67
      Feb 25 08:03:15 router dhclient[20807]: DHCPACK from 192.168.225.1
      Feb 25 08:03:15 router dhclient[20807]: unknown dhcp option value 0x78
      Feb 25 08:03:15 router dhclient[32362]: REBOOT
      Feb 25 08:03:15 router dhclient[33631]: Starting add_new_address()
      Feb 25 08:03:15 router dhclient[33858]: ifconfig igb1 inet 192.0.0.2 netmask 255.255.255.224 broadcast 192.0.0.31
      Feb 25 08:03:16 router dhclient[34940]: New IP Address (igb1): 192.0.0.2
      Feb 25 08:03:16 router dhclient[35854]: New Subnet Mask (igb1): 255.255.255.224
      Feb 25 08:03:16 router dhclient[36318]: New Broadcast Address (igb1): 192.0.0.31
      Feb 25 08:03:16 router dhclient[36763]: New Routers (igb1): 192.0.0.1
      Feb 25 08:03:16 router dhclient[37940]: Adding new routes to interface: igb1

      The only potential problem I see there is the two completely different subnets shown there for the igb1 WAN. That implies it may be pulling an IP from a different DHCP server at some point.

      X 1 Reply Last reply Reply Quote 0
      • X
        Ximulate @stephenw10
        last edited by

        @stephenw10 OK, I may have just found something.

        192.168.255.1 is the new cell modem web interface. I vaguly recall when setting up, I couldn't access the modem's use interface & at some point in the trouble shooting I added a VIP 192.168.255.2 which I guess worked. I removed that VIP just now, & instead of getting the modem user interface I'm getting the wISP AirOS config portal.

        I guess I'm going to try to change the default address used by the modem. I'm still not sure where the 192.0.0.X is coming from.

        X 1 Reply Last reply Reply Quote 0
        • X
          Ximulate @Ximulate
          last edited by Ximulate

          So I changed the default IP for the modem's user interface. I still had to add a VIP to the secondary WAN interface to access the modem's interface.

          After rebooting, I noticed the gateway for the cell modem is 192.0.0.1 but its now offline.
          Edit: After several minutes (say 5), the gateway IP updated to what looks like a proper address & is reporting online.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Ok so it could be the cell modem serving it's own subnet via DHCP if it loses cell signal. You might have to reject leases from it to prevent that.
            192.0.0.1 could be really what the ISP is using even if they probably shouldn't!

            X 1 Reply Last reply Reply Quote 0
            • X
              Ximulate @stephenw10
              last edited by

              Thanks I added 192.0.0.1 to "Reject leases from" in the interface. For kicks, I decided to reboot the cell modem. Though the Primary WAN was up, connectivity if the network went to pot. Here's the logs, filtered for "192.0"

              Feb 26 15:41:12 	php-fpm 	37729 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.0.0.2 - Restarting packages.
              Feb 26 15:40:37 	php-fpm 	14962 	8.8.8.8|192.0.0.2|GW_Cellular|306.312ms|389.629ms|0.0%|online|delay
              Feb 26 15:40:26 	php-fpm 	37729 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
              Feb 26 15:39:43 	php-fpm 	401 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.0.0.2 - Restarting packages.
              Feb 26 15:39:41 	php-fpm 	37729 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
              Feb 26 15:39:07 	kernel 		arpresolve: can't allocate llinfo for 192.0.0.1 on igb1
              Feb 26 15:38:59 	rc.gateway_alarm 	72234 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
              Feb 26 15:38:59 	php-fpm 	401 	/rc.newwanip: dpinger: status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock not found
              Feb 26 15:38:59 	php-fpm 	37729 	/rc.dyndns.update: dpinger: status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock not found
              Feb 26 15:38:59 	php-fpm 	36581 	/rc.filter_configure_sync: dpinger: status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock not found
              Feb 26 15:38:58 	php-fpm 	401 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
              Feb 26 15:38:19 	php-fpm 	36581 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.0.0.2 - Restarting packages.
              Feb 26 15:38:16 	php-fpm 	401 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
              Feb 26 15:37:34 	php-fpm 	36581 	/rc.newwanip: The command '/usr/local/bin/dpinger -S -r 0 -i GW_Cellular -B 192.0.0.2 -p /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.pid -u /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock -C "/etc/rc.gateway_alarm" -d 1 -s 2500 -l 5000 -t 60000 -A 5000 -D 350 -L 15 8.8.8.8 >/dev/null' returned exit code '1', the output was ''
              Feb 26 15:37:34 	rc.gateway_alarm 	12798 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
              Feb 26 15:36:50 	php-fpm 	36581 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
              Feb 26 15:36:50 	php-fpm 	54361 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 192.168.5.145 -> 192.0.0.2 - Restarting packages.
              Feb 26 15:36:07 	rc.gateway_alarm 	92783 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
              Feb 26 15:36:07 	php-fpm 	54361 	/rc.newwanip: dpinger: cannot connect to status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock - No such file or directory (2)
              Feb 26 15:36:05 	php-fpm 	54361 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
              Feb 26 15:35:58 	php-fpm 	54361 	8.8.8.8|192.0.0.2|GW_Cellular|51.916ms|16.283ms|54%|down|highloss
              Feb 26 15:35:28 	php-fpm 	54361 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
              Feb 26 15:34:45 	rc.gateway_alarm 	10879 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
              Feb 26 15:34:40 	php-fpm 	54361 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
              
              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                If anything I would expect 192.0.0.X to be the real connection and 192.168.225.1 to be something local. However that isn't the modem subnet it's gui seems to be using.

                So if that fails I'd try refusing leases from 192.168.225.1 instead.

                X 2 Replies Last reply Reply Quote 0
                • X
                  Ximulate @stephenw10
                  last edited by

                  I reached out to the cellular modem manufacturer, who was helpful. Apparently some of my modem config was wrong, so that now appears to be straighten out. However, I'm continuing to experience issues.
                  https://wirelessjoint.com/viewtopic.php?t=4191

                  Reviewing the logs from the last two lock-ups, I see the following happening several minutes beforehand. I was not in this morning, but saw my blink cams & several other devices went offline. A few hours later, the alarm monitoring company called to report a com failure (which means the alarm was able to communicate for some time after the issue started.) Also in the logs I've noticed that both gateways report packetloss/offline within a few seconds of each other.

                  Mar  2 09:43:43 router unbound[12219]: [12219:1] error: ssl handshake failed crypto error:0A000416:SSL routines::sslv3 alert certificate unknown
                  Mar  2 09:43:43 router unbound[12219]: [12219:1] notice: ssl handshake failed 10.111.11.118 port 53295
                  Mar  2 09:44:22 router unbound[12219]: [12219:3] error: ssl handshake failed crypto error:0A000416:SSL routines::sslv3 alert certificate unknown
                  Mar  2 09:44:22 router unbound[12219]: [12219:3] notice: ssl handshake failed 10.111.11.115 port 62052
                  Mar  2 09:44:22 router unbound[12219]: [12219:2] error: ssl handshake failed crypto error:0A000416:SSL routines::sslv3 alert certificate unknown
                  Mar  2 09:44:22 router unbound[12219]: [12219:2] notice: ssl handshake failed 10.111.11.115 port 62053
                  Mar  2 09:45:08 router filterdns[36239]: merge_config: configuration reload
                  Mar  2 09:45:08 router filterdns[36239]: 	Adding Action: pf table: networkABC host: abc.duckdns.org
                  Mar  2 09:45:08 router filterdns[36239]: 	Adding Action: pf table: network123 host: 123.duckdns.org
                  [More of the above, then]
                  Mar  2 09:46:08 router filterdns[36239]: failed to resolve host ntp.org will retry later again.
                  Mar  2 09:46:08 router filterdns[36239]: failed to resolve host abc.duckdns.org will retry later again.
                  Mar  2 09:46:08 router filterdns[36239]: failed to resolve host 123.duckdns.org will retry later again.
                  
                  1 Reply Last reply Reply Quote 0
                  • X
                    Ximulate @stephenw10
                    last edited by

                    @stephenw10
                    It seems like something in the router is preventing IPs (not just DNS resolving) from being found

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      What are those hosts at 10.111.11.115 and 10.111.11.118?

                      X 1 Reply Last reply Reply Quote 0
                      • X
                        Ximulate @stephenw10
                        last edited by

                        @stephenw10
                        .115 is an iPhone. Interesting thing here is that device would not have been on the network at that time (09:44). This episode & the one before, .115 was in the logs reporting the same thing minutes before the issue started (it would have been on the network the previous episode.)

                        Not sure yet what .118 is. It might be another iPhone.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Is it possible the system clock is wrong?

                          X 1 Reply Last reply Reply Quote 0
                          • X
                            Ximulate @stephenw10
                            last edited by

                            @stephenw10 no, the time is correct

                            X 1 Reply Last reply Reply Quote 0
                            • X
                              Ximulate @Ximulate
                              last edited by Ximulate

                              This morning all seemed fine (smart TV was working, not complaints from the alarm, etc) until I logged into my desktop PC and it would not load local (i.e. pfSense GUI) or WWW pages. I tried to SSH into the router, but no joy.

                              The logs are relatively quite from midnight until the time I power cycled the router. I did not see any "ssl handshake failed crypto error."

                              There were several "filterdns 18089 Adding Action: pf table: XYZ host: xxx.xxx.xxx.xxx" prior to rebooting. I've seen this in the logs prior to other failures.

                              I also noticed serveral ntpd logs like this:

                              Mar 6 08:31:31 	ntpd 	87972 	Soliciting pool server 45.83.234.123
                              

                              Tried "ntpq -c pe" per stack exchange post, which if I understand correctly st:16 means out of sync:

                              =============================================================================
                              0.pfsense.pool. .POOL. 16 p - 64 0 0.000 +0.000 0.000
                              1.pool.ntp.org .POOL. 16 p - 64 0 0.000 +0.000 0.000
                              2.pool.ntp.org .POOL. 16 p - 64 0 0.000 +0.000 0.000
                              3.pool.ntp.org .POOL. 16 p - 64 0 0.000 +0.000 0.000
                              *65-100-46-166.d .SOCK. 1 u 38 128 377 74.742 +2.404 1.990
                              +ns1.your-site.c 216.218.254.202 3 u 58 128 377 72.907 +1.203 5.012
                              +104.156.246.53 204.9.54.119 2 u 103 128 377 40.948 -0.402 6.028

                              Dashboard shows correct time

                              X stephenw10S 2 Replies Last reply Reply Quote 0
                              • X
                                Ximulate @Ximulate
                                last edited by

                                Dashboard shows correct time

                                NTP service is enabled with
                                0.pfsense.pool.ntp.org
                                1.pool.ntp.org
                                2.pool.ntp.org
                                3.pool.ntp.org

                                Same timeservers are input into System > General > Timeservers
                                I don't see any firewall rules that would block NTP requests.
                                I'm disabling NTP Server, as I don't think I'm using it.
                                I'm assuming the other timeservers listed in the ntpq results are requests from LAN devices

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator @Ximulate
                                  last edited by

                                  @Ximulate said in Router Locking Up (maybe due to excessive lan traffic?):

                                  I logged into my desktop PC and it would not load local (i.e. pfSense GUI) or WWW pages. I tried to SSH into the router, but no joy.

                                  How was it failing? Is it a DNS resolution failure? The services actually stopped responding on the firewall?

                                  Progressively failing services like that could be a disk issue. Do you see gaps in the logging after recovering access?

                                  X 1 Reply Last reply Reply Quote 0
                                  • X
                                    Ximulate @stephenw10
                                    last edited by Ximulate

                                    @stephenw10

                                    No, the IP addresses appear to being dropped as if dhcp is failing or devices are not able to see other devices. In other words, if I type the pfSense router IP address into the browser it does not load... the browser does not see the pfSense gui. Once this happens, the only way I'm able to recover access is power cycling the router.

                                    At one point, I had my laptop connected to the serial console of the router. I was usually able to access the command menu that way. Occasionally, I could RPD to the laptop to access the command menu but that would normally not work either.

                                    I think I've tried this already, but I think I'll manually set the the IP address of my desktop & laptop to see if they still communicate next time the network fails. Currently pfSense is handling out static leases to my desktop & a few other items, and dynamic to the rest.

                                    On the rare occasion that I catch the network acting up but can get to the router gui, I have not seen any failing services. I have also tried the pfsense tools in the CLI lile "playback restartallwan" without success. Reboot was required.

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      Was the console responsive if you were at the laptop connected to it directly?

                                      X 1 Reply Last reply Reply Quote 0
                                      • X
                                        Ximulate @stephenw10
                                        last edited by Ximulate

                                        @stephenw10 To the best of my recollection, at least within the last few weeks, the console has always been accessable via serial.

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Ok then I'd try to connect out from it when this happens and see what (if anything) still works.

                                          X 1 Reply Last reply Reply Quote 0
                                          • X
                                            Ximulate @stephenw10
                                            last edited by

                                            @stephenw10 maybe I misunderstood your last question. When the network/router fails, I have been able to access the console via serial connection but devices on the network/router still do not communicate. I've tried restarting php, restarting the web configurator, using the playback scripts in the tools... none of those resolve the issue, except rebooting

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.