Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Router Locking Up (maybe due to excessive lan traffic?)

    Scheduled Pinned Locked Moved General pfSense Questions
    64 Posts 6 Posters 4.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • X
      Ximulate @Ximulate
      last edited by Ximulate

      So I changed the default IP for the modem's user interface. I still had to add a VIP to the secondary WAN interface to access the modem's interface.

      After rebooting, I noticed the gateway for the cell modem is 192.0.0.1 but its now offline.
      Edit: After several minutes (say 5), the gateway IP updated to what looks like a proper address & is reporting online.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Ok so it could be the cell modem serving it's own subnet via DHCP if it loses cell signal. You might have to reject leases from it to prevent that.
        192.0.0.1 could be really what the ISP is using even if they probably shouldn't!

        X 1 Reply Last reply Reply Quote 0
        • X
          Ximulate @stephenw10
          last edited by

          Thanks I added 192.0.0.1 to "Reject leases from" in the interface. For kicks, I decided to reboot the cell modem. Though the Primary WAN was up, connectivity if the network went to pot. Here's the logs, filtered for "192.0"

          Feb 26 15:41:12 	php-fpm 	37729 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.0.0.2 - Restarting packages.
          Feb 26 15:40:37 	php-fpm 	14962 	8.8.8.8|192.0.0.2|GW_Cellular|306.312ms|389.629ms|0.0%|online|delay
          Feb 26 15:40:26 	php-fpm 	37729 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
          Feb 26 15:39:43 	php-fpm 	401 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.0.0.2 - Restarting packages.
          Feb 26 15:39:41 	php-fpm 	37729 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
          Feb 26 15:39:07 	kernel 		arpresolve: can't allocate llinfo for 192.0.0.1 on igb1
          Feb 26 15:38:59 	rc.gateway_alarm 	72234 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
          Feb 26 15:38:59 	php-fpm 	401 	/rc.newwanip: dpinger: status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock not found
          Feb 26 15:38:59 	php-fpm 	37729 	/rc.dyndns.update: dpinger: status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock not found
          Feb 26 15:38:59 	php-fpm 	36581 	/rc.filter_configure_sync: dpinger: status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock not found
          Feb 26 15:38:58 	php-fpm 	401 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
          Feb 26 15:38:19 	php-fpm 	36581 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.0.0.2 - Restarting packages.
          Feb 26 15:38:16 	php-fpm 	401 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
          Feb 26 15:37:34 	php-fpm 	36581 	/rc.newwanip: The command '/usr/local/bin/dpinger -S -r 0 -i GW_Cellular -B 192.0.0.2 -p /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.pid -u /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock -C "/etc/rc.gateway_alarm" -d 1 -s 2500 -l 5000 -t 60000 -A 5000 -D 350 -L 15 8.8.8.8 >/dev/null' returned exit code '1', the output was ''
          Feb 26 15:37:34 	rc.gateway_alarm 	12798 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
          Feb 26 15:36:50 	php-fpm 	36581 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
          Feb 26 15:36:50 	php-fpm 	54361 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 192.168.5.145 -> 192.0.0.2 - Restarting packages.
          Feb 26 15:36:07 	rc.gateway_alarm 	92783 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
          Feb 26 15:36:07 	php-fpm 	54361 	/rc.newwanip: dpinger: cannot connect to status socket /var/run/dpinger_GW_Cellular~192.0.0.2~8.8.8.8.sock - No such file or directory (2)
          Feb 26 15:36:05 	php-fpm 	54361 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
          Feb 26 15:35:58 	php-fpm 	54361 	8.8.8.8|192.0.0.2|GW_Cellular|51.916ms|16.283ms|54%|down|highloss
          Feb 26 15:35:28 	php-fpm 	54361 	/rc.newwanip: rc.newwanip: on (IP address: 192.0.0.2) (interface: WANSEC[opt6]) (real interface: igb1).
          Feb 26 15:34:45 	rc.gateway_alarm 	10879 	>>> Gateway alarm: GW_Cellular (Addr:192.0.0.1 Alarm:down RTT:0ms RTTsd:0ms Loss:100%)
          Feb 26 15:34:40 	php-fpm 	54361 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through 192.0.0.1
          
          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            If anything I would expect 192.0.0.X to be the real connection and 192.168.225.1 to be something local. However that isn't the modem subnet it's gui seems to be using.

            So if that fails I'd try refusing leases from 192.168.225.1 instead.

            X 2 Replies Last reply Reply Quote 0
            • X
              Ximulate @stephenw10
              last edited by

              I reached out to the cellular modem manufacturer, who was helpful. Apparently some of my modem config was wrong, so that now appears to be straighten out. However, I'm continuing to experience issues.
              https://wirelessjoint.com/viewtopic.php?t=4191

              Reviewing the logs from the last two lock-ups, I see the following happening several minutes beforehand. I was not in this morning, but saw my blink cams & several other devices went offline. A few hours later, the alarm monitoring company called to report a com failure (which means the alarm was able to communicate for some time after the issue started.) Also in the logs I've noticed that both gateways report packetloss/offline within a few seconds of each other.

              Mar  2 09:43:43 router unbound[12219]: [12219:1] error: ssl handshake failed crypto error:0A000416:SSL routines::sslv3 alert certificate unknown
              Mar  2 09:43:43 router unbound[12219]: [12219:1] notice: ssl handshake failed 10.111.11.118 port 53295
              Mar  2 09:44:22 router unbound[12219]: [12219:3] error: ssl handshake failed crypto error:0A000416:SSL routines::sslv3 alert certificate unknown
              Mar  2 09:44:22 router unbound[12219]: [12219:3] notice: ssl handshake failed 10.111.11.115 port 62052
              Mar  2 09:44:22 router unbound[12219]: [12219:2] error: ssl handshake failed crypto error:0A000416:SSL routines::sslv3 alert certificate unknown
              Mar  2 09:44:22 router unbound[12219]: [12219:2] notice: ssl handshake failed 10.111.11.115 port 62053
              Mar  2 09:45:08 router filterdns[36239]: merge_config: configuration reload
              Mar  2 09:45:08 router filterdns[36239]: 	Adding Action: pf table: networkABC host: abc.duckdns.org
              Mar  2 09:45:08 router filterdns[36239]: 	Adding Action: pf table: network123 host: 123.duckdns.org
              [More of the above, then]
              Mar  2 09:46:08 router filterdns[36239]: failed to resolve host ntp.org will retry later again.
              Mar  2 09:46:08 router filterdns[36239]: failed to resolve host abc.duckdns.org will retry later again.
              Mar  2 09:46:08 router filterdns[36239]: failed to resolve host 123.duckdns.org will retry later again.
              
              1 Reply Last reply Reply Quote 0
              • X
                Ximulate @stephenw10
                last edited by

                @stephenw10
                It seems like something in the router is preventing IPs (not just DNS resolving) from being found

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  What are those hosts at 10.111.11.115 and 10.111.11.118?

                  X 1 Reply Last reply Reply Quote 0
                  • X
                    Ximulate @stephenw10
                    last edited by

                    @stephenw10
                    .115 is an iPhone. Interesting thing here is that device would not have been on the network at that time (09:44). This episode & the one before, .115 was in the logs reporting the same thing minutes before the issue started (it would have been on the network the previous episode.)

                    Not sure yet what .118 is. It might be another iPhone.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Is it possible the system clock is wrong?

                      X 1 Reply Last reply Reply Quote 0
                      • X
                        Ximulate @stephenw10
                        last edited by

                        @stephenw10 no, the time is correct

                        X 1 Reply Last reply Reply Quote 0
                        • X
                          Ximulate @Ximulate
                          last edited by Ximulate

                          This morning all seemed fine (smart TV was working, not complaints from the alarm, etc) until I logged into my desktop PC and it would not load local (i.e. pfSense GUI) or WWW pages. I tried to SSH into the router, but no joy.

                          The logs are relatively quite from midnight until the time I power cycled the router. I did not see any "ssl handshake failed crypto error."

                          There were several "filterdns 18089 Adding Action: pf table: XYZ host: xxx.xxx.xxx.xxx" prior to rebooting. I've seen this in the logs prior to other failures.

                          I also noticed serveral ntpd logs like this:

                          Mar 6 08:31:31 	ntpd 	87972 	Soliciting pool server 45.83.234.123
                          

                          Tried "ntpq -c pe" per stack exchange post, which if I understand correctly st:16 means out of sync:

                          =============================================================================
                          0.pfsense.pool. .POOL. 16 p - 64 0 0.000 +0.000 0.000
                          1.pool.ntp.org .POOL. 16 p - 64 0 0.000 +0.000 0.000
                          2.pool.ntp.org .POOL. 16 p - 64 0 0.000 +0.000 0.000
                          3.pool.ntp.org .POOL. 16 p - 64 0 0.000 +0.000 0.000
                          *65-100-46-166.d .SOCK. 1 u 38 128 377 74.742 +2.404 1.990
                          +ns1.your-site.c 216.218.254.202 3 u 58 128 377 72.907 +1.203 5.012
                          +104.156.246.53 204.9.54.119 2 u 103 128 377 40.948 -0.402 6.028

                          Dashboard shows correct time

                          X stephenw10S 2 Replies Last reply Reply Quote 0
                          • X
                            Ximulate @Ximulate
                            last edited by

                            Dashboard shows correct time

                            NTP service is enabled with
                            0.pfsense.pool.ntp.org
                            1.pool.ntp.org
                            2.pool.ntp.org
                            3.pool.ntp.org

                            Same timeservers are input into System > General > Timeservers
                            I don't see any firewall rules that would block NTP requests.
                            I'm disabling NTP Server, as I don't think I'm using it.
                            I'm assuming the other timeservers listed in the ntpq results are requests from LAN devices

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator @Ximulate
                              last edited by

                              @Ximulate said in Router Locking Up (maybe due to excessive lan traffic?):

                              I logged into my desktop PC and it would not load local (i.e. pfSense GUI) or WWW pages. I tried to SSH into the router, but no joy.

                              How was it failing? Is it a DNS resolution failure? The services actually stopped responding on the firewall?

                              Progressively failing services like that could be a disk issue. Do you see gaps in the logging after recovering access?

                              X 1 Reply Last reply Reply Quote 0
                              • X
                                Ximulate @stephenw10
                                last edited by Ximulate

                                @stephenw10

                                No, the IP addresses appear to being dropped as if dhcp is failing or devices are not able to see other devices. In other words, if I type the pfSense router IP address into the browser it does not load... the browser does not see the pfSense gui. Once this happens, the only way I'm able to recover access is power cycling the router.

                                At one point, I had my laptop connected to the serial console of the router. I was usually able to access the command menu that way. Occasionally, I could RPD to the laptop to access the command menu but that would normally not work either.

                                I think I've tried this already, but I think I'll manually set the the IP address of my desktop & laptop to see if they still communicate next time the network fails. Currently pfSense is handling out static leases to my desktop & a few other items, and dynamic to the rest.

                                On the rare occasion that I catch the network acting up but can get to the router gui, I have not seen any failing services. I have also tried the pfsense tools in the CLI lile "playback restartallwan" without success. Reboot was required.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Was the console responsive if you were at the laptop connected to it directly?

                                  X 1 Reply Last reply Reply Quote 0
                                  • X
                                    Ximulate @stephenw10
                                    last edited by Ximulate

                                    @stephenw10 To the best of my recollection, at least within the last few weeks, the console has always been accessable via serial.

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      Ok then I'd try to connect out from it when this happens and see what (if anything) still works.

                                      X 1 Reply Last reply Reply Quote 0
                                      • X
                                        Ximulate @stephenw10
                                        last edited by

                                        @stephenw10 maybe I misunderstood your last question. When the network/router fails, I have been able to access the console via serial connection but devices on the network/router still do not communicate. I've tried restarting php, restarting the web configurator, using the playback scripts in the tools... none of those resolve the issue, except rebooting

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Right but can you ping out from the console to external targets? By IP and FQDN? What about internal targets?

                                          X 1 Reply Last reply Reply Quote 0
                                          • X
                                            Ximulate @stephenw10
                                            last edited by Ximulate

                                            @stephenw10
                                            I had to go back to the first post to refresh my memory, but yes I did also try pinging back then
                                            https://forum.netgate.com/post/1152732

                                            When I can get into the GUI, I don't see any issues in the dashboard like down WAN, CPU or memory issues. Most of the time, I don't notice 'til its too late so I can't connect to GUI. I have set-up my laptop to the router using the console. I've tried various options in the menu, including restarting PHP, the web configurator, tools like "playback restartallwan" and others to no avail. The one interesting thing is, although the lan devices aren't connecting, I can, from the console, sometimes ping external IPs like 9.9.9.9 OK but 8.8.8.8 might not respond. Internal LAN devices don't respond to ping either.

                                            Now a lot has transpired since that post so I'll try to ping next time. However, I do think I'm going to have the same/similar results. I just reconnected my laptop via serial to the console so its ready to go as soon as I can get to it.

                                            BTW... Thank you for hanging in there with me on this!

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.