Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Router hanging after 21.05 upgrade

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    7 Posts 2 Posters 886 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sdm900
      last edited by

      Morning

      I upgrade my NetGate SG1000 to 21.05 last week and since it has been hanging and effectively crashing. I've had to power cycle it about 4 times.

      It lasts about 48 hours, then I notice the web interface hangs and I can't ssh to it. BUT packets are still being routed and filtered.

      About another 24 hours later, it hangs entirely and stops passing packets.

      Any idea?

      Thanks.

      The last few lines of system.log just prior to the last crash (this morning)

      Jul  6 07:27:03 home nginx: 2021/07/06 07:27:03 [error] 89464#100108: *1520 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.0.0.158, server: , request: "POST /widgets/widgets/interface_statistics.widget.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.0.0.1", referrer: "https://10.0.0.1/index.php"
      Jul  6 07:32:23 home nginx: 2021/07/06 07:32:23 [error] 89464#100108: *1522 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.0.0.158, server: , request: "POST /widgets/widgets/dyn_dns_status.widget.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.0.0.1", referrer: "https://10.0.0.1/index.php"
      Jul  6 07:37:43 home nginx: 2021/07/06 07:37:43 [error] 89464#100108: *1524 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.0.0.158, server: , request: "POST /getstats.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.0.0.1", referrer: "https://10.0.0.1/index.php"
      Jul  6 07:57:19 home nginx: 2021/07/06 07:57:19 [error] 89464#100108: *1526 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.0.0.158, server: , request: "GET /index.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.0.0.1", referrer: "https://10.0.0.1/status_interfaces.ph
      

      I rebooted the router before it stopped passing packets.

      The previous hang where it stopped passing packets and needed to be rebooted

      Jul  3 08:33:48 home ppp[13666]: [wan_link0] Link: reconnection attempt 692
      Jul  3 08:33:48 home ppp[13666]: [wan_link0] PPPoE: Connecting to 'Tangerine'
      Jul  3 08:33:57 home ppp[13666]: [wan_link0] PPPoE connection timeout after 9 seconds
      Jul  3 08:33:57 home ppp[13666]: [wan_link0] Link: DOWN event
      Jul  3 08:33:57 home ppp[13666]: [wan_link0] LCP: Down event
      Jul  3 08:33:57 home ppp[13666]: [wan_link0] Link: reconnection attempt 693 in 4 seconds
      Jul  3 08:34:01 home ppp[13666]: [wan_link0] Link: reconnection attempt 693
      Jul  3 08:34:01 home ppp[13666]: [wan_link0] PPPoE: Connecting to 'Tangerine'
      Jul  3 08:34:10 home ppp[13666]: [wan_link0] PPPoE connection timeout after 9 seconds
      Jul  3 08:34:10 home ppp[13666]: [wan_link0] Link: DOWN event
      Jul  3 08:34:10 home ppp[13666]: [wan_link0] LCP: Down event
      Jul  3 08:34:10 home ppp[13666]: [wan_link0] Link: reconnection attempt 694 in 4 seconds
      Jul  3 08:34:14 home ppp[13666]: [wan_link0] Link: reconnection attempt 694
      Jul  3 08:34:14 home ppp[13666]: [wan_link0] PPPoE: Connecting to 'Tangerine'
      Jul  3 08:34:23 home ppp[13666]: [wan_link0] PPPoE connection timeout after 9 seconds
      Jul  3 08:34:23 home ppp[13666]: [wan_link0] Link: DOWN event
      Jul  3 08:34:23 home ppp[13666]: [wan_link0] LCP: Down event
      Jul  3 08:34:23 home ppp[13666]: [wan_link0] Link: reconnection attempt 695 in 1 seconds
      Jul  3 08:34:24 home ppp[13666]: [wan_link0] Link: reconnection attempt 695
      Jul  3 08:34:24 home ppp[13666]: [wan_link0] PPPoE: Connecting to 'Tangerine'
      

      I have THOUSANDS of these messages.

      S 1 Reply Last reply Reply Quote 0
      • S
        sdm900 @sdm900
        last edited by sdm900

        GRRR... I can't post more information, the forum says my reply is spam??

        "Flagged as spam by akismet"

        1 Reply Last reply Reply Quote 0
        • S
          sdm900
          last edited by

          And in another piece of information... my routing became unresponsive and I noticed that miniupnpd was using a lot of CPU.

          miniupnpd had used 15minutes of cpu time since I rebooted 9hours ago, which seems excessive.

          I've restarted the service and now its using no cpu...

          S 1 Reply Last reply Reply Quote 0
          • S
            sdm900 @sdm900
            last edited by

            I have now caught miniupnpd using a lot of cpu time several times. When I go and look at the upnp status (rules) there are none.

            A restart returns it to normal.

            S 1 Reply Last reply Reply Quote 0
            • S
              sdm900 @sdm900
              last edited by

              OK, this is looking like

              https://forum.netgate.com/topic/164178/upnp-broken-on-21-05/11

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Unless you're seeing those errors in the routing log it may not be.

                The logs you showed above are PPPoE failing to connect. Unrelated to UPnP but could cause miniupnpd to use far more CPU than usual.

                Is the parent NIC linked? Does PPPoE succeed at all?

                Steve

                S 1 Reply Last reply Reply Quote 0
                • S
                  sdm900 @stephenw10
                  last edited by

                  Yes, I saw the miniupnpd errors in my logs... but every time I try to paste them into this ticket, it is refused claiming its spam. Hence my GRRR comment up the chain :)

                  I've applied the fixed in the upnp-broken ticket and will see how it goes.

                  1 Reply Last reply Reply Quote 1
                  • First post
                    Last post
                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.