Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    XG-7100 goes unresponsive & core dump

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    11 Posts 4 Posters 1.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      brcisna
      last edited by stephenw10

      Hello All,

      school environment - Netgate XG-7100 x 2.- 2 buildings

      The Netgate XG-7100 after putting in place 3 months ago will randomly go 'unresposive' at almost the same time some days. It may go 7 days with no problems and may go two days, but always happen when people are arriving at school.
      Also, this was run with captive portal disabled until the school year actually started with no problems. Captive portal enabled before first school day. When I say responsive no sshing, no web UI, no pinging etc. Only option is power cycle. It is located in such a way very difficult to get serial cable / laptop hooked up to it when it has hit this state to see if anything can be accessed via serial cable.
      There are about 800 users on this during the day. It always happens about 8am-8:30 am. Has never stopped after this time.
      Put the Xg-7100 on a different battery backup just to eliminate this possibility with still some unresponsive' days.

      Two days ago the Netgate finally done a core dump and rebooted on it's own. In the 'info' saved file there is a mention of 'Panic String: spin lock held too long'. I can not untar the actual text.dump.tar.0 for some reason , unsupported format with file roller. in order to post here for examination.
      An unscientific guess is going to disable the captive portal and run it for a week this way.

      We have a second Netgate XG-7100 at another school building using Captive Portal with about 600 users and this machine has been rock solid.

      Can anyone clue me on how to untar the textdump.tar,0 file for posting here. I tried renaming to textdump.tar.gz but says unsupoported format

      Thank You

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        You should just be able to remove the .0 extension and the result is a tar ball. PM it to me if you don't want it public.

        Steve

        1 Reply Last reply Reply Quote 0
        • B
          brcisna
          last edited by

          Hi Steve,

          Thanks for offering to take a look at core dump files. I am showing my stupidity. I don't see any place on a profile to PM to attach core dump file to. I only see "Start a chat"?

          Thanks

          GertjanG 1 Reply Last reply Reply Quote 0
          • GertjanG
            Gertjan @brcisna
            last edited by Gertjan

            @brcisna said in XG-7100 goes unresponsive & core dump:

            I don't see any place on a profile to PM

            Do not pas the dump into the chat box.
            For example, use pastebin.org to upload the file, and PM him the link, something like https://pastebin.com/qxdg9QKX

            Btw : click on his name or avatar, and then :

            0946197c-8555-4e02-91ca-1002f4ee1d52-image.png

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            1 Reply Last reply Reply Quote 0
            • B
              brcisna
              last edited by

              Hi Gertjan,

              Thanks for the response , info. I thought maybe i was missing something in the users profiles? I'm used to the old school,"pm this person". in the user profile details. I'll do exactly as you suggested here.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Yes, sorry. I'm old school too, still using 'PM'. NodeBB only has chat there.

                It would probably actually be better to open a ticket with us in support for this and attach it there: https://go.netgate.com

                Much easier to point our developers at it once it's in the ticket system.

                Steve

                1 Reply Last reply Reply Quote 0
                • B
                  brcisna
                  last edited by

                  Steve,

                  Ok, Thanks I will post the core dump files at the URL you have posted,
                  This is one of those deals is hard to pinpoint,at least for me anyway. The XG-7100 'just stopped' and at that point it is only a re-power to revive it. If it weren't situated in such a rats nest , confined area in server room i would have liked to setup serial cable/laptop onto it before rebooting it. With the actual core dump someone there may see something, hopefully. We actually have an extra new XG-7100 to put into place for this very purpose but want to see what the root problem is with this XG-7100

                  Thanks again

                  1 Reply Last reply Reply Quote 0
                  • B
                    brcisna
                    last edited by

                    Update:

                    Let the Netgate XG-7100 run for two weeks with captive portal disabled. Run like a champ.
                    This week re-enabled Captive Portal ,ran fine first day,
                    Second day at 8:15 am no internet and unable to ssh it. Had to re-power unit and internet back up,,no signs of any hints in logs of what might have happened.
                    What is so odd every time this has happened it is at about 8:15 am,,,but did live the first day. This router is on a new heavy UPS backup dedicated to router ONLY.

                    Thank You

                    1 Reply Last reply Reply Quote 0
                    • DerelictD
                      Derelict LAYER 8 Netgate
                      last edited by Derelict

                      I would look at any out-of-band options for access to the node. Like the console.

                      You can take a status output from the command line if you can get access to the console.

                      php /usr/local/www/status.php && cp /tmp/status_output.tgz /root

                      Even if console access is difficult, it will probably be required.

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      1 Reply Last reply Reply Quote 0
                      • B
                        brcisna
                        last edited by

                        Hi Derelict,

                        Thank you for the response. I am showing my stupidity. What do you mean by 'out of band options'?
                        Once I do the php command at console what would i be looking for?
                        I will certainly give this a try,

                        Thanks again

                        1 Reply Last reply Reply Quote 0
                        • DerelictD
                          Derelict LAYER 8 Netgate
                          last edited by

                          Meaning like the serial console or possibly an interface without the captive portal on it. It is unknown whether your issue has anything to do with the captive portal at this point.

                          Chattanooga, Tennessee, USA
                          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                          Do Not Chat For Help! NO_WAN_EGRESS(TM)

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.