Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SG3100 keeps locking up after latest update

    Official Netgate® Hardware
    8
    74
    11.8k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tuser11 @stephenw10
      last edited by

      @stephenw10 just an FYI, our lockups have continued. We even swapped SG-3100 boxes (we had a cold spare). So it looks like the issue either isn't related to v23.01 (we are on 23.05.1 now) or we're just seeing a higher number of lockups. https://forum.netgate.com/topic/182065/troubleshooting-repeated-sg-3100-lockups

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Are you still seeing the same disk errors?

        T 1 Reply Last reply Reply Quote 0
        • T
          tuser11 @stephenw10
          last edited by

          @stephenw10 Nope, no disk errors

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Hmm so just stops responding entirely? Nothing on the console? Can you log the console output across an outage to see if anything is shown when it happens?

            T 1 Reply Last reply Reply Quote 0
            • M
              michmoor LAYER 8 Rebel Alliance
              last edited by

              Just curious but can you downgrade to a 22.X code. See if the problem follows

              Firewall: NetGate,Palo Alto-VM,Juniper SRX
              Routing: Juniper, Arista, Cisco
              Switching: Juniper, Arista, Cisco
              Wireless: Unifi, Aruba IAP
              JNCIP,CCNP Enterprise

              T 1 Reply Last reply Reply Quote 0
              • T
                tuser11 @stephenw10
                last edited by

                @stephenw10 Nope, USB console unresponsive when connected until after reboot. I'm not sure how to log console output across an outage. I don't have a computer I can leave connected to the unit. I was hoping remote syslog would give me all data I needed in this situation but was wrong.

                1 Reply Last reply Reply Quote 0
                • T
                  tuser11 @michmoor
                  last edited by

                  @michmoor while I agree it's a fair step of elimination (owner discussed this as soon as we realized the issues are across 2 different SG-3100 units), I'm against downgrading and not having latest security patches on a device we depend on for network security.

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    SteveITS Galactic Empire @tuser11
                    last edited by

                    @tuser11 Two devices does make it seem unlikely to be hardware. But yet specific to your setup since others haven't had the issue. Our 3100 has 32 days' uptime because that's when I installed 23.05.1.

                    OK new random thought: is there a chance of a ground loop? Do you have two buildings connected with a wire? Or two power feeds/grounds in a building?

                    Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                    When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                    Upvote 👍 helpful posts!

                    T 1 Reply Last reply Reply Quote 0
                    • T
                      tuser11 @SteveITS
                      last edited by tuser11

                      @SteveITS funny you should mention a ground loop issue. I do not know if we have a ground loop issue. Basic equipment and APCs equipment show no ground or wiring faults. All test we've done ourselves and by a third party electrician revealed stable power, correct wiring and solid ground. We even had the main outside disconnect panel replaced beginning of this year to resolve some corrosion related problems found. Sometimes we have random issues with voltage that cause different UPSs in the building to randomly go on backup or cycle between AC/DC as if stuck in a power loop.

                      Yes, we do have 2 buildings physically connected and connected with a wire. Our electrical problems are....well lets just say I'm grateful for so many UPS's and APC being so kind about honoring their warranty considering the failure rate here is abnormal.

                      Could a power problem sneak past our UPS and damage our equipment? We haven't had any abnormal hardware failures downstream of the UPSs beyond the SG-3100. Dell PowerEdge Servers, Switches, NAS, all good.

                      Or better yet, are you thinking there may be a ground issue with some other equipment that the SG-3100 is sensitive to?

                      S 1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        It's possible. Does the 2nd 3100 still stop responding if you run it somewhere else?

                        To log the console output would require something there connected to it. That could be something else that happens to be there, a server etc. Or a laptop perhaps or a RasPi maybe.
                        How practical that might be would depend on how often this happens I guess.

                        T 1 Reply Last reply Reply Quote 0
                        • S
                          SteveITS Galactic Empire @tuser11
                          last edited by

                          @tuser11 said in SG3100 keeps locking up after latest update:

                          Yes, we do have 2 buildings physically connected and connected with a wire

                          If they have separate grounds, then what happens is, that wire carries the voltage difference between the grounds. We actually measured voltage on one once but it was long ago. IIRC it was burning out switch ports. Fiber is ideal to connect buildings.

                          That said, the first two sites I found say it's only an issue with shielded twisted pair and UTP is not a probem, which isn't my recollection at all, but it's been a long time since I've run into a situation not using fiber.

                          https://www.truecable.com/blogs/cable-academy/how-to-fix-a-ground-loop
                          https://networkencyclopedia.com/ground-loop/

                          Often an actual Ethernet cable isn't fed through a UPS and even if it is it's likely looking for a 1000v surge not 10 volts.

                          Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                          When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                          Upvote 👍 helpful posts!

                          T 1 Reply Last reply Reply Quote 1
                          • T
                            tuser11 @stephenw10
                            last edited by tuser11

                            @stephenw10 the other SG-3100 is offline in a box. I'm going to see if i can connect a USB from the SG-3100 to one of the Dell Servers and pass it through to a virtual machine that i can leave a putty session running on. As long as i have a static IP i can plug my laptop into the same switch and get access to that virtual machine even if SG-3100 goes offline. If that doesn't work I'll setup a pi or similar.

                            1 Reply Last reply Reply Quote 1
                            • T
                              tuser11 @SteveITS
                              last edited by

                              @SteveITS said in SG3100 keeps locking up after latest update:

                              https://www.truecable.com/blogs/cable-academy/how-to-fix-a-ground-loop

                              Good tip, i will look into this more as i just setup a Ethernet to an outdoor building near the house at home. Would hate to see these issues start creeping up at home.

                              1 Reply Last reply Reply Quote 0
                              • R rcoleman-netgate referenced this topic on
                              • N
                                netplumbers
                                last edited by

                                I had some locking up issues a while back on my sg-3100 that turned out to be a bad power supply. When you swapped for your spare, did you also change PSUs?

                                T 2 Replies Last reply Reply Quote 2
                                • T
                                  tuser11 @netplumbers
                                  last edited by

                                  @netplumbers Good tip. I don't remember. I looked through my logs and i have a log of changing hard drive and later changing the SG-3100 unit. No notes about changing the power supply. With that said, I'm going to assume I didn't and change it anyway tomorrow.

                                  1 Reply Last reply Reply Quote 1
                                  • T
                                    tuser11 @netplumbers
                                    last edited by

                                    @netplumbers Unfortunately the power supply swap was already done and just wasn't in my notes. When I went to swap the power supply today, the supply in the box from the old system had our internal asset ID written on it from the old system. So the power supply in production is the newest supply with less than 4 months of use.

                                    1 Reply Last reply Reply Quote 0
                                    • T
                                      tuser11
                                      last edited by

                                      I have a putty session running from SG-3100 USB passed through to a Linux virtual machine. I'm logged into the pfsense box via putty so I have somewhere to look next time the system locks up in hopes that even though connecting USB after lockup results in unresponsive putty session, maybe this will still have some output on the screen on the next lockup.

                                      @stephenw10 Do i just leave the prompt as is after login or is there a command i should run to stream something?

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Anything shown should be pushed to the console whatever is happening at that time.

                                        T 1 Reply Last reply Reply Quote 1
                                        • T
                                          tuser11 @stephenw10
                                          last edited by tuser11

                                          @stephenw10 Hello, today there was another lockup. I had the console connected to a virtual machine for many days waiting for it to lockup again. When it did, I logged into the vm to look at the console that was connected to SG-3100 and there was no output about the event. The last message on the screen in the console was a message that I had successfully logged in via VPN many hours before.

                                          Can this box be easily locked up via a DDoS attack? How could that be identified when there are always lots of blocked IP addresses? I have logs up until the lockup.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            It could exhaust the state table perhaps but that would not stop it responding at the console. Also you would see the states rising in the monitoring graphs after rebooting.

                                            Were you able to try 'ctl+t' at the console?

                                            If it was a drive error the console would be full of errors showing that.

                                            A hard lock like that with nothing logged at all is more likely a hardware problem IMO.

                                            T 2 Replies Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.