Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SG-4860 Crashing with umass0 disconnecting

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    18 Posts 5 Posters 1.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      calmor15014
      last edited by calmor15014

      I apologize if this is not the right forum for this, please direct me to the appropriate one in that case.

      I have an SG-4860 which has random but more frequently increasing failures. First, DHCP stops working, then eventually the entire device stops working (web interface unresponsive, no IPv6 network functionality). Kernel error messages are as follows:

      Aug 28 16:45:38 (hostname withdrawn) kernel: umass0: at uhub1, port 4, addr 4 (disconnected)
      Aug 28 16:45:38 (hostname withdrawn) kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
      Aug 28 16:45:38 (hostname withdrawn) kernel: da0: <Generic Ultra HS-COMBO 1.98> s/n 000000225001 detached
      Aug 28 16:45:38 (hostname withdrawn) kernel: (da0:umass-sim0:0:0:0): Periph destroyed
      Aug 28 16:45:38 (hostname withdrawn) kernel: umass0: detached
      Aug 28 16:45:40 (hostname withdrawn) kernel: ugen0.4: <Generic Ultra Fast Media> at usbus0
      Aug 28 16:45:40 (hostname withdrawn) kernel: umass0 on uhub1
      Aug 28 16:45:40 (hostname withdrawn) kernel: umass0: <Generic Ultra Fast Media, class 0/0, rev 2.00/1.98, addr 4> on usbus0
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: <Generic Ultra HS-COMBO 1.98> Removable Direct Access SCSI device
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: Serial Number 000000225001
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: 40.000MB/s transfers
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: 29184MB (59768832 512 byte sectors)
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: quirks=0x2<NO_6_BYTE>
      

      I am running OpenVPN, Squid, SquidGuard, avahi, and nut. The drive is not full (2% of 25GB), logs are all on a remote logging server. While the device is operating, none of the operating parameters (CPU, memory, HDD) approach even 50% of max.

      It seems like the detach-reattach of the SCSI drive is causing errors in the device. It doesn't appear that this has specific cause (usage, time, etc.)

      Is this the case of a failing drive?

      Thanks in advance!

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        It could be that the onboard storage has a problem. You can pop another disk in there and install to that, though.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 1
        • C
          calmor15014
          last edited by

          Thanks for your response!

          I was leaning toward a spotty drive, but I've never seen a storage device failure manifest itself in this way. Complete disconnect/reconnect seems odd, usually I'd see read/access errors or something, but these are the only errors in any log that seems out of the ordinary until services start failing completely. I have some concern that it's a device controller failure.

          Is there any way to validate the storage issue prior to disassembling the box? I'm not so familiar with flash/SSD diagnostics.

          The unit has been in a rack with tons of ventilation and max environmental temps around 22C. It's lived a pretty easy life, mechanically-speaking. I do have the occasional power fluctuation, but these disconnection events occur seemingly at random.

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            That device is the soldered-on eMMC storage device in the 4860 so there really isn't much to do in the way of diagnostics for it. Somehow the controller is losing contact with the storage. The fact that the device appears to disconnect despite being permanently connected is concerning because it likely means the device itself is failing in some way.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 1
            • C
              calmor15014
              last edited by

              I don't have the device to disassemble at the moment but it looks like there are some SATA connections on the board based on photos online. Can I disable the internal eMMC device and use a SATA external drive instead?

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                There is an mSATA connector inside. You can install an mSATA drive and it will boot from there. You don't have to disable the eMMC.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 1
                • kiokomanK
                  kiokoman LAYER 8
                  last edited by kiokoman

                  if it's this one

                  4860.jpg

                  i don't see the emmc soldered-on
                  are you sure?
                  maybe you only need to clean the contacts

                  ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
                  Please do not use chat/PM to ask for help
                  we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
                  Don't forget to Upvote with the 👍 button for any post you find to be helpful.

                  C 1 Reply Last reply Reply Quote 1
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    That is an mSATA disk.

                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 1
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      If that SG-4860 is still in warranty you should open a ticket with us: https://go.netgate.com

                      Steve

                      C 1 Reply Last reply Reply Quote 1
                      • C
                        calmor15014 @stephenw10
                        last edited by

                        Thanks, but unfortunately I bought it in 2017.

                        I had a laptop platter drive and an mSATA cable lying around, so I installed it today to see if that remedies the issue. It's back up and running, so I'll monitor it. If it solves the problem, I'll probably install an SSD.

                        Thanks for your response and support. I've had great experience with everyone from Netgate.

                        If it's motherboard related, do you sell replacements?

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          If the eMMC has failed it would require a replacement board. If you open a ticket we can quote you for that.

                          I recommend going the mSATA route though, it will be a lot less expensive and we have seen that work reliably in similar cases. A bad eMMC is not an indication anything else on the board will fail.

                          Steve

                          1 Reply Last reply Reply Quote 1
                          • C
                            calmor15014 @kiokoman
                            last edited by

                            @kiokoman Sorry, I should have mentioned it's actually the SG-4860-1U; forgot there was a pretty significant hardware difference between the two. I don't have the device shown everything is soldered onto the motherboard.

                            I do have three mSATA connectors in front of the CPU, however, and was able to add a new disk, install pfSense, and give it a test.

                            I expect at some point to see the same error messages, as the eMMC is still connected, but isn't being used for anything. Hopefully, it will keep working as normal afterward though. pfSense shows the correct size for the root disk so I know it's not using the eMMC as the system device.

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by stephenw10

                              You have three SATA connectors on the board, for using regular SATA drives. The 1U also has SATA power connectors on the PSU.

                              There in only one mSATA socket and two mPCIe sockets. Just FYI if you use that.
                              https://docs.netgate.com/pfsense/en/latest/solutions/sg-4860-1u/msata-installation.html

                              Steve

                              C 1 Reply Last reply Reply Quote 1
                              • DerelictD
                                Derelict LAYER 8 Netgate
                                last edited by Derelict

                                mSATA. They are like $20 on Amazon.

                                https://www.amazon.com/TCSUNBOW-MSATA-60GB-Solid-Machine/dp/B077YWJVXB/

                                By far your cheapest option and it will be "snappier" than it was on eMMC.

                                Chattanooga, Tennessee, USA
                                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                1 Reply Last reply Reply Quote 1
                                • C
                                  calmor15014 @stephenw10
                                  last edited by

                                  @stephenw10 yep, turns out I don't know what I'm talking about. :) I definitely used one of the SATA connectors with the laptop platter drive and the power supply connector. It's been up for about 24 hours now and working normally so far. No kernel messages after bootup complete.

                                  If it seems to be okay for a couple weeks I'll probably order the mSATA device as I'm sure it will be faster than an old 320GB laptop hard drive.

                                  Thanks again for everyone's help!

                                  1 Reply Last reply Reply Quote 2
                                  • C
                                    calmor15014
                                    last edited by

                                    Turns out the kernel logged that same sequence of messages last night, but as expected, the device continued to operate normally as none of the services are relying on the eMMC. Seems like the motherboard is operating normally. Another week or so of solid operation, and I'll look for an mSATA.

                                    Thanks again!

                                    1 Reply Last reply Reply Quote 0
                                    • kiokomanK
                                      kiokoman LAYER 8
                                      last edited by kiokoman

                                      when you have the time try to clean it with some isopropyl alcohol and a toothbrush, it does a fair job of getting rid of both water-based (oxide) and oil-based contaminants that can cause intermittent connection. if it's not enought a reballing/reflaw would be necessary but for that you need a tecnical expert able to do it.
                                      ... or just ignore it and mount an msata

                                      ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
                                      Please do not use chat/PM to ask for help
                                      we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
                                      Don't forget to Upvote with the 👍 button for any post you find to be helpful.

                                      C 1 Reply Last reply Reply Quote 0
                                      • C
                                        calmor15014 @kiokoman
                                        last edited by

                                        @kiokoman As mentioned above, on the SG-4860-1U that I have, there is no contact surface to clean - the eMMC is directly soldered to the PCB. Aside from trying to resolder it, there isn't much to do, and at that point it's too big a risk vs. the mSATA and ignoring kernel messages. I could probably change the config to avoid mounting da0 in the first place if it gets that frequent/irritating.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.