Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    SG-4860 Crashing with umass0 disconnecting

    Official Netgate® Hardware
    5
    18
    205
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      calmor15014 last edited by calmor15014

      I apologize if this is not the right forum for this, please direct me to the appropriate one in that case.

      I have an SG-4860 which has random but more frequently increasing failures. First, DHCP stops working, then eventually the entire device stops working (web interface unresponsive, no IPv6 network functionality). Kernel error messages are as follows:

      Aug 28 16:45:38 (hostname withdrawn) kernel: umass0: at uhub1, port 4, addr 4 (disconnected)
      Aug 28 16:45:38 (hostname withdrawn) kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
      Aug 28 16:45:38 (hostname withdrawn) kernel: da0: <Generic Ultra HS-COMBO 1.98> s/n 000000225001 detached
      Aug 28 16:45:38 (hostname withdrawn) kernel: (da0:umass-sim0:0:0:0): Periph destroyed
      Aug 28 16:45:38 (hostname withdrawn) kernel: umass0: detached
      Aug 28 16:45:40 (hostname withdrawn) kernel: ugen0.4: <Generic Ultra Fast Media> at usbus0
      Aug 28 16:45:40 (hostname withdrawn) kernel: umass0 on uhub1
      Aug 28 16:45:40 (hostname withdrawn) kernel: umass0: <Generic Ultra Fast Media, class 0/0, rev 2.00/1.98, addr 4> on usbus0
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: <Generic Ultra HS-COMBO 1.98> Removable Direct Access SCSI device
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: Serial Number 000000225001
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: 40.000MB/s transfers
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: 29184MB (59768832 512 byte sectors)
      Aug 28 16:45:40 (hostname withdrawn) kernel: da0: quirks=0x2<NO_6_BYTE>
      

      I am running OpenVPN, Squid, SquidGuard, avahi, and nut. The drive is not full (2% of 25GB), logs are all on a remote logging server. While the device is operating, none of the operating parameters (CPU, memory, HDD) approach even 50% of max.

      It seems like the detach-reattach of the SCSI drive is causing errors in the device. It doesn't appear that this has specific cause (usage, time, etc.)

      Is this the case of a failing drive?

      Thanks in advance!

      1 Reply Last reply Reply Quote 0
      • jimp
        jimp Rebel Alliance Developer Netgate last edited by

        It could be that the onboard storage has a problem. You can pop another disk in there and install to that, though.

        1 Reply Last reply Reply Quote 1
        • C
          calmor15014 last edited by

          Thanks for your response!

          I was leaning toward a spotty drive, but I've never seen a storage device failure manifest itself in this way. Complete disconnect/reconnect seems odd, usually I'd see read/access errors or something, but these are the only errors in any log that seems out of the ordinary until services start failing completely. I have some concern that it's a device controller failure.

          Is there any way to validate the storage issue prior to disassembling the box? I'm not so familiar with flash/SSD diagnostics.

          The unit has been in a rack with tons of ventilation and max environmental temps around 22C. It's lived a pretty easy life, mechanically-speaking. I do have the occasional power fluctuation, but these disconnection events occur seemingly at random.

          1 Reply Last reply Reply Quote 0
          • jimp
            jimp Rebel Alliance Developer Netgate last edited by

            That device is the soldered-on eMMC storage device in the 4860 so there really isn't much to do in the way of diagnostics for it. Somehow the controller is losing contact with the storage. The fact that the device appears to disconnect despite being permanently connected is concerning because it likely means the device itself is failing in some way.

            1 Reply Last reply Reply Quote 1
            • C
              calmor15014 last edited by

              I don't have the device to disassemble at the moment but it looks like there are some SATA connections on the board based on photos online. Can I disable the internal eMMC device and use a SATA external drive instead?

              1 Reply Last reply Reply Quote 0
              • jimp
                jimp Rebel Alliance Developer Netgate last edited by

                There is an mSATA connector inside. You can install an mSATA drive and it will boot from there. You don't have to disable the eMMC.

                1 Reply Last reply Reply Quote 1
                • kiokoman
                  kiokoman LAYER 8 last edited by kiokoman

                  if it's this one

                  4860.jpg

                  i don't see the emmc soldered-on
                  are you sure?
                  maybe you only need to clean the contacts

                  C 1 Reply Last reply Reply Quote 1
                  • jimp
                    jimp Rebel Alliance Developer Netgate last edited by

                    That is an mSATA disk.

                    1 Reply Last reply Reply Quote 1
                    • stephenw10
                      stephenw10 Netgate Administrator last edited by

                      If that SG-4860 is still in warranty you should open a ticket with us: https://go.netgate.com

                      Steve

                      C 1 Reply Last reply Reply Quote 1
                      • C
                        calmor15014 @stephenw10 last edited by

                        Thanks, but unfortunately I bought it in 2017.

                        I had a laptop platter drive and an mSATA cable lying around, so I installed it today to see if that remedies the issue. It's back up and running, so I'll monitor it. If it solves the problem, I'll probably install an SSD.

                        Thanks for your response and support. I've had great experience with everyone from Netgate.

                        If it's motherboard related, do you sell replacements?

                        1 Reply Last reply Reply Quote 0
                        • stephenw10
                          stephenw10 Netgate Administrator last edited by

                          If the eMMC has failed it would require a replacement board. If you open a ticket we can quote you for that.

                          I recommend going the mSATA route though, it will be a lot less expensive and we have seen that work reliably in similar cases. A bad eMMC is not an indication anything else on the board will fail.

                          Steve

                          1 Reply Last reply Reply Quote 1
                          • C
                            calmor15014 @kiokoman last edited by

                            @kiokoman Sorry, I should have mentioned it's actually the SG-4860-1U; forgot there was a pretty significant hardware difference between the two. I don't have the device shown everything is soldered onto the motherboard.

                            I do have three mSATA connectors in front of the CPU, however, and was able to add a new disk, install pfSense, and give it a test.

                            I expect at some point to see the same error messages, as the eMMC is still connected, but isn't being used for anything. Hopefully, it will keep working as normal afterward though. pfSense shows the correct size for the root disk so I know it's not using the eMMC as the system device.

                            1 Reply Last reply Reply Quote 0
                            • stephenw10
                              stephenw10 Netgate Administrator last edited by stephenw10

                              You have three SATA connectors on the board, for using regular SATA drives. The 1U also has SATA power connectors on the PSU.

                              There in only one mSATA socket and two mPCIe sockets. Just FYI if you use that.
                              https://docs.netgate.com/pfsense/en/latest/solutions/sg-4860-1u/msata-installation.html

                              Steve

                              C 1 Reply Last reply Reply Quote 1
                              • Derelict
                                Derelict LAYER 8 Netgate last edited by Derelict

                                mSATA. They are like $20 on Amazon.

                                https://www.amazon.com/TCSUNBOW-MSATA-60GB-Solid-Machine/dp/B077YWJVXB/

                                By far your cheapest option and it will be "snappier" than it was on eMMC.

                                1 Reply Last reply Reply Quote 1
                                • C
                                  calmor15014 @stephenw10 last edited by

                                  @stephenw10 yep, turns out I don't know what I'm talking about. :) I definitely used one of the SATA connectors with the laptop platter drive and the power supply connector. It's been up for about 24 hours now and working normally so far. No kernel messages after bootup complete.

                                  If it seems to be okay for a couple weeks I'll probably order the mSATA device as I'm sure it will be faster than an old 320GB laptop hard drive.

                                  Thanks again for everyone's help!

                                  1 Reply Last reply Reply Quote 2
                                  • C
                                    calmor15014 last edited by

                                    Turns out the kernel logged that same sequence of messages last night, but as expected, the device continued to operate normally as none of the services are relying on the eMMC. Seems like the motherboard is operating normally. Another week or so of solid operation, and I'll look for an mSATA.

                                    Thanks again!

                                    1 Reply Last reply Reply Quote 0
                                    • kiokoman
                                      kiokoman LAYER 8 last edited by kiokoman

                                      when you have the time try to clean it with some isopropyl alcohol and a toothbrush, it does a fair job of getting rid of both water-based (oxide) and oil-based contaminants that can cause intermittent connection. if it's not enought a reballing/reflaw would be necessary but for that you need a tecnical expert able to do it.
                                      ... or just ignore it and mount an msata

                                      C 1 Reply Last reply Reply Quote 0
                                      • C
                                        calmor15014 @kiokoman last edited by

                                        @kiokoman As mentioned above, on the SG-4860-1U that I have, there is no contact surface to clean - the eMMC is directly soldered to the PCB. Aside from trying to resolder it, there isn't much to do, and at that point it's too big a risk vs. the mSATA and ignoring kernel messages. I could probably change the config to avoid mounting da0 in the first place if it gets that frequent/irritating.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post

                                        Products

                                        • Platform Overview
                                        • TNSR
                                        • pfSense Plus
                                        • Appliances

                                        Services

                                        • Training
                                        • Professional Services

                                        Support

                                        • Subscription Plans
                                        • Contact Support
                                        • Product Lifecycle
                                        • Documentation

                                        News

                                        • Media Coverage
                                        • Press
                                        • Events

                                        Resources

                                        • Blog
                                        • FAQ
                                        • Find a Partner
                                        • Resource Library
                                        • Security Information

                                        Company

                                        • About Us
                                        • Careers
                                        • Partners
                                        • Contact Us
                                        • Legal
                                        Our Mission

                                        We provide leading-edge network security at a fair price - regardless of organizational size or network sophistication. We believe that an open-source security model offers disruptive pricing along with the agility required to quickly address emerging threats.

                                        Subscribe to our Newsletter

                                        Product information, software announcements, and special offers. See our newsletter archive to sign up for future newsletters and to read past announcements.

                                        © 2021 Rubicon Communications, LLC | Privacy Policy