SG-3100 hardware check



  • Hi there,

    I have an SG-3100 running 2.4.5p1 I've been trying to configure for a while as a home firewall. It has the optional 32Gb M2 SSD fitted. Been going fairly well although lately noticed the WebGUI slowing down (taking 10-30 seconds to update the page after applying a change for example).

    Last week the webGUI locked up completely. I disconnected the network and connected directly with a single PC to LAN1 and also tried the other LAN ports. The SG-3100 wouldn't respond to ping on its usual IP address. Tried with a different PC also.

    SSH was still responsive so restored an old, known working, backup config but it did nothing; web GUI wouldn't log in. I've reset to factory defaults and now its working again.

    How can I check the hardware & software installation is OK before continuining to invest time in configuring it?

    Is it worth upgrading to 2.5.0 to reduce chance of this happening again?

    Also the temperature generally runs 60-70 degrees Celsius - this seems pretty hot but no error is shown; is that normal?

    I tried running a fsck - output is below but it doesn't mean anything to me as I don't have Linux experience.

    I need to be confident the SG-3100 is reliable; I'm looking to purchase 3 more for elsewhere.

    I am based in the UK; is this something I should be referring to Amicatech as the UK distributor for? I've not found them very responsive in my limited dealings with them so far.

    Thanks for any advice you can give.

    Shell Output - fsck

    ** /dev/diskid/DISK-4AC907960DFB00000205s2a (NO WRITE)
    ** SU+J Recovering /dev/diskid/DISK-4AC907960DFB00000205s2a

    USE JOURNAL? no

    ** Skipping journal, falling through to full fsck

    SETTING DIRTY FLAG IN READ_ONLY MODE

    UNEXPECTED SOFT UPDATE INCONSISTENCY
    ** Last Mounted on /
    ** Root file system
    ** Phase 1 - Check Blocks and Sizes
    INCORRECT BLOCK COUNT I=29708 (8 should be 0)
    CORRECT? no

    INCORRECT BLOCK COUNT I=29709 (8 should be 0)
    CORRECT? no

    ** Phase 2 - Check Pathnames
    ** Phase 3 - Check Connectivity
    ** Phase 4 - Check Reference Counts
    UNREF FILE I=29631 OWNER=root MODE=100666
    SIZE=0 MTIME=Aug 29 17:26 2020
    CLEAR? no

    UNREF FILE I=29724 OWNER=root MODE=100600
    SIZE=209 MTIME=Sep 1 10:09 2020
    RECONNECT? no

    CLEAR? no

    UNREF FILE I=29736 OWNER=root MODE=100600
    SIZE=0 MTIME=Sep 1 08:24 2020
    RECONNECT? no

    CLEAR? no

    UNREF FILE I=29748 OWNER=root MODE=100600
    SIZE=0 MTIME=Sep 1 10:09 2020
    RECONNECT? no

    CLEAR? no

    ** Phase 5 - Check Cyl groups
    FREE BLK COUNT(S) WRONG IN SUPERBLK
    SALVAGE? no

    SUMMARY INFORMATION BAD
    SALVAGE? no

    BLK(S) MISSING IN BIT MAPS
    SALVAGE? no

    22288 files, 285614 used, 7273657 free (3289 frags, 908796 blocks, 0.0% fragmentation)



  • Hi,

    You have a backup of the config, right ?

    According to https://www.youtube.com/watch?v=4DKr1Dvan5I I recommend that you actually repair the file system. Change the "no" (several of them) for a "yes".

    A complete re install 'from scratch' will also re init the file system.

    Hooking it up to an UPS so the power isn't removed while it's running also helps a great deal.
    The proper way to shut it down before power removal is Diagnostics > Halt System ( a console / SSH equivalent exist).



  • Hi Gertjan,

    Thank you for the quick reply. I'd see the Netgate videos before but hadn't spotted this one. Very helpful. I've ran it 5 times and it was clean on all except the second pass when an error was corrected, so hopefully that reduces chance of future errors.

    I've tried to be careful to halt the system every time. Coming from domestic routers it was a shock when I learnt that it can't handle power interuptions well. Thankfully we generally have very reliable power here (1 noticeable power cut every few years). I'll consider a UPS although they can be a pain and I've seen a fair few systems where they've caused more grief than they've fixed (regular battery replacement spings to mind...).

    I'll keep working on re-programing the new config now and hope this issue doesn't re-occur. I use auto config backup and also save out manual backups after any major changes.

    Thanks again for your help; really appreciated.



  • Hello!

    I have deployed a couple of sg-3100's. I initially set them up with 32gb drives (TS32GMTS400S). The thought at the time was that I might need the space down the road.

    Long story short, after numerous problems during setup and break in, I ditched the extra drives. I dont know if it was a bad batch of drives, or if the netgate just didnt like them.

    I am just running basic edge installs with pfb, snort, openvpn, and a few other support packages. No feed hoarding. Nothing fancy.

    I am only using about 20% of the built in drive space and dont think I will need more.

    The temps you are seeing appear to be normal and not an issue.

    I do run a UPS with nut/apcupsd. I have heard that running with ram drives can help with stability. Hopefully a pro will chime in.

    The netgates have been rock solid and worry free.

    John



  • Hi John,

    Thanks for your thoughts. I chose to add the SSD (genuine Netgate, not third party) to allow space for packages under the advice of the distributor who said I might run out of space with packages like snort etc. otherwise - i.e. same thinking as you. Good to know you haven't run out of space. I presume when you ditched the drives you had to re-install from scratch?

    I'll certainly think twice before spec'ing them again. I'll call the distributor and see if there are any known issues with the drives.

    Thanks again,

    Tim


  • Netgate Administrator

    @noisybloke said in SG-3100 hardware check:

    Also the temperature generally runs 60-70 degrees Celsius - this seems pretty hot but no error is shown; is that normal?

    That is the expected range for the on-die sensor in the SoC in the SG-3100. They all run 60-70C.

    If it seemed to be getting progerssively slower was there any indication of the RAM being exhausted or the drive filling?

    I'm not aware if any issue with m.2 SATA drives in them but a bad drive could affect anything I guess. Generally they help speed up anything that involves drive reads/writes, they are significantly faster then eMMC. That includes opening pages in the GUI.

    The issues we usually see are out of control logging. It easy to enter some bad values and end up with a package not rotating it's logs.

    And, yeah, I would agree with running an SSD if you want any sort of substantial logging from Snort or Squid etc. Or indeed anything that required frequent drive writes.

    Steve



  • Hi Steve,

    Thanks for your input and tips. No, I never saw any increase on the memory (circa 10% max 20% approx) or disc usage (circa 5%) indicators on the dashboard. It increased a bit when I tried installing a few packaged but not my much.
    Great to know there isn't a known issue with the m.2 SATA drives.

    My impression is that running snort or squid has little value these days but Suricata could be worthwhile. Its someway down the road for me; I want to build it step by step and ensure the core functionality is rock solid first.

    Thanks a lot,
    Tim



  • @noisybloke said in SG-3100 hardware check:

    Coming from domestic routers it was a shock when I learnt that it can't handle power interuptions well.

    These domestic routers do not have a file system as what you would find on PC or NAS.
    pfSense could be run from ROM with minimal dynamic data storage, and some NVRAM for the config, but in that case upgrading would be far more complicated, no more packages, and no more dynamic data views. It would become just another SOHO router.

    Rip out the power cable of your PC : after a couple of times your PC will complain, if it still boots.

    @noisybloke said in SG-3100 hardware check:

    (1 noticeable power cut every few years

    You are wired up yourself ? ;)
    A blackout that kills all the lights is just one example of a power outage. The oned that 'hurt' a system a far more common.

    Btw : still, power issues rarely actually kill a device physically. It's just wrong data getting written on the wrong place or something like that. Rebuilding (reformatting) the disk will take care of things. Just make sure your config is saved regularly. I've one of my PC's running a small program that logs in using SSH, executing the 'Diagnostics > Backup & Restore', retrieve the complete config, save the file and log out. A set it and forget it installation.
    Take note of the "Netgate Device ID" and the 'Device key' which is useful to retrieve a backup of what has been send to Netgate's remote backup storage, see Services > Auto Configuration Backup > Restore


Log in to reply