SG1100 Failing to boot after power loss (UPS protected)
-
Hello
I have a ticket with support but thought I'd open this up to the community to see if anyone else has come across a similar issue.
Context:
We run a fleet of SG1100s (20 or so) and all of them are protected by UPS.Issue:
One of the sites (generator for mains power) has an issue where each day when the gennie is fired up, and the UPS is turned on, all devices come online except the SG1100.
The SG1100 powers up, but the serial console simply displays "#". From there you can press enter and nothing happens except "#" on a new line. No networking works through the SG1100 at this point. To my surprise, you're able to enter "reboot" and the device reboots and begins to properly function. I'm unsure what other commands would work. Manually cycling power also makes the device actually work.We managed to do some further testing with the SG1100. Here are our findings.
• Replaced SG1100 with a brand-new unit. Issue persists.
• Replaced power supply with brand new power supply. Issue persists.
• Replaced USB serial cable with brand new serial cable. Issue persists.
• Removed USB serial cable all together, issue resolved.Since drafting this message, the issue has presented again at one of our other sites. This other site is online 24/7 and but had a power failure, so the issue is less frequent here.
Has anyone else experienced a similar issue?
Cheers
Vinny -
@vinnyzuk Sounds like it’s sitting at a shell prompt. Guessing the option to require a password is disabled (default, IIRC) but it should be at the menu.
I would connect the console beforehand to see what is being shown before that point.
To be clear is the 1100 actually losing power?
Or, something to which it is connected is, maybe?
-
Thanks for the quick response Steve.
Yes, the 1100 is losing power. The breakers for the the UPS output are shut off before the generator is shut down. The breakers are then turned back on once generator power is restored.
It only just occurred to me that the PC that is connected via serial is booting up at the same time as the netgate, so I have been missing the serial output from before the PC boots.
I'll plug a laptop into the serial and see what the output is from the beginning.
Regardless, I still don't understand why pulling the SG1100 power plug and plugging back in or typing the "reboot" resolves the issue. -
If the console is attached permanently I'd guess it's sending some rogue character and interrupting the boot.
If you enter
?
at the # prompt what does it return? -
@stephenw10 I'll give that a go and let you know what I find. Won't be for a few days. Will update here.
-
Update
For some further context, the site has an Intel NUC mini PC attached to the comms system to allow for remote troubleshooting.
Did some more in-depth troubleshooting. Learned that the issue only occurs when serial is plugged into the NUC specifically, and the NUC is off when the Netgate is switched on.
o Does not happen with my laptop in place of the NUC.
o Does not happen if the NUC is already booted.
o Happens even with everything unplugged from the NUC (including the PSU) except the serial cable.Because the issue only happens with the NUC when the NUC hasn't booted yet, there is no way to see the serial output as the Netgate boots. I don't suppose the Netgate can log that internally for later viewing?
As far as why this happens, I can only assume some weird power-off circuitry happening in the NUC that upsets the Netgate's boot process. I've tried changing various bios settings related to USB power and such but nothing seems to make a difference.
Another weird issue that was very intermittent was Putty saying "COM4 Error 5- Access denied" despite a fresh boot and running as admin. After a few minutes this would go away.
-
You could try adding a boot delay to the 1100 so the NUC has finished booting before it starts.
Though if it sends an escape sequence that can interrupt the boot that could cause a problem at any time. The
#
implies the root prompt which is an odd place to end up.The 1100 logs everything once the kernel starts but it doesn't log input at the console.