Netgate 1100 not rebooting
-
Hmm, did you try rebooting on a completely clean install of 23.09.1?
-
@stephenw10 Turned out, that the culprit was apcupsd, which somehow interfered rebooting. Once I removed it, 24.03-RC rebooted normally.
I used apcupsd only because nut didn't find/connect UPS on 1100.
-
Ah, nice catch!
Not seeing a bug for that, is it a known issue?
-
If this problem is specific to 1100, then it might have been unknown. Might be not so common to connect UPS to 1100, I don't know. Should be easy to reproduce.
-
After uninstalling apcupsd and then rebooting 1100 successfully I thought I was done with reboot problem. Then I read about someone with 1100 and apcupsd having no problems with rebooting and thought I would check if my 1100 would still reboot. Turned out it did not, so apcupsd was not the culprit after all.
Tried the revised RC, though was pretty sure it wouldn't change anything. It did not. Then reinstalled 23.09.01 again and tried to reboot without first restoring config. It did not reboot.
I have had 1100 for less than three years, is it failing already? ** Edit: I've used RAM disk from the beginning **
What do flickering lights of the ethernet connectors indicate? Earlier I tried another power brick, but it didn't change anything.
Here's the recovery and reboot attempt log: putty.txt
-
Hmm, nothing there looks especially unusual, even the Bad MBR logs.
@pfsjap said in Netgate 1100 not rebooting:
What do flickering lights of the ethernet connectors indicate?
How exactly do you mean 'flickering'? Can you capture that in a video?
-
@stephenw10 Here's a short clip: Netgate 1100 - Trim.7z
-
Hmm, no that's not good. You shouldn't see the port LEDs light without anything connected. Does it do that continually?
-
@stephenw10 Continually after it starts? For as long as I have had patience to wait.
Reboot is like shutdown, to make it boot again I have to unplug power, wait and then replug power. Sometimes I have to do unplugging-waiting-replugging several times until booting succeeds. The longer I wait in between, the more likely booting succeeds.
-
Sorry I mean does it start doing the immediately after powering it on or sometime later? Like before or after POST. If it starts flickering immediately that's probably a hardware issue. If it starts after a minute or so it could be the drivers.
-
After reboot the LEDs stay off, but start flickering after about 4 minutes and 15 seconds (I don't know how consistent this time is, I only have one data point). First it is kind of random but over time becomes like on that video clip.
Usually I don't wait for this long, but unplug power right away, then wait and replug again. After plugging in power the flickering may start right away, or not. Anyway, I just try cycling power again.
If power has been unplugged long enough, 1100 boots up right away.
-
-
What uboot version do you have?:
[24.03-RC][root@1100-3.stevew.lan]/root: kenv | grep smbios.bios smbios.bios.reldate="10/07/2021" smbios.bios.vendor="U-Boot" smbios.bios.version="2018.03-devel-18.12.3-gc9aa92c-dirty"
-
@stephenw10 Seems to be same as in your post above:
[24.03-RC][admin@pfSense1100.localdomain]/root: uname -a FreeBSD pfSense1100.localdomain 15.0-CURRENT FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Tue Apr 16 00:38:13 UTC 2024 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/aarch64/f8EaPNPx/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBSD-src-plus-RELENG_24_03/arm64.aarch64/sys/pfSense arm64 [24.03-RC][admin@pfSense1100.localdomain]/root: kenv | grep smbios.bios smbios.bios.reldate="10/07/2021" smbios.bios.vendor="U-Boot" smbios.bios.version="2018.03-devel-18.12.3-gc9aa92c-dirty" [24.03-RC][admin@pfSense1100.localdomain]/root:
-
Hmm, try interrupting the boot at the loader. Will it reboot from there:
+---- Welcome to Netgate pfSense Plus ----+ __________________________ | | / ___\ | 1. Boot Multi user [Enter] | | /` | 2. Boot Single user | | / :-| | 3. Escape to loader prompt | | _________ ___/ /_ | | 4. Reboot | | /` ____ / /__ ___/ | | 5. Cons: Dual (Serial primary) | | / / / / / / | | | | / /___/ / / / | | Options: | | / ______/ / / _ | | 6. Kernel: default/kernel (1 of 2) | |/ / / / _| |_ | | 7. Boot Options | / /___/ |_ _| | | 8. Boot Environments | / |_| | | | /_________________________/ +-----------------------------------------+ \ Exiting menu! Type '?' for a list of commands, 'help' for more detailed help. OK reboot
Will it reset from the uboot prompt?:
U-Boot 2018.03-devel-18.12.3-gc9aa92c-dirty (Oct 07 2021 - 18:20:55 -0300) Model: Netgate 1100 CPU 1200 [MHz] L2 800 [MHz] TClock 200 [MHz] DDR 750 [MHz] DRAM: 1 GiB Comphy chip #0: Comphy-0: USB3 5 Gbps Comphy-1: PEX0 2.5 Gbps Comphy-2: SATA0 6 Gbps SATA link 0 timeout. AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl SATA mode flags: ncq led only pmp fbss pio slum part sxs PCIE-0: Link down MMC: sdhci@d0000: 0, sdhci@d8000: 1 Loading Environment from SPI Flash... SF: Detected mx25u3235f with page size 256 Bytes, erase size 64 KiB, total 4 MiB OK Model: Netgate 1100 Net: eth0: neta@30000 [PRIME] Read - switch port: 0x1, page: 0x0, reg: 0x0, val: 0xFFFF Read - switch port: 0x2, page: 0x0, reg: 0x0, val: 0xFFFF Switch Ports Disabled Hit any key to stop autoboot: 0 Marvell>> reset
If it won't do either of those it's almost certainly a hardware fault.
-
@stephenw10 Will try when I have a chance, but need more information (not familiar operating at this level):
-
I know how to interrupt auto boot, it brings the menu above. How do I get into OK prompt from there? Escape key? ** Edit: nah, it's the 3rd option in the menu **
-
To get Marwell prompt I have used the recovery media in the USB port. Can I get into this prompt some other way? What should reset do?
-
-
Yup you can use menu option 3 or just hit Esc to reach the loader prompt.
The uboot (Marvel>>) prompt is before the loader. Just hit any key at the console (except return!) when you see:
Hit any key to stop autoboot: 0
.
That's part of the reinstall procedure, if you've done that you should have no problem. You don't need a USB drive. -
@stephenw10 It didn't do either, both resulted in flickering LEDs. At this point not rebooting is just inconvenience, but of course device condition may (and will) get worse with time.
This is not about rebooting, but can you tell me what's the correct way to run fsck in single user mode? This is not working:
Enter full pathname of shell or RETURN for /bin/sh: # fsck -y / fsck: cannot open `pfSense/ROOT/default': No such file or directory # bectl list BE Active Mountpoint Space Created default NR / 2.36G 2024-04-17 09:56 default_20240417095629 - - 968M 2023-12-06 23:26 #
Thank you for your help!
-
fsck doesn't run on ZFS. You shouldn't need to but you can scrub the zpool. See:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/filesystem-check.htmlFor example:
[24.03-RC][root@1100-3.stevew.lan]/root: zpool status pool: pfSense state: ONLINE status: Some supported and requested features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. config: NAME STATE READ WRITE CKSUM pfSense ONLINE 0 0 0 mmcsd0s3a ONLINE 0 0 0 errors: No known data errors [24.03-RC][root@1100-3.stevew.lan]/root: zpool scrub pfSense [24.03-RC][root@1100-3.stevew.lan]/root: zpool status pool: pfSense state: ONLINE status: Some supported and requested features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: scrub in progress since Fri Apr 19 14:06:16 2024 1.49G / 1.49G scanned, 0B / 1.49G issued 0B repaired, 0.00% done, no estimated completion time config: NAME STATE READ WRITE CKSUM pfSense ONLINE 0 0 0 mmcsd0s3a ONLINE 0 0 0 errors: No known data errors
It can take a while but runs in the background.