Netgate 1100 not rebooting
-
@stephenw10 Reboot 23.09.1 from console fails:
FreeBSD/arm64 (pfSense1100.localdomain) (ttyu0) Netgate 1100 Netgate Device ID: *** Serial: *** *** Welcome to Netgate pfSense Plus 23.09.1-RELEASE (arm64) on pfSense1100 *** Current Boot Environment: default_20240407162645 Next Boot Environment: default_20240407162645 WAN (wan) -> mvneta0.4090 -> v4/DHCP4: 91.155.14.186/23 LAN (lan) -> mvneta0.4091 -> v4: 192.168.99.1/24 OPT (opt1) -> mvneta0.4092 -> v4: 172.16.1.1/24 0) Logout (SSH only) 9) pfTop 1) Assign Interfaces 10) Filter Logs 2) Set interface(s) IP address 11) Restart webConfigurator 3) Reset webConfigurator password 12) PHP shell + Netgate pfSense Plus tools 4) Reset to factory defaults 13) Update from console 5) Reboot system 14) Disable Secure Shell (sshd) 6) Halt system 15) Restore recent configuration 7) Ping host 16) Restart PHP-FPM 8) Shell Enter an option: 5 Netgate pfSense Plus will reboot. This may take a few minutes, depending on your hardware. Do you want to proceed? Y/y: Reboot normally R/r: Reroot (Stop processes, remount disks, re-run startup sequence) Enter: Abort Enter an option: y Netgate pfSense Plus is rebooting now. Stopping package apcupsd...done. Stopping /usr/local/etc/rc.d/pfb_dnsbl.sh...done. Stopping /usr/local/etc/rc.d/pfb_filter.sh...done. pflog0: promiscuous mode disabled Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 0 0 0 done All buffers synced. Uptime: 4d20h8m34s Khelp module "ertt" can't unload until its ref
I also couldn't get 1100 to boot, so reinstalled 23.09.1 again. Having nothing to loose at this point, I updated to 24.03-RC. 1100 didn't reboot during update, but did boot after having been unplugged from power for a while. Update finished after boot, but had the same problem as 6100 with unbound not starting, so another reboot attempt and boot after that:
Netgate 1100 Netgate Device ID: ** Serial: ** *** Welcome to Netgate pfSense Plus 24.03-RC (arm64) on pfSense1100 *** Current Boot Environment: default Next Boot Environment: default WAN (wan) -> mvneta0.4090 -> v4/DHCP4: 91.155.14.186/23 LAN (lan) -> mvneta0.4091 -> v4: 192.168.99.1/24 OPT (opt1) -> mvneta0.4092 -> v4: 172.16.1.1/24 0) Logout (SSH only) 9) pfTop 1) Assign Interfaces 10) Filter Logs 2) Set interface(s) IP address 11) Restart GUI 3) Reset admin account and password 12) PHP shell + Netgate pfSense Plus tools 4) Reset to factory defaults 13) Update from console 5) Reboot system 14) Disable Secure Shell (sshd) 6) Halt system 15) Restore recent configuration 7) Ping host 16) Restart PHP-FPM 8) Shell Enter an option: 5 Netgate pfSense Plus will reboot. This may take a few minutes, depending on your hardware. Do you want to proceed? Y/y: Reboot normally R/r: Reroot (Stop processes, remount disks, re-run startup sequence) Enter: Abort Enter an option: y Netgate pfSense Plus is rebooting now. Stopping package apcupsd...done. Stopping /usr/local/etc/rc.d/pfb_dnsbl.sh...done. Stopping /usr/local/etc/rc.d/pfb_filter.sh...done. pflog0: promiscuous mode disabled Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 0 0 0 done All buffers synced. Uptime: 17m31s
During usb recovery console displayed these errors:
scanning bus 0 for devices... 1 USB Device(s) found scanning bus 1 for devices... 2 USB Device(s) found scanning usb for storage devices... 1 Storage Device(s) found 18022 armada-3720-netgate-1100.dtb 18022 armada-3720-sg1100.dtb 12944 armada-3720-netgate-2100.dtb 12944 armada-3720-sg2100.dtb System Volume Information/ 132485 config.xml 5 file(s), 1 dir(s) 2097152 bytes read in 104 ms (19.2 MiB/s) 18022 bytes read in 17 ms (1 MiB/s) ## Starting EFI application at 07000000 ... Card did not respond to voltage select! Scanning disk sdhci@d0000.blk... Disk sdhci@d0000.blk not ready Scanning disk sdhci@d8000.blk... bad MBR sector signature 0x0000 bad MBR sector signature 0x0000 . . . bad MBR sector signature 0x0000 bad MBR sector signature 0x0000 Scanning disk usb_mass_storage.lun0... Found 5 disks
-
None of those errors are a concern, all 1100s show that.
Do you have any additional hardware attached to that 1100?
-
@stephenw10 Console, OPT and WAN ports were connected and also UPS was connected to USB3 port when trying to reboot 23.09.1 above, then I disconnected UPS. Connected it again after update and the boots above.
This reboot problem started around the same time I connected UPS to 1100, but that may just be coincidental.
-
Hmm, did you try rebooting on a completely clean install of 23.09.1?
-
@stephenw10 Turned out, that the culprit was apcupsd, which somehow interfered rebooting. Once I removed it, 24.03-RC rebooted normally.
I used apcupsd only because nut didn't find/connect UPS on 1100.
-
Ah, nice catch!
Not seeing a bug for that, is it a known issue?
-
If this problem is specific to 1100, then it might have been unknown. Might be not so common to connect UPS to 1100, I don't know. Should be easy to reproduce.
-
After uninstalling apcupsd and then rebooting 1100 successfully I thought I was done with reboot problem. Then I read about someone with 1100 and apcupsd having no problems with rebooting and thought I would check if my 1100 would still reboot. Turned out it did not, so apcupsd was not the culprit after all.
Tried the revised RC, though was pretty sure it wouldn't change anything. It did not. Then reinstalled 23.09.01 again and tried to reboot without first restoring config. It did not reboot.
I have had 1100 for less than three years, is it failing already? ** Edit: I've used RAM disk from the beginning **
What do flickering lights of the ethernet connectors indicate? Earlier I tried another power brick, but it didn't change anything.
Here's the recovery and reboot attempt log: putty.txt
-
Hmm, nothing there looks especially unusual, even the Bad MBR logs.
@pfsjap said in Netgate 1100 not rebooting:
What do flickering lights of the ethernet connectors indicate?
How exactly do you mean 'flickering'? Can you capture that in a video?
-
@stephenw10 Here's a short clip: Netgate 1100 - Trim.7z
-
Hmm, no that's not good. You shouldn't see the port LEDs light without anything connected. Does it do that continually?
-
@stephenw10 Continually after it starts? For as long as I have had patience to wait.
Reboot is like shutdown, to make it boot again I have to unplug power, wait and then replug power. Sometimes I have to do unplugging-waiting-replugging several times until booting succeeds. The longer I wait in between, the more likely booting succeeds.
-
Sorry I mean does it start doing the immediately after powering it on or sometime later? Like before or after POST. If it starts flickering immediately that's probably a hardware issue. If it starts after a minute or so it could be the drivers.
-
After reboot the LEDs stay off, but start flickering after about 4 minutes and 15 seconds (I don't know how consistent this time is, I only have one data point). First it is kind of random but over time becomes like on that video clip.
Usually I don't wait for this long, but unplug power right away, then wait and replug again. After plugging in power the flickering may start right away, or not. Anyway, I just try cycling power again.
If power has been unplugged long enough, 1100 boots up right away.
-
-
What uboot version do you have?:
[24.03-RC][root@1100-3.stevew.lan]/root: kenv | grep smbios.bios smbios.bios.reldate="10/07/2021" smbios.bios.vendor="U-Boot" smbios.bios.version="2018.03-devel-18.12.3-gc9aa92c-dirty"
-
@stephenw10 Seems to be same as in your post above:
[24.03-RC][admin@pfSense1100.localdomain]/root: uname -a FreeBSD pfSense1100.localdomain 15.0-CURRENT FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Tue Apr 16 00:38:13 UTC 2024 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/aarch64/f8EaPNPx/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBSD-src-plus-RELENG_24_03/arm64.aarch64/sys/pfSense arm64 [24.03-RC][admin@pfSense1100.localdomain]/root: kenv | grep smbios.bios smbios.bios.reldate="10/07/2021" smbios.bios.vendor="U-Boot" smbios.bios.version="2018.03-devel-18.12.3-gc9aa92c-dirty" [24.03-RC][admin@pfSense1100.localdomain]/root:
-
Hmm, try interrupting the boot at the loader. Will it reboot from there:
+---- Welcome to Netgate pfSense Plus ----+ __________________________ | | / ___\ | 1. Boot Multi user [Enter] | | /` | 2. Boot Single user | | / :-| | 3. Escape to loader prompt | | _________ ___/ /_ | | 4. Reboot | | /` ____ / /__ ___/ | | 5. Cons: Dual (Serial primary) | | / / / / / / | | | | / /___/ / / / | | Options: | | / ______/ / / _ | | 6. Kernel: default/kernel (1 of 2) | |/ / / / _| |_ | | 7. Boot Options | / /___/ |_ _| | | 8. Boot Environments | / |_| | | | /_________________________/ +-----------------------------------------+ \ Exiting menu! Type '?' for a list of commands, 'help' for more detailed help. OK reboot
Will it reset from the uboot prompt?:
U-Boot 2018.03-devel-18.12.3-gc9aa92c-dirty (Oct 07 2021 - 18:20:55 -0300) Model: Netgate 1100 CPU 1200 [MHz] L2 800 [MHz] TClock 200 [MHz] DDR 750 [MHz] DRAM: 1 GiB Comphy chip #0: Comphy-0: USB3 5 Gbps Comphy-1: PEX0 2.5 Gbps Comphy-2: SATA0 6 Gbps SATA link 0 timeout. AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl SATA mode flags: ncq led only pmp fbss pio slum part sxs PCIE-0: Link down MMC: sdhci@d0000: 0, sdhci@d8000: 1 Loading Environment from SPI Flash... SF: Detected mx25u3235f with page size 256 Bytes, erase size 64 KiB, total 4 MiB OK Model: Netgate 1100 Net: eth0: neta@30000 [PRIME] Read - switch port: 0x1, page: 0x0, reg: 0x0, val: 0xFFFF Read - switch port: 0x2, page: 0x0, reg: 0x0, val: 0xFFFF Switch Ports Disabled Hit any key to stop autoboot: 0 Marvell>> reset
If it won't do either of those it's almost certainly a hardware fault.
-
@stephenw10 Will try when I have a chance, but need more information (not familiar operating at this level):
-
I know how to interrupt auto boot, it brings the menu above. How do I get into OK prompt from there? Escape key? ** Edit: nah, it's the 3rd option in the menu **
-
To get Marwell prompt I have used the recovery media in the USB port. Can I get into this prompt some other way? What should reset do?
-
-
Yup you can use menu option 3 or just hit Esc to reach the loader prompt.
The uboot (Marvel>>) prompt is before the loader. Just hit any key at the console (except return!) when you see:
Hit any key to stop autoboot: 0
.
That's part of the reinstall procedure, if you've done that you should have no problem. You don't need a USB drive. -
@stephenw10 It didn't do either, both resulted in flickering LEDs. At this point not rebooting is just inconvenience, but of course device condition may (and will) get worse with time.
This is not about rebooting, but can you tell me what's the correct way to run fsck in single user mode? This is not working:
Enter full pathname of shell or RETURN for /bin/sh: # fsck -y / fsck: cannot open `pfSense/ROOT/default': No such file or directory # bectl list BE Active Mountpoint Space Created default NR / 2.36G 2024-04-17 09:56 default_20240417095629 - - 968M 2023-12-06 23:26 #
Thank you for your help!