pfSense+ 24.11 on my 1100 reboots every couple of hours
-
I was unable to upgrade my 1100 from 24.03 to 24.11 (failed in many spectacular ways) so I did a fresh install and restored my config from backup. Now the router reboots itself every few hours. Nothing in the log other than the usual bootup sequence stuff.
What logs exist I can easily examine to learn what's going on? I am NOT a Unix or BSD expert, but I can follow directions easily enough.
-
Wow, the current cycle lasted 30 whole minutes before rebooting again.
Dec 1 14:23:03 router php-cgi[735]: rc.bootup: The command '/usr/local/sbin/strongswanrc stop' returned exit code '1', the output was 'strongswan not running? (check /var/run/daemon-charon.pid).'
Dec 1 14:23:02 router kernel: ....
Dec 1 14:23:02 router check_reload_status[731]: Updating all dyndns
Dec 1 14:23:01 router kernel: done.
Dec 1 14:23:00 router kernel: done.
Dec 1 14:23:00 router php-cgi[735]: rc.bootup: NTPD is starting up.
Dec 1 13:56:55 router kernel: done.
Dec 1 13:56:54 router kernel: done.
Dec 1 13:56:54 router php-cgi[735]: rc.bootup: sync unbound done.Just decided to reboot because, reasons. I was watching YouTube.
-
Do you see an alert or crash report after it reboots?
What's shown in the system logs immediately before it reboots?
-
@stephenw10 No, no error in the web UI. And I included a snippet of the log above. 13:56:55, done. 14:23, starting to reboot. This is from Status/System Logs/System/General. Is there some other log I should be looking at?
One other point of note is I had left the bootable USB drive attached (which contains the installer). I didn't think leaving it there would cause any harm as the boot sequence normally ignores it. I have since unplugged it.
-
Got another reboot:
Uptime 00 Hour 27 Minutes 09 Seconds
Dec 1 18:56:25 router kernel: .done.
Dec 1 18:56:25 router kernel: ....
Dec 1 18:56:25 router check_reload_status[680]: Updating all dyndns
Dec 1 18:56:24 router kernel: done.
Dec 1 18:56:23 router kernel: done.
Dec 1 18:56:23 router php-cgi[684]: rc.bootup: NTPD is starting up.
Dec 1 14:23:54 router kernel: done.
Dec 1 14:23:53 router kernel: done.
Dec 1 14:23:53 router php-cgi[684]: rc.bootup: sync unbound done.about 25 minutes ago. No errors in the web UI.
-
Hmm, what log are you taking that from? It looks odd.
Are you running ZFS? Ram disks?
-
@stephenw10 in the web gui, from Status/System Logs/System/General, visible from status_logs.php
I am running zfs and a ram disk, configured by default.
it rebooted again over night, about 2 hours ago.
-
Which RAM disk are you running? If it's /var you are probably losing logs when it restarts. Try disabling that at least as a test so you can get the full logs across a reboot.
-
@stephenw10 OK disabled my RAM disks, no new interesting info in the logs after the last reboot:
Dec 3 12:31:31 router kernel: .done.
Dec 3 12:31:30 router kernel: ...
Dec 3 12:31:30 router check_reload_status[633]: Updating all dyndns
Dec 3 12:31:30 router kernel: done.
Dec 3 12:31:28 router kernel: done.
Dec 3 12:31:28 router php-cgi[637]: rc.bootup: NTPD is starting up.
Dec 3 12:31:28 router kernel: done.
Dec 3 12:23:43 router php-fpm[1550]: <remainder removed to avoid spam detector>Router rebooted at 12:30.
-
Hmm, and that last line was nothing interesting?
The next step would be to connect the serial console to something and log the output. If anything is shown it will be there.
If it reboots every few hours that should at least be relatively easy. -
A couple more reboots over night. The only interesting thing is snort was active at the moment:
1:21am:
Dec 4 01:21:22 router kernel: GDB: debug ports: uart
Dec 4 01:21:22 router kernel: ---<<BOOT>>---
Dec 4 01:21:22 router syslogd: kernel boot file is /boot/kernel/kernel
Dec 4 01:21:29 router snort[49050]: [120:32:1] (http_inspect) RANGE FIELD NOT PRESENT IN GET METHOD, BUT RESPONSE WITH PARTIAL CONTENT [Classification: Unknown Traffic] [Priority: 3] {TCP} 23.32.75.29:80 -> 73.140.138.66:64173
Dec 4 01:21:29 router snort[49050]: [120:32:1] (http_inspect) RANGE FIELD NOT PRESENT IN GET METHOD, BUT RESPONSE WITH PARTIAL CONTENT [Classification: Unknown Traffic] [Priority: 3] {TCP} 23.32.75.35:80 -> 73.140.138.66:488625:20am
Dec 4 05:20:22 router kernel: GDB: debug ports: uart
Dec 4 05:20:22 router kernel: ---<<BOOT>>---
Dec 4 05:20:22 router syslogd: kernel boot file is /boot/kernel/kernel
Dec 4 05:20:16 router snort[97846]: [120:32:1] (http_inspect) RANGE FIELD NOT PRESENT IN GET METHOD, BUT RESPONSE WITH PARTIAL CONTENT [Classification: Unknown Traffic] [Priority: 3] {TCP} 23.53.122.218:80 -> 73.140.138.66:4258
Dec 4 05:20:16 router snort[97846]: [120:32:1] (http_inspect) RANGE FIELD NOT PRESENT IN GET METHOD, BUT RESPONSE WITH PARTIAL CONTENT [Classification: Unknown Traffic] [Priority: 3] {TCP} 23.53.122.199:80 -> 73.140.138.66:7185 -
@DaveWh Snort on an 1100?
I am using a few 1100s now and its very memory constrained. I wouldnt be surprised if your issues are not in some way related to memory pressure.
If possible, can you disable Snort, reboot (to clear up memory ) then monitor? -
@michmoor OK will do. I had snort running on 24.03 just fine...
-
Even if it uses all the CPU cycles it still shouldn't crash a reboot!
-
@stephenw10 snort uninstalled, got a reboot, nothing interesting in the log:
*** Welcome to Netgate pfSense Plus 24.11-RELEASE (arm64) on router ***
Current Boot Environment: default
Next Boot Environment: defaultWAN (wan) -> mvneta0.4090 -> v4/DHCP4: 73.140.138.66/23
LAN (lan) -> mvneta0.4091 -> v4: 192.168.1.1/24
OPT (opt1) -> mvneta0.4092 ->
WG_VPN (opt2) -> tun_wg0 -> v4: 10.200.0.1/24- Logout / Disconnect SSH 9) pfTop
- Assign Interfaces 10) Filter Logs
- Set interface(s) IP address 11) Restart GUI
- Reset admin account and password 12) PHP shell + Netgate pfSense Plus tools
- Reset to factory defaults 13) Update from console
- Reboot system 14) Enable Secure Shell (sshd)
- Halt system 15) Restore recent configuration
- Ping host 16) Restart PHP-FPM
- Shell
Enter an option: TIM-1.0
WTMI-devel-18.12.1-1a13f2f
WTMI: system early-init
SVC REV: 5, CPU VDD voltage: 1.237V
NOTICE: Booting Trusted Firmware
NOTICE: BL1: v1.5(release):1f8ca7e-dirty (Marvell-devel-18.12.2)
NOTICE: BL1: Built : 18:22:47, Oct 7 2021
NOTICE: BL1: Booting BL2
NOTICE: BL2: v1.5(release):1f8ca7e-dirty (Marvell-devel-18.12.2)
NOTICE: BL2: Built : 18:22:52, Oct 7 2021
NOTICE: BL1: Booting BL31
NOTICE: BL31: v1.5(release):1f8ca7e-dirty (Marvell-devel-18.12.2)
NOTICE: BL31: Built : 18U-Boot 2018.03-devel-18.12.3-gc9aa92c-dirty (Oct 07 2021 - 18:20:55 -0300)
Model: Netgate 1100
CPU 1200 [MHz]
L2 800 [MHz]
TClock 200 [MHz]
DDR 750 [MHz]
DRAM: 1 GiB"Enter an option: " was the prompt and then when the reboot cycle began, it output "TIM-1.0".
-
Hmm, that looks like a hardware issue TBH. It's just rebooting with no output at all.
However that would still happen in 24.03 if it is.