Help to debug reboot problem on 23.01
-
I don't exactly remember when it starts to happen, was it beta also or not, but when I try to reboot the system via GUI it just hangs loading page after confirming and selecting boot environment, then 504 Gateway Time-out appears
The log shows only process start stopping and then restarting of services and timeout appearsApr 15 07:37:38 nginx 2023/04/15 07:37:38 [crit] 61860#100296: *1 SSL_write() failed (13: Permission denied) while processing HTTP/2 connection, client: 192.168.77.3, server: 0.0.0.0:443 Apr 15 07:37:38 nginx 2023/04/15 07:37:38 [error] 61860#100296: *1 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 192.168.77.3, server: , request: "POST /diag_reboot.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "192.168.77.1", referrer: "https://192.168.77.1/diag_reboot.php"
192.168.77.3 is a PC from reboot was initiated via web GUI.
When I have tried to use option 5 in the terminal, I got:Enter an option: 5 Netgate pfSense Plus will reboot. This may take a few minutes, depending on your hardware. Do you want to proceed? Y/y: Reboot normally R/r: Reroot (Stop processes, remount disks, re-run startup sequence) S: Reboot into Single User Mode (requires console access!) Enter: Abort Enter an option: Y Netgate pfSense Plus is rebooting now. Stopping package nut...done. Stopping package suricata...done. Stopping /usr/local/etc/rc.d/dyndns.sh...done.
And this just stuck on it.
When I select option 8 and just put reboot and hit enter, system immediately reboot without any problem.How can I analyze what prevents machine to reboot?
-
Hmm. Nothing in the system log?
If you hit
ctl+t
at the console after using option 5 does it show a process waiting?What packages do you have installed?
Steve
-
@stephenw10
Pressed several times CTRL+T.
The process is rc.initial.reboot:Apr 16 06:38:25 php-cgi 86649 rc.initial.reboot: Stopping all packages.
Packages installed:
I don't see anything unusual in log, starting process:Apr 16 07:50:09 php-cgi 58772 notify_monitor.php: Message sent to <>@gmail.com OK Apr 16 07:50:08 root 64435 PPPoE connection does not have a valid IP address Apr 16 07:50:08 upsmon 20436 Signal 15: exiting Apr 16 07:50:08 php-cgi 56261 rc.initial.reboot: Stopping all packages.
I have just disabled every other thing that have been spammed logs, Suricata, pfBlockerNG...
Halt is also not working.
Don't know is it helpful or not:
Output from truss -fae -o /tmp/truss.txt php-cgi /etc/rc.initial.reboot
truss_1.txt
kdump kdump.txt -
Hmm, I would have bet on Suricata but obviously not if it still does that with it disabled.
Do you have external notifications set up?
It looks like it might be trying and failing to send a notice that the firewall is rebooting.
-
@stephenw10
Do you mean System/Advanced/Notifications?
I have only E-Mail notifications set.
Disabled notifications and uninstalled suricata.
Nothing
I have a second firewall that reboots just fine. Config is similar... -
Hmm. What hardware is this running on?
Though is seems like a software issue since native reboot works fine.
-
@stephenw10
I just started to remove all additional scripts running from /usr/local/etc/rc.d
and found that removing dyndns.sh does help.
I can now reboot the system without problem.#!/usr/local/bin/bash while true; do IP_ADDRESS=$(ifconfig pppoe0 | grep "inet " | awk '{print $2}') if [ -z "$IP_ADDRESS" ]; then # PPPoE connection does not have a valid IP address logger "PPPoE connection does not have a valid IP address" else # PPPoE connection has a valid IP address /etc/rc.dyndns.update logger "PPPoE connection has a valid IP address, force DYNDNS" fi sleep 3600 done
I don't really remember if that some manual script I have been some years ago installed, or it is part of pfSense+ but it is the same on the secondary firewall and just works… Can not explain what exactly triggering this issue with reboot. My clean VM just do not have any scripts… but… it's not PPPoE and there are no hosts configured…
Ok. Changed this to:#!/usr/local/bin/bash case "$1" in start) while true; do IP_ADDRESS=$(ifconfig pppoe0 | grep "inet " | awk '{print $2}') if [ -z "$IP_ADDRESS" ]; then # PPPoE connection does not have a valid IP address logger "PPPoE connection does not have a valid IP address" else # PPPoE connection has a valid IP address /etc/rc.dyndns.update logger "PPPoE connection has a valid IP address, force DYNDNS" fi sleep 3600 done ;; stop) exit 0 ;; esac exit 0
And reboot works just fine... so it possible that sometime ago I have just generated this problem that was so hard to debug. Anyway, thank you for trying to help me!
Edited:
Yes, definitely it was manual script added, just because for some reason dynDNS was not updated.