Crashes while upgrading to 24.03 from the last stable
-
Hmm, the backtrace there is not very helpful unfortunately:
db:1:pfs> bt Tracing pid 12 tid 100013 td 0xfffff800016e4740 kdb_enter() at kdb_enter+0x33/frame 0xfffffe0010784da0 kbdmux_intr() at kbdmux_intr+0x3d/frame 0xfffffe0010784dc0 taskqueue_run_locked() at taskqueue_run_locked+0x182/frame 0xfffffe0010784e40 taskqueue_run() at taskqueue_run+0x68/frame 0xfffffe0010784e60 ithread_loop() at ithread_loop+0x257/frame 0xfffffe0010784ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe0010784f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0010784f30 --- trap 0xa5a5a5a5, rip = 0, rsp = 0, rbp = 0xa5a5a5a5a5a5a5a5 ---
But also there is no panic shown. It almost looks like it was manually interrupted:
<118>Checking Snort Subscriber rules md5 file... done. <118>There is a new set of Snort Subscriber rules posted. <118>Downloading snortrules-snapshot-29200.tar.gz... KDB: enter: manual escape to debugger
Is that possible something was connected to it interrupting it at that point?
-
Is that possible something was connected to it interrupting it at that point?
I think that's possible. It was taking longer than usual to update and restart, as near I could tell. When I started watching the console I was seeing a static screen that was not changing, and I pressed Enter on the keyboard after a couple minutes. Could this have interrupted the upgrade?
-
It could depending on where it was. But I still wouldn't expect it to reach the debugger like that.
-
@stephenw10 Would a hardware issue be able to cause a problem like this one? I built this server years ago and it is using an ancient OCZ SSD that could be about to fail at any time. I did move all logging to a remote syslog server to preserve it as much as possible.
-
I would expect a drive failure to be far more obvious then this. It looks more like a hardware interrupt has been triggered somehow.
Have you seen any further issues since the upgrade completed?
-
@stephenw10 I have not. It's been running just fine since yesterday.
-
Hmm, well it's odd but I wouldn't be too concerned since it did complete the upgrade and there was seemingly no panic.
If you see anything further we look at any crash reports. -
-
Hi @stephenw10, the problem can't be completely ignored... any time my router reboots, it has a decent chance of freezing on reboot (my guess is it freezes about 2/3rds of the time). Fortunately it's a VM so I don't need physical access, but still, a reliable reboot is pretty core to a reliable system, especially for remote access situations.
Mike
-
I agree but what you're seeing is a completely different problem. Which is why I forked it to a new thread.
-
After consulting with our devs here it seems this could in fact be caused by a bad or failing SSD. It's hitting whilst trying to run a checksum on the downloaded Snort ruleset.
So I would suggest that SSD has reached the end of it;s useful life!
-
@stephenw10 I'll have to bring the thing down and swap drives then, thanks!
-
@stephenw10 Just swapped the ancient OCZ drive with a much better one today. I just used dd to 1:1 copy data across and it's booting fine, but do I need to care about the "The backup GPT table is not on the end of the device" message? Obviously it's because the new drive is larger than the old one, but will this have any impact on pfSense?
-
No, not unless you have some boot issue that requires the use of the secondary table.
However you can try to use growfs to fill the disk if you wish. Run:
touch /root/force_growfs
then reboot and it should fill it during the next boot.