Crashes while upgrading to 24.03 from the last stable

Dobby_

I would install 2.7.2 CE and upgrade to 23.09.01 and then to the latest
24.03 Release from today morning (I was doing)

And then installing the packets you need and then playback your backup
again.

Would be in my eyes more fast then all the other "work around" stuff!

stephenw10

Hmm, the backtrace there is not very helpful unfortunately:

db:1:pfs> bt
Tracing pid 12 tid 100013 td 0xfffff800016e4740
kdb_enter() at kdb_enter+0x33/frame 0xfffffe0010784da0
kbdmux_intr() at kbdmux_intr+0x3d/frame 0xfffffe0010784dc0
taskqueue_run_locked() at taskqueue_run_locked+0x182/frame 0xfffffe0010784e40
taskqueue_run() at taskqueue_run+0x68/frame 0xfffffe0010784e60
ithread_loop() at ithread_loop+0x257/frame 0xfffffe0010784ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe0010784f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0010784f30
--- trap 0xa5a5a5a5, rip = 0, rsp = 0, rbp = 0xa5a5a5a5a5a5a5a5 ---

But also there is no panic shown. It almost looks like it was manually interrupted:

<118>Checking Snort Subscriber rules md5 file... done.
<118>There is a new set of Snort Subscriber rules posted.
<118>Downloading snortrules-snapshot-29200.tar.gz...
KDB: enter: manual escape to debugger

Is that possible something was connected to it interrupting it at that point?

rayrayrayraydog

Is that possible something was connected to it interrupting it at that point?

I think that's possible. It was taking longer than usual to update and restart, as near I could tell. When I started watching the console I was seeing a static screen that was not changing, and I pressed Enter on the keyboard after a couple minutes. Could this have interrupted the upgrade?

stephenw10

It could depending on where it was. But I still wouldn't expect it to reach the debugger like that.

rayrayrayraydog

@stephenw10 Would a hardware issue be able to cause a problem like this one? I built this server years ago and it is using an ancient OCZ SSD that could be about to fail at any time. I did move all logging to a remote syslog server to preserve it as much as possible.

stephenw10

I would expect a drive failure to be far more obvious then this. It looks more like a hardware interrupt has been triggered somehow.

Have you seen any further issues since the upgrade completed?

rayrayrayraydog

@stephenw10 I have not. It's been running just fine since yesterday.

stephenw10

Hmm, well it's odd but I wouldn't be too concerned since it did complete the upgrade and there was seemingly no panic.
If you see anything further we look at any crash reports.

mikebenna

Hi @stephenw10, the problem can't be completely ignored... any time my router reboots, it has a decent chance of freezing on reboot (my guess is it freezes about 2/3rds of the time). Fortunately it's a VM so I don't need physical access, but still, a reliable reboot is pretty core to a reliable system, especially for remote access situations.

Mike

stephenw10

I agree but what you're seeing is a completely different problem. Which is why I forked it to a new thread.

stephenw10

After consulting with our devs here it seems this could in fact be caused by a bad or failing SSD. It's hitting whilst trying to run a checksum on the downloaded Snort ruleset.

So I would suggest that SSD has reached the end of it;s useful life!

rayrayrayraydog

@stephenw10 I'll have to bring the thing down and swap drives then, thanks!

rayrayrayraydog

@stephenw10 Just swapped the ancient OCZ drive with a much better one today. I just used dd to 1:1 copy data across and it's booting fine, but do I need to care about the "The backup GPT table is not on the end of the device" message? Obviously it's because the new drive is larger than the old one, but will this have any impact on pfSense?

stephenw10

No, not unless you have some boot issue that requires the use of the secondary table.

However you can try to use growfs to fill the disk if you wish. Run: touch /root/force_growfs then reboot and it should fill it during the next boot.