CE 2.7.1 upgrade failure on a Watchguard XTM 5
I am running pfsense on a watchguard XTM 5 for like 5 years, with heavy packages like pfblockerng and snort.
I have been through several major upgrades over the years and never had any kind of issue, and I usually not uninstall package when upgrading to a minor version.
I was on 2.7.0 and upgraded to 2.7.1 so I did not uninstalled packages. I mention it because it might be my mistake and the root cause of my issue.
See upgrading log attached pfsense_bug_upgrade.txt
which ended with :
The process will require 2 MiB more space. [1/1] Upgrading pfSense-kernel-pfSense from 2.7.0 to 2.7.1... [1/1] Extracting pfSense-kernel-pfSense-2.7.1: .......... done ===> Keeping a copy of current kernel in /boot/kernel.old >>> Removing unnecessary packages... done. System is going to be upgraded. Rebooting in 10 seconds. Success
I let it reboot during at least 20 minutes and it was still not pingable. Unfortunately I had no serial console cable with me and so I tried the only (desperate) thing I could do : power off and then power on. Same issue obviously.
It took me 3 hours to finally find a console cable and examining what the problem was.
See the whole boot sequence attached pfsense_bug_boot.txt
That's what I saw at the end :
2023-11-25T15:12:15.366010+01:00 - init 428 - - getty repeating too quickly on port /dev/ttyv3, sleeping 30 secs 2023-11-25T15:12:15.374576+01:00 - init 430 - - getty repeating too quickly on port /dev/ttyv1, sleeping 30 secs 2023-11-25T15:12:15.375309+01:00 - init 433 - - getty repeating too quickly on port /dev/ttyv0, sleeping 30 secs 2023-11-25T15:12:15.374867+01:00 - init 432 - - getty repeating too quickly on port /dev/ttyv7, sleeping 30 secs 2023-11-25T15:12:15.374380+01:00 - init 431 - - getty repeating too quickly on port /dev/ttyv2, sleeping 30 secs 2023-11-25T15:12:15.444662+01:00 - init 436 - - getty repeating too quickly on port /dev/ttyu0, sleeping 30 secs
I googled "getty repeating too quickly on port", found some reports here, tried a few things that I don't remember, but nothing worked.
I tried to boot on the kernel.old, same issue.
At that point I had no more time to lose to restore my server's connectivity so I decided to reinstall completely, and to reflash a fresh 2.7.0
I preferred to reinstall a 2.7.0 because I wasn't sure at that moment if the issue was not in 2.7.1 itself, like some incompatibility with XTM 5 hardware that would have been undetected by netgate.
After reinstall of 2.7.0 it booted successfully and then I restored my configuration. I decided to give the 2.7.1 upgrade a try again, but this time I uninstalled all packages prior to start it.
And this time it succeeded.
I reinstalled my packages and no issue either.
It now runs sweet on latest version
So I don't really have an issue but I give feedback here as it might be helpful
Hmm, not seen that before but I'd guess it was ultimately filesystem damage from powering off a UFS install:
WARNING: / was not properly dismounted WARNING: /: mount pending error: blocks 832 files 2172
@stephenw10 the router was not power off after the installation. The script rebooted and thats all. I do have hard shutdown it (removed the power cable) after I realised that I had an issue (because I didn't have a console cable as explained) Maybe it's coming from that action but then we have no explanation for the upgrade failure
Yes, without the console output after the first reboot it's really hard to know exactly what happened. The getty output could be related but it could also just be from filesystem damage.