Fresh install of 2.7 on new hardware then restorng setting from my old hardware running 2.5.2. PHP errors, system crashes and reboots, very unstable
-
I just purchased some new hardware with plans to make it my new dedicated pfsense box.
I have an install on 2.5.2 that has been running great for a long time.
I made a backup of the config on my 2.5.2 but includes the packages.
I started with a fresh install of 2.7 on the new hardware. From there I restored the backup and I started to get frequent PHP errors. I did some research and it seems that could be because I didn't remove the packages before the restore. I manually went into the xml and removed the packages. Fresh installed 2.7 and then did a restore.
Yesterday I noticed I was still getting PHP errors. I also had a major crash and reboot and when I took a look at the logs it appeared even though I had removed the packages via the xml that the machine was reaching out on the pfblockerng package and looking to update my lists. That caused the whole system to crash.
Unfortunately I'm really struggling to get this new hardware to stay as stable as my old gear. Is there a better way to do this without starting all over? I don't want to have to re-do al my vlans, static IPs and traffic rules.
-
Removing packages should not be required in that scenario. What errors were you seeing exactly?
Restoring your old config should be fine. If the interfaces are different you will need to re-assign them of course.
-
@stephenw10
Yes, understood, my interfaces are all set correctly, VLANs working, traffic working normally.Unfortunately I didn't download all the logs/errors but here are a few:
PHP Fatal error: Uncaught DivisionByZeroError: Division by zero in /usr/local/www/includes/functions.inc.php:219
Stack trace:
#0 /usr/local/www/includes/functions.inc.php(33): mem_usage()
#1 /usr/local/www/getstats.php(40): get_stats(Array)
#2 {main}
thrown in /usr/local/www/includes/functions.inc.php on line 219Dump header from device: /dev/nvd0p3
Architecture: amd64
Architecture Version: 4
Dump Length: 238080
Blocksize: 512
Compression: none
Dumptime: 2023-09-21 17:06:19 -0500
Hostname: pfSense.localdomain
Magic: FreeBSD Text Dump
Version String: FreeBSD 14.0-CURRENT #1 plus-RELENG_23_05_1-n256108-459fc493a87: Wed Jun 28 04:26:04 UTC 2023
root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-23_05_1-main/obj/amd64/f2Em2w3l/v
Panic String: page fault
Dump Parity: 1257990990
Bounds: 0
Dump Status: good -
-
If you have that crash report you can upload the file dump here: https://nc.netgate.com/nextcloud/index.php/s/oHkedWmPJQFWTRw
-
@stephenw10 I've uploaded a textdump file
-
This looks like a filesystem error:
<118>Bootup complete UFS /dev/ufsid/65148cf762e575fb (/) cylinder checksum failed: cg 207, cgp: 0xb01051 != bp: 0xae7eac13 UFS /dev/ufsid/65148cf762e575fb (/) cylinder checksum failed: cg 207, cgp: 0x4ade4a9d != bp: 0xae7eac13 /: inode 15624960: check-hash failed /: inode 15624960: check-hash failed /: inode 15624960: check-hash failed UFS /dev/ufsid/65148cf762e575fb (/) cg 152: bad magic number 0x0 should be 0x90255 panic: softdep_deallocate_dependencies: dangling deps cpuid = 0 time = 1695846014 KDB: enter: panic
That could be a bad drive. But the first thing to do is just run a few FSCK loops:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/filesystem-check.htmlSteve
-
@stephenw10 I ran a few fsck loops and keep getting system marked dirty.
I also ran a reboot from the console (#5) with F and that seems to come back clean but when I run fsck after that I seem to get errors that come back such as bad magic number & unexpected soft update inconsistency.
I also booted into a live instance of ubuntu and ran fsck there as well as check file system from the disk utility and it check out as OK.
-
It could potentially be a bad drive then. But make sure you run fsck at least 5 times from single user mode in pfSense to be sure.
-
@stephenw10 Yes I ran it 5 times, a couple of times now LOL.
Seems like the results are very inconsistent.
Does a dirty drive mean it's bad or has errors?
-
If the filesystem cannot be cleaned or keeps becoming corrupt then, yes that could be a bad drive. But I have seen the same thing on older hardware with simply a lose connector.
-
So I purchased a new nvme drive to test that out. I'm getting immediate errors again even with a fresh install of 2.7 and then a restore with no installed packages
I've added new dump files to the nextcloud share
Also noticed this in the logs and finally captured it.
As I stated I removed the packages from the XML but this is showing up as though the system is trying to pull down info as part of pfblockerOct 13 15:06:19 php-fpm 86183 /rc.update_urltables: : ERROR: could not update pfB_PRI1_v4 content from https://127.0.0.1:443/pfblockerng/pfblockerng.php?pfb=pfB_PRI1_v4
-
That pfBlocker can sometimes happen at boot before the lists have been fetched. It's not normally an issue.
Those crash reports all show filesystem issues still. It could be an issue with the board firmware. Is it running the latest BIOS?
Do you see those errors before you restore the old config?
Steve
-
No errors on the old machine at all.
I've purchased another machine to try/test and just did a fresh install and system restore.
Just got this error on the new-new hardware. Looks like it's similar to the other machine
Crash report begins. Anonymous machine information:
amd64
14.0-CURRENT
FreeBSD 14.0-CURRENT #1 RELENG_2_7_0-n255866-686c8d3c1f0: Wed Jun 28 04:21:19 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/obj/amd64/LwYAddCr/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/sources/FreeBSD-src-RELCrash report details:
PHP Errors:
[13-Oct-2023 16:37:00 America/Chicago] PHP Fatal error: Uncaught TypeError: Unsupported operand types: string / int in /etc/inc/util.inc:2409
Stack trace:
#0 /etc/inc/pfsense-utils.inc(1902): get_memory()
#1 /usr/local/www/includes/functions.inc.php(104): pfsense_default_state_size()
#2 /usr/local/www/includes/functions.inc.php(35): get_pfstate()
#3 /usr/local/www/getstats.php(40): get_stats(Array)
#4 {main}
thrown in /etc/inc/util.inc on line 2409No FreeBSD crash data found.
-
That's this: https://redmine.pfsense.org/issues/14648
Which is nothing to do with the drive. It's usually a one time error if you see it.
-
@stephenw10
Newest oneCrash report begins. Anonymous machine information:
amd64
14.0-CURRENT
FreeBSD 14.0-CURRENT #1 RELENG_2_7_0-n255866-686c8d3c1f0: Wed Jun 28 04:21:19 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/obj/amd64/LwYAddCr/var/jenkins/workspace/pfSense-CE-snapshots-2_7_0-main/sources/FreeBSD-src-RELCrash report details:
PHP Errors:
[13-Oct-2023 17:11:36 America/Chicago] PHP Fatal error: Uncaught DivisionByZeroError: Division by zero in /usr/local/www/includes/functions.inc.php:212
Stack trace:
#0 /usr/local/www/includes/functions.inc.php(33): mem_usage()
#1 /usr/local/www/getstats.php(40): get_stats(Array)
#2 {main}
thrown in /usr/local/www/includes/functions.inc.php on line 212No FreeBSD crash data found.
-
@mc866 said in Fresh install of 2.7 on new hardware then restorng setting from my old hardware running 2.5.2. PHP errors, system crashes and reboots, very unstable:
functions.inc.php:212
That's the same bug. Did you try the patch on the bug report?
-
-
You can use the System Patches package, just use the commit ID in the patch.
https://docs.netgate.com/pfsense/en/latest/development/system-patches.html
-
Awesome thanks!