New 2.7.2CE Install on AMD/Realtek Hardware Intermittent kernel panic on boot but not while running/booted up.
-
@N8LBV memory itself and CPU test fine outside of PFSense.
-
@N8LBV This is pretty much a basic default install, I changed very little.
It did this after the 2nd or 3rd reboot initially.
Most of the time it boots fine.
And happens fairly early in the boot process.
Single Nvme 500GB SSD ZFS. GPT/UEFI. -
I had an appliance with a damaged Intel NIC port that would fail to boot 1 out of 10 times. I disabled the defective NIC
- pkg install nano
- nano /boot/loader.conf.local
- hint.igb.3.disabled=1
If you suspect it has something to do with the Realtek NIC, you could do something similar
- disable NIC in BIOS
- add hint.re.0.disabled=1 to /boot/loader.conf.local
- or add it to a System -> Advanced -> System Tunable
-
@elvisimprsntr It does not appear to be the NIC but I could pull the intel server nic out.
and disable the onboard nic.
all 3 nics are working perfectly when the system is booted up.
And they worked solidly for a week in Windows with heavy CPU and network loads.
I also have run memtest for hours and OCCT in Windows for hours on various memory and CPU tests.
I'm very confident the hardware is not failing in any way.
It appears to crash right after the hard disk driver changes hands from UEFI to the kernel.
"hdac1: (AMD Raven HDA Controller>) line then fail. -
@N8LBV said in New 2.7.2CE Install on AMD/Realtek Hardware Intermittent kernel panic on boot but not while running/booted up.:
AMD Raven HDA Controller
At the moment I'm chasing this lead: LINK
-
Are you sure that's a drive controller and not an HD audio device you can just disable?
-
@N8LBV It looks like this was fixed in a FreeBSD14 pre-release.
But somehow not fixed in PFSense 2.7.2
Per the link above.
Disabling onboard sound chip in the BIOS may have fixed this for me.
Being tested now. -
@stephenw10 I was not sure of anything at the time I posted the image.
But yes it is the Audio controller, which I now have disabled for futher testing and expect this is a fix.
It looks like this issue was fixed in a FREEBSD-14 pre-release in 2023.
So I'm not understanding why what appears to be the same issue is back.
Or have we just discovered a new variant or very similar issue that has not been patched? -
It looks like it wasn't committed until after 2.7.2 was branched:
https://cgit.freebsd.org/src/commit/?id=901d81c3e0f43cb0e4e10bb42ab9f0a71cfcda0aIt's in devel now: https://github.com/pfsense/FreeBSD-src/commit/015daf5221f7588b9258fe0242cee09bde39fe21
So will be in the next release. Its in Plus 24.03.But you should disable any audio devices in a pfSense install anyway. They can only cause problems!
-
@stephenw10 Yeah agreed and this is a good fix for now.
At least I fully know what is going on now as well.
It's a bit sloppy for me to go and leave an audio interface enabled.
But I figured it didn't matter as PFsense shouldn't load any drivers for the audio interface as far as I know.
This problem could come back if the BIOS is ever reset or defaulted for any reason, but I at least will know what to do in that rare event.ASUS consumer motherboards like to mildly overclock some settings by default.
This occasionally results in a failed boot and a "hit F1" to load defaults message which would re-enable the audio.
I'm not worried about it and it will get fixed in a future update obviously.I already knew what I was getting into building a box on a consumer AMD light gaming motherboard LOL.
Incidentally the Realtek NIC is doing great!
A little slower than the dual Intel server NIC (as expected) that is also installed.
And the Realtek is only going to be used for very rare management of the web interface and SSH.
I did put it through a series of long multi-hour high bandwidth testing to see if I could make it fail
and it did just fine before taking it out of the routing mix.
The install was planned to disable it and install another NIC if it gave any problems.The hit was minimal and could have been far worse!
Thanks!!