Cannot boot kernel with SMP enabled
-
Hi guys,
I can't boot the kernel (pfSense 2.4.5-p1) with SMP enabled on the following hardware:
- Zotac H67 wifi
- Intel i5 2500k
It seems to be an issue specifically isolated to FreeBSD and its handling of SMP in relation to my motherboard. I've updated the BIOS to the latest revision. If the number of Active Cores exceeds 1, the kernel will not boot.
panic: AP# 1 (PHY# 2) failed!
or
panic: AP#2 (PHY# 4) failed!
However, with only 1 core active pfSense will install & operate as intended with zero problems.
I've tried almost literally every BIOS option I can think of, one by one. Disabling ACPI does not resolve the issue,
set hint.acpi.0.disabled=1
With the kernel stating
local APIC required
Booting from UEFI mem-stick yields even less joy:
In an effort to troubleshoot I've been trying to get standard FreeBSD 12.x to boot with all four cores active, but to no avail. DragonflyBSD however boots perfectly with all cores enabled. I note they have done extensive re-writing of FreeBSD code with SMP in mind, and it seems to work beautifully. I wonder if pfSense will one day shift to DragonflyBSD,?
With that in mind though, I tracked down the following article which suggests that interrupts could be the issue:
https://neilpa.me/2019-01-02-freebsd-uhk-panic/
I believe this could be a potential root cause of my problem too. I am not experienced enough to know how to compile pfSense source from the github repo, with my patch to the FreeBSD source also there. Is that a viable option? I'd certainly like to try, if anyone is able to guide me through the process and suggest appropriate software to compile with. Preferably through Windows, or Linux (I have more powerful machines running these), or via FreeBSD/similar on the machine itself if required. I know pfSense itself does not have compiling utilities/code included for good reason.
Running with 1 core is not acceptable to me as it's a huge waste of a still capable chip. I'm tried to the hardware in large part because it's something I already have, that is of a reasonable spec. My connection is gigabit and I intend to run WireGuard and lots of other things, so I'd like it to have some poke rather than swapping to other cheaper hardware.
I've invested many, many hours into researching the issue myself, so I would really appreciate any guidance that anyone can offer.
Edit:
Currently download the latest daily to see if there's any changes there, but I think it's unlikely. -
@tomlawesome
try with pfsense 2.5.0 -
Thanks, haha. Just edited to say that's what I'm about to do!
-
Regrettably, 2.5.0 shares the same fate as I expected:
panic: AP# 1 (PHY# 2) failed!
I think my only hope is a patched pfSense (FreeBSD) kernel and self-compiling.
-
Is this a "FreeBSD used by pfSense" issue or a FreeBSD issue ?
I mean, any of these work on your hardware ? -
@gertjan None of the older builds will boot, but I had missed the FreeBSD 13 development build, so will give that a shot now.
-
@tomlawesome Regrettably, no dice with FreeBSD 13 either.
Also, re-posting original links somewhere they won't auto-delete after a few days so that they're there for the future:
1
2
-
SMT enabled in that CPU could be a security risk, I'm not sure if this problem applies to routers/firewalls, as they are only passing traffic..
MDS - Microarchitectural Data Sampling
TAA - Transactional Asynchronous AbortReference:
https://www.intel.com/content/www/us/en/architecture-and-technology/mds.html
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/So, I would take a deep look into that, to confirm if the security risk really applies to pfsense, before enabling SMT.
Note that MDS and TAA cannot be patched, only way to fix is disabling SMT.
I spent some days reading about this last year, I'm not an expert ok? I could be wrong about this.
-
@mcury said in Cannot boot kernel with SMP enabled:
SMT enabled in that CPU could be a security risk, I'm not sure if this problem applies to routers/firewalls, as they are only passing traffic..
MDS - Microarchitectural Data Sampling
TAA - Transactional Asynchronous AbortReference:
https://www.intel.com/content/www/us/en/architecture-and-technology/mds.html
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/So, I would take a deep look into that, to confirm if the security risk really applies to pfsense, before enabling SMT.
Note that MDS and TAA cannot be patched, only way to fix is disabling SMT.
I spent some days reading about this last year, I'm not an expert ok? I could be wrong about this.
That looks different to Symmetric MultiProcessing - but I'm no expert either!
-
@tomlawesome Oh, SMP, somehow I read SMT in your topic, ehhe, don't know how that happened..
SMT would be Simultaneous multithreading -
@mcury said in Cannot boot kernel with SMP enabled:
@tomlawesome Oh, SMP, somehow I read SMT in your topic, ehhe, don't know how that happened..
SMT would be Simultaneous multithreadingEasy mistake to make! No harm done, and I learned something in the process!