24.03 install failed in 1 out of 3
-
Hello again, coming back to provide an update on this situation:
So far I have been totally unable to upgrade these pfSense VMs to the latest 24.03.
After @stephenw10 kindly reviewed the NDI's, the update branches appeared again.
So I made two different backups: took a VM snapshot, and simply backed up the configuration.
First attempt was the upgrade from existing pfSense VM to 24.03 following the normal update process.
Failed:
The same error as before.
Then I moved on to making a clean install. I first tried
netgate-installer-v1.0-RC-amd64-20240919-1435.iso
, image downloaded from Netgate.This installer first asks you to select the WAN interface - then its configuration, then the LAN interface, and its configuration.
So I configured WAN as static all settings correct upstream gateway set.
From here onwards always fails.
Since the only odd setting here is the "use local resolver", I tried different configurations:use local resolver: true
This option shows the first issue when after selecting "local resolver: true" you are forced to add an IP address to the resolver address.
I added 127.0.0.1, faileduse local resolver: false
so I did two tests with this setting, one I configured my hosting name resolver, second I configured cloud flare's DNS.In any of these attempts, the result is always one and the same:
However, if I exit the installer and go to the command prompt, the interface is working correctly, I am perfectly able to resolve addresses, and I am able to ping external addresses:
As you can see on the image below, I
ping google.com
and it resolves, and I am also ableews.netgate.com
a test requested by @stephenw10
(I am thinking that the issue here is that the interface is getting 1500 MTU when it should be 1400 MTU. But installer does not provide a method to set the MTU that I am aware of? I could set the MTU manually when on command prompt, but I don't know how to go back to the installer from there)
Ok so in the meanwhile @stephenw10 kindly suggested to test with the 2.7.2 install. So I downloaded the 2.7.2 ISO and installed CE.
After install I restored the config and the upgrade from 2.7.2 to 23.09.1 is offered.
I upgrade from 2.7.2 to 23.09.1 and it goes flawlessly.
After being on 23.09.1 I am offered to upgrade to 24.03. So I do, and the result:And I'm back to square one.
(In the meanwhile my deepest thanks to @stephenw10 for putting up with me and providing all the help he could)
-
Duplicating messages:
Are you able to get a crash report after that panic?Are you using xn NICs?
-
@stephenw10
No unfortunately from what I was able to see I am not able to get a crash report after the panic.
The NIC's show asxn
yes.# lspci | grep -E -i --color 'network|ethernet' 23:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
-
here's the backtrace:
db> bt Tracing pid 0 tid 100000 td 0xffffffff8303de40 kdb_enter() at kdb_enter+0x33/frame 0xffffffff83f0c890 panic() at panic+0x43/frame 0xffffffff83f0c8f0 trap_fatal() at trap_fatal+0x40f/frame 0xffffffff83f0c950 trap_pfault() at trap_pfault+0x4f/frame 0xffffffff83f0c9b0 calltrap() at calltrap+0x8/frame 0xffffffff83f0c9b0 --- trap 0xc, rip = 0xffffffff8128c005, rsp = 0xffffffff83f0ca88, rbp = 0xffffffff83f0cad0 --- xen_start32() at xen_start32+0x5/frame 0xffffffff83f0cad0 xenpci_attach() at xenpci_attach+0x207/frame 0xffffffff83f0cb10 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cb60 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cb90 pci_attach() at pci_attach+0xcb/frame 0xffffffff83f0cbd0 acpi_pci_attach() at acpi_pci_attach+0x17/frame 0xffffffff83f0cc10 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cc60 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cc90 acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x42f/frame 0xffffffff83f0ccf0 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cd40 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cd70 acpi_probe_children() at acpi_probe_children+0x237/frame 0xffffffff83f0cdd0 acpi_attach() at acpi_attach+0x972/frame 0xffffffff83f0ce60 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0ceb0 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cee0 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cf30 bus_generic_new_pass() at bus_generic_new_pass+0x127/frame 0xffffffff83f0cf60 root_bus_configure() at root_bus_configure+0x36/frame 0xffffffff83f0cf90 configure() at configure+0x9/frame 0xffffffff83f0cfa0 mi_startup() at mi_startup+0x1c8/frame 0xffffffff83f0cff0 db>
-
Aha, this is good. So it's something Xen specific by the looks of it. Let's see...
-
Are you able to get the console out put leading up to the panic so we can see what was attaching?
I note that 24.03 is built on FreeBSD15 and 23.09.X is FreeBSD14 so there could be incompatibility there. What version of Xen (or XCP) are you using?
-
@stephenw10 hi
I can try to either do a screen recording or halt the VM prior to boot, plug the console and get the output.
I'll get back to this.Now I have a small question about this FreeBSD versioning:
FreeBSD 14.1 was released June 2024
FreeBSD 15 official release schedule points it to be released in December 2025How exactly are we already on FreeBSD 15 here? How ready for production is it?
Xen version is latest
# cat /etc/os-release NAME="XCP-ng" VERSION="8.2.1" ID="xenenterprise" ID_LIKE="centos rhel fedora" VERSION_ID="8.2.1" PRETTY_NAME="XCP-ng 8.2.1" ANSI_COLOR="0;31" HOME_URL="http://xcp-ng.org/" BUG_REPORT_URL="https://github.com/xcp-ng/xcp"
-
For our purposes, ready. I've run it up to our devs. Let's see what they say.
Unfortunately, as I say, I don't think any of them are running Xen/XCP any longer.
-
You're not using any special packages or modules for Xen I assume?
Like xe-guest-utilities?
-
@stephenw10 No I am not, quite plain install.
Mind if I was why were you questioning about the
xn
network interfaces? Do they have some known issues?By default none of the VM's had guest utilities installed.
Yesterday on my last attempt I installedxe-guest-utilities
to see if that would render some difference, but nothing. -
I asked about xn because it's an unusual NIC type. There are default configs for some NICs types like em and igb and none for xn, hn, virtio etc. If you had the hypervisor configured to present e1000 NICs it might have behaved differently.
It appears to be an issue when trying to attach something Xen specific but it's not clear just from the backtrace what that is. It may be possible to simply disable it.
-
Alright so I'll try to come back and present a better output of what happens previous to the crash. I'm completely unable to do it now but I'll try to do it today still.
-
Thanks, that should help a lot.
-
Hello again,
I'm sorry but it was impossible for me to do this last week.
In the meanwhile I proceeded and captured the whole boot from the Boot Screen to the crash.
Autoboot in 0 seconds. [Space] to pause Loading kernel... /boot/kernel/kernel text=0x19eec0 text=0xff4c38 text=0x17e3db4 data=0x180 data=0x22d718+0x3d18e8 0x8+0x1cb0f0+0x8+0x1da290 Loading configured modules... /boot/entropy size=0x1000 /boot/kernel/zfs.ko size 0x5ea9a0 at 0x35a7000 /boot/kernel/opensolaris.ko size 0x1e2f0 at 0x3b92000 /boot/kernel/cryptodev.ko size 0x7718 at 0x3bb1000 can't find '/etc/hostid' staging 0x73600000-0x779e3000 (not copying) tramp 0x779e3000 PT4 0x779e4000 Start @ 0xffffffff8039f000 ... EFI framebuffer information: addr, size 0xf0000000, 0x240000 dimensions 1024 x 768 stride 1024 masks 0x00ff0000, 0x0000ff00, 0x000000ff, 0x00000000 GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb ---<<BOOT>>--- Copyright (c) 1992-2024 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBSD-src-plus-RELENG_24_03/amd64.amd64/sys/pfSense amd64 FreeBSD clang version 17.0.6 (https://github.com/llvm/llvm-project.git llvmorg-17.0.6-0-g6009708b4367) VT(efifb): resolution 1024x768 Hyper-V Version: 0.0.0 [SP0] Features=0x870<APIC,HYPERCALL,VPINDEX,TMFREQ> PM Features=0x0 [C0] Features3=0x8<PCPUDPE> CPU: AMD Ryzen 5 3600 6-Core Processor (3593.36-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x870f10 Family=0x17 Model=0x71 Stepping=0 Features=0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT> Features2=0xfed83203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x40001f3<LAHF,CMP,CR8,ABM,SSE4A,MAS,Prefetch,DBE> Structured Extended Features=0x219c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA> Structured Extended Features2=0x400004<UMIP,RDPID> XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> AMD Extended Feature Extensions ID EBX=0x1005<CLZERO,XSaveErPtr,IBPB> Hypervisor: Origin = "Microsoft Hv" real memory = 2143289344 (2044 MB) avail memory = 2012315648 (1919 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: <Xen HVM> random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" random: unblocking device. ioapic0: MADT APIC ID 1 != hw id 0 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 <Version 1.1> irqs 0-47 TCP_ratelimit: Is now initialized ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE. ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (ipw_bss_fw, 0xffffffff80750310, 0) error 1 ipw_ibss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE. ipw_ibss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (ipw_ibss_fw, 0xffffffff807503c0, 0) error 1 ipw_monitor: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE. ipw_monitor: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (ipw_monitor_fw, 0xffffffff80750470, 0) error 1 iwi_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_iwi.LICENSE. iwi_bss: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (iwi_bss_fw, 0xffffffff80770010, 0) error 1 iwi_ibss: You need to read the LICENSE file in /usr/share/doc/legal/intel_iwi.LICENSE. iwi_ibss: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (iwi_ibss_fw, 0xffffffff807700c0, 0) error 1 iwi_monitor: You need to read the LICENSE file in /usr/share/doc/legal/intel_iwi.LICENSE. iwi_monitor: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (iwi_monitor_fw, 0xffffffff80770170, 0) error 1 random: entropy device external interface wlan: mac acl policy registered kbd1 at kbdmux0 WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 15.0. efirtc0: <EFI Realtime Clock> efirtc0: registered as a time-of-day clock, resolution 1.000000s netgate0: <unknown hardware> smbios0: <System Management BIOS> at iomem 0x7f3cc000-0x7f3cc01e smbios0: Version: 2.8, BCD Revision: 2.8 acpi0: <Xen> acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) cpu0: <ACPI CPU> on acpi0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 62500000 Hz quality 950 attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 isab0: <PCI-ISA bridge> at device 1.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc1a0-0xc1af at device 1.1 on pci0 ata0: <ATA channel> at channel 0 on atapci0 ata1: <ATA channel> at channel 1 on atapci0 uhci0: <Intel 82371SB (PIIX3) USB controller> port 0xc180-0xc19f irq 23 at device 1.2 on pci0 usbus0 on uhci0 pci0: <bridge> at device 1.3 (no driver attached) vgapci0: <VGA-compatible display> mem 0xf0000000-0xf1ffffff,0xf3042000-0xf3042fff at device 2.0 on pci0 vgapci0: Boot video device xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem 0xf2000000-0xf2ffffff irq 28 at device 3.0 on pci0 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x2dee022 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8128c005 stack pointer = 0x28:0xffffffff83f0da88 frame pointer = 0x28:0xffffffff83f0dad0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) rdi: 0000000000000000 rsi: ffffffff83f0da98 rdx: 0000000000000009 rcx: 0000000000001800 r8: 0000000000000007 r9: 0000000000000002 rax: 0000000002dee022 rbx: fffff800016fc000 rbp: ffffffff83f0dad0 r10: 0000000000000000 r11: ffffffff83f0d8f4 r12: ffffffff82d5aee0 r13: fffff800017c0690 r14: fffff800016fc600 r15: 0000000000001800 trap number = 12 panic: page fault cpuid = 0 time = 1 KDB: enter: panic [ thread pid 0 tid 100000 ] Stopped at kdb_enter+0x33: movq $0,0x235af42(%rip)
Hope this helps.
-
Ah, yup hopefully that will help.