Failed 23.01 upgrade of 7100 pair
-
I've a pair of 7100 setup with CARP/HA with ETH8 as SYNC interface.
I did upgrade the secondary first, no issues.
But after the primary was upgraded ETH8 displays as no carrier and the primary reboots frequently.
The textdump.tar/panic.txt file just displays "page fault"
The last trap in textdump.tar/msgbuf.txt is:Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 10
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff813187ba
stack pointer = 0x0:0xfffffe0130908560
frame pointer = 0x0:0xfffffe0130908560
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 31445 (nginx)
rdi: fffff80142664114 rsi: 0 rdx: 41f
rcx: 41f r8: 1 r9: 9
rax: fffff80142664114 rbx: fffff801c8606900 rbp: fffffe0130908560
r10: 53f562b3 r11: 4c r12: fffff80077efc100
r13: 0 r14: fffff80141b6e400 r15: 1
trap number = 12
panic: page fault
cpuid = 2
time = 1678898975
KDB: enter: panicReverting the primary back to 22.05 and ETH8 is active again.
Any ideas on what is the matter with the primary?
-
Not enough information there to determine what happened, we need to see all of the files from the crash dump not just that part.
-
I created a support ticket and was told that in 23.01 MDI/MDI-X was turned off for the ETH ports.
Other than that I was told to do a re-image of the device and do a config restore afterwards.
I will need to schedule this but I'll update both here and of course the case. -
The MDI-X issue wouldn't cause a panic, it would cause it not to have link on ports that need a different type of cable (straight-through vs crossover patches)
Reinstalling may or may not help with a panic but without the full crash dump it's hard to speculate more.
-
The full crash report was on your ticket and it shows you are hitting this:
https://redmine.pfsense.org/issues/13938Either of the two workarounds shown there should prevent it.
Steve
-
The patch to disable
sendfile
is in the recommended patches list in the system patches package, so that would be the easiest way to apply it. -
Thanks @jimp and @stephenw10 !
I can confirm that the sendfile patch removes the issue with panic.
Now is there a way to re-enable MDI-X for the ETH ports?
If not, what was the idea to remove that functionality? -
It should not have been removed. It's a bug and has been fixed for 23.05 but would require a rebuild to fix in 23.01
-
ok I'll do sync over a VLAN untill 23.05 arrives.
Again thanks for you help! -
That will work. Or just swap the link out for a cross-over cable.