pfsense panic after "sonewconn: pcb [address] Listen queue overflow"
-
I have a NETGATE 1541, which has been running well before, but it suddenly lost its response late last night. After checking, it is a kernel panic.
<7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (1 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (17 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (3 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (9 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (5 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (3 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (9 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (1 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (12 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (7 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (5 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (2 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (8 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (3 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (14 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (9 occurrences) <7>sonewconn: pcb 0xfffff802f7491988: Listen queue overflow: 49 already in queue awaiting acceptance (6 occurrences) Fatal trap 12: page fault while in kernel mode cpuid = 11; apic id = 0b fault virtual address = 0x0 fault code = supervisor read data, page not present Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read instruction, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0xfffffe00a6b7ba08 Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0x0 fault code = supervisor read instruction, page not present frame pointer = 0x28:0xfffffe00a6b7ba40 code segment = base 0x0, limit 0xfffff, type 0x1b instruction pointer = 0x20:0x0 stack pointer = 0x28:0xfffffe00a6b08a78 frame pointer = 0x28:0xfffffe00a6b08ad0 instruction pointer = 0x20:0xffffffff83d20708 stack pointer = 0x28:0xfffffe00a13d8ac0 frame pointer = 0x28:0xfffffe00a13d8ac0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 42 (zio_write_intr_0_8) trap number = 12 panic: page fault cpuid = 2 time = 1679593982 KDB: enter: panic
dumpfile:
textdump.tar info.0 -
panic:
db:0:kdb.enter.default> show registers cs 0x20 ds 0x3b ll+0x1a es 0x3b ll+0x1a fs 0x13 gs 0x1b ss 0x28 ll+0x7 rax 0x12 rcx 0x1 rdx 0xfffffe00a6b7b6b0 rbx 0xffffffff816654e6 rsp 0xfffffe00a6b7b7b0 rbp 0xfffffe00a6b7b7c0 rsi 0xa rdi 0xffffffff835048f8 uart_console+0x10 r8 0 r9 0x20 r10 0xffffffff832a1578 vt_conswindow r11 0x133 ll+0x112 r12 0xffffffff815829bc r13 0xfffffe00a6b7b940 r14 0x100 ll+0xdf r15 0xfffff800233c1740 rip 0xffffffff80dd2247 kdb_enter+0x37 rflags 0x86 ll+0x65 kdb_enter+0x37: movq $0,0x28feec6(%rip) db:0:kdb.enter.default> run lockinfo db:1:lockinfo> show locks No such command; use "help" to list available commands db:1:lockinfo> show alllocks No such command; use "help" to list available commands db:1:lockinfo> show lockedvnods Locked vnodes db:0:kdb.enter.default> show pcpu cpuid = 2 dynamic pcpu = 0xfffffe0080e30140 curthread = 0xfffff800233c1740: pid 42 tid 100423 "zio_write_intr_0_8" curpcb = 0xfffff800233c1ce0 fpcurthread = none idlethread = 0xfffff80005669000: tid 100005 "idle: cpu2" curpmap = 0xffffffff83690da8 tssp = 0xffffffff8371af70 commontssp = 0xffffffff8371af70 rsp0 = 0xfffffe00a6b7bcc0 kcr3 = 0x80000000040e6002 ucr3 = 0xffffffffffffffff scr3 = 0x3f247e813 gs32p = 0xffffffff83721788 ldt = 0xffffffff837217c8 tss = 0xffffffff837217b8 tlb gen = 13359505 curvnet = 0 db:0:kdb.enter.default> bt Tracing pid 42 tid 100423 td 0xfffff800233c1740 kdb_enter() at kdb_enter+0x37/frame 0xfffffe00a6b7b7c0 vpanic() at vpanic+0x194/frame 0xfffffe00a6b7b810 panic() at panic+0x43/frame 0xfffffe00a6b7b870 trap_fatal() at trap_fatal+0x38f/frame 0xfffffe00a6b7b8d0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00a6b7b930 calltrap() at calltrap+0x8/frame 0xfffffe00a6b7b930 --- trap 0xc, rip = 0, rsp = 0xfffffe00a6b7ba08, rbp = 0xfffffe00a6b7ba40 --- ??() at 0/frame 0xfffffe00a6b7ba40 zio_done() at zio_done+0x7fe/frame 0xfffffe00a6b7bad0 zio_execute() at zio_execute+0xad/frame 0xfffffe00a6b7bb20 taskqueue_run_locked() at taskqueue_run_locked+0x144/frame 0xfffffe00a6b7bb80 taskqueue_thread_loop() at taskqueue_thread_loop+0xd2/frame 0xfffffe00a6b7bbb0 fork_exit() at fork_exit+0x7e/frame 0xfffffe00a6b7bbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00a6b7bbf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
-
When it panic, I was running the make makesum of ports on another FreeBSD to update the dependency packages of gitlab-ce, which would pull a large number of go packages at a high frequency.
After that, it didn't take long for the ip of my nic to be unreachable. The strange thing is that the other ip of another nic can work normally. Then after some time, it seems to panic.
-
Hmm, the last thing shown there is a zfs command. If it was unable to write to the drive whatever is accepting connections there could also be unable and eventually fill the buffers causing that queue overflow.
However usually that would also prevent writing the crash report.Are you able to reproduce this?
Steve