pf_test: kif == NULL, if_xname on multi-WAN and "Reset all states if WAN IP Address changes"
-
Good news is it hasnt failed on me again since the one time 15m after original upgrade.
So... scary start, but stable for nearly 24hours now.
Hopefully that release is SOON.
-
Re-tested on the latest build
2.5.0-DEVELOPMENT (amd64)
built on Fri Apr 03 19:36:42 EDT 2020
FreeBSD 12.0-RELEASE-p10
Don't know what exactly was merged/patched, whatever else, but I can not crash my system anymore with my test sequence. I'll check later what changes are made on github. -
Hmm... looks like this thing came back, after I've configured CARP and started to play with PPPoE on secondary firewall and re-plugged main ethernet cable that cames from ISP (PPPoE).
Crash report begins. Anonymous machine information: amd64 12.1-STABLE FreeBSD 12.1-STABLE f1de4082be8(devel-12) pfSense Crash report details: No PHP errors found. .......................................... <118>Starting CRON... done. <118> Starting package Cron...done. <118> Starting package System Patches...done. <118> Starting package Service Watchdog...done. <118> Starting package nut...done. <118> Starting package Shellcmd...done. <118> Starting package Backup...done. <118> Starting package iperf...done. <118> Starting /usr/local/etc/rc.d/shutdown.nut.sh...done. <118>pfSense 2.5.0-DEVELOPMENT amd64 Fri May 22 07:43:46 EDT 2020 <118>Bootup complete <6>ix0: link state changed to DOWN <6>ix0: link state changed to UP <6>ix0: link state changed to DOWN <6>ix0: link state changed to UP <6>gif0: link state changed to DOWN <6>ng0: changing name to 'pppoe0' <6>gif0: link state changed to DOWN <6>gif0: link state changed to UP <6>gif0: link state changed to DOWN <6>gif0: link state changed to UP <6>gif0: link state changed to DOWN pf_test: kif == NULL, if_xname pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname @�� <6>ng0: changing name to 'pppoe0' pf_test: kif == NULL, if_xname @�� pf_test: kif == NULL, if_xname <6>gif0: link state changed to DOWN <6>gif0: link state changed to UP Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 04 fault virtual address = 0x70 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80f25037 stack pointer = 0x28:0xfffffe0095dd2370 frame pointer = 0x28:0xfffffe0095dd23b0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 80273 (unbound) trap number = 12 panic: page fault cpuid = 2 time = 1590480096 KDB: enter: panic
The crash occurs only on primary firewall.
Full log is here -
Got those topic messages on backup firewall also, when experimented with PPPoE (same link)
May 26 19:34:51 kernel May 26 19:34:51 root 54561 PPPoE put down May 26 19:34:51 kernel May 26 19:34:50 kernel May 26 19:34:50 kernel May 26 19:34:49 kernel May 26 19:34:49 kernel May 26 19:34:48 kernel May 26 19:34:48 kernel May 26 19:34:47 kernel May 26 19:34:47 kernel May 26 19:34:47 kernel May 26 19:34:46 kernel May 26 19:34:46 kernel May 26 19:34:46 kernel May 26 19:34:45 kernel May 26 19:34:45 kernel May 26 19:34:44 kernel pf_test: kif == NULL, if_xname May 26 19:34:44 kernel pf_test: kif == NULL, if_xname May 26 19:34:43 kernel pf_test: kif == NULL, if_xname May 26 19:34:43 kernel pf_test: kif == NULL, if_xname May 26 19:34:42 kernel May 26 19:34:42 kernel May 26 19:34:42 kernel pf_test: kif == NULL, if_xname
-
May be it's https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230498
-
I've played around a bit and the result is that 2.4.5-RELEASE is still not affected, and 2.5 is crashing not every time but mostly, when I issue 'killall mpd5' or even 'rc.linkup stop wan'
The configuration is the same. PPPoE as WAN, DHCP as WAN2, no IPv6, only failover is configured. PPPoE as tier1 and DHCP as tier2.
I don't think that unbound is the real cause of crash... just because I've tried to stop unbound service and repeat the sequence (multiple times disconnected and connected WAN port cable) and got another dump. crash00.txtFatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 06 fault virtual address = 0x0 fault code = supervisor read instruction, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0xfffffe0074d614a8 frame pointer = 0x28:0xfffffe0074d615f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi4: clock (0)) trap number = 12 panic: page fault cpuid = 3 time = 1590747377 KDB: enter: panic ����������������
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234296 looks very similar.
-
Do we have patch provided in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234296 enabled on pfSense latest 2.5 version? I mean https://svnweb.freebsd.org/base?view=revision&revision=343787
-
If it was marked fixed in 12.0-RELEASE then yes, that would be in 2.5.0. 2.5.0 snapshots are on 12.1-STABLE now, so well past that point.
-
@jimp
Can you suggest me something? -
Nothing I'm aware of for that, I'm afraid.
-
@jimp
What do you think, сould this fatal trap be due to the radix_mpath option enabled in the kernel?
Any chances to get some snapshot with this option disabled? -
It's possible. There are other problems with RADIX_MPATH as well but I'm not sure if we're going to look into fixing them or back that out.