Fatal trap 12: page fault while in kernel mode when connecting PPPoE
-
If it makes sense, I have re-configured both nodes with LAGGs to exactly match order and interface names, but it did not change anything in this behavior. So far I can't reproduce this in VMs, but one of the VMs was crashed once sometime ago when I tried other pf bug replication, unfortunately I have not saved this crash, but it was similar, fatal trap 12, referring to two exact things:
fault virtual address = 0x50
and
fault code = supervisor read data, page not present -
@w0w Can you try disabling pfsync?
-
@cmcdonald
Last time when I disabled pfsync, it stopped to crash. But I need to re-test it. -
@cmcdonald
Yes, looks like the problem is limited to “Synchronize states” option. -
@w0w I've had a look at that dump, and while I think I've identified what's going wrong I do not understand how we can end up in that situation.
It'd be interesting to get a full core dump (as opposed to these text dumps). Are you up for reproducing the problem and sharing a core dump (along with the exact version you triggered the crash on, of course)?
Short version: add a device for a swap partition, ideally at least as large as system RAM. A USB stick should work. (Note you'll lose all data on the stick!)
If the USB (or other) swap device is da0 do:gpart destroy -F da0 gpart create -s gpt da0 gpart add -t freebsd-swap da0
Add
/dev/da0p1 none swap sw 0 0
to /etc/fstab.
Edit /etc/pfSense-ddb.conf and change thescript kdb.enter.default
toscript kdb.enter.default=bt ; show registers ; dump ; reset
.Reboot.
Future panics should dump a kernel core to the swap partition, which will get saved to /var/crash on the next boot. Those files (along with an exact version number of the system this happened on) should let us dig a bit deeper.
-
@w0w What if you restrict pfsync updates from primary to secondary only, a vice-versa...instead of bi-directional syncing?
-
@cmcdonald That's what I did last time
It looks like it stopped to crash, but maybe it needs further testing, not sure.
@kprovost
I posted some links with core dumps created privately -
@w0w disabling which sync path (primary to secondary or secondary to primary) ?
-
@cmcdonald
Secondary to primary. -
https://redmine.pfsense.org/issues/14804
Just for reference, problem solved.