Kernel Panic on Temporarily Disable CARP with ixgbe driver
-
Could be an use-after-free.
pfsense 2.2.6 is FreeBSD 10.1-RELEASE-p25 which has mainly security fixes from the 10-Branch but is missing some other fixes.
https://svnweb.freebsd.org/base?view=revision&revision=277625 should fix it.
Yes, that's what I thought. Although that patch appears to be in the devel branch https://github.com/pfsense/FreeBSD-src/commit/f72184af7f1b19f99893f951a64a22f22ec344ba. I tried a beta build of 2.3 last week and same problem. Are the beta snapshots taken off the devel branch?
-
I guess it is.
Can you ssh into the pfsense and do an "uname -a" on the shell?
-
Can you ssh into the pfsense and do an "uname -a" on the shell?
FreeBSD <redacted> 10.2-STABLE FreeBSD 10.2-STABLE #317 58b7eab(devel): Fri Jan 15 04:28:46 CST 2016 root@pfs23-amd64-builder:/usr/home/pfsense/pfsense/tmp/obj/usr/home/pfsense/pfsense/tmp/FreeBSD-src/sys/pfSense amd64</redacted>
-
https://github.com/pfsense/FreeBSD-src/commit/f72184af7f1b19f99893f951a64a22f22ec344ba#diff-2a75ab8f3cf1e4838de5abd9c14a1870
seems to be in there. If thats the tree the beta is built from.
-
Yeah, it looks like it is. The commit hash in uname (58b7eab) is from the devel branch.
As I mentioned earlier, this sounds similar to https://forum.pfsense.org/index.php?topic=55433.0, which wasn't actually solved, just worked around by using a different NIC. I've traced the code back from carp_forus() which attemps to grab the lock, but going back to the ixgbe driver it just gets too complicated for me and I haven't managed to find what may be freeing the ifp pointer https://github.com/pfsense/FreeBSD-src/blob/945ed01c4bae06169f63978e43029c04d4abd731/sys/netinet/ip_carp.c#L1126.
-
I should add that ether_input() does check if ifp isn't a NULL pointer, but maybe there's a race condition here where something else is clearing it. https://github.com/pfsense/FreeBSD-src/blob/5aba7ffcfb97d9b6f4ce464de77b02ad4d7b8ad3/sys/net/if_ethersubr.c#L628.
-
Did you try
hw.pci.enable_msix=0
btw?
-
Did you try
hw.pci.enable_msix=0
Yep, that worked! What's the impact of disabling MSI-X though? Would be nice to not have to disable MSI-X and get to the bottom of the bug.
-
MSI-X is an extension to MSI which afaik implements separate capabilty structure, offers more vectors:
https://en.wikipedia.org/wiki/Message_Signaled_Interrupts
Now that it works without MSI-X you could try a different Slot for the ixgbe-card (HP should have a best practice document for that)
And you could try to update the Servers Bios. Maybe MSI-X Setup is somewhat borked.
-
Now that it works without MSI-X you could try a different Slot for the ixgbe-card (HP should have a best practice document for that)
And you could try to update the Servers Bios. Maybe MSI-X Setup is somewhat borked.
Server BIOS is up to date, running the latest release from HP which came out last month. I'll see if I can try a different slot. On a somewhat related note, I had to disable x2APIC in the BIOS for the machine to boot. Not sure if that's a BIOS or FreeBSD issue.