Random kernel panic and restart on 2.7.2
-
Hmm, that backtrace is huge:
db:0:kdb.enter.default> bt Tracing pid 0 tid 100008 td 0xfffffe0020516000 kdb_enter() at kdb_enter+0x32/frame 0xfffffe002036acd0 vpanic() at vpanic+0x163/frame 0xfffffe002036ae00 panic() at panic+0x43/frame 0xfffffe002036ae60 dblfault_handler() at dblfault_handler+0x1ce/frame 0xfffffe002036af20 Xdblfault() at Xdblfault+0xd7/frame 0xfffffe002036af20 --- trap 0x17, rip = 0xffffffff80f6bc74, rsp = 0xfffffe001d7d2000, rbp = 0xfffffe001d7d2000 --- ipsec6_checkpolicy() at ipsec6_checkpolicy+0x4/frame 0xfffffe001d7d2000 ipsec6_common_output() at ipsec6_common_output+0x28/frame 0xfffffe001d7d2040 ip6_output() at ip6_output+0x102/frame 0xfffffe001d7d2260 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d22c0 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d2490 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d24c0 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d24f0 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d2530 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d25a0 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d25c0 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d2660 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d26c0 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d2700 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d2920 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d2980 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d2b50 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d2b80 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d2bb0 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d2bf0 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d2c60 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d2c80 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d2d20 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d2d80 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d2dc0 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d2fe0 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d3040 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d3210 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d3240 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d3270 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d32b0 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d3320 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d3340 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d33e0 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d3440 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d3480 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d36a0 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d3700 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d38d0 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d3900 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d3930 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d3970 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d39e0 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d3a00 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d3aa0 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d3b00 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d3b40 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d3d60 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d3dc0 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d3f90 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d3fc0 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d3ff0 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d4030 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d40a0 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d40c0 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d4160 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d41c0 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d4200 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d4420 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d4480 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d4650 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d4680 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d46b0 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d46f0 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d4760 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d4780 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d4820 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d4880 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d48c0 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d4ae0 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d4b40 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d4d10 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d4d40 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d4d70 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d4db0 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d4e20 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d4e40 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d4ee0 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d4f40 ip6_output_send() at ip6_output_send+0xe3/frame 0xfffffe001d7d4f80 ip6_output() at ip6_output+0x1d57/frame 0xfffffe001d7d51a0 pf_refragment6() at pf_refragment6+0x1ab/frame 0xfffffe001d7d5200 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d53d0 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d5400 pfil_mbuf_out() at pfil_mbuf_out+0x38/frame 0xfffffe001d7d5430 enc_hhook() at enc_hhook+0x262/frame 0xfffffe001d7d5470 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe001d7d54e0 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe001d7d5500 ipsec6_perform_request() at ipsec6_perform_request+0x76/frame 0xfffffe001d7d55a0 ipsec_transmit() at ipsec_transmit+0x170/frame 0xfffffe001d7d5600 ip6_forward() at ip6_forward+0x99c/frame 0xfffffe001d7d5700 pf_refragment6() at pf_refragment6+0x18d/frame 0xfffffe001d7d5760 pf_test6() at pf_test6+0x153b/frame 0xfffffe001d7d5930 pf_check6_out() at pf_check6_out+0x43/frame 0xfffffe001d7d5960 pfil_mbuf_fwd() at pfil_mbuf_fwd+0x38/frame 0xfffffe001d7d5990 ip6_forward() at ip6_forward+0x3fd/frame 0xfffffe001d7d5a90 ip6_input() at ip6_input+0xa57/frame 0xfffffe001d7d5b70 netisr_dispatch_src() at netisr_dispatch_src+0x22c/frame 0xfffffe001d7d5bc0 ether_demux() at ether_demux+0x149/frame 0xfffffe001d7d5bf0 ether_nh_input() at ether_nh_input+0x36e/frame 0xfffffe001d7d5c50 netisr_dispatch_src() at netisr_dispatch_src+0xaf/frame 0xfffffe001d7d5ca0 ether_input() at ether_input+0x69/frame 0xfffffe001d7d5d00 iflib_rxeof() at iflib_rxeof+0xc46/frame 0xfffffe001d7d5e00 _task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe001d7d5e40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0x14e/frame 0xfffffe001d7d5ec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfffffe001d7d5ef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe001d7d5f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001d7d5f30 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Do the crashes always show that? Or at least very similar to that?
It looks similar to this: https://redmine.pfsense.org/issues/14431 Though I'd expect to see something logged due to an interface or link going down. You didn't remove anything from the msgbuf output?
Steve
-
@stephenw10 except for PIDs, they are near identical from one crash to another.
I'll take a look at your link.
Thank you! -
I would guess it's triggered by an IPSec tunnel carrying IPv6 if you have that?
Of course that should not panic...
-
Well, it seems that the issue can be related to the bug...!!
I provide some context for better understanding.
This is my home-office firewall.
I do a lot of remote work as IT manager for some customers, and I connect to the internet via a dual 4G setup (using two different providers and pointing different BTS), since I live in a remote location (not a great choice for an IT guy, but hey...).
So I've setted up many IPSec VPNs with dual tunnel for every customer using router VTI and BGP dynamic routing via FRR.
With this setup, I can continue to work even when one of my WANs goes down or loss packets (which is quite common).
All this on IPv4 world. (no IPv6 connectivity from mobile providers here)But I've also the same connection schema to a location with IPv6 from which I take a /64 subnet to my house for experimental purposes.
So one of the two tunnel on this location is routing also this IPv6 subnet to my house on a second Phase 2 IPSec.The whole system worked very fine, but after learning about this bug, it could be that during a connection drop, even a brief one, there is some IPv6 traffic trying to pass through the offline ipsec interface...
This setup has actually been working for more than 1 year, but I previously tunneled via OpenVPN and only recently switched this IPv6 routing to IPSec.
I'll try shutting down the IPv6 stack entirely and see if that fixes it.
Thank you!
-
Yeah, that seems likely from that backtrace. If it always happens ai less than 12hrs that should be easy enough to test.
I've never been able to replicate that panic locally which means it's far more difficult to pin down.
-
Pretty frequent, so I think we'll know by tomorrow
Feb 8 17:36:18 root 26063 Bootup complete Feb 8 16:49:44 root 428 Bootup complete Feb 8 14:47:03 root 21766 Bootup complete Feb 8 13:45:17 root 93044 Bootup complete Feb 8 07:10:15 root 60172 Bootup complete Feb 8 00:14:39 root 77642 Bootup complete Feb 7 15:36:30 root 45258 Bootup complete Feb 7 14:53:23 root 98846 Bootup complete Feb 7 13:40:11 root 15021 Bootup complete Feb 7 12:38:21 root 49059 Bootup complete Feb 7 10:34:49 root 34494 Bootup complete Feb 7 10:16:51 root 75312 Bootup complete Feb 7 08:35:36 root 62958 Bootup complete Feb 7 07:53:43 root 54954 Bootup complete Feb 6 23:09:37 root 60224 Bootup complete Feb 6 22:28:15 root 96734 Bootup complete Feb 6 21:22:09 root 802 Bootup complete
-
I am curious as to what I can search through / look for in my crash dumps as I have been crashing pretty regularly on 2.7.2 to see if this is similar. Or what files would someone like to view ?
-
The most telling line in the backtrace is probably:
ip6_output()
Though that doesn't always appear as you can see in the bug report where is happens on ppp links. -
Uptime: 15h 44m
I'll keep an eye on it today too, but it seems that disabling IPv6 subnet tunneling solved the problem, so it's probably the same anomaly.
Do you know if there is a planned fix for this problem also on pfSense CE?
As a workaround on my specific problem, I could try to restore that routing on OpenVPN tunnels as before (which had never given this problem), but it would be nice if it were solved.
Thanks,
Edoardo -
@EdoFede said in Random kernel panic and restart on 2.7.2:
I could try to restore that routing on OpenVPN tunnels as before
That would be a good test. I would expect both VPN types to go down at the same time so IPv6 sessions over both should behave similarly. So if it doesn't panic over OpenVPN then it's handling that differently which could be a clue.
-
I'm trying to replicate the setup of IPv6 routing on OpenVPN, but something doesn't work as expected.
I'm not doing the exact same way as before (that worked...both IPv4 and IPv6 tunneling) because I've already setted up IPv4 over IPSec + BGP for this site and don't want to brake the whole setup.I'm trying to route only IPv6 traffic over the OpenVPN tunnel (that has IPv4 endpoints as before), but something is wrong, I think on routing.
I'm able to ping6 google from the "remote" firewall via the tunnel, but not on the internal "remote" IPv6 network.I'll investigate and let you know if I can reproduce the issue even on OpenVPN.
I don't think that will happen anyway, because nothing like this has ever happened to me before with IPv6 and OpenVPN.Bye!
Edo -
Sounds like a missing iroute at the server end.
-
yeah, missing iroute!
Now fixed, thanks!Now I'll monitor the system and let you know if the issue happens also on OpenVPN.
Bye,
Edo -
it seems to be stable on OpenVPN... no reboots/crash at the moment.
The only setup difference with the IPSec configuration is that on IPSec I had to manually enter the default route (route -6 add default <tunnel endpoint>) because for some strange reason it was not set automatically (even if I selected the gateway as default in the routing menu).
I'll write if it happens again, but I would say that the problem only seems to be present on IPSec.