Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfSense Active CARP Member Crashed: aesni_process -> crypto_dispatch ...

    Scheduled Pinned Locked Moved IPsec
    22 Posts 4 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ
      jimp Rebel Alliance Developer Netgate
      last edited by

      That doesn't quite look like the other thread you mentioned, which I believe was also tied to https://redmine.pfsense.org/issues/8070 -- your crash has a bit different backtrace and also those were using AES-GCM, not AES-256.

      I'm not finding anything else that lines up exactly with what you are seeing here on current versions of FreeBSD.

      Does the behavior change if you disable AES-NI/cryptodev?

      Could you possibly try a 2.4.5-RC snapshot and see if it happens there?

      Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • M
        monotypeTattoo
        last edited by

        Thank you Jimp.
        We had a re-occurrence last night. That was 12 days after the first incident. We are going to leave it another 12 days and then upgrade to 2.4.5.

        M 1 Reply Last reply Reply Quote 0
        • M
          monotypeTattoo @monotypeTattoo
          last edited by

          This has happened again, still on 2.4.4-p3:

          Tracing pid 12 tid 100124 td 0xfffff80004760000
          pf_test() at pf_test+0x1d24/frame 0xfffffe00003000c0
          pf_check_out() at pf_check_out+0x1d/frame 0xfffffe00003000e0
          pfil_run_hooks() at pfil_run_hooks+0x90/frame 0xfffffe0000300170
          ip_output() at ip_output+0xb1d/frame 0xfffffe00003002a0
          ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfffffe00003002f0
          esp_output_cb() at esp_output_cb+0xeb/frame 0xfffffe0000300350
          aesni_process() at aesni_process+0x151/frame 0xfffffe0000300400
          crypto_dispatch() at crypto_dispatch+0x140/frame 0xfffffe0000300440
          esp_output() at esp_output+0x5cc/frame 0xfffffe00003004e0
          ipsec4_perform_request() at ipsec4_perform_request+0x37f/frame 0xfffffe0000300580
          ipsec4_forward() at ipsec4_forward+0x5a/frame 0xfffffe00003005b0
          ip_forward() at ip_forward+0x221/frame 0xfffffe0000300650
          ip_input() at ip_input+0x72a/frame 0xfffffe00003006b0
          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0000300700
          ether_demux() at ether_demux+0x173/frame 0xfffffe0000300730
          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0000300790
          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe00003007e0
          ether_input() at ether_input+0x26/frame 0xfffffe0000300800
          igb_rxeof() at igb_rxeof+0x6e1/frame 0xfffffe0000300890
          igb_msix_que() at igb_msix_que+0x110/frame 0xfffffe00003008e0
          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0000300920
          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0000300970
          fork_exit() at fork_exit+0x83/frame 0xfffffe00003009b0
          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003009b0
          

          That's the third time, 19 days since the last crash.
          We have started rolling 2.4.5 out across our sites and dare say this firewall will be upgraded to 2.4.5 before it crashes again.

          Will advise whether the upgrade resolves the problem.

          1 Reply Last reply Reply Quote 0
          • M
            monotypeTattoo
            last edited by

            We upgraded to 2.4.5 on Monday (2020-05-04).
            Tonight the primary firewall crashed again tonight. Stack trace looks very similar from line 7 onwards.

            We are going to turn off AES-NI acceleration later on tonight.

            Tracing pid 12 tid 100135 td 0xfffff8000470f620
            kdb_enter() at kdb_enter+0x3b/frame 0xfffffe0000336e10
            vpanic() at vpanic+0x19b/frame 0xfffffe0000336e70
            panic() at panic+0x43/frame 0xfffffe0000336ed0
            trap_pfault() at trap_pfault/frame 0xfffffe0000336f20
            trap_pfault() at trap_pfault+0x49/frame 0xfffffe0000336f80
            trap() at trap+0x29d/frame 0xfffffe0000337090
            calltrap() at calltrap+0x8/frame 0xfffffe0000337090
            --- trap 0xc, rip = 0xffffffff80e89c3b, rsp = 0xfffffe0000337160, rbp = 0xfffffe0000337280 ---
            ip_output() at ip_output+0x12fb/frame 0xfffffe0000337280
            ipsec_process_done() at ipsec_process_done+0x1c7/frame 0xfffffe00003372d0
            esp_output_cb() at esp_output_cb+0xea/frame 0xfffffe0000337330
            aesni_process() at aesni_process+0x151/frame 0xfffffe00003373e0
            crypto_dispatch() at crypto_dispatch+0x14d/frame 0xfffffe0000337410
            esp_output() at esp_output+0x601/frame 0xfffffe00003374b0
            ipsec4_perform_request() at ipsec4_perform_request+0x38c/frame 0xfffffe0000337550
            ipsec4_forward() at ipsec4_forward+0x5a/frame 0xfffffe0000337580
            ip_forward() at ip_forward+0x230/frame 0xfffffe0000337620
            ip_input() at ip_input+0x724/frame 0xfffffe00003376b0
            netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe0000337700
            ether_demux() at ether_demux+0x15b/frame 0xfffffe0000337730
            ether_nh_input() at ether_nh_input+0x32c/frame 0xfffffe0000337790
            netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe00003377e0
            ether_input() at ether_input+0x26/frame 0xfffffe0000337800
            igb_rxeof() at igb_rxeof+0x6d5/frame 0xfffffe0000337890
            igb_msix_que() at igb_msix_que+0x101/frame 0xfffffe00003378e0
            intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0000337920
            ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0000337970
            fork_exit() at fork_exit+0x83/frame 0xfffffe00003379b0
            fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003379b0
            
            FreeBSD 11.3-STABLE #236 21cbb70bbd1(RELENG_2_4_5): Tue Mar 24 15:26:53 EDT 2020     root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-245/obj/amd64/YNx4Qq3j/build/ce-crossbuild-245/sources/FreeBSD-src/sys/pfSense
            
            1 Reply Last reply Reply Quote 0
            • M
              monotypeTattoo
              last edited by monotypeTattoo

              We turned off AES-NI but that capped IpSec VPN traffic to 8 Mb/s (seemed very drastic?)
              We've turned AES-NI back on.

              We've got physically identical firewalls ready to go in at one of our other sites. Once we have those in situ, we will look at firmware upgrades and tweaking AES ciphers.

              1 Reply Last reply Reply Quote 0
              • M
                monotypeTattoo
                last edited by

                Quick update, we experienced an identical crash on 2020-05-08.
                Will keep apprised for developments as we hopefully, in the coming weeks, roll out firmware upgrades.

                1 Reply Last reply Reply Quote 0
                • jimpJ
                  jimp Rebel Alliance Developer Netgate
                  last edited by

                  Under VPN > IPsec, Advanced settings tab, do you have Asynchronous Cryptography checked? If so, try unchecking it.

                  Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 0
                  • M
                    monotypeTattoo
                    last edited by

                    Hi Jimp,

                    We don't have the asynchronous cryptography option checked.

                    Thank you.

                    1 Reply Last reply Reply Quote 0
                    • M
                      monotypeTattoo
                      last edited by

                      Another crash late last night.
                      Different stack trace this time.

                      We haven't made any changes as yet as we have been waiting for our physically identical firewalls at our other site get some bedding in time.

                      Tracing pid 12 tid 100067 td 0xfffff8000435e000
                      kdb_enter() at kdb_enter+0x3b/frame 0xfffffe003e10e4a0
                      vpanic() at vpanic+0x19b/frame 0xfffffe003e10e500
                      panic() at panic+0x43/frame 0xfffffe003e10e560
                      trap_pfault() at trap_pfault/frame 0xfffffe003e10e5b0
                      trap_pfault() at trap_pfault+0x49/frame 0xfffffe003e10e610
                      trap() at trap+0x29d/frame 0xfffffe003e10e720
                      calltrap() at calltrap+0x8/frame 0xfffffe003e10e720
                      --- trap 0xc, rip = 0xffffffff80e8127a, rsp = 0xfffffe003e10e7f0, rbp = 0xfffffe003e10e870 ---
                      ip_input() at ip_input+0x5da/frame 0xfffffe003e10e870
                      swi_net() at swi_net+0x143/frame 0xfffffe003e10e8e0
                      intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe003e10e920
                      ithread_loop() at ithread_loop+0xe7/frame 0xfffffe003e10e970
                      fork_exit() at fork_exit+0x83/frame 0xfffffe003e10e9b0
                      fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003e10e9b0
                      
                      1 Reply Last reply Reply Quote 0
                      • M
                        monotypeTattoo
                        last edited by

                        I'm not sure whether this will help in shining any light on the issue, but I've collated the relevant contents of msgbuf.txt from each crash:

                        2020-05-26

                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 7; apic id = 07
                        
                        fault virtual address	= 0x3800
                        
                        Fatal trap 12: page fault while in kernel mode
                        fault code		= supervisor write data, page not present
                        cpuid = 3; apic id = 03
                        instruction pointer	= 0x20:0xffffffff80e89c3b
                        fault virtual address	= 0x1
                        stack pointer	        = 0x0:0xfffffe000037d160
                        
                        frame pointer	        = 0x0:0xfffffe000037d280
                        
                        Fatal trap 12: page fault while in kernel mode
                        
                        cpuid = 1; 
                        apic id = 01
                        Fatal trap 12: page fault while in kernel mode
                        fault virtual address	= 0x800
                        cpuid = 5; apic id = 05
                        fault virtual address	= 0x1
                        fault code		= supervisor read data, page not present
                        fault code		= supervisor write data, page not present
                        instruction pointer	= 0x20:0xffffffff80e8127a
                        stack pointer	        = 0x0:0xfffffe003e10e7f0
                        frame pointer	        = 0x0:0xfffffe003e10e870
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        processor eflags	= interrupt enabled, resume, IOPL = 0
                        current process		= 12 (swi1: netisr 6)
                        trap number		= 12
                        panic: page fault
                        cpuid = 5
                        KDB: enter: panic
                        

                        2020-05-16

                        Fatal trap 12: page fault while in kernel mode
                        
                        
                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 2; apic id = 02
                        fault virtual address	= 0x1000
                        fault code		= supervisor write data, page not present
                        
                        
                        
                        Fatal trap 12: page fault while in kernel mode
                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 6; apic id = 06
                        fault virtual address	= 0x3000
                        fault code		= supervisor write data, page not present
                        
                        instruction pointer	= 0x20:0xffffffff80e89c3b
                        stack pointer	        = 0x28:0xfffffe0000373160
                        frame pointer	        = 0x28:0xfffffe0000373280
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        processor eflags	= interrupt enabled, resume, IOPL = 0
                        current process		= 12 (irq288: igb2:que 6)
                        trap number		= 12
                        panic: page fault
                        cpuid = 6
                        KDB: enter: panic
                        

                        2020-05-08

                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 2; apic id = 02
                        fault virtual address	= 0x1000
                        
                        fault code		= supervisor write data, page not present
                        
                        instruction pointer	= 0x20:0xffffffff80e89c3b
                        Fatal trap 12: page fault while in kernel mode
                        stack pointer	        = 0x28:0xfffffe000034b160
                        frame pointer	        = 0x28:0xfffffe000034b280
                        
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        Fatal trap 12: page fault while in kernel mode
                        processor eflags	= interrupt enabled, cpuid = 6; apic id = 06
                        fault virtual address	= 0x3000
                        fault code		= supervisor write data, page not present
                        instruction pointer	= 0x20:0xffffffff80e89c3b
                        stack pointer	        = 0x28:0xfffffe0000373160
                        frame pointer	        = 0x28:0xfffffe0000373280
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        processor eflags	= resume, IOPL = 0
                        current process		= 12 (irq284: igb2:que 2)
                        trap number		= 12
                        panic: page fault
                        cpuid = 2
                        KDB: enter: panic
                        

                        2020-05-06

                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 0; apic id = 00
                        fault virtual address	= 0x0
                        fault code		= supervisor write data, page not present
                        instruction pointer	= 0x20:0xffffffff80e89c3b
                        stack pointer	        = 0x28:0xfffffe0000337160
                        frame pointer	        = 0x28:0xfffffe0000337280
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        processor eflags	= interrupt enabled, resume, IOPL = 0
                        current process		= 12 (irq282: igb2:que 0)
                        trap number		= 12
                        panic: page fault
                        cpuid = 0
                        KDB: enter: panic
                        

                        2020-04-14

                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 4; apic id = 04
                        fault virtual address	= 0x1
                        fault code		= supervisor read data, page not present
                        instruction pointer	= 0x20:0xffffffff80f447a4
                        stack pointer	        = 0x28:0xfffffe00002ffe90
                        frame pointer	        = 0x28:0xfffffe00003000c0
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        processor eflags	= interrupt enabled, resume, IOPL = 0
                        current process		= 12 (irq286: igb2:que 4)
                        

                        2020-03-15

                        Fatal trap 12: page fault while in kernel mode
                        
                        
                        cpuid = 0; 
                        Fatal trap 12: page fault while in kernel mode
                        
                        Fatal trap 12: page fault while in kernel mode
                        cpuid = 7; apic id = 07
                        cpuid = 5; 
                        apic id = 05
                        
                        Fatal trap 12: page fault while in kernel mode
                        fault virtual address	= 0x1
                        cpuid = 1; fault code		= supervisor read data, page not present
                        apic id = 01
                        instruction pointer	= 0x20:0xffffffff80f447a4
                        stack pointer	        = 0x28:0xfffffe0000309e90
                        frame pointer	        = 0x28:0xfffffe000030a0c0
                        code segment		= base 0x0, limit 0xfffff, type 0x1b
                        			= DPL 0, pres 1, long 1, def32 0, gran 1
                        fault virtual address	= 0x1
                        fault code		= supervisor read data, page not present
                        instruction pointer	= 0x20:0xffffffff80f447a4
                        stack pointer	        = 0x28:0xfffffe000031de90
                        frame pointer	        = 0x28:0xfffffe000031e0c0
                        fault virtual address	= 0x1
                        processor eflags	= interrupt enabled, resume, IOPL = 0
                        current process		= 12 (irq287: igb2:que 5)
                        

                        We are going to do some benchmarking with OpenVPN today and see if we can use it instead of ipSec for a couple of weeks. We're also discussing swapping the master/stand-by firewalls around, in order to rule-in/rule-out are hardware problem.

                        1 Reply Last reply Reply Quote 0
                        • M
                          mazafak
                          last edited by

                          we are seeing the same exact problem. panics/reboots every 5-7 days.
                          setup: 2.4.4-p3, CARP, VLANs, site-to-site Ipsec(aesni in use). hardware was fully replaced to eliminate HW related problems.
                          May 16

                          db:0:kdb.enter.default>  show pcpu
                          cpuid        = 3
                          dynamic pcpu = 0xfffffe044c606380
                          curthread    = 0xfffff8000c429000: pid 12 "irq351: ixl3:q3"
                          curpcb       = 0xfffffe0451afa400
                          fpcurthread  = none
                          idlethread   = 0xfffff8000835b620: tid 100006 "idle: cpu3"
                          curpmap      = 0xffffffff82b85998
                          tssp         = 0xffffffff82bb6948
                          commontssp   = 0xffffffff82bb6948
                          rsp0         = 0xfffffe0451afa400
                          gs32p        = 0xffffffff82bbd1a0
                          ldt          = 0xffffffff82bbd1e0
                          tss          = 0xffffffff82bbd1d0
                          db:0:kdb.enter.default>  bt
                          Tracing pid 12 tid 100284 td 0xfffff8000c429000
                          pf_test() at pf_test+0x1d24/frame 0xfffffe0451af9880
                          pf_check_out() at pf_check_out+0x1d/frame 0xfffffe0451af98a0
                          pfil_run_hooks() at pfil_run_hooks+0x90/frame 0xfffffe0451af9930
                          ip_output() at ip_output+0xb1d/frame 0xfffffe0451af9a60
                          ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfffffe0451af9ab0
                          esp_output_cb() at esp_output_cb+0xeb/frame 0xfffffe0451af9b10
                          aesni_process() at aesni_process+0x151/frame 0xfffffe0451af9bc0
                          crypto_dispatch() at crypto_dispatch+0x140/frame 0xfffffe0451af9c00
                          esp_output() at esp_output+0x5cc/frame 0xfffffe0451af9ca0
                          ipsec4_perform_request() at ipsec4_perform_request+0x37f/frame 0xfffffe0451af9d40
                          ipsec4_forward() at ipsec4_forward+0x5a/frame 0xfffffe0451af9d70
                          ip_forward() at ip_forward+0x221/frame 0xfffffe0451af9e10
                          ip_input() at ip_input+0x72a/frame 0xfffffe0451af9e70
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451af9ec0
                          ether_demux() at ether_demux+0x173/frame 0xfffffe0451af9ef0
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451af9f50
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451af9fa0
                          ether_input() at ether_input+0x26/frame 0xfffffe0451af9fc0
                          vlan_input() at vlan_input+0x215/frame 0xfffffe0451afa070
                          ether_demux() at ether_demux+0x15c/frame 0xfffffe0451afa0a0
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451afa100
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451afa150
                          ether_input() at ether_input+0x26/frame 0xfffffe0451afa170
                          ixl_rxeof() at ixl_rxeof+0x47b/frame 0xfffffe0451afa210
                          ixl_msix_que() at ixl_msix_que+0x42/frame 0xfffffe0451afa260
                          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0451afa2a0
                          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0451afa2f0
                          fork_exit() at fork_exit+0x83/frame 0xfffffe0451afa330
                          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0451afa330
                          

                          May 21

                          db:0:kdb.enter.default>  run lockinfo
                          db:1:lockinfo> show locks
                          No such command; use "help" to list available commands
                          db:1:lockinfo>  show alllocks
                          No such command; use "help" to list available commands
                          db:1:lockinfo>  show lockedvnods
                          Locked vnodes
                          db:0:kdb.enter.default>  show pcpu
                          cpuid        = 4
                          dynamic pcpu = 0xfffffe044c610380
                          curthread    = 0xfffff8000ccf6620: pid 12 "irq352: ixl3:q4"
                          curpcb       = 0xfffffe0451aff400
                          fpcurthread  = none
                          idlethread   = 0xfffff8000835b000: tid 100007 "idle: cpu4"
                          curpmap      = 0xffffffff82b85998
                          tssp         = 0xffffffff82bb69b0
                          commontssp   = 0xffffffff82bb69b0
                          rsp0         = 0xfffffe0451aff400
                          gs32p        = 0xffffffff82bbd208
                          ldt          = 0xffffffff82bbd248
                          tss          = 0xffffffff82bbd238
                          db:0:kdb.enter.default>  bt
                          Tracing pid 12 tid 100285 td 0xfffff8000ccf6620
                          ip_input() at ip_input+0x60e/frame 0xfffffe0451afee70
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451afeec0
                          ether_demux() at ether_demux+0x173/frame 0xfffffe0451afeef0
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451afef50
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451afefa0
                          ether_input() at ether_input+0x26/frame 0xfffffe0451afefc0
                          vlan_input() at vlan_input+0x215/frame 0xfffffe0451aff070
                          ether_demux() at ether_demux+0x15c/frame 0xfffffe0451aff0a0
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451aff100
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451aff150
                          ether_input() at ether_input+0x26/frame 0xfffffe0451aff170
                          ixl_rxeof() at ixl_rxeof+0x47b/frame 0xfffffe0451aff210
                          ixl_msix_que() at ixl_msix_que+0x42/frame 0xfffffe0451aff260
                          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0451aff2a0
                          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0451aff2f0
                          fork_exit() at fork_exit+0x83/frame 0xfffffe0451aff330
                          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0451aff330
                          

                          May 24

                          db:0:kdb.enter.default>  run lockinfo
                          db:1:lockinfo> show locks
                          No such command; use "help" to list available commands
                          db:1:lockinfo>  show alllocks
                          No such command; use "help" to list available commands
                          db:1:lockinfo>  show lockedvnods
                          Locked vnodes
                          db:0:kdb.enter.default>  show pcpu
                          cpuid        = 0
                          dynamic pcpu = 0x898380
                          curthread    = 0xfffff8000c88f620: pid 12 "irq339: ixl2:q0"
                          curpcb       = 0xfffffe0451a16400
                          fpcurthread  = none
                          idlethread   = 0xfffff8000834c000: tid 100003 "idle: cpu0"
                          curpmap      = 0xffffffff82b85998
                          tssp         = 0xffffffff82bb6810
                          commontssp   = 0xffffffff82bb6810
                          rsp0         = 0xfffffe0451a16400
                          gs32p        = 0xffffffff82bbd068
                          ldt          = 0xffffffff82bbd0a8
                          tss          = 0xffffffff82bbd098
                          db:0:kdb.enter.default>  bt
                          Tracing pid 12 tid 100263 td 0xfffff8000c88f620
                          ip_output() at ip_output+0x1418/frame 0xfffffe0451a15a60
                          ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfffffe0451a15ab0
                          esp_output_cb() at esp_output_cb+0xeb/frame 0xfffffe0451a15b10
                          aesni_process() at aesni_process+0x151/frame 0xfffffe0451a15bc0
                          crypto_dispatch() at crypto_dispatch+0x140/frame 0xfffffe0451a15c00
                          esp_output() at esp_output+0x5cc/frame 0xfffffe0451a15ca0
                          ipsec4_perform_request() at ipsec4_perform_request+0x37f/frame 0xfffffe0451a15d40
                          ipsec4_forward() at ipsec4_forward+0x5a/frame 0xfffffe0451a15d70
                          ip_forward() at ip_forward+0x221/frame 0xfffffe0451a15e10
                          ip_input() at ip_input+0x72a/frame 0xfffffe0451a15e70
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451a15ec0
                          ether_demux() at ether_demux+0x173/frame 0xfffffe0451a15ef0
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451a15f50
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451a15fa0
                          ether_input() at ether_input+0x26/frame 0xfffffe0451a15fc0
                          vlan_input() at vlan_input+0x215/frame 0xfffffe0451a16070
                          ether_demux() at ether_demux+0x15c/frame 0xfffffe0451a160a0
                          ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451a16100
                          netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451a16150
                          ether_input() at ether_input+0x26/frame 0xfffffe0451a16170
                          ixl_rxeof() at ixl_rxeof+0x47b/frame 0xfffffe0451a16210
                          ixl_msix_que() at ixl_msix_que+0x42/frame 0xfffffe0451a16260
                          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0451a162a0
                          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0451a162f0
                          fork_exit() at fork_exit+0x83/frame 0xfffffe0451a16330
                          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0451a16330
                          
                          1 Reply Last reply Reply Quote 1
                          • M
                            monotypeTattoo
                            last edited by

                            Hello mazafak,

                            What hardware are you using?
                            When you replaced the hardware, was it a like for like replacement or did you change the specification in any way?

                            Thanks
                            M

                            1 Reply Last reply Reply Quote 0
                            • M
                              monotypeTattoo
                              last edited by

                              Probably also worth mentioning, we are using two Intel I350 Quad Port 1GbE network cards in each of our pfSense boxes.

                              1 Reply Last reply Reply Quote 0
                              • M
                                mazafak
                                last edited by

                                first it was a dell R430 with Intel X520 10gbe card (ix)
                                now it is a supermicro with Intel X710 10gbe card (ixl)

                                configuration was moved from the first host to the next. problem with crashes and reboots stayed.
                                here is a new one from 5/28

                                db:0:kdb.enter.default>  run lockinfo
                                db:1:lockinfo> show locks
                                No such command; use "help" to list available commands
                                db:1:lockinfo>  show alllocks
                                No such command; use "help" to list available commands
                                db:1:lockinfo>  show lockedvnods
                                Locked vnodes
                                db:0:kdb.enter.default>  show pcpu
                                cpuid        = 2
                                dynamic pcpu = 0xfffffe044c5fc380
                                curthread    = 0xfffff8000c67b000: pid 12 "irq350: ixl3:q2"
                                curpcb       = 0xfffffe0451af5400
                                fpcurthread  = none
                                idlethread   = 0xfffff8000834b000: tid 100005 "idle: cpu2"
                                curpmap      = 0xffffffff82b85998
                                tssp         = 0xffffffff82bb68e0
                                commontssp   = 0xffffffff82bb68e0
                                rsp0         = 0xfffffe0451af5400
                                gs32p        = 0xffffffff82bbd138
                                ldt          = 0xffffffff82bbd178
                                tss          = 0xffffffff82bbd168
                                db:0:kdb.enter.default>  bt
                                Tracing pid 12 tid 100283 td 0xfffff8000c67b000
                                ip_output() at ip_output+0x1418/frame 0xfffffe0451af4a60
                                ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfffffe0451af4ab0
                                esp_output_cb() at esp_output_cb+0xeb/frame 0xfffffe0451af4b10
                                aesni_process() at aesni_process+0x151/frame 0xfffffe0451af4bc0
                                crypto_dispatch() at crypto_dispatch+0x140/frame 0xfffffe0451af4c00
                                esp_output() at esp_output+0x5cc/frame 0xfffffe0451af4ca0
                                ipsec4_perform_request() at ipsec4_perform_request+0x37f/frame 0xfffffe0451af4d40
                                ipsec4_forward() at ipsec4_forward+0x5a/frame 0xfffffe0451af4d70
                                ip_forward() at ip_forward+0x221/frame 0xfffffe0451af4e10
                                ip_input() at ip_input+0x72a/frame 0xfffffe0451af4e70
                                netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451af4ec0
                                ether_demux() at ether_demux+0x173/frame 0xfffffe0451af4ef0
                                ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451af4f50
                                netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451af4fa0
                                ether_input() at ether_input+0x26/frame 0xfffffe0451af4fc0
                                vlan_input() at vlan_input+0x215/frame 0xfffffe0451af5070
                                ether_demux() at ether_demux+0x15c/frame 0xfffffe0451af50a0
                                ether_nh_input() at ether_nh_input+0x32b/frame 0xfffffe0451af5100
                                netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfffffe0451af5150
                                ether_input() at ether_input+0x26/frame 0xfffffe0451af5170
                                ixl_rxeof() at ixl_rxeof+0x47b/frame 0xfffffe0451af5210
                                ixl_msix_que() at ixl_msix_que+0x42/frame 0xfffffe0451af5260
                                intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0451af52a0
                                ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0451af52f0
                                fork_exit() at fork_exit+0x83/frame 0xfffffe0451af5330
                                fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0451af5330
                                
                                
                                1 Reply Last reply Reply Quote 0
                                • M
                                  monotypeTattoo
                                  last edited by

                                  Hi Mazafak,

                                  Any chance that you could let us know what model CPUs/chipsets you have reproduced the issue with?
                                  Also the ciphers/hash algorithms used?

                                  Thank you.

                                  I've made a mistake in the initial post. We are using AES256-GCM, not AES256-CBC.

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    mazafak
                                    last edited by

                                    I took ipsec settings from:
                                    https://docs.netgate.com/pfsense/en/latest/vpn/scaling.html (scroll down to "Optimal Encryption Settings")

                                    the supermicro mb is X11SDV-8C-TP8F, that's the one having crashes similar to the dell R430.

                                    what is interesting is that on the other side of the ipsec tunnel we have another pfsense running on Super Micro XG-1537. it doesn't have VLAN/LACP setup and it has been rock solid for over a year.

                                    so, it could be the VLAN code that's crashing for me. I will attempt upgrading to 2.4.5-p1 once it is available to see if it is any better.

                                    1 Reply Last reply Reply Quote 0
                                    • M
                                      monotypeTattoo
                                      last edited by

                                      Thank you Mazafak.

                                      We had another crash last night:

                                      Tracing pid 12 tid 100065 td 0xfffff80004359000
                                      kdb_enter() at kdb_enter+0x3b/frame 0xfffffe003e1044a0
                                      vpanic() at vpanic+0x19b/frame 0xfffffe003e104500
                                      panic() at panic+0x43/frame 0xfffffe003e104560
                                      trap_pfault() at trap_pfault/frame 0xfffffe003e1045b0
                                      trap_pfault() at trap_pfault+0x49/frame 0xfffffe003e104610
                                      trap() at trap+0x29d/frame 0xfffffe003e104720
                                      calltrap() at calltrap+0x8/frame 0xfffffe003e104720
                                      --- trap 0xc, rip = 0xffffffff80e8127a, rsp = 0xfffffe003e1047f0, rbp = 0xfffffe003e104870 ---
                                      ip_input() at ip_input+0x5da/frame 0xfffffe003e104870
                                      swi_net() at swi_net+0x143/frame 0xfffffe003e1048e0
                                      intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe003e104920
                                      ithread_loop() at ithread_loop+0xe7/frame 0xfffffe003e104970
                                      fork_exit() at fork_exit+0x83/frame 0xfffffe003e1049b0
                                      fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003e1049b0
                                      
                                      1 Reply Last reply Reply Quote 0
                                      • A
                                        astabing
                                        last edited by

                                        We had very similar crashes on different pfSense 2.4.4 machines along with kernel: [zone: pf frag entries] PF frag entries limit reached messages in logs.

                                        Crashes are gone after Enable MSS clamping on VPN traffic was enabled in IPsec advanced settings. Maybe its your case too ?

                                        M 1 Reply Last reply Reply Quote 0
                                        • M
                                          monotypeTattoo @astabing
                                          last edited by monotypeTattoo

                                          @astabing said in pfSense Active CARP Member Crashed: aesni_process -> crypto_dispatch ...:

                                          We had very similar crashes on different pfSense 2.4.4 machines along with kernel: [zone: pf frag entries] PF frag entries limit reached messages in logs.

                                          Crashes are gone after Enable MSS clamping on VPN traffic was enabled in IPsec advanced settings. Maybe its your case too ?

                                          We've never seen kernel: [zone: pf frag entries] PF frag entries limit reached and we enabled MSS clamping last week to resolve an IPSec throughput issue. We had another crash within 48 hours of making that change, so it didn't resolve it.

                                          We've raised a FreeBSD Bugzilla report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951.

                                          Thanks

                                          1 Reply Last reply Reply Quote 0
                                          • M
                                            monotypeTattoo
                                            last edited by

                                            Looks like this may be fixed:

                                            https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951#c16

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.