PfSense 2.4.1 - ikev2 IPSEC tunnel under load crashes whole firewall VM



  • Hello pfSense team,

    as there is preferred to open the forum topic before raising a bug, I am doing so. My pfSense Xenserver VM after upgrade from latest 2.3 to to 2.4.1 keeps crashing once I am transferring bigger amount of data through the IPSEC tunnel. I would like to collect some crash data, however it does not seems for me it is even able to create any crash file. Your hint where to search for them is welcome.

    Tunnel is established between two pfSense VMs, one running on ESXi 5.5 using CPU without AES-NI, second one (crashing one) is running on the Xenserver 7.0 on CPU with AES-NI. I can provide all details of the configuration, for now I have solved the issue by using the OpenVPN tunnel.

    Logs during issues shows loss of connectivity and simultaneous reboot:

    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense syslogd: kernel boot file is /boot/kernel/kernel
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: Copyright © 1992-2017 The FreeBSD Project.
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: Copyright © 1979, 1980, 1983, 1986, 1988, 1989                                                                                        , 1991, 1992, 1993, 1994
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: The Regents of the University of California. All                                                                                          rights reserved.
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: FreeBSD is a registered trademark of The FreeBSD                                                                                          Foundation.
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: FreeBSD 11.1-RELEASE-p2 #6 r313908+7eae9364d25(R                                                                                        ELENG_2_4): Sun Oct 22 17:32:35 CDT 2017
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: root@buildbot2.netgate.com:/builder/ce-241/tmp/o                                                                                        bj/builder/ce-241/tmp/FreeBSD-src/sys/pfSense amd64
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: FreeBSD clang version 4.0.0 (tags/RELEASE_400/fi                                                                                        nal 297347) (based on LLVM 4.0.0)
    Oct 31 10:46:19 pfSense syslogd: sendto: Network is unreachable
    Oct 31 10:46:19 pfSense kernel: VT(vga): text 80x25
    etc…

    Thanks,
    GyroK



  • I have the exact same issue. I think it has also happened on 2.4.0
    I do not have VMs, it does this on bare metal, with a Supermicro A1SRi-2558F.

    I can reproduce the problem by just copying some files through the IPSec tunnel.
    Luckily I do have crashdumps, there are three attached to this post. And of course, they have also been sent via the automatic crash dump thingy.

    crashdumps.zip



  • I disabled the AES-NI CPU-based crypto accelleration, rebooted. So far this seems to work.



  • I have the same issue on a SG-2440 unit.
    As soon the GB’s are flowing through the IPSec tunnel the unit crashes within a few minutes.
    Also on the SG-2440 disabling AES-NI (System/Advanced/Misc) seems to prevent the crashes.
    This behavior is introduced since version 2.4.0, release 2.3.4-P1 was working fine.



  • Hello pfSense team,

    I did some research and found following bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219356

    I followed behavior description, and it is the same bug - after changing encryption from AES-GCM to AES tunnel is stable as was in pfSense 2.3

    Looks like some regression …

    Regards,

    GyroK



  • Who should be able to fix this bug?
    Is it the pfsense team, or should this be fixed by the FreeBSD developers?



  • @RMB:

    Is it the pfsense team, or should this be fixed by the FreeBSD developers?

    It has been fixed, in FreeBSD 11-STABLE, so this particular fix might get imported into pfSense. Don’t know.
    Doesn’t seem to be much activity, so I’ll dump this in the pfSense bug tracker, because, well, I think we can safely say it’s a bug.



  • Great, thanks!



  • Any news on this bug?
    The problem is still there in version 2.4.2.
    I have to disable AES-NI to prevent a kernel panic during load through an IPSec tunnel.



  • Nope, but feel free to comment on the redmine bug repo:
    https://redmine.pfsense.org/issues/8070

    Or find someone with a support contract that can complain.  ::)



  • Having what I believe is this issue since moving to 2.4.x
    Here is a picture of the console with the Kernel crash.
    No log available.
    Reverted back to version 2.3.x and the problem has not occurred as of yet.




  • One clarification on my application, using a supermicro motherboard with pfsense installed directly to hard drive.  No VM Software involved.



  • Anyone know if this release has a fix for this issue?

    2.4.3-DEVELOPMENT (amd64)
    built on Tue Mar 13 10:14:21 CDT 2018
    FreeBSD 11.1-RELEASE-p7

    I see this is a patched version of FreeBSD, and there was a reference to ipsec fixes in the release notes, but it wasn’t clear if this fixed this same issue.



  • @Tacoma:

    Anyone know if this release has a fix for this issue?

    2.4.3-DEVELOPMENT (amd64)
    built on Tue Mar 13 10:14:21 CDT 2018
    FreeBSD 11.1-RELEASE-p7

    I see this is a patched version of FreeBSD, and there was a reference to ipsec fixes in the release notes, but it wasn’t clear if this fixed this same issue.

    Unfortunately, this bug is still valid with the following SW version:

    2.4.3-RELEASE (amd64)
    built on Mon Mar 26 18:02:04 CDT 2018
    FreeBSD 11.1-RELEASE-p7

    GCM mode cannot be used on the machines with AES-NI.

    Regards,
    GyroK


  • Administrator

    To claim it’s unusable in general is untrue. The crash must be specific to a certain combination of hardware, traffic load, and/or pattern of traffic.

    Loads of people are using AES-NI and AES-GCM without crashing, including just about every Netgate employee from our home firewalls.



  • Can confirm this is occurring for me on two different systems.
    Both are running on ESXi 6.5, one on DL380 G8, the other on DL380 G9.
    NIC type is vmxnet3, open-vm-tools installed on both.
    Phase 1: AES128-GCM / 128 / SHA1 / DH2
    Phase 2: AES128-GCM / AES-XCBC / no PFS

    Hard crash with a reboot within 5 minutes of initiating continuous iperf run, sometimes one side, sometimes both.

    Switching to any non-AES-NI algorithms kills throughput, but doesn’t hard crash.

    My```
    dmesg | grep -i aes

    Features2=0xffba2203 <sse3,pclmulqdq,ssse3,cx16,pcid,sse4.1,sse4.2,x2apic,popcnt,tscdlt,aesni,xsave,osxsave,avx,f16c,rdrand,hv>aesni0: <aes-cbc,aes-xts,aes-gcm,aes-icm>on motherboard</aes-cbc,aes-xts,aes-gcm,aes-icm></sse3,pclmulqdq,ssse3,cx16,pcid,sse4.1,sse4.2,x2apic,popcnt,tscdlt,aesni,xsave,osxsave,avx,f16c,rdrand,hv>

    
    I'll do some more testing this weekend when there's not as much production traffic flowing but for right now I'm knocked back down to plain AES.
    
    It does indeed make pfSense unusable for installations requiring decent IPSec interconnect speeds. Considering this issue I'll likely move to VyOS for my concentrators.
    
    Has anyone attempted to use the patch from the previous FreeBSD thread posted?
    
    Edit: both running 2.4.3-Release

 

© Copyright 2002 - 2018 Rubicon Communications, LLC | Privacy Policy