IPSec Issues 2.2.3 and 2.2.4



  • Hi folks,

    I bought one of the pfSense branded firewalls, the SG-4860. I'm starting to regret it. I have used old computers with Intel NIC's up until this one and never had these kind of problems.

    It arrived to 2.2.2, I setup an IPSec tunnel to my Palo Alto firewall, it just worked. I then bridged the remaining ports to make this behave like a home router (single WAN, multiple LAN ports). Did some throughput testing, easily saturated the 50 down 3 up link I had.

    Then the update to 2.2.3. I was running AEI-NI, and like everyone else, no traffic over the tunnel. This was very frustrating because I had sent this home to another city with my CEO. I looked like a fool. He brings it back, I put it on the bench, can't find anything wrong. Start looking around the forums, didn't find anything until last night, looks like 2.2.3 broke IPSec when running AES-256 which is what I am running.

    So I applied the 2.2.4 patch this evening. Immediately traffic came up over the tunnel, but I see strange throughput issues. Downloading a large ISO over the tunnel, 50/3 pipe, it starts around 47-50, bounces around for around 400MB of the ISO, then drops to a solid unwavering 7.6Mbit/sec until that transfer completes. If I start another one, it repeats the same behavoir.

    MBUF looks good (I'm set to 1,000,000), nothing else appears to be grossly wrong. Tunnel is pretty standard, AES-256, SHA1, P2's are the same, DH5.

    I plug in my old core 2 box, also running 2.2.4, no issues at all, exactly the same tunnel.

    I'm considering rolling the firewall back to 2.2.2, but there have been a couple of important security updates.

    So are these SG-4860's just lemons? Or having used old computers as firewalls for years, am I just spoiled over "plain reliable hardware".

    TIA



  • Certainly not lemons, they're your best bet for ensuring hardware-specific things will "just work" on every version. If you had good experiences using old recycled desktops, you're lucky. Generally that's where people have issues because of flaky old hardware.

    The AES-NI regression in 2.2.3 was an unfortunate oversight, where all our AES-CBC test setups didn't have AES-NI, and all our production and test systems with AES-NI were only using AES-GCM. Our test procedures have been updated to ensure coverage of all the possible combinations there so such things won't recur, and it's of course fixed in 2.2.4.

    There aren't any general performance issues with IPsec or AES-NI either. I have 100 Mb of bandwidth from home to office over Internet, with a 4860 at home, and can max that out consistently no issue.

    The symptoms you describe sound like the situation where MSS clamping will help. Try enabling that on the IPsec advanced tab at 1400 and see what the behavior is like.



  • Just enabled from work, will repeat my testing from home tonight and report back.



  • Seeing the same thing, works fine for a bit, then flatlines just over 7.6Mbps. Rebooted it twice just to make sure.



  • Are you able to switch it to AES-GCM to see if that makes any difference? Depends on whether the other side supports it.

    If not, I don't expect it'll make any difference, but try disabling AES-NI and see if that changes the behavior any.



  • No, only a aes128-CCM16 (nothing GCM). Otherwise just AES variants, 3DES. Is GCM all that and a bag of chips? I'm not familiar with it.

    No difference with AES-NI disabled, if anything, a bit slower but same behavior (that was the first thing I tested on the new 2.2.4).



  • Those things were to try to narrow down the problem. That pretty much eliminates all those possibilities.

    GCM is significantly faster with AES-NI than CBC. It's always preferable with AES-NI where available for that reason. It isn't an answer to a problem like that, just another thing that might possibly help narrow down the cause. Though switching it to 3DES helps similarly in narrowing down the cause. With 3DES it behaved more or less the same? 3DES is much slower, though probably have enough headroom on that CPU that it'll max out your connection fine.

    Under System>Advanced, Networking, do you still have TSO and LRO disabled? Both are disabled by default. Enabling them can cause similar sounding issues.

    Does your other hardware definitely not have issues still?



  • OK, I'll try 3DES tonight and report back.

    Yes, TSO/LRO still disabled (on all three tunnels). Besides increasing MBUF buffers, doing the bridge, the unit is pretty much stock.

    And yes, I have two other tunnels on generic hardware. One on a dell desktop, Core i5, 4gb ram, small SSD, running AES-NI with an Intel Quad port PCIe nic, just using two nics, still running 2.2.2 that I don't want to upgrade because it's a 2 hour flight away. My other one is what I use at home, it's a circa 2009 Core 2 duo, 2GB ram, old 80gb rotational disk using the onboard nic for lan (I think it's a lower end intel), and a single port PCI Intel GT nic. This one running 2.2.4. It does not have AES-NI support.

    Both running full speed of their local connection capabilities (50/50, and 50/3 respectively). I can move around large files without issue/slowdown/etc. Neither have the MSS clamping set.

    I did testing on the 4860 before it left me with the 2.2.2 firmware, and it had no issues. Is it reasonably easy to roll back to 2.2.2? That would answer if there is a hardware problem or something with the current point version of pfSense.



  • TSO and LRO are global to the system, not a per-tunnel config.

    I thought from your earlier description you had an old system of some sort that you were swapping between at the same location and same config as the 4860, is that not the case? The ones running at other locations wouldn't be relevant.

    In that case it's safe to downgrade to 2.2.2 for testing that circumstance. Can use the manual update under System>Firmware with:
    https://files.pfsense.org/mirror/updates/old/pfSense-Full-Update-2.2.2-RELEASE-amd64.tgz

    It will complain that your config revision is newer, but that's OK in this specific case.



  • That's correct, the Core 2 system is at my house, so is the 4860. I've been swapping back and forth between those two systems just to validate that there is nothing upstream or network related.

    The third system is in the other city.

    I'll try 3DES tonight, failing that, I'll put the 2.2.2 on.



  • Sorry, didn't get to testing last night, will try tonight.



  • Tested 3DES with and without MSS clamping, even worse throughput than AES, about 4.7Mbit/sec.

    Tried AES-256, get about 9.6Mbit/sec.

    Rolling back to 2.2.2 right now.



  • So yeah, there is something seriously wrong with builds 2.2.3 and 2.2.4 with respect to IPsec.

    Rolled back to 2.2.2, and my throughput goes back to maxing out the circuit (50 down/3up). I've attached a screenshot proving this.

    From a "just work" on every version of pfSense with these SG boxes sold on the pfsense store, what options do I have here? Do I submit a ticket of some sort? It's easily to reproduce.

    With regards,



  • Netgate

    @rain:

    No, only a aes128-CCM16 (nothing GCM). Otherwise just AES variants, 3DES. Is GCM all that and a bag of chips? I'm not familiar with it.

    No difference with AES-NI disabled, if anything, a bit slower but same behavior (that was the first thing I tested on the new 2.2.4).

    AES-CCM isn't a great mode for IPSec.  In fact, the only support I can find in the FreeBSD kernel for it is in the wireless code, so I'm confused how you've configured to use it.  (AES-CCM gets used a lot in 802.11.)

    If you don't want to use AES-GCM, have you tried AES-CBC-128 with HMAC-SHA1, because that's the bog-standard "best practice" until you get concerned with the strength of SHA1 and a 128-bit key length.

    In face, I can't find any support for using AES-CCM in the IPSec subsystem in FreeBSD.  Here are the auth and encryption tokens that 'setkey' will recognize.  These are copy-pasta straight for the source code.

    /* authentication alogorithm */
    hmac-md5
    hmac-sha1
    keyed-md5
    keyed-sha1
    hmac-sha2-256
    hmac-sha2-384
    hmac-sha2-512
    hmac-ripemd160
    aes-xcbc-mac
    tcp-md5
    null

    /* encryption alogorithm */
    des-cbc
    3des-cbc
    null
    simple
    blowfish-cbc
    cast128-cbc
    des-deriv
    des-32iv
    rijndael-cbc
    aes-ctr
    camellia-cbc

    Not can I find any support in the GUI for AES-CCM.

    BTW, the only modes registered with the AES-NI module are:
    AES-CBC
    AES-ICM
    AES-GCM
    AES-GHASH (128, 192, 256 bit)
    AES-XTS

    That said, AES-NI isn't going to help much for modes with a separate HMAC (basically all but AES-GCM) because the pass over the packet with the HMAC will dominate the time to encode/decode the packet before transmit/reception.

    This is why AES-GCM is a 'win' with AES-NI.

    I have ZERO doubt that 3DES is slower than AES.

    please send the output of "ipsec statusall".  I don't suggest posting it here in the forum.  Since you purchased these from the pfSense store, you have support.  Open a ticket.  If it's a bug that we've somehow missed, then I'll ensure that you don't "use" that ticket.


  • Netgate

    SG-2220 (yes, they do exist, C2358 2 cores @ 1.7GHz) at home.
    C2758 (8 cores @ 2.4GHz) as VPN gateway at work.
    Both running pfSense software version 2.2.4

    1Gbps link from home, 1Gbps link at work, what happens between those two is good, but not ideal.

    Jims-MacBook-Pro:~ jim$ ping -c 3 nfs4
    PING nfs4.pfmechanics.com (172.27.32.4): 56 data bytes
    64 bytes from 172.27.32.4: icmp_seq=0 ttl=61 time=4.352 ms
    64 bytes from 172.27.32.4: icmp_seq=1 ttl=61 time=4.434 ms
    64 bytes from 172.27.32.4: icmp_seq=2 ttl=61 time=4.860 ms

    –- nfs4.pfmechanics.com ping statistics ---
    3 packets transmitted, 3 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 4.352/4.549/4.860/0.223 ms
    Jims-MacBook-Pro:~ jim$ ssh nfs4
    Last login: Sat Aug  1 15:48:30 2015 from 172.21.0.26
    FreeBSD 10.1-RELEASE-p5 (GENERIC) #0: Tue Jan 27 08:55:07 UTC 2015

    [jim@nfs4 ~]$ rm testfile
    [jim@nfs4 ~]$ dd if=/dev/random of=testfile bs=1k count=200k
    204800+0 records in
    204800+0 records out
    209715200 bytes transferred in 6.281192 secs (33387802 bytes/sec)
    [jim@nfs4 ~]$ ls -l testfile
    -rw-r–r--  1 jim  netgate  209715200 Aug  1 15:49 testfile
    [jim@nfs4 ~]$ exit
    logout
    Connection to nfs4 closed.
    Jims-MacBook-Pro:~ jim$ scp nfs4:testfile /tmp/testfile
    testfile                                          100%  200MB  22.2MB/s  00:09   
    Jims-MacBook-Pro:~ jim$ tcsh
    [Jims-MacBook-Pro:~] jim% repeat 10 sftp nfs4:testfile /dev/null
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  25.0MB/s  00:08   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  25.0MB/s  00:08   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  16.7MB/s  00:12   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  16.7MB/s  00:12   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  18.2MB/s  00:11   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  25.0MB/s  00:08   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  28.6MB/s  00:07   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  25.0MB/s  00:08   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  25.0MB/s  00:08   
    Connected to nfs4.
    Fetching /usr/home/jim/testfile to /dev/null
    /usr/home/jim/testfile                            100%  200MB  25.0MB/s  00:08   
    [Jims-MacBook-Pro:~] jim%

    ![Screen Shot 2015-08-01 at 4.06.37 PM.png](/public/imported_attachments/1/Screen Shot 2015-08-01 at 4.06.37 PM.png)
    ![Screen Shot 2015-08-01 at 4.06.37 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2015-08-01 at 4.06.37 PM.png_thumb)


  • Netgate

    And a little longer test
    [jim@nfs4 ~]$ dd if=/dev/random of=testfile bs=1k count=2000k
    2048000+0 records in
    2048000+0 records out
    2097152000 bytes transferred in 64.266291 secs (32632224 bytes/sec)
    [jim@nfs4 ~]$ ls -l testfile
    -rw-r–r--  1 jim  netgate  2097152000 Aug  1 16:10 testfile
    [jim@nfs4 ~]$ exit

    ![Screen Shot 2015-08-01 at 4.21.22 PM.png](/public/imported_attachments/1/Screen Shot 2015-08-01 at 4.21.22 PM.png)
    ![Screen Shot 2015-08-01 at 4.21.22 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2015-08-01 at 4.21.22 PM.png_thumb)


  • Netgate

    and, now that I've recovered the nuc from last night's 1hr+ power hit at work…

    Note that running across a LAN is faster, but no VPN.
    jim@nucatwork:~ % sudo scp jim@nfs4:testfile /usr/local/www/apache24/data/
    Password for jim@nfs4:
    testfile                                          100% 2000MB  87.0MB/s  00:23   
    jim@nucatwork:~ %

    ![Screen Shot 2015-08-01 at 5.10.57 PM.png](/public/imported_attachments/1/Screen Shot 2015-08-01 at 5.10.57 PM.png)
    ![Screen Shot 2015-08-01 at 5.10.57 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2015-08-01 at 5.10.57 PM.png_thumb)



  • @jwt:

    AES-CCM isn't a great mode for IPSec.  In fact, the only support I can find in the FreeBSD kernel for it is in the wireless code, so I'm confused how you've configured to use it.  (AES-CCM gets used a lot in 802.11.)

    I didn't, likely my post wasn't clear. Those are options on the other end of the tunnel, a Palo Alto Networks 3000 series. The response was to let CMB know what other options I have available to try. Although I appreciate your long response!

    I sent an email to the support from the store asking how to use that support ticket, but I have not yet heard back. That was last week Monday when I sent it. I'd love to report back that I got great support on this hardware.



  • I don't doubt you are getting that on different hardware. I'm getting a lot better on some of my old home built hardware from the scrap heap. But not an apples to apples comparison.

    Once I rolled back to 2.2.2 I'm getting reasonable performance from the tunnel. With nothing else changing except moving to 2.2.3 or 2.2.4 the tunnel fails to pass traffic and passes it terribly slow respectively. Not exactly "just works".


  • Netgate

    @rain:

    I don't doubt you are getting that on different hardware. I'm getting a lot better on some of my old home built hardware from the scrap heap. But not an apples to apples comparison.

    It's pretty close, actually.  I'm quite familiar with the SG-4860.  If anything, the 2220 is slower, and that was the point.  It's really straight-forward to get > 200Mbps using AES-GCM with AES-NI.

    If I'd wanted to quote lab performance, I've seen > 1.5Gbps using fairly modern Xeons.  But the SG-2220 is slower than what you're using.

    @rain:

    Once I rolled back to 2.2.2 I'm getting reasonable performance from the tunnel. With nothing else changing except moving to 2.2.3 or 2.2.4 the tunnel fails to pass traffic and passes it terribly slow respectively. Not exactly "just works".

    Have you turned off AES-NI?


  • Netgate

    @rain:

    @jwt:

    AES-CCM isn't a great mode for IPSec.  In fact, the only support I can find in the FreeBSD kernel for it is in the wireless code, so I'm confused how you've configured to use it.  (AES-CCM gets used a lot in 802.11.)

    I didn't, likely my post wasn't clear. Those are options on the other end of the tunnel, a Palo Alto Networks 3000 series. The response was to let CMB know what other options I have available to try. Although I appreciate your long response!

    OK, so send the output of "ipsec statusall", as requested.  Then we'll know what we're dealing with.

    @rain:

    I sent an email to the support from the store asking how to use that support ticket, but I have not yet heard back. That was last week Monday when I sent it. I'd love to report back that I got great support on this hardware.

    I've forwarded an internal request to see what happened here.



  • statusall on 2.2.2 and 2.2.4? Or just one?


  • Netgate

    both would be interesting, but 2.2.2 would be OK



  • More information in there then I'm willing to post publicly, so I've PM'd it to you.

    This was on 2.2.2 working the way it should.



  • There's something to this, I'm working on narrowing it down. I also grabbed your support ticket, will reply back there with an update before I wrap up for the day today.



  • Thanks Chris.

    Nice to be validated.  ;D I'm a newbie on these forums, but I'm not a newbie with networks.

    With regards,



  • I too am experiencing issues with IPSec 2.2.3 and 2.2.4.  My tunnel is fast, stays up for a couple hours and then just disconnects…. Not the same issue as what others are reporting, but I have three tunnels connecting to my IPSec 2.2.3 instance, (two far ends are 2.2.3 and one is 2.2.4)  The 2.2.4 does not stay healthly for more than 4 hours.. deleting both instances at both ends are recreating brings everything back up for another 4 hours and then the tunnel dies again.



  • What kind of hardware?

    What kind of tunnel?

    CMB is working on my issue for a couple of weeks now but I haven't heard anything recently. I got the impression it was an upstream problem.



  • The firmware version of Intel NICs can play a significant role in these types of problems. I have had identical issues on a LAN. Take a look at that if you would.

    afrojoe: I would update your 2.2.3 concentrator to 2.2.4 if you can. I am curious, are these pfSense HW or hand rolled? If hand rolled what are the NICs?

    /M



  • It's one of the official pfsense units sold directly from pfsense which will not have issues like firmware incompatibility.



  • Hi rain!

    The issue with Intel firmware is not necessarily incompatibility but rather the Intel developers seem to like to "play around" with different things for various customers in releases, some of this is latency on WAN links. I was one of the first WISPS in the 90s and it drove us nuts.

    I have an SG-2440 that is working flawlessly. I have a ticket open regarding the issues of different clients in mobile configurations. CMB is very busy on many fronts. Being he is AWESOME, we should help every way we can to narrow down issues.

    I would also not rule out an ISP related issue. Do you have the same provider on both ends? AND are all of your endpoints pfSense HW? TIA

    /M



  • Not and ISP issue, same hardware on two different providers behaves the same, also on the same provider. Different hardware on the same two different providers work without issue, also on the same provider.

    Quality of circuits is outstanding in all my remote locations. I'm using a hub spoke model, with a pair of Palo Alto 3000 series as the hub. Multiple spokes, all pfSense. Any pfsense running 2.2.2 has no issues (AES-256). All running 2.2.4 work fine except the pfsense official hardware firewall from the store.

    I have no issues other than with this one firewall hardware. All other factors I can remove, have been removed.

    CMB has I think all the details he's asked for, but I'm sure if he needs more he'll ask.

    And trust that I have nothing but respect for CMB and the team at ESF. I honestly believe that pfsense is the best platform for perimeter security out there, commercial or not. The only reason I use PAN as my hub is because of executive concerns around an open source platform doing all security between all subnets, local and remote.

    I'm just in an awkward position. I promised the CEO of the company that I would get him the best of the best, rather than what I usually build using spare parts, and I looked like an amateur after 2.2.2. All the technical reasons aside, he sees me handing him a black box that doesn't work as I told him it would. Meanwhile an old grinder under the desk supports 10-20 people on a regular basis and never blips.

    The only reason I bought the pfsense branded hardware was because I read these forums regularly, and I see pfsense experts brag all the time about their bulletproof hardware from the pfsense store. I wanted to be one of those too because quite frankly although I have good good success with old hardware, one day I'm sure that might end (given the end user problems on these forums). :)

    I'm grateful, honest!

    Cheers,