AES-NI acceleration of AES-GCM w/IPSec coming in 2.2



  • I live in Austin.  I have a 1gbps/1gbps line to my home, which is run by a FW-7551 (2 core Rangeley @ 1.7GHz). 
    There is a C2758 (8 core Rangeley @ 2.4GHz) at work.  (Work is connected via 10Gbps.)

    both systems are running a very recent (today) snapshot of pfSense-2.2

    I have an IPSec tunnel between the FW-7551 and C2758 running AES-GCM with AES-NI acceleration.

    jims-mini is a mac mini on my desk.
    172.21.0.95 is an Intel i5 NUC on my desk at home.

    So this is, in many-respects, a "real world" test, not an artificial "in-the-lab" test.

    I'm seeing between 729mbps and 891mbps throughput in the below.

    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  8031k      0  0:00:15  0:00:15 –:--:-- 7291k
    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  7134k      0  0:00:17  0:00:17 --:--:-- 7348k
    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  7302k      0  0:00:17  0:00:17 --:--:-- 8012k
    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  8598k      0  0:00:14  0:00:14 --:--:-- 8228k
    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  8212k      0  0:00:15  0:00:15 --:--:-- 8919k
    jims-mini:~ jim$

    More formal testing will come, but I'm pretty stoked about the above.



  • Apparently, I can't do simple math.

    It's been pointed out that 123MB in 15 seconds was about 65-70Mbps.

    The current 2.2-RC snapshots are running about 200Mbps on the same hardware.

    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  25.5M      0  0:00:04  0:00:04 –:--:-- 25.8M
    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  25.8M      0  0:00:04  0:00:04 --:--:-- 25.8M
    jims-mini:~ jim$ curl http://172.21.0.95/p2r.tgz > /tmp/p2r.tgz
      % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                    Dload  Upload  Total  Spent    Left  Speed
    100  123M  100  123M    0    0  25.9M      0  0:00:04  0:00:04 --:--:-- 26.0M



  • Thanks.

    It would be interesting to see what a pair of those 8-core Rangeleys could do between them.  (Maybe in a code box or Tt tags for readability ;))



  • Yeah - I think it would be most fair to put a couple of those on a table and connect them to each other through a VPN in a small network and take the ISPs out of it completely.



  • Yes, that would be interesting, but I've not gotten to it yet.

    I am testing with iperf/iperf3 now, because things have gotten too fast for my "curl" test.

    I'm getting a consistent 310-315Mbps with iperf between to work, across my home 1Gbps link.
    (C2758 on one end, FW-7551 (C2358) on the other, both running today's snapshot.)

    Earlier, I setup a pair of 1U boxes each with a E3-1275 at 3.5ghz. Each has a 10G Intel X520 Dual port card.
    The endpoints for load generation are two Dell R200 servers running stock FreeBSD (10.0 on one, 10.1 on the other), each with a Chelsio card.

    Quick-n-dirty single stream (iperf3) test results:
    AES-GCM no AES-NI - ~125 Mbps
    AES-GCM with AES-NI - ~1.75 Gbps.
    AES-GCM with AES-NI and pf disabled, ~2.2 Gbps.
    AES-CBC runs at around 415-425 Mbps.

    straight-up iperf3 between the same hosts
    pf Enabled, single stream:  3.28 Gbps
    pf Enabled, 10 stream: 5.15 Gbps
    pf Disabled, single stream: 4.63 Gbps
    pf Disabled, 10 stream: 6.91 Gbps

    That's all for now.



  • Thats certainly fast enough for the vast majority of people.



  • @kejianshi:

    Thats certainly fast enough for the vast majority of people.

    but not for me.  :-)



  • I will be using 10gb network for the next foreseeable future.  If I can max that out, I will be happy.



  • You'll have to wait a bit.  That's not going to be in 2.2.



  • This morning's results.

    Remember, this is a real-world network, not a lab situation.

    (So fun to watch…)

    ![Screen Shot 2014-12-30 at 10.08.28 AM.png](/public/imported_attachments/1/Screen Shot 2014-12-30 at 10.08.28 AM.png)
    ![Screen Shot 2014-12-30 at 10.08.28 AM.png_thumb](/public/imported_attachments/1/Screen Shot 2014-12-30 at 10.08.28 AM.png_thumb)


Log in to reply