Yet another sizing question.
-
I see that aes-ni is not contained in QuickAssist after some research, but it seems that QAT is a much larger feature set than aes-ni. It's difficult to see exactly what the differences are because the AES-NI docs show 7 assembly language instructions where the QuickAssist docs show dozens of calls in C, covering a lot of different encryption algorithms, some of which appear to be related to AES but not contained in AES-NI.
Not sure I'd call them unrelated though as they're both focused mostly on encryption.
They are completely unrelated because one of them is a set of CPU instructions present on many CPUs from multiple vendors and widely implemented in standard libraries, while the other is an umbrella name for a set of off-chip accelerators with different (incompatible) feature sets usable only with a specific set of (not widely used) libraries. As a CPU instruction, there is essentially zero overhead for using AES-NI. As an off-chip accelerator, QAT requires setting up a block of work, telling the accelerator to work on it, and retrieving the work product. The overhead of performing those steps is very high if the size of the block is small (OpenVPN uses fairly small blocks.) I'm not aware of any good public benchmarks of QAT with OpenVPN (would love a link if someone has some), but I haven't seen any indications that it's a game changer for that application. Most of the published benchmarks are for web applications or certain IPsec implementations which can transfer very large blocks with minimal CPU involvement. You're right that QAT can also do zlib compression. That mostly falls into the "so what" department because most large data is already compressed, and implementing compression on top of SSL is generally considered a security risk these days so it's often disabled anyway. There are also better algorithms than zlib which are optimized for modern CPUs and which are a more attractive choice than buying into QAT just for that.
People have been talking about QAT on C2758 for small appliance applications for literally 3+ years now, with basically nothing to show for it. The version of QAT implemented on the C2758 is different than that in intel's current flagship products and is unlikely to get major development resources at this point. If you haven't already gotten several years of return from QAT on rangely in an OEM application, it's probably not worth putting it on a shopping list at this point. If someone has a specific (unusual) reason that they need QAT, great. But it's a shame to see people buying into it at this late date with the expectation that it's going to revolutionize their OpenVPN implementation.
-
Just buy new stuff, it's more better.
-
Interesting.
While I haven't set up openvpn on this box, I've done benchmarks. The benchmarks showed good performance, but thinking about it now they could be set up as a large job, meaning best performance. I've seen several parts of the documentation talking about different implementations of QAT but didn't know they were so significantly different.
I had been under the impression that the c2758 has no extra hardware for QAT, meaning no coprocessor built-in. It supposedly uses the main cpu cores for this. I guess this could mean that the QAT code is implemented as on-chip rom or microcode, but not sure if it matters at this point.
You're certainly right about there having been much talk and little to show for it over the past years.
-
I had been under the impression that the c2758 has no extra hardware for QAT, meaning no coprocessor built-in. It supposedly uses the main cpu cores for this. I guess this could mean that the QAT code is implemented as on-chip rom or microcode, but not sure if it matters at this point.
The C2000 series is a SoC (system on chip) which has a number of components which are on-die but not in the CPU. For example, the SATA controllers, ethernet controller, memory controller, and PCIe are all integrated into the same chip. But the CPU still needs to communicate over a bus to talk to those components, including the QAT module. None of the on-die components have direct access into the CPU, unlike AES-NI which (as a set CPU instructions) uses the registers and other CPU resources directly rather than requiring separate communication over a bus and potentially requiring a context switch to save CPU state between kernel and user mode.
-
IMHO, any N3150 based boards are crypto-capable and low-power, 8Gb RAM limited by design - the only negative effect I've experienced.