Yet another sizing question.
-
Cool. Does pfSense boot UEFI or do I need to go old-school? And will it try to format my gentoo drives?
-
2.4 supports UEFI
https://redmine.pfsense.org/issues/4044You can select the wrong drives and screw up your gentoo drives, I don't know if it will mess up your boot manager?
The safest way to do it will be to either:
-
Unplug your drives on the gentoo box and run the installer
-
Install to a usb on another system and then switch the USB to your C2758 box, boot from it and reassign NICs (this would be useful if it isn't practical to unplug drives on gentoo box but you have something else lying around that you either can unplug drives easily or don't care if gets messed up)
-
-
If you're interested I might run pfSense on that system again and see how it handles pure IDS/IPS for a reference point. It might be a lot better without VPN taking up a big chunk of one core.
That would be greatly appreciated :)
-
I'm curious now as well, I'll have to try that out.
I was also curious to see how its real world performance compares to Ira's VPN benchmark:
https://forum.pfsense.org/index.php?topic=105238.msg616743#msg616743openvpn --genkey --secret /tmp/secret time openvpn --test-crypto --secret /tmp/secret --verb 0 --tun-mtu 20000 --cipher aes-256-cbc
Then to give the execution time in seconds a real-world meaning:
( 3200 / execution_time_seconds ) = Projected Maximum OpenVPN Performance in Mbps
I'll report back with the IDS/IPS performance.
-
FWIW the c2*58 chips have compression and encryption acceleration in hardware. It's the QuickAssist feature set. For encryption and compression the c2758 box does better than my 1st-generation i7 920. For everything else, of course, it sucks in comparison.
Frankly I thought ids/ips would be less strenuous than encryption would be, but that's what I get for speculation. While I'm curious to know about ids/ips without encryption my use case would be mostly with it.
I can sacrifice one hdd on my gentoo box, all it has right now is iso images for use in VMs. It's a kvm box, but since it doesn't support VT-d I can't isolate NICs for just a firewall, I haven't been able to make a bridge device which has an ip on the guest but not the host, which sort of blows my security model up.
-
FWIW the c2*58 chips have compression and encryption acceleration in hardware.
In theory. In practice, just forget it exists.
I haven't been able to make a bridge device which has an ip on the guest but not the host, which sort of blows my security model up.
You just don't configure an IP on the bridge.
-
I'll report back with the IDS/IPS performance.
Well, IDS/IPS is certainly taxing but performance is greatly improved when not saturating one core with VPN.
On my J3355B:
I kept my 150/10 connection maxed out for a few minutes by downloading DOTA 2 on Steam.The max CPU I got off the 1 minute RRD's was 61.63% (this pretty well matches up to the top output). At that moment on the RRD graphs it equated to 103.58k pps.
This was using the Open ET & Snort Free rules, paired down to eliminate FP's. It's a home network and it was pretty inactive at the time of the test other than background processes.
Also, suricata, not snort which is single thread only.So IDS/IPS is definitely more CPU intensive than VPN on a modern AES-NI CPU.
That being said, the J3355 is a very low end passively cooled CPU.J3455 would likely get you in the 350Mbps range on suricata.
A G4560 will probably handle just about anything a home user can throw at it short of Gigabit WAN with all the packages or an expectation for line speed VPN.
-
FWIW the c2*58 chips have compression and encryption acceleration in hardware.
In theory. In practice, just forget it exists.
Meaning what? Does pfSense not have support for this hardware? In Linux my c2758 outruns my i7 920 for encryption and compression tasks.
I haven't been able to make a bridge device which has an ip on the guest but not the host, which sort of blows my security model up.
You just don't configure an IP on the bridge.
Thanks for the tip. I'll give this a try when i get a chance.
-
In Linux my c2758 outruns my i7 920 for encryption and compression tasks.
C2758 has AES-NI and 920 does not, also 920 is super old architecture and 5 years older than C2758.
I think VAMike was saying you can forget about any HW acceleration QuickAssist may provide in theory, but AES-NI will definitely make a difference.
-
FWIW the c2*58 chips have compression and encryption acceleration in hardware.
In theory. In practice, just forget it exists.
Meaning what? Does pfSense not have support for this hardware? In Linux my c2758 outruns my i7 920 for encryption and compression tasks.
Not because of quickassist, unless you went out of your way to install 3rd party drivers, and even then openvpn is a lousy application for QAT. (It's much more optimized for embedding into a web server.) I would expect the c2758 to be faster at encryption than the i7 920 because it has AES-NI. The c2750 would be a bit faster because it trades quickassist for a bit more clock.
-
I do have the third party drivers for QAT on my box. I frankly don't see why anyone would get the hardware without taking full advantage of it.
AFAIK aes-ni is a subset of QAT. And that was my point, that the qat feature set is working on my 2758, because otherwise there's no way the i7 920 would lose out to an atom, in spite of the age difference.
So does pfSense make good use of this feature set or no? Frankly the VPN is more important to me than the IDS/IPS feature.
-
-
I see that aes-ni is not contained in QuickAssist after some research, but it seems that QAT is a much larger feature set than aes-ni. It's difficult to see exactly what the differences are because the AES-NI docs show 7 assembly language instructions where the QuickAssist docs show dozens of calls in C, covering a lot of different encryption algorithms, some of which appear to be related to AES but not contained in AES-NI.
Not sure I'd call them unrelated though as they're both focused mostly on encryption.
-
I see that aes-ni is not contained in QuickAssist after some research, but it seems that QAT is a much larger feature set than aes-ni. It's difficult to see exactly what the differences are because the AES-NI docs show 7 assembly language instructions where the QuickAssist docs show dozens of calls in C, covering a lot of different encryption algorithms, some of which appear to be related to AES but not contained in AES-NI.
Not sure I'd call them unrelated though as they're both focused mostly on encryption.
They are completely unrelated because one of them is a set of CPU instructions present on many CPUs from multiple vendors and widely implemented in standard libraries, while the other is an umbrella name for a set of off-chip accelerators with different (incompatible) feature sets usable only with a specific set of (not widely used) libraries. As a CPU instruction, there is essentially zero overhead for using AES-NI. As an off-chip accelerator, QAT requires setting up a block of work, telling the accelerator to work on it, and retrieving the work product. The overhead of performing those steps is very high if the size of the block is small (OpenVPN uses fairly small blocks.) I'm not aware of any good public benchmarks of QAT with OpenVPN (would love a link if someone has some), but I haven't seen any indications that it's a game changer for that application. Most of the published benchmarks are for web applications or certain IPsec implementations which can transfer very large blocks with minimal CPU involvement. You're right that QAT can also do zlib compression. That mostly falls into the "so what" department because most large data is already compressed, and implementing compression on top of SSL is generally considered a security risk these days so it's often disabled anyway. There are also better algorithms than zlib which are optimized for modern CPUs and which are a more attractive choice than buying into QAT just for that.
People have been talking about QAT on C2758 for small appliance applications for literally 3+ years now, with basically nothing to show for it. The version of QAT implemented on the C2758 is different than that in intel's current flagship products and is unlikely to get major development resources at this point. If you haven't already gotten several years of return from QAT on rangely in an OEM application, it's probably not worth putting it on a shopping list at this point. If someone has a specific (unusual) reason that they need QAT, great. But it's a shame to see people buying into it at this late date with the expectation that it's going to revolutionize their OpenVPN implementation.
-
Just buy new stuff, it's more better.
-
Interesting.
While I haven't set up openvpn on this box, I've done benchmarks. The benchmarks showed good performance, but thinking about it now they could be set up as a large job, meaning best performance. I've seen several parts of the documentation talking about different implementations of QAT but didn't know they were so significantly different.
I had been under the impression that the c2758 has no extra hardware for QAT, meaning no coprocessor built-in. It supposedly uses the main cpu cores for this. I guess this could mean that the QAT code is implemented as on-chip rom or microcode, but not sure if it matters at this point.
You're certainly right about there having been much talk and little to show for it over the past years.
-
I had been under the impression that the c2758 has no extra hardware for QAT, meaning no coprocessor built-in. It supposedly uses the main cpu cores for this. I guess this could mean that the QAT code is implemented as on-chip rom or microcode, but not sure if it matters at this point.
The C2000 series is a SoC (system on chip) which has a number of components which are on-die but not in the CPU. For example, the SATA controllers, ethernet controller, memory controller, and PCIe are all integrated into the same chip. But the CPU still needs to communicate over a bus to talk to those components, including the QAT module. None of the on-die components have direct access into the CPU, unlike AES-NI which (as a set CPU instructions) uses the registers and other CPU resources directly rather than requiring separate communication over a bus and potentially requiring a context switch to save CPU state between kernel and user mode.
-
IMHO, any N3150 based boards are crypto-capable and low-power, 8Gb RAM limited by design - the only negative effect I've experienced.