CPU to Saturate 150mbit up and down simultaneously via VPN?
-
Get something around an i3-7100. It's more than you need for 300Mbps OpenVPN, but that gives you room to grow. It's fairly power efficient if you're not running it at max. (And if you do end up maxing it out, then a low-powered chip wasn't going to work anyway.) Or if you want to wait, embedded solutions based on goldmont might work but they're still thin on the ground and we don't have a lot of real-world results. Silvermont/Airmont based embedded processors aren't going to hit 300Mbps of OpenVPN.
Good info, thanks.
If the 7100 at 3.9Ghz is a bit overkill, I wonder if the 7100T would, suffice. The 35W TDP makes cooling and power options lower cost.
Heck, maybe a Haswell chip could suffice and be lower cost. Is the fact that these new i3's have HT and important part of the consideration? Maybe a non HT dual core haswell could do the trick and save a bunch of money in the process.
It doesn't hurt that socket 1150 motherboards are still cheaper than 1151 boards, and I already have more DDR3 RAM than I know what to do with, but no spare DDR4 RAM
-
Here is the performance I'm getting on an old i5-2400 for reference. FWIW that CPU has a passmark of 5878, single thread 1740 (single thread will determine your per-instance performance on OpenVPN).
https://forum.pfsense.org/index.php?topic=127667.msg704656#msg704656
Obviously what I'm using is overkill, if I had it to do over again I wouldn't buy what I bought but i didn't know any better at the time.
Keep in mind this is just for reference, you will get much better performance with what you described. My system is also running suricata, pfBlockerNG & DNSBL.
This is also encrypting at AES-256-CBC, which I also don't recommend. AES-128-GCM is more than enough for your privacy and significantly more efficient.
So without all of the extra packages and encrypting at AES-128-GCM I'm guessing that you would see <12% CPU usage on my system by switching to AES-128-GCM but still using those packages.
https://calomel.org/aesni_ssl_performance.htmlIf I'm right on those estimates then I think a J3355B or J3455-ITX would meet your needs. The AES-NI is different, and goldmont is updated but slower than modern core series processors, but probably at least on par with my 6 years old i5-2400.
Passmark is certainly not the best reference system but gives a ballpark idea of performance comparison:
| –- | i5-2400 | j3355b | j3455-ITX |
| multi-thread | 5878 | 1324 (22%) | 2135 (36%) |
|
|
| single-thread | 1740 | 884 (50%) | 782 (44%) |As another reference point the i3-7100 VAMike suggested beats out my i5-2400 in multi and single threaded scores by a small margin and is far more modern, which matters. His recommendation will absolutely, undoubtedly work for you without a hitch. He's also a very knowledgeable user and I'd be interested in his input on my suggestion.
If you are looking for a lower power / cheaper solution, read on.
Obviously this is hack math for a lot of reasons, but still should serve as a ballpark.
If you adjust my approximation of 12% for AES-128_GCM @130Mbps for your 300Mbps, you get ~28% on my system.
Again adjusting that for a J3355B on a single core would be ~56%, or 28% of the total two cores.
On a J3455-ITX, ~62% or 16% of the total four cores.Again, these are hack estimates but they both point in the general direction that you should be able to do this on a passively cooled modern celeron.
Thanks, good info. I wonder this as well.
So are the OpenVPN calculations not multithreaded?
-
No, open VPN is single threaded. You can create a gateway group to spread across multiple cores (and this is effective for real world performance and easy to do), but any single instance of OpenVPN will be limited by single thread performance.
-
Can you elaborate on how to setup an OpenVPN gateway group please?
-
Here is the performance I'm getting on an old i5-2400 for reference. FWIW that CPU has a passmark of 5878, single thread 1740 (single thread will determine your per-instance performance on OpenVPN).
https://forum.pfsense.org/index.php?topic=127667.msg704656#msg704656
So about 30% CPU for ~150Mbps, and he's looking for 300Mbps…if your estimate below that the J3355 is about half the speed of the i5-2400, that means he needs about 120% of the J3355. Maybe a bit less for AES-GCM, but the bottom line will probably be that the J3355's limit is right around the target speed.
This is also encrypting at AES-256-CBC, which I also don't recommend. AES-128-GCM is more than enough for your privacy and significantly more efficient.
So without all of the extra packages and encrypting at AES-128-GCM I'm guessing that you would see <12% CPU usage on my system by switching to AES-128-GCM but still using those packages.
I definitely recommend AES-128-GCM over AES-256-CBC, but it actually doesn't make all that much difference in OpenVPN performance–the bottlenecks lie elsewhere.
Passmark is certainly not the best reference system but gives a ballpark idea of performance comparison:
I'm skeptical that it offers much insight at all. First, because the results I've seen are all over the map for a given CPU, and also because they're not focused on the specific problem of OpenVPN (so it's an apples to oranges comparison). I'd guess that a goldmont CPU might handle the task, but without some real world experience wouldn't present it as more than a guess. Definitely the math isn't reliable across architectures like this.
Again, these are hack estimates but they both point in the general direction that you should be able to do this on a passively cooled modern celeron.
One problem here is that intel has used "pentium" and "celeron" to cover a heck of a lot of different architectures at this point. Right now you can buy a "pentium" or a "celeron" which is a goldmont mobile CPU or one that's a kaby lake desktop, and those have very different performance characteristics. A skylake or kaby lake pentium or celeron will behave similarly to the i3-7100, with the performance for this application basically scaling with clock speed. Picking just one is tough, because there's always another one that's a little faster for another couple of bucks or a little slower for a couple of bucks less. Probably any of them will hit the 300Mbps target, the i3-7100 is just a point that will have some headroom but isn't on the really steep part of the price curve, but you could also go with a G3950 for half the cost. But switching to a goldmont "pentium" or "celeron" you move from "probably can do this" to "maybe can do this"; it's a completely different architecture, so instead of just looking at the lower clock speed you're also looking at a lower IPC.
-
If the 7100 at 3.9Ghz is a bit overkill, I wonder if the 7100T would, suffice. The 35W TDP makes cooling and power options lower cost.
The 7100 has the fan in the box, and I haven't noticed that a few watts makes much of a difference in power options. If you are trying to go fanless that's a whole different story, but that wasn't in the original ask. :) On most of these newer CPUs you'll be idle most of the time and the fan will also be idling. The low TDP chips are important mainly if you want to ensure that they never get too hot because you can't/don't want to put a fan on. They'll draw the same power at idle, but are throttled from getting too hot (at the cost of peak performance). I personally use a fanless router, but I'm not trying to run multiple hundred megabits of OpenVPN on it–that really changes the requirements.
Heck, maybe a Haswell chip could suffice and be lower cost. Is the fact that these new i3's have HT and important part of the consideration? Maybe a non HT dual core haswell could do the trick and save a bunch of money in the process.
The kaby lake celeron option is about $50. If you can get a similar haswell for a bunch less, go for it. I'd guess you'd end up in the same ballpark and I'd just get the newer architecture which is about as slightly more efficient as it may be slightly more expensive. But if you get a steal on the haswell it should work fine; if the parts box RAM is worth enough to tip the scale, that's that.
-
The 7100 has the fan in the box, and I haven't noticed that a few watts makes much of a difference in power options. If you are trying to go fanless that's a whole different story, but that wasn't in the original ask. :) On most of these newer CPUs you'll be idle most of the time and the fan will also be idling. The low TDP chips are important mainly if you want to ensure that they never get too hot because you can't/don't want to put a fan on. They'll draw the same power at idle, but are throttled from getting too hot (at the cost of peak performance). I personally use a fanless router, but I'm not trying to run multiple hundred megabits of OpenVPN on it–that really changes the requirements.
The fan doesn't bother me in the slightest. This thing is going to reside in my basement next to my noisy ProCurve switch, my KVM server and my patch panel, where noise is not a concern. The plan - however - was to use a PicoPSU-type power supply. The 54W TDP - using those online PSU calculators of questionable accuracy - winds up being right on the hairy edge of what the PicoPSU model I had in mind can provide.
I like the PicoPSU's as they seem to have much less overhead at idle when measured at the wall with my Kill-A-Watt than a similar PC with a traditional ATX power supply does. Not quite sure why that is though.
The kaby lake celeron option is about $50. If you can get a similar haswell for a bunch less, go for it. I'd guess you'd end up in the same ballpark and I'd just get the newer architecture which is about as slightly more efficient as it may be slightly more expensive. But if you get a steal on the haswell it should work fine; if the parts box RAM is worth enough to tip the scale, that's that.
Yeah, it's less the price of the CPU, and more the price of the other accessories that winds up being better with Haswell. Socket 1150 motherboards tend to be cheaper than Socket 1151 models, and then there's the fact that I won't need to buy RAM at all. I have unused DDR3 sticks up to my arm pits, but I as of yet don't have anything DDR4 (or DDR3L for that matter) in my house.
So about 30% CPU for ~150Mbps, and he's looking for 300Mbps…if your estimate below that the J3355 is about half the speed of the i5-2400, that means he needs about 120% of the J3355. Maybe a bit less for AES-GCM, but the bottom line will probably be that the J3355's limit is right around the target speed.
Hmm. So ~30% of a i5-2400 for 150Mbps, so I'd need 60% of that same i5-2400.
If these results are an accurate predictor for OpenVPN type of workloads (which I am not convinced they are due to the special instruction sets like AES-NI) Haswell gained about 13% IPC over Sandy.
The i5-2400 turbo's up to 3.4Ghz, so divide by 1.13, and multiply by 0.6 I ought to need an absolute bare minimum of 1.8Ghz out of the Haswell arch.
Add a safety margin over that, and even a 2.8Ghz Haswell Celeron G1840 ought to do the trick. Question is, is that cutting it too close…
-
Thanks for your input Mike, it's always good stuff! I happen to have a j3355b I use with LibreELEC I also have a spare HDD and PRO/1000. I think I'll run pfsense on it tonight and see how it handles VPN. I'll report back!
-
I like the PicoPSU's as they seem to have much less overhead at idle when measured at the wall with my Kill-A-Watt than a similar PC with a traditional ATX power supply does. Not quite sure why that is though.
It's likely due to a combination of the picoPSU's not having a fan which takes power to run, and the fact that PSU's tend to be inefficient when only drawing a small percentage of their peak power. So even a 300W ATX PSU is running at 10% of peak with a 30W pfSense box, while a 60W AC/DC converter (this is what really matters for efficiency with a picoPSU) is running at 50%.
On my LibreElec J3355B I saw a drop of about 8W switching from an old ATX PSU to a pico PSU. Not enough to warrant buying one if you already have an ATX PSU lying around, but maybe enough if you don't. I did it because i didn't want to hear the fan though.
-
I posted another thread on the J3355B performance.
https://forum.pfsense.org/index.php?topic=127793.msg705046#msg705046It was pretty impressive IMO.
On the tests I ran it looked like the CPU scaled fairly linearly with VPN throughput. If that's true then this CPU would work for your 300Mbps application @ AES-128-CBC.
$55 for low power SoC.
-
Can you elaborate on how to setup an OpenVPN gateway group please?
Sure, you simply setup 2+ openVPN clients. I would recommend setting different servers for these clients if able, this helps mitigate the effects of a server going down or slowing down.
Go to Interfaces, assign them an interface and enable the interface, save and apply
Go to System/Routing/Gateway Groups
Create a new gateway group. Select all of the clients that you want to work simultaneously as Tier 1, you can optionally select fallback clients as Tier 2+. Fallbacks are active when all gateways in the higher tier are down.Finally go to your Firewall rules
Any rule that passes traffic that you want to force VPN use on, edit it, select advanced settings and under Gateway select the gateway group you created. -
AMD Geode based APU2C4
Just to clarify, the APU2C4 isn't AMD Geode based, it's on much more powerful Jaguar cores. And that said, with four of them I'd expect it to be possible to aggregate multiple OpenVPN connections to equal 150Mbps, as others have suggested. I say possible because it might be, not because I'd advise it. But if I were in OP's situation I'd at least try it.
Just didn't want anyone to get the idea the that APU2C4 is the same as the old APU systems, which were (are) based on very old Geode CPUs.
-
AMD Geode based APU2C4
Just to clarify, the APU2C4 isn't AMD Geode based, it's on much more powerful Jaguar cores. And that said, with four of them I'd expect it to be possible to aggregate multiple OpenVPN connections to equal 150Mbps, as others have suggested. I say possible because it might be, not because I'd advise it. But if I were in OP's situation I'd at least try it.
Note that the requirement was 150Mbps bidirectional; most of the test numbers are single stream–roughly 300Mbps equivalent. Dicey on an APU2, I think, even with multiple streams.
Just didn't want anyone to get the idea the that APU2C4 is the same as the old APU systems, which were (are) based on very old Geode CPUs.
Just to correct the correction, the geodes were in the older ALIX series; pcengines "APU" was a bobcat core and performance-wise was much closer to the APU2 except that it lacks AES-NI and has half the cores. (Confusing naming as "APU" is AMDs name for a line covering 8 different cores over 5+ years.)
-
Just to correct the correction, the geodes were in the older ALIX series; pcengines "APU" was a bobcat core and performance-wise was much closer to the APU2 except that it lacks AES-NI and has half the cores. (Confusing naming as "APU" is AMDs name for a line covering 8 different cores over 5+ years.)
Ah, yes. I meant ALIX.
-
Get something around an i3-7100. It's more than you need for 300Mbps OpenVPN, but that gives you room to grow. It's fairly power efficient if you're not running it at max. (And if you do end up maxing it out, then a low-powered chip wasn't going to work anyway.) Or if you want to wait, embedded solutions based on goldmont might work but they're still thin on the ground and we don't have a lot of real-world results. Silvermont/Airmont based embedded processors aren't going to hit 300Mbps of OpenVPN.
So, I took your advice and went with an i3-7100.
This is my first socket 1151 chip, so there are a lot of unfamiliar BIOS options.
Anyone know if "Intel Speed Shift Technology" is compatible with the version of BSD pfSense is built on?
edit:
I also have to admit I am VERY impressed with this little chip.
I haven't installed pfSense yet, but I am doing some testing in Ubuntu 16.10.
Using the PicoPSU-80 and 60W power brick kit from Mini-Box.com I'm idling on the desktop pulling only 7.1W from the wall (as measured on my Kill-A-Watt).
That's about the same power as my PcEngines low power Quad Core Jaguar at idle.
When I load up the chip with mprime (linux version of Prime95) it peaks at about 46W at the wall.
And that's at 3.9Ghz 2C/4T.
Even the stock Intel cooler (which just BARELY fit inside the M350 case once the drive brackets were removed) doesn't spin up much during load testing.
Very impressed.
The ASRock H270M-ITX/ac is also a great little Mini-ITX board with dual Intel NIC's to pair with it.
-
AMD Geode based APU2C4
Just to clarify, the APU2C4 isn't AMD Geode based, it's on much more powerful Jaguar cores. And that said, with four of them I'd expect it to be possible to aggregate multiple OpenVPN connections to equal 150Mbps, as others have suggested. I say possible because it might be, not because I'd advise it. But if I were in OP's situation I'd at least try it.
Note that the requirement was 150Mbps bidirectional; most of the test numbers are single stream–roughly 300Mbps equivalent. Dicey on an APU2, I think, even with multiple streams.
Just didn't want anyone to get the idea the that APU2C4 is the same as the old APU systems, which were (are) based on very old Geode CPUs.
Just to correct the correction, the geodes were in the older ALIX series; pcengines "APU" was a bobcat core and performance-wise was much closer to the APU2 except that it lacks AES-NI and has half the cores. (Confusing naming as "APU" is AMDs name for a line covering 8 different cores over 5+ years.)
Yeah, my bad, I got my chips confused. Definitely have the PCEngines APU2C4, which is 4 Jaguar cores at 1Ghz I believe (or was it 1.2?)
-
Yeah, my bad, I got my chips confused. Definitely have the PCEngines APU2C4, which is 4 Jaguar cores at 1Ghz I believe (or was it 1.2?)
2nd gen Jaguar core at 1 Ghz according the the document below and pcengines. Likely limited to 1Ghz due to the design of their cooling solution.
https://www.amd.com/Documents/AMDGSeriesSOCProductBrief.pdf
CPU: AMD Embedded G series GX-412TC, 1 GHz quad Jaguar core with 64 bit and AES-NI support, 32K data + 32K instruction cache per core, shared 2MB L2 cache.
-
I also have to admit I am VERY impressed with this little chip.
I haven't installed pfSense yet, but I am doing some testing in Ubuntu 16.10.
Using the PicoPSU-80 and 60W power brick kit from Mini-Box.com I'm idling on the desktop pulling only 7.1W from the wall (as measured on my Kill-A-Watt).
That's about the same power as my PcEngines low power Quad Core Jaguar at idle.
When I load up the chip with mprime (linux version of Prime95) it peaks at about 46W at the wall.
And that's at 3.9Ghz 2C/4T.
Even the stock Intel cooler (which just BARELY fit inside the M350 case once the drive brackets were removed) doesn't spin up much during load testing.
Very impressed.
The ASRock H270M-ITX/ac is also a great little Mini-ITX board with dual Intel NIC's to pair with it.
Wow. That's really good to know. Yeah, those M350 cases are tiny, but they kind of stand alone in the market, and are perfect for a mini ITX pfSense system provided your NICs are onboard. I have one but it's for a MythTV frontend. Thanks for the info.
-
I also have to admit I am VERY impressed with this little chip.
I haven't installed pfSense yet, but I am doing some testing in Ubuntu 16.10.
Using the PicoPSU-80 and 60W power brick kit from Mini-Box.com I'm idling on the desktop pulling only 7.1W from the wall (as measured on my Kill-A-Watt).
That's about the same power as my PcEngines low power Quad Core Jaguar at idle.
When I load up the chip with mprime (linux version of Prime95) it peaks at about 46W at the wall.
And that's at 3.9Ghz 2C/4T.
Even the stock Intel cooler (which just BARELY fit inside the M350 case once the drive brackets were removed) doesn't spin up much during load testing.
Very impressed.
The ASRock H270M-ITX/ac is also a great little Mini-ITX board with dual Intel NIC's to pair with it.
Wow. That's really good to know. Yeah, those M350 cases are tiny, but they kind of stand alone in the market, and are perfect for a mini ITX pfSense system provided your NICs are onboard. I have one but it's for a MythTV frontend. Thanks for the info.
Any time!
And it gets better. I killed Xorg and the idle wattage measured at the wall went down to 6.2W!
Full specs if anyone else is interested (links to where I bought them, you may find better prices elsewhere):
-
Intel Core i3-7100 ($119.96 w. Prime)
-
ASRock H270M-ITX/ac Mini-ITX motherboard with dual Intel NIC's ($96.98)
-
Crucial 8GB (2x4GB) DDR4-2133 kit ($55.49 w. Prime)
-
BiWin 60GB M.2 Sata SSD ($40.98)
-
M350 Universal Mini-ITX enclosure ($39.95)
-
Molex to P4 power adapter ($4.95)
And that's it. Total: 393.31 (less for me, since I already had a few of the parts left over from other projects.
The CPU comes with a cooler. Before you assemble everything, it looks like it won't fit in the M350 enclosure, but it does (just barely), as long as you don't use the 2.5" drive brackets. (use an M2, USB drive or SATA DOM)
I also pulled out the mini-Wlan card (you loosen two screws on the bottom of the board and it comes right out). I wasn't using it, and I figured I'd rather not have it wasting power. Also disabled everything in BIOS I wasnt planning on using, and enabled all power saving states, except suspend to RAM, as the router needs to be operating 24/7.
I used a fan profile on the board. The CPU puts out so little power that it seems to stay at the coolers minimum fan speed most of the time. Granted it is pretty cold in my basement right now.
(Warmer temps will result in higher fan speeds which will drive up power consumption noticeably. At this low power use the fans use a surprisingly large percentage of the power)
I'm very happy thus far.
Just stay away from the USB3 ports. pfSense doesn't seem to like those at all, and the installers will fail unless booted from one of the USB2 ports.
-
-
So,
After installing pfSense, my power use at idle went up a little bit to about 8W (compared to 6.2W in Ubuntu).
Part of this may be due to my "Hidaptive" power setting, or maybe BSD 10.3 isnt quite as good at power management as Ubuntu is at this point.
Either way, still good results.
Here are some comparative openSSL numbers,
First the PcEngines APU2C4:
[2.3.1-RELEASE][root@pfSense.localdomain]/root: openssl speed -elapsed -evp aes-128-ecb You have chosen to measure elapsed time instead of user CPU time. Doing aes-128-ecb for 3s on 16 size blocks: 23413097 aes-128-ecb's in 3.00s Doing aes-128-ecb for 3s on 64 size blocks: 18438085 aes-128-ecb's in 3.00s Doing aes-128-ecb for 3s on 256 size blocks: 7473361 aes-128-ecb's in 3.00s Doing aes-128-ecb for 3s on 1024 size blocks: 2115520 aes-128-ecb's in 3.01s Doing aes-128-ecb for 3s on 8192 size blocks: 279464 aes-128-ecb's in 3.00s OpenSSL 1.0.1s-freebsd 1 Mar 2016 built on: date not available options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: clang The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-ecb 124869.85k 393345.81k 637726.81k 720221.92k 763123.03k
Now the i3-7100:
[2.3.3-RELEASE][admin@router.localdomain]/var/log: openssl speed -elapsed -evp aes-128-ecb You have chosen to measure elapsed time instead of user CPU time. Doing aes-128-ecb for 3s on 16 size blocks: 242729953 aes-128-ecb's in 3.00s Doing aes-128-ecb for 3s on 64 size blocks: 207367303 aes-128-ecb's in 3.01s Doing aes-128-ecb for 3s on 256 size blocks: 69510589 aes-128-ecb's in 3.00s Doing aes-128-ecb for 3s on 1024 size blocks: 17831161 aes-128-ecb's in 3.00s Doing aes-128-ecb for 3s on 8192 size blocks: 2219499 aes-128-ecb's in 3.00s OpenSSL 1.0.1s-freebsd 1 Mar 2016 built on: date not available options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: clang The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-ecb 1294559.75k 4412345.31k 5931570.26k 6086369.62k 6060711.94k
Looks like an average of about an order of magnitude improvement across the board.