AMD Athon II not enough for traffic shaping?



  • Been running a PoC box for a few months now (currently running 2.2.6), mainly to replace a dd-wrt that wasn't capable of sustaining anywhere near 60mbps over an openvpn tunnel.

    Just threw an old box I had collecting dust at this for now (and no reason to build anything better, until now perhaps).  It's an AMD Athlon II X2 240 processor (dual core).  And the system's got 2GB RAM.

    For OpenVPN this has been perfect… sustains 60mbps and barely breaks a sweat (10-12% CPU @ sustained 60mbps).  I've always got two tunnels connected (sometimes 3) and round robin all LAN traffic thru these tunnels.  No problems.

    Late last week I decided to setup traffic shaping.  Just wanted to prioritize VoIP traffic on the tunnel interfaces (OVPN1, OVPN2, OVPN3) and there is no queue configured for the WAN (since everything is openvpn traffic anyways).

    Got it all setup and just forgot about it for the weekend until this morning when I noticed some intermittent internet drops.  Log on to the box and cpu is 100% on both cores (100% for each openvpn client process).  Finally tracked it down to the fact that I was saturating my uplink (10mbps) transferring files for work and the traffic shaping seems to be the reason.  With the shaping active, any sustained upstream data at or more than ~4mbps causes high CPU usage (70%+) and if that upstream traffic is sustained for any period of time anywhere near 10mbps then the CPU pegs at 100% (along with the problems like intermittent tunnel drops, etc.) until the uploading stops.  Remove the shaping and sustained uploads @ 10mbps is fine (just the 3-4% for openvpn encryption).

    So my question is, do I really need that much more CPU power for traffic shaping or am I doing something wrong?  If I do need that much more CPU, what should I be looking at?  Am I going to need something like an Intel i5 or something to shape this traffic?

    Thanks.



  • Been running a PoC box for a few months now (currently running 2.2.6), mainly to replace a dd-wrt that wasn't capable of sustaining anywhere near 60mbps over an openvpn tunnel.

    OpenWRT is Linux based what is running often much more smooth or liquid on any kind of hardware,
    pfsense is FreeBSD based and there for you will need only based on this facts some more power to get
    the same flawless acting and running system as you go by a Linux based solution.

    DD-WRT is offering many options, features and functions and compared to pfSense mostly the users
    are thinking both have to be placed in the same class of software but this isn´t right so.
    pfSense is a software based firewall that needs his power to realize the things that were
    configured and was enabled, some functions are eating less CPU power and some are
    needing more CPU power.

    Just threw an old box I had collecting dust at this for now (and no reason to build anything better, until now perhaps).  It's an AMD Athlon II X2 240 processor (dual core).  And the system's got 2GB RAM.

    What is the CPU frequency (GHz)?
    Did you activate PowerD (hi adaptive)?
    What is this kind of RAM you are using (DDR) and the RAMspeed?

    If I had to guess, you're being limited by your ram speed more than the CPU or anything else.
    Please accept quite how it works. The packet filter, the IP forwarding parts, and even NAT
    (part of pf, but run at a different phase) all hit the memory system. It's likely not that your
    CPU can't keep up, it's that your memory system is saturated. And on top of this it might
    also be that your RAM amount is to small!

    2 GB for firewall only
    2 GB - 4 GB for firewall & IDS (Snort) & VPN
    4 GB - 8 GB for firewall & IDS (Snort) & VPN & Squid

    8 GB - 16 GB for firewall & IDS (Snort) & VPN & Squid as caching proxy & HAVP

    • high up the mbuf size
    • high up the Swap size (RAM disk)
    • high up the default RAM size of Squid

    For OpenVPN this has been perfect… sustains 60mbps and barely breaks a sweat (10-12% CPU @ sustained 60mbps).  I've always got two tunnels connected (sometimes 3) and round robin all LAN traffic thru these tunnels.  No problems.

    How do you use Round Robin to spread the traffic? In normal you should using load balancing over three
    WAN interfaces and there are three main methods to realize this;

    • Service based load balancing
    • Session based load balancing
    • policy based routing as load balancing

    What is the line speed of your Internet connection?
    Or do you have more then one Internet connections?
    What services are running over this VPN tunnels?
    What are the endpoint of the VPN tunnels?

    Late last week I decided to setup traffic shaping.  Just wanted to prioritize VoIP traffic on the tunnel interfaces (OVPN1, OVPN2, OVPN3) and there is no queue configured for the WAN (since everything is openvpn traffic anyways).

    You would be able to spread this traffic over the three WAN interfaces by using load balancing
    or QoS, right? For traffic shaping you might be needing more power and also with massively using
    QoS you will need many more CPU power as in normal cases.

    Got it all setup and just forgot about it for the weekend until this morning when I noticed some intermittent internet drops.  Log on to the box and cpu is 100% on both cores (100% for each openvpn client process).  Finally tracked it down to the fact that I was saturating my uplink (10mbps) transferring files for work and the traffic shaping seems to be the reason.  With the shaping active, any sustained upstream data at or more than ~4mbps causes high CPU usage (70%+) and if that upstream traffic is sustained for any period of time anywhere near 10mbps then the CPU pegs at 100% (along with the problems like intermittent tunnel drops, etc.) until the uploading stops.  Remove the shaping and sustained uploads @ 10mbps is fine (just the 3-4% for openvpn encryption).

    You will need more CPU power and perhaps better to go also on top with more CPU cores.
    Also more and faster RAM should be better to get rid of this situation you are in. So for sure
    no one will hear those thing really, but to solve this I would recommend it to you.

    So my question is, do I really need that much more CPU power for traffic shaping or am I doing something wrong?

    Both could really be, but the better CPU and more RAm I would tend to be more important.
    You could also try out using policy based routing together with load balancing and it will be
    perhaps then enough power there, but to come closer to the point where it really lacks and
    fails, we should be more informed over the whole network set up and usage and all other
    interesting things, that will be related to this issue. Mostly a network schematic will be
    fine for the most peoples pictures will tell more then 1000 words!
    How many users are involved and must be served?
    Is there any VOIP traffic we are talking about?
    What VOIP phones if so are in usage?
    What are the other VPN end points?
    What switches are in usage?

    If I do need that much more CPU, what should I be looking at?

    First it would be better to know all the above set up questions.
    How many electric power it should use?
    How many is your budget?
    How many users are there?
    What services are in use and how many packets and what kind of packets will be installed?
    IDS/IPS, QoS, VPN settings, ect..

    Am I going to need something like an Intel i5 or something to shape this traffic?

    PC Engines APU or APU2, pfSense SG-2220 will be the entry level
    Intel Atom C2558 and C2758 are the mid ranged level
    Intel Xeon E3-12xxv3 will be a business or pro platform
    Intel E5-2600v3 CPUs are the enterprise section or level to realize very powerful UTM devices

    Intel Xeon D-15x8 platforms from 4 to 16 CPU cores and the double of threads
    are reaching from the lower bottom to the highest top so they could be placed
    in all levels, but they are pretty pricey and expensive.

    At the moment I would suggest you to go with a Intel C2558 or C2758 platform
    but pending on the answers according to the questions asked above this could
    turn around. A cheap and used Intel Xeon E3-12xxv3 together with a refurbished
    mini-ITX board and also second hand ECC RAM you will be also able to get a real
    pfSense bomb that is for cheap as it can and able to realize all your dreams and
    fit your needs.



  • @BlueKobold:

    1. What is the CPU frequency (GHz)?
    2. Did you activate PowerD (hi adaptive)?
    3. What is this kind of RAM you are using (DDR) and the RAMspeed?
    1. 2.8 GHz
    2. Unknown
    3. Unknown

    The box is 5 or 6 years old (and sitting in a closet for probably the last 2 years) and I don't really remember what I threw in the box when it was built.  I'll have to head to the basement, connect a monitor and get into the BIOS to get the answers to 2 & 3.

    For OpenVPN this has been perfect… sustains 60mbps and barely breaks a sweat (10-12% CPU @ sustained 60mbps).  I've always got two tunnels connected (sometimes 3) and round robin all LAN traffic thru these tunnels.  No problems.

    How do you use Round Robin to spread the traffic? In normal you should using load balancing over three
    WAN interfaces and there are three main methods to realize this;

    • Service based load balancing
    • Session based load balancing
    • policy based routing as load balancing

    So I've got a routing group configured with each VPN client interface in the group, all set to Tier 1.  This is how I round robin the traffic.

    1. What is the line speed of your Internet connection?
    2. Or do you have more then one Internet connections?
    3. What services are running over this VPN tunnels?
    4. What are the endpoint of the VPN tunnels?
    1. 60mbps down/10mbps up
    2. No
    3. During the day, it's all business (home office setup where I work the majority of my days) so it's mainly VoIP traffic, video conferencing and file transfers via sftp with the usual light http(s) traffic.  At night, it's streaming video (Netflix, etc.) and Bittorrent traffic.
    4. Endpoints are various servers from a single commercial VPN provider.

    Late last week I decided to setup traffic shaping.  Just wanted to prioritize VoIP traffic on the tunnel interfaces (OVPN1, OVPN2, OVPN3) and there is no queue configured for the WAN (since everything is openvpn traffic anyways).

    You would be able to spread this traffic over the three WAN interfaces by using load balancing
    or QoS, right? For traffic shaping you might be needing more power and also with massively using
    QoS you will need many more CPU power as in normal cases.

    Well on the old dd-wrt router I could only have one tunnel connected at a time and I used QoS to prioritize VoIP traffic.  On this pfsense box, I can now connect multiple client sessions and use policy based routing, which is what I'm doing.  And then tried to add traffic shaping for QoS.

    Got it all setup and just forgot about it for the weekend until this morning when I noticed some intermittent internet drops.  Log on to the box and cpu is 100% on both cores (100% for each openvpn client process).  Finally tracked it down to the fact that I was saturating my uplink (10mbps) transferring files for work and the traffic shaping seems to be the reason.  With the shaping active, any sustained upstream data at or more than ~4mbps causes high CPU usage (70%+) and if that upstream traffic is sustained for any period of time anywhere near 10mbps then the CPU pegs at 100% (along with the problems like intermittent tunnel drops, etc.) until the uploading stops.  Remove the shaping and sustained uploads @ 10mbps is fine (just the 3-4% for openvpn encryption).

    You will need more CPU power and perhaps better to go also on top with more CPU cores.
    Also more and faster RAM should be better to get rid of this situation you are in. So for sure
    no one will hear those thing really, but to solve this I would recommend it to you.

    So my question is, do I really need that much more CPU power for traffic shaping or am I doing something wrong?

    Both could really be, but the better CPU and more RAm I would tend to be more important.
    You could also try out using policy based routing together with load balancing and it will be
    perhaps then enough power there, but to come closer to the point where it really lacks and
    fails, we should be more informed over the whole network set up and usage and all other
    interesting things, that will be related to this issue. Mostly a network schematic will be
    fine for the most peoples pictures will tell more then 1000 words!
    How many users are involved and must be served?
    Is there any VOIP traffic we are talking about?
    What VOIP phones if so are in usage?
    What are the other VPN end points?
    What switches are in usage?

    The only VoIP traffic comes from me, for work.  I've got an Obihai 1032 phone in the office, plus a softphone app on my Android phone, if needed, but both are never used at the same time.  The reason I want to setup QoS is for those last couple hours at the end of my work day when the kids get home and start going to town on youtube, netflix, whatever.  Just want to make sure my VoIP traffic has the priority when needed.

    Network is pretty simple (see attachment).

    If I do need that much more CPU, what should I be looking at?

    First it would be better to know all the above set up questions.

    1. How many electric power it should use?
    2. How many is your budget?
    3. How many users are there?
    4. What services are in use and how many packets and what kind of packets will be installed?
      IDS/IPS, QoS, VPN settings, ect..
    1. The less the better, but a typical desktop PC on 24x7 is what I had in mind (i.e. I don't want to run some power hungry server grade hardware for this).
    2. $300-$500 max would be ideal (and closer to $300 would be even more ideal), I could throw more at it if it were necessary, but I'm hoping it's not.
    3. 4 users in the house; 12 devices on the network
    4. So it's your typical family setup in the evenings/weekends (web surfing/social media/whatever the kids are doing, youtube, netflix, etc.).  Week days it's me working with mostly VoIP, video conferencing, email, web, file transfers (large) via sftp.  There's that 2-3 hour window each day where the kids come home and start using the internet and I'm wrapping up my day.  This is where I want to ensure VoIP traffic has priority.

    Am I going to need something like an Intel i5 or something to shape this traffic?

    PC Engines APU or APU2, pfSense SG-2220 will be the entry level
    Intel Atom C2558 and C2758 are the mid ranged level
    Intel Xeon E3-12xxv3 will be a business or pro platform
    Intel E5-2600v3 CPUs are the enterprise section or level to realize very powerful UTM devices

    Intel Xeon D-15x8 platforms from 4 to 16 CPU cores and the double of threads
    are reaching from the lower bottom to the highest top so they could be placed
    in all levels, but they are pretty pricey and expensive.

    At the moment I would suggest you to go with a Intel C2558 or C2758 platform
    but pending on the answers according to the questions asked above this could
    turn around. A cheap and used Intel Xeon E3-12xxv3 together with a refurbished
    mini-ITX board and also second hand ECC RAM you will be also able to get a real
    pfSense bomb that is for cheap as it can and able to realize all your dreams and
    fit your needs.




  • 4 Persons with 12 devices are today likes a normal family but in real this is more tending to be a small
    company need, looking forward to the network environment. Perhaps some other tips or hints would
    solve your problem also. But in real I will be more thinking newer hardware should do the job better.

    [HOWTO] OPENVPN and traffic shaping GUIDE!
    Throughput troubleshooting

    • enable PowerD (hi adaptive)
    • change the cryptographic chipper to a lower one
    • enable TRIM support if a SSD or mSATA is in use
    • high up the mbuf size to 1.000.000 (only with more RAM)

    The other thing is the electric power consuming, with an Intel Celeron N2930 you will be
    owning the same CPU horse power by using only 7 Watt and your CPU is nearly using 55 Watt!!

    An Intel C2558 SoC is using 14 Watt but is delivering some time the CPU power from your set up.
    So it could really be that you are will perhaps combining a configuration, hardware basis and failure
    searching to solve this problem.

    Budget:
    Jetway N2930 mini-ITX Board ~$200
    2 x 4 GB RAM DDR3L-1333 1,35V ~$50
    M350 mini-ITX case ~$50
    External 12V PSU ~$10
    mSATA 120 GB ~$50
    ~$360

    Mid range:
    Supermicro A1SRi-2558F ~$280
    Supermicro SC101i case ~$70 or
    mini-ITX case M350 ~$50
    2 x 4 GB DDR3- 1600 ECC RAM ~$60
    PicoPSU internal 160 Watt ~$45
    external PSU 144 Watt ~$40
    120 GB SSD ~$80
    ~$630

    Comparable to the SG-4860 unit form the pfSense store, but this comes with 3 miniPCIe + 1 SIM slot
    for connecting mSATA, modem and WiFi cards directly inside of the unit.

    High end:
    Supermicro A1SRi-2758F ~$350
    Supermicro SC101i case ~$70 or
    mini-ITX case M350 ~$50
    2 x 4 GB DDR3- 1600 ECC RAM ~$60
    PicoPSU internal 160 Watt ~$45
    external PSU 144 Watt ~$40
    120 GB SSD ~$80
    ~$700

    Ready to use boxes and sets:

    Comparable to the SG-8860 unit form the pfSense store, but this comes with 3 miniPCIe + 1 SIM slot
    for connecting mSATA, modem and WiFi cards directly inside of the unit.



  • Thanks for the info.  Definitely seems like I need to upgrade.  Can't seem to source the Jetway boards all that easily up north here.  Do you have any thoughts on the Zotac barebones systems?  I'm thinking something like this one may be equivalent to the "budget" option you presented?  Only thing is that it has only 2 NICs, but I'd just continue to use my old dd-wrt as a switch.



  • Thanks for the info.  Definitely seems like I need to upgrade.

    Or all together is mostly the better thing to get rid of this issues.

    • tuning & pimp
    • failure searching
    • learning setting up queues
    • New and stronger pfSense box
    • A common switch likes Cisco SG200, SG300 or DGS1510-20 from D-Link.

    Can't seem to source the Jetway boards all that easily up north here.

    Where you are exactly living? Here in Germany perhaps not so far away from you all is
    able to get for a fair price, regarding on the shipping fee to you and the transportation
    fee to Germany before.

    Supermicro Mainboard A1SRi-2758F ~379 + shipping
    Supermicro Mainboard A1SRi-2558F ~285 € + shipping
    Shipping fee:
    Benelux countries = 25 €
    Scandinavian countries = 49 €

    Jetway JNF9HG-2930 ~235 + shipping fee
    Jetway JNF9HB-2930 ~255 + shipping fee
    Shipping fee:
    Bulgaria, Finland, Greece, Iceland, Norway, Romania 25 € - 35 € based on the weight of the parcel.

    M350 case ~67 €
    2 x 4 GB RAM DDR3L-1600 1,35V ~50 €
    1 x mSATA 60 Gb - 120 GB ~ 50 € - 120 €

    ~520 € all in

    Do you have any thoughts on the Zotac barebones systems?  I'm thinking something like this one may
    be equivalent to the "budget" option you presented?

    I hate the Zotac boxes as a firewall or router and so I would not go with them.

    Only thing is that it has only 2 NICs, but I'd just continue to use my old dd-wrt as a switch.

    VOIP phones are mostly using 2 different QoS arts, the first one is DiffServ and the other one is DSCP,
    if you get a "normal" and common switch you will be able to ensure the QoS line internal in your LAN.

    Cisco SG200 or SG300 would do this job with ease and they are also available as 10 Port verions
    for low price or cheap. Or if not get an 8 Port Netgear Switch GS108Tv2 for ~80 € better then the
    Linksys "switch" in the router.



  • @SirJohnEh:

    Been running a PoC box for a few months now (currently running 2.2.6), mainly to replace a dd-wrt that wasn't capable of sustaining anywhere near 60mbps over an openvpn tunnel.

    Just threw an old box I had collecting dust at this for now (and no reason to build anything better, until now perhaps).  It's an AMD Athlon II X2 240 processor (dual core).  And the system's got 2GB RAM.

    For OpenVPN this has been perfect… sustains 60mbps and barely breaks a sweat (10-12% CPU @ sustained 60mbps).  I've always got two tunnels connected (sometimes 3) and round robin all LAN traffic thru these tunnels.  No problems.

    Late last week I decided to setup traffic shaping.  Just wanted to prioritize VoIP traffic on the tunnel interfaces (OVPN1, OVPN2, OVPN3) and there is no queue configured for the WAN (since everything is openvpn traffic anyways).

    Got it all setup and just forgot about it for the weekend until this morning when I noticed some intermittent internet drops.  Log on to the box and cpu is 100% on both cores (100% for each openvpn client process).  Finally tracked it down to the fact that I was saturating my uplink (10mbps) transferring files for work and the traffic shaping seems to be the reason.  With the shaping active, any sustained upstream data at or more than ~4mbps causes high CPU usage (70%+) and if that upstream traffic is sustained for any period of time anywhere near 10mbps then the CPU pegs at 100% (along with the problems like intermittent tunnel drops, etc.) until the uploading stops.  Remove the shaping and sustained uploads @ 10mbps is fine (just the 3-4% for openvpn encryption).

    So my question is, do I really need that much more CPU power for traffic shaping or am I doing something wrong?  If I do need that much more CPU, what should I be looking at?  Am I going to need something like an Intel i5 or something to shape this traffic?

    Thanks.

    Seems to me like it's gotta be something with the traffic shaping config.  That's not a world beating system, for sure, but it should easily be able to handle that kind of connection.  I had a similar problem once, where certain uploads would apparently induce heavy buffer bloat.  I'd watch the CPU, RAM, MBUF usage climb and climb.  What fixed it for me was enabling traffic shaping.  What's your shaper config look like?  And what do other system status look like when it's exhibiting the problem behavior?

    Matt



  • The shaping config was pretty basic, just went thru the multi wan wizard and created the (PRIQ) queues for each of the openvpn interfaces.  Each interface was just set to prioritize VoIP traffic and treat all other traffic at the same priority after that.

    When uploading anything more than a couple mbps, the cpu spikes.  Anything about 10mbps sustained and it's 100% CPU.  The top output shows all openvpn client processes being used for uploading at 100%.  The CPU line in top is ~8% user, ~92% system.  I'm pretty sure that 8% is for the vpn encryption and the 92% in the system is the traffic shaper doing its thing.


Log in to reply