Thanks for the reply, I'll test the suggested changes now, we'll leave the Nic optimization last I think. I'm using 2 x Intel i210 nics btw, forgot to mention that.
I should also add that I tried to add a traffic shaper directly to the WAN and LAN interface with CoDel as the QMA (Instead of using limiters, clicking on "By Interface" In the traffic shaper page), I get nice throughput of 880 Mbit/s with the bandwidth set at 960 Mbit/s (may be bottlenecked a bit then). Unfortunately this way all the traffic going out of the LAN is also limited and queued using CoDel, so if using VLAN this is not a good idea... Really wanted to make this work without doing that. If I apply the CoDel queuing system to the WAN interface only, only the traffic coming from WAN (Downloads) is limited and queued by CoDel. Hopefully this is a clear enough explanation