Intel Server gigabit NIC (em driver) and slow performance on pfSense
-
Hi
Capable Intel gigabit NICs supports something called InterruptThrottleRate. It's purpose is to dynamically adjust the interrupt rate to moderate CPU usage if lots of packets are coming in.
But in a firewall distro as pfSense, minimum latency is probably the most important in which I can just turn of the InterruptThrottleRate algorithm in Linux (e1000e driver) by setting it to 0.
As far as I have understood the FreeBSD em-driver still operates with a static value (was 8000hz, but 10000hz now i think) like Linux did before Intel developed the algorithm to lower the latency quite drastically.Background info:
http://www.intel.com/design/network/applnots/ap450.htmMy question is how this InterruptThrottling is implemented in the em-driver in pfSense. Is it turned off? Is it using the same dynamic algorithm as Linux/Windows, or is it still static?
This can affect the throughput and latency quite much so I would really like to know how this is implemented in pfSense, as I can find no way to set this value in pfsense/FreeBSD as I can in Linux by typing i.e "modprobe e1000e InterruptThrottleRate=0,0).
This is too bad as this parameter is vital to getting the most out of the Intel NICs in different usage scenarios.Anyone has an idea? As the Intel NICs get recommended all over the place for pfSense this must have been thought of.
The reason for asking is because I generally get higher and more unstable latencies (just measured roundtrip to LAN interface using ping) (0.1-0.21ms) when running pfSense compared to the same Intel PRO/1000PT dual port server NIC in Debian Lenny with the e1000e driver (0.06-0.12ms). Running test with iperf the CPU usage on pfSense is around 75% (with em0 taskq on CPU0 around 92% using top -S) when maxing the gigabit in one direction, while Debian on the same computer uses between 20 and 30%.
Based on the CPU usage pfSense is not using the InterrruptThrottle mechanism as I did use in Debian (using the modern dynamic version), but the latency is higher and unstable indicating that there may be something going on anyway.pfSense hardware:
Intel Core 2 Duo E8400
4GB DDR2 800mhz RAM
Intel Pro/1000PT dual port PCI-e 4x. -
Update:
Just to check I installed Vyatta, and it uses a lot less CPU. It also has ping latencies in the 0.05-0.07 range.
Running the following test:
Server: iperf -s -i 3 -w 256k
Client: iperf -c 10.0.0.22 -i 3 -t 300 -w 256k -d (full duplex test)gave the following result on pfSense:
880Mbit/s one way, and about 280Mbit/s the other way, which is not very impressive.On Vyatta with the exact same hardware and the exact same test (just installed vyatta over pfsense) i get:
914Mbit/s one way, and about 825Mbit/s the other way.
Keep in mind that the latencies are also continiously lower on Vyatta AND the CPU usage is lower.This tells me that the em driver in pfSense probably is nowhere near it's potential compared to the e1000e driver in Linux. Either that or FreeBSD is relatively slow performancewise compared to modern Linux (Vyatta is also based on Debian Lenny, and run kernel 2.6.26-1).
Does someone have any comments here? Since these are the recommended cards for pfSense I really think this should be improved.
-
If what you're saying is correct - then yes, hopefully someone can do something (we use Intel cards).
Just a random shot in the dark, the issue you have described and tested is different than the "Use Device Polling" and "Disable Hardware Checksum Offload" option from System/Advanced in the WebGUI, right?
-
The FreeBSD em driver has some tunables that may do what you want. See the Free BSD man page for the em driver: http://www.freebsd.org/cgi/man.cgi?query=em&apropos=0&sektion=0&manpath=FreeBSD+7.1-RELEASE&format=html
The delays would act to moderate the interrupt rate by allowing a single interrupt to service a number of frames.
Any values you set are read at system startup so can't be changed without a reboot. You can set the values in the file /boot/loader.conf
It looks as if you can set values for all em NICs, but not specific values for individual NICs.
-
If what you're saying is correct - then yes, hopefully someone can do something (we use Intel cards).
Just a random shot in the dark, the issue you have described and tested is different than the "Use Device Polling" and "Disable Hardware Checksum Offload" option from System/Advanced in the WebGUI, right?
This test is "out-of-the-box" performance with just a basic setup using NAT and minimum FW as the default setup.
In other words the default settings in pfSense are not to use device polling and Hardware Checksum Offload is not disabled.This is "unmodified" performance on both boxes.
-
Hmm, there seems to be lots of problems regarding very high CPU usage (and possibly lower performance) using the em-driver in the FreeBSD 7.X versions.
Just search for +taskq +em using google and you will find several people with the same problems.
(taskq is the process eating up almost all CPU cycles on one CPU core when loading the NIC)
Hmm, this is too bad as I would really like to use pfSense instead of some Linux based firewall/router. Vyatta seems a little bit unfinished still, with is lacking pppoe qos implementation and very buggy GUI (if you can call it that) (so buggy that I disabled it and only used the cli).I don't really have the need for all this performance, but I hate to know that there are avoidable "inefficiencies" in my box.
-
The FreeBSD em driver has some tunables that may do what you want. See the Free BSD man page for the em driver: http://www.freebsd.org/cgi/man.cgi?query=em&apropos=0&sektion=0&manpath=FreeBSD+7.1-RELEASE&format=html
The delays would act to moderate the interrupt rate by allowing a single interrupt to service a number of frames.
Any values you set are read at system startup so can't be changed without a reboot. You can set the values in the file /boot/loader.conf
It looks as if you can set values for all em NICs, but not specific values for individual NICs.
Reading through that manual page confirms my suspicion that the em driver is still locked to static values like old Linux behaviour, not using the modern dynamic algorithms (couldn't se it mentioned in the manual). But it wouldn't matter anyways since it seems interrupt moderation seems to be turned off by default, which is probably the best option for pfSense anyway. But as the Linux case shows the dynamic version really gives the best of both worlds.
But if it's not set, which may partially explain some of the excessive CPU-usage compared to Linux/Windows 7 RC (which I also tested the card with), then something else must be causing the somewhat increased latency. And the low duplex-bandwith is also strange, but maybe caused by the excessive CPU-usage.
-
The igb driver appears to support 'Adaptive Interrupt Moderation', but in FreeBSD that's only the 82575 and 82576 chips. If your hardware (82571/82572) supports AIM, as evidently it does since linux & windows do it, maybe it would be possible to modify the igb driver to handle the 82571/82572 chips as well.
Being a FreeBSD newbie, I haven't a clue what that entails, but I imagine it's at least adding PCI ids to igb and removing them from em …
regards, ... Charlie