pfSense performance with Gb ONT
-
Hello,
I'm using pfSense by 2008 and I'm very happy with it.
Now I've moved my Internet connection to a fiber GPON @ 1 Gbps but the bandwidth can't go over 550 Mbps (more or less).
I was thinking the problem is the actual connection itself (telcos usually over-multiplex your traffic to squeeze more people/connections over a single port at the edge network stations) and don't worrying too much (550/100 Mbps is a good connection in any case...).
Then I found this:
https://teklager.se/en/knowledge-base/choosing-router-operating-system-pfsense-vs-opnsense-vs-openwrt/
Is what they're saying about APUs with pfSense true?I have APU router and want to have a full gigabit internet speed. If you have a full gigabit internet from your ISP, congratulations! You should install OpenWRT or IPFire, both of these Operating systems will perform at full gigabit on APU2, APU3 and APU4 because they are able to utilize all 4 CPU cores. pfSense and OPNSense use only 1 CPU core for routing, and are able to achieve between 400-600Mbit/s. If you want to run OPNSense of pfSense with full gigabit, you will need to upgrade to one of the TLSense routers.
(I quoted the statement present here for future reference of other people in case the page disappears)
My current pfSense firewall runs over a SuperMicro C2358 (dual core Intel Atom C2358 @1.74GHz, 8 GiB RAM) with no particular package installed or heavy daemon running (it's a simple home installation that usually runs idle at 3 to 5 % of CPU most of the time).
Thank you in advance for your attention.
-
The CPU used in the APU is a lot less powerful than that and that same website updated their results after a few tweaks to show that it can pass 1Gbps: https://teklager.se/en/knowledge-base/apu2-1-gigabit-throughput-pfsense/
In general though Linux based routers will often pass more throughput than FreeBSD based ones.
As they say on those posts most newer NICs/drivers provide multiple queues that can be serviced by multiple CPU cores.
There was a time when the pf process itself was single threaded often being the limitation in system throughput but that is no longer the case. However You will still find pf cannot use all available cores equally.I would certainly expect that C2358 system to pass Gigabit if it isn't loaded with packages etc.
Steve
-
On a C2358 box I have here it's on the edge of what it will do but it will do it. This is a pretty close to default 2.5 install:
Passing Gigabit line rate:
[ ID] Interval Transfer Bandwidth Retr [ 5] 0.00-30.03 sec 1.64 GBytes 470 Mbits/sec 42 sender [ 5] 0.00-30.03 sec 0.00 Bytes 0.00 bits/sec receiver [ 7] 0.00-30.03 sec 1.64 GBytes 470 Mbits/sec 0 sender [ 7] 0.00-30.03 sec 0.00 Bytes 0.00 bits/sec receiver [SUM] 0.00-30.03 sec 3.29 GBytes 940 Mbits/sec 42 sender [SUM] 0.00-30.03 sec 0.00 Bytes 0.00 bits/sec receiver
CPU usage:
last pid: 88725; load averages: 5.19, 4.20, 2.42 up 0+13:56:37 12:16:19 162 threads: 18 running, 130 sleeping, 14 waiting CPU: 0.1% user, 0.0% nice, 99.9% system, 0.0% interrupt, 0.0% idle Mem: 44M Active, 289M Inact, 557M Wired, 391M Buf, 3016M Free Swap: PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 0 root -76 - 0 448K - 0 2:25 100.00% [kernel{if_io_tqg_0}] 0 root -76 - 0 448K CPU1 1 2:21 99.77% [kernel{if_io_tqg_1}] 0 root -92 - 0 448K - 0 2:30 0.36% [kernel{dummynet}] 88725 root 42 20 9804K 5208K RUN 1 0:00 0.11% /usr/local/bin/rrdtool update /var/db/rrd/wan-packets.rrd N 7560 root 20 0 17M 3312K RUN 1 0:51 0.11% /usr/local/sbin/pcscd{pcscd} 64108 root 20 0 13M 4008K CPU0 0 0:01 0.11% top -aSH
-
Stephen,
do you believe the tweaks pointed out in the article you linked are worth it in any case? Are they applicable using the "tunables" page or I have to modify the loader.conf.local by ssh?
I already have expanded MBUF in the past but I haven't applied all of those (some of them are specifically dedicated to the igb driver, thus Intel cards, I wonder on other NICs what the directions are).From a (quick) test I made today (with the tweaks applied via "system tunables" and machine rebooted), the box reached 70% of CPU usage during the 550/100 Mbps test (simple speed-test run in parallel): I wonder if it would be able to keep up a sustained rate of 1 Gbps since it is @70% now...
Thank you
-
You need to use
top -aSH
to see how that load it using the CPU cores. One could be pegged at 100% already.Steve
-
I have repeated the test using top as you suggested: as expected I can also see 2 queues for each NIC (e.g.
{irq263: igb2:que 0}
and{irq264: igb2:que 1}
) since the CPU is dual core; because of this, I am assuming the "system tunables" tweaks have been applied and the system is correctly running using multiple queues per each NIC. As a side note, I have also enabled the various hardware offloads for the NICs, using the appropriate checkbox on the pfSense webGUI.During the test the load seemed spread almost equally between the two cores with a 30% idle per core during maximum load period (550 Mbps downlink, 100 Mbps uplink): thus, it looks like the 70% cumulative maximum CPU time under load is confirmed.
I don't know if this is enough to say that the box won't be able to keep up an hypothetical 1 Gbps throughput, since I don't know if the additional load would be linearly added to the current one or if the "pattern" is more on the logarithmic side...
Do you have any advice about how to perform other, maybe more meaningful, tests or how to read these results?
-
I would do a test between in iperf3 server on WAN and a client on LAN to get an idea of the maximum throughput. That what I was doing above to see 940Mbps on the box I have.
Steve
-
Currently dealing with something somewhat similar. Have 1Gbps fiber that I'm feeding directly to pfsense.
Using speedtest, I'm hovering around 500Mbps consistently.
But using the speedtest-cli on the pfsense CLI, I'm consistently getting around 800Mbps. Curious to see if you are seeing this as well.I also discovered that my previous lower speeds are/were being caused by my traffic shaper. I removed the shaper and 'speedtest' tests immediately jumped (though not as far as I was expecting). As far as I know, the queues in the shaper are setup correctly.
-
Hello ck42,
I am not able to answer to your doubt about the differences between running the throughput test directly from the pfSense machine and running it from a different machine settled in one pfSense-controlled network.
Anyhow, I performed several tests in this second scenario during these COVID times (...) and the following are my discoveries.Based on just facts, without further extrapolations, the C2358 Atom with pfSense is not able to hold a sustained Gb rate, at least over a PPPoE connection.
These results are based on the evidence that, performing a fast "swap" of the C2358 box with the modem coming from the provider (a Technicolor with an embedded Linux onboard) the performance of the communication line raised instantly from an average of 550 Mbps to an average of 750/800 Mbps, staying all of the other factors the same (same cables/connections, source/target servers, power supply, etc.).
Swapping back to the pfSense box during the same test session the performance went back to ~550 Mbps.
All of that said maintaining the same ~70% of used CPU during the test sessions, as reported in previous posts.The same test session has been repeated many times during the same day and on different days, in order to exclude random fluctuations of the performance of the line or other issues, showing similar results (at least a 200 Mbps drop using the C2358).
To me these results generate two considerations:
- obviously the C2358 is not able to fully take advantage of an optical (GPON) 1 Gbps PPPoE-backed connection when used in conjunction with pfSense;
- most probably there is some sort of deep networking inefficiency in the BSD kernel and/or in pfSense as a "distro" (pf? PPPoEd? Unfortunately I don't have the skills to verify this), since the CPU load of the box stayed at 70% of utilization, instead of showing a saturation.
In order to verify without a doubt the second bullet I should install something different on the machine (Windows? Some Linux flavor?) and execute the same tests but, also looking at the results already obtained, I personally don't have many doubts.
I would like to hear any comment that may arise from all of this.
Thank you very much for your attention
-
Yup, it because it's PPPoE. That is single threaded in pfSense currently, you will only ever see one queue on the NIC being used.
You can get significant improvement there by settingnet.isr.dispatch=deferred
.
See: https://docs.netgate.com/pfsense/en/latest/hardware/tuning-and-troubleshooting-network-cards.html#pppoe-with-multi-queue-nicsSteve
-
How do I know if my NIC is a multi-queue NIC? (Using the onboard NIC)
-
Most are. It will show in the boot log for most drivers:
Jun 12 14:35:20 kernel igb1: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0xe0a0-0xe0bf mem 0xdfe60000-0xdfe7ffff,0xdff2c000-0xdff2ffff irq 20 at device 20.0 on pci0 Jun 12 14:35:20 kernel igb1: Using 1024 TX descriptors and 1024 RX descriptors Jun 12 14:35:20 kernel igb1: Using 2 RX queues 2 TX queues Jun 12 14:35:20 kernel igb1: Using MSI-X interrupts with 3 vectors Jun 12 14:35:20 kernel igb1: Ethernet address: 00:10:f3:4e:1f:67 Jun 12 14:35:20 kernel igb1: netmap queues/slots: TX 2/1024, RX 2/1024
Steve
-
Here's what I have for mine. Can't tell if it's multi-queued or not...
em1: Using MSIX interrupts with 3 vectors em1: Ethernet address: 54:be:f7:38:b5:84 em1: netmap queues/slots: TX 1/1024, RX 1/1024
-
ck42, it looks like it is single-queued