PfSense Dual 10GbE ESXi 6U2 Slow
-
Dear List,
We are using a server with two 10 GbE Optical Cards connected to a vSwitch 'EXTERNAL/WAN', where pfSense is the only other connection - with vmxnet3. So theoretically, pfSense should be able to get 10-20 GbE. A similarly connected server has accomplished 800MB/s (non-vm,non pfSense) without using jumbo frames, and, as such, we expect us to achieve the same. However, with pfSense, we only get a max of 275 MB/s and the clients behind pfSense 250 MB/s.
By Changing System -> Advanced -> Networking -> Unchecked (e.g. enabled)
- Disable hardware TCP segmentation offload
- Disable hardware large receive offload
Then pfSense was able to get 600 MB/s, however the clients behind it reduced significantly to only 80 KB/s. So not a good idea.
However we know that things can go 600 MB/s or better, so what should we do to have 600+ on to firewall -and- the clients behind it?
TEST: curl http://lg.core-backbone.com/files/10000MB.test > /dev/null
Note: we have currently activated 8 (out of 56) cores, and don't think CPU is an issue.
Thanks for any help in getting the performance we are looking for.
-
This is a related post (not a VM setup but still about speeds and such). You should take a read and try out what they did and see if it helps.
https://forum.pfsense.org/index.php?topic=113011.0
-
Please note that we are talking about Mega Bytes per Second. Changing the offloading upped it from 300 to 600 but then the clients behind it suffered. Changing:
hw.pci.enable_msix=0
hw.pci.enable_msi=0did not help:
Interestingly, a 'top -SH' reveals
last pid: 9532; load averages: 0.51, 0.20, 0.11 up 0+04:57:10 19:03:27
159 processes: 11 running, 118 sleeping, 30 waiting
CPU: 0.8% user, 0.0% nice, 5.6% system, 6.8% interrupt, 86.7% idle
Mem: 20M Active, 121M Inact, 225M Wired, 57M Buf, 7557M Free
Swap: 2047M Total, 2047M FreePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 128K CPU4 4 296:40 100.00% idle{idle: cpu4}
11 root 155 ki31 0K 128K CPU7 7 296:37 100.00% idle{idle: cpu7}
11 root 155 ki31 0K 128K CPU1 1 295:31 100.00% idle{idle: cpu1}
11 root 155 ki31 0K 128K RUN 3 296:35 96.97% idle{idle: cpu3}
11 root 155 ki31 0K 128K CPU5 5 296:32 93.99% idle{idle: cpu5}
11 root 155 ki31 0K 128K RUN 6 296:27 89.99% idle{idle: cpu6}
11 root 155 ki31 0K 128K CPU2 2 296:37 86.96% idle{idle: cpu2}
12 root -92 - 0K 512K CPU0 0 3:55 55.96% intr{irq258: vmx0}
86328 root 52 0 56664K 6828K select 3 0:11 48.97% curl{curl}
11 root 155 ki31 0K 128K RUN 0 292:34 48.00% idle{idle: cpu0} -
Any Ideas?
-
bump
-
as far as i know the freebsd kernel will not be able to achieve such troughput at this time (specially not virtualized)
there is a big difference between sending/receiving that ammount of traffic & routing/firewalling such ammount of throughputread up on netmap-fwd that will fix this:
https://blog.pfsense.org/?p=1866 -
Is this something we can install on pfSense?
We consider our hardware in unlimited for such task. We could go up to 56 cores, if needed.
On the net map-fwd GitHub page, we saw values of only 600 Mbps; we are already at 300 MB/s (2400 Mbps) and are looking for 700 MB/s + in order to get closer to the full 10GbE.
What do the pfSense experts have to say?
-
600Mbps on a quadcore atom
-
Sure; we have faster HW, but how do we make it work?
-
On the net map-fwd GitHub page, we saw values of only 600 Mbps; we are already at 300 MB/s (2400 Mbps) and are looking for 700 MB/s + in order to get closer to the full 10GbE.
You're looking at the completely wrong number. Mbps/Gbps means nothing at all, pps is what matters. That's 600 Mbps at minimum size packets, over 1 Mpps. That'd be upwards of 10 Gbps at the average packet size of typical Internet traffic.
Is this something we can install on pfSense?
Not at this time.
You might be able to squeeze a bit more than what you're currently getting through ESX, but I think the best I've seen or heard of at large packet sizes inside ESX is roughly 4 Gbps at 1500 MTU.
-
Hi Chris,
Thanks for your response. I guess Mbps is not always the same. ;)
Could you provide some instructions or would we have engage your services to obtain -at least - the 4Gps? This is important to us.
Thanks so kindly,
Alfredo.
-
Instructions please.
-
Check the "go faster" box.
There are no instructions, or anything I'm aware of to impact what you're getting.
-
Interestingly, a 'top -SH' reveals
last pid: 9532; load averages: 0.51, 0.20, 0.11 up 0+04:57:10 19:03:27
159 processes: 11 running, 118 sleeping, 30 waiting
CPU: 0.8% user, 0.0% nice, 5.6% system, 6.8% interrupt, 86.7% idle
Mem: 20M Active, 121M Inact, 225M Wired, 57M Buf, 7557M Free
Swap: 2047M Total, 2047M FreePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 128K CPU4 4 296:40 100.00% idle{idle: cpu4}
11 root 155 ki31 0K 128K CPU7 7 296:37 100.00% idle{idle: cpu7}
11 root 155 ki31 0K 128K CPU1 1 295:31 100.00% idle{idle: cpu1}
11 root 155 ki31 0K 128K RUN 3 296:35 96.97% idle{idle: cpu3}
11 root 155 ki31 0K 128K CPU5 5 296:32 93.99% idle{idle: cpu5}
11 root 155 ki31 0K 128K RUN 6 296:27 89.99% idle{idle: cpu6}
11 root 155 ki31 0K 128K CPU2 2 296:37 86.96% idle{idle: cpu2}
12 root -92 - 0K 512K CPU0 0 3:55 55.96% intr{irq258: vmx0}
86328 root 52 0 56664K 6828K select 3 0:11 48.97% curl{curl}
11 root 155 ki31 0K 128K RUN 0 292:34 48.00% idle{idle: cpu0}Try with only 1 vCPU. Just try.
Does plaing with "Disabling checksum offload" change anything?