slow pfsense IPSec performance
-
@stephenw10 thanks for your very professional support.
Yes, outside the tunnel I was able to reach higher rates (using pfsense integrated IPERF tool checking the connectivity WAN-to-WAN)
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 101 MBytes 844 Mbits/sec 38 226 KBytes
[ 5] 1.00-2.00 sec 107 MBytes 898 Mbits/sec 17 174 KBytes
[ 5] 2.00-3.00 sec 104 MBytes 870 Mbits/sec 42 227 KBytes
[ 5] 3.00-4.00 sec 103 MBytes 868 Mbits/sec 31 313 KBytes
[ 5] 4.00-5.00 sec 105 MBytes 879 Mbits/sec 14 203 KBytes
[ 5] 5.00-6.00 sec 102 MBytes 854 Mbits/sec 36 254 KBytes
[ 5] 6.00-7.00 sec 104 MBytes 875 Mbits/sec 15 217 KBytes
[ 5] 7.00-8.00 sec 105 MBytes 879 Mbits/sec 50 143 KBytes
[ 5] 8.00-9.00 sec 102 MBytes 856 Mbits/sec 30 227 KBytes
[ 5] 9.00-10.00 sec 107 MBytes 898 Mbits/sec 19 271 KBytes[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.02 GBytes 872 Mbits/sec 292 sender
[ 5] 0.00-10.09 sec 1.01 GBytes 864 Mbits/sec receiveriperf Done.
This is the latency between the two WAN interfaces of pfsense instances:
PING DEST_IP from SRC_IP: 56 data bytes
64 bytes from DEST_IP: icmp_seq=0 ttl=57 time=1.069 ms
64 bytes from DEST_IP: icmp_seq=1 ttl=57 time=1.194 ms
64 bytes from DEST_IP: icmp_seq=2 ttl=57 time=1.046 ms--- DEST_IP ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 1.046/1.103/1.194/0.065 msIPSEC is a fantastic thing, but I'm struggling to make it working as expected with acceptable performance.
Both the pfsense endpoints are connected to the GARR research network with a 1Gb link. -
@cool_corona Should I set these values on OPT1 and WAN interfaces disabling MSS clamping for VPN traffic in Sys > Adv > Firewall & NAT ?
-
@mauro-tridici I’m guessing the reason is virtualisation.
Virtualisation has a hefty penalty cost in bandwidth when it comes to such things as interrupt based packet handling in a guest VM. This is due to the hypervisor context switching taking place several times across the hypervisor and guest VM for every packet interrupt that needs to be handled.
If the traffic does not scale properly across CPU cores and is “serialised” on a single core, then throughput will be very low, and yet there will be no real CPU utilisation because all the time is spent waiting on context switches for interrupt handling rather than actually processing traffic payload. -
Setting MSS clamping there will apply it to all traffic that matches a defined P2 subnet. On any interface. That should be sufficient.
1ms between the end points seems unexpectedly low! Is this a test setup with local VMs?
Steve
-
@mauro-tridici Yes. Report back.
-
@keyser Just not true. The performance penalty is minimal compared to bare metal.
-
@cool_corona said in slow pfsense IPSec performance:
@keyser Just not true. The performance penalty is minimal compared to bare metal.
Ehh no, depends very much on what hardware you have (and it’s level of virtualization assist). But if it’s reasonably new x86-64 and above Atom level Level CPU, then yes, that will see the cost of virtualisation dvindle. But if we are taking Atom level Jxxxx series CPUs, then those speeds are very much at the very limit of the hardware when doing virtualisation.
-
@keyser Nobody uses Atoms for Virtualization.....
-
@stephenw10 thank you for the clarification.
I will try to describe the scenario-
we have two different VMware ESXi hypervisors (A and B) in two different sites
-
on the ESXI_A has been deployed pfsense endpoint PF_A
-
on the ESX_B has been deployed pfsense endpoint PF_B
-
PF_A and PF_B are geographically connected using a 1Gb network link and WAN-to-WAN iperf test is good (about 900Mbps)
-
HOST_1 is a physical server on a LAN behind the PF_A
-
HOST_2 is a test virtual machine deployed on ESX_B on a LAN behind the PF_B (so, HOST_2 and PF_B are on the same hypervisor and they are "virtually" connected, there is no a LAN cable between HOST_2 and PF_B since they are on the same hypervisor)
When I try to run iperf2 between HOST1 and HOST2 I obtain only 240/290 Mbps
I hope it helps.
-
-
@cool_corona unfortunately nothing changed, thank you.
-
@mauro-tridici Your ISP is not throttling VPN's??
-
Mmm, 1ms between then is like in the same data center. Or at least geographically very close. What is the route between them?
When you ran the test outside the tunnel, how was that done? Still between the two ESXi hosts?
-
@cool_corona said in slow pfsense IPSec performance:
Nobody uses Atoms for Virtualization.....
Ha. Assume nothing!
-
@cool_corona my ISP is not throttling the VPNs, we already use several VPNs (host to LAN) without any problem (iperf test bitrate is optimal). We are experiencing this low bitrate only with IPSEC LAN2LAN VPN
-
@stephenw10 we have 2 data centres in the same city, they are interconnected with a dedicated 1Gb link on the GARR network.
the test outside the tunnel is between the WAN interfaces of the two pfsense instances: that is between PF_A and PF_B -
@stephenw10 our hypervisors have "2 x Intel(R) Xeon(R) Gold 5218 CPU - 32 cores @ 2.30GHz"
-
Hmm, that should be plenty fast enough. What happens if you test across the tunnel between the two pfSense instances directly? So set the source IP on the client to be in the P2.
-
@mauro-tridici said in slow pfsense IPSec performance:
@stephenw10 our hypervisors have "2 x Intel(R) Xeon(R) Gold 5218 CPU - 32 cores @ 2.30GHz"
Then it’s definitely not hardware that is limiting the transferspeed. Those CPU’s/platforms have loads of power for this usecase.
-
@stephenw10 Sorry, I didn't understand the test I should do?
Should I do an iperf or a ping test between PF_A[opt1] and PF_B[opt2]?PF_A has
WAN IP: xxxxxxxx
LAN IP (for management only): 192.168.240.11
OPT1 IP: 192.168.202.1PF_B has
WAN IP: yyyyyyyy
LAN IP (for management only): 192.168.220.123
OPT1 IP: 192.168.201.1 -
Normally you run iperf3 server on one pfSense box then run iperf3 client on the other one and give it the WAN address of the first one to connect to.
But to test over the VPN the traffic has to match the defined P2 policy so at the client end you need to set the Bind address to, say, the LAN IP and then point it at the LAN IP of the server end.
Then you are testing directly across the tunnel without going through any internal interfaces that might be throttling.
So run
iperf3 -s
on the PF_A as normal.
Then on PF_B runiperf3 -c -B 192.168.220.123 192.168.240.11