• Hi,

    I am having issues with the CARP performance on my ESXi/Vmware environnement.

    I have an cluster of 4 ESX with DvSwitch on v6.7. All this ESX are connected with 4 x 10Gb/s (1 Management, 1 vMotion, 2 VM Network). On 2 of them, I have 2 pfsense configured as a HA cluster. Each of them have 2 physicals interfaces : 1 connected to WAN network, and 1 to a "Trunk" network, both of them are VMXnet3. I have enabled promiscious, mac modification and forged transmission on DVswitch port allocated to them. I have 5 "VLAN interfaces" configured in addition to the WAN interface. I configured CARP on each network interface so that VM dont loose connection. The cluster is working fine: if the master reboot or loose an interface, the slave get the VIP back and the vm don't even loose the connection.

    My problem is that I have a speed problem on VIP, while it is working great directly on the firewall IP. For example, I put 3 vms on the same ESX to not take in account network :

    • A = VM on Wan network, with IP 172.17.254.247
    • B = VM on Lan network, with IP 172.19.18.33
    • P = pfSense with IP 172.17.254.42 and 172.19.18.251, and CARP VIP 172.17.254.43 and 172.1918.254 (obviously there is more interfaces, but not usedull in this example)

    If i run iperf3 on A to connect to P on WAN interface, I have this results :

    Connecting to host 172.17.254.42, port 5250
    [  4] local 172.17.254.247 port 34838 connected to 172.17.254.42 port 5250
    [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
    [  4]   0.00-1.00   sec  2.11 GBytes  18.2 Gbits/sec    0    860 KBytes
    [  4]   1.00-2.00   sec  1.56 GBytes  13.4 Gbits/sec   45   1.05 MBytes
    [  4]   2.00-3.00   sec  2.45 GBytes  21.1 Gbits/sec    0   1.17 MBytes
    [  4]   3.00-4.00   sec  2.23 GBytes  19.2 Gbits/sec    0   1.26 MBytes
    [  4]   4.00-5.00   sec  1.41 GBytes  12.1 Gbits/sec    5    966 KBytes
    [  4]   5.00-6.00   sec   788 MBytes  6.61 Gbits/sec    0   1.01 MBytes
    [  4]   6.00-7.00   sec  2.59 GBytes  22.2 Gbits/sec    0   1.05 MBytes
    [  4]   7.00-8.00   sec  1.98 GBytes  17.0 Gbits/sec    0   1.08 MBytes
    [  4]   8.00-9.00   sec  2.23 GBytes  19.2 Gbits/sec    0   1.15 MBytes
    [  4]   9.00-10.00  sec  2.22 GBytes  19.1 Gbits/sec    0   1.24 MBytes
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  4]   0.00-10.00  sec  19.6 GBytes  16.8 Gbits/sec   50             sender
    [  4]   0.00-10.00  sec  19.6 GBytes  16.8 Gbits/sec                  receiver
    
    iperf Done.
    

    If i run iperf3 on A to connect to P on WAN VIP interface, I have this results :

    Connecting to host 172.17.254.43, port 5250
    [  4] local 172.17.254.247 port 57642 connected to 172.17.254.43 port 5250
    [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
    [  4]   0.00-1.00   sec  1.04 GBytes  8.89 Gbits/sec    0    479 KBytes
    [  4]   1.00-2.00   sec  1.08 GBytes  9.26 Gbits/sec    0   1024 KBytes
    [  4]   2.00-3.00   sec  1.09 GBytes  9.36 Gbits/sec    0   1.06 MBytes
    [  4]   3.00-4.00   sec  1.09 GBytes  9.35 Gbits/sec    0   1.10 MBytes
    [  4]   4.00-5.00   sec  1.09 GBytes  9.35 Gbits/sec    0   1.11 MBytes
    [  4]   5.00-6.00   sec  1.09 GBytes  9.35 Gbits/sec    0   1.12 MBytes
    [  4]   6.00-7.00   sec  1.09 GBytes  9.34 Gbits/sec    0   1.14 MBytes
    [  4]   7.00-8.00   sec  1.09 GBytes  9.35 Gbits/sec    0   1.15 MBytes
    [  4]   8.00-9.00   sec  1.09 GBytes  9.34 Gbits/sec    0   1.15 MBytes
    [  4]   9.00-10.00  sec  1.09 GBytes  9.35 Gbits/sec    0   1.15 MBytes
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  4]   0.00-10.00  sec  10.8 GBytes  9.29 Gbits/sec    0             sender
    [  4]   0.00-10.00  sec  10.8 GBytes  9.29 Gbits/sec                  receiver
    
    iperf Done.
    

    If i run iperf3 on B to connect to P on LAN interface, I have this results :

    Connecting to host 172.19.18.251, port 5250
    [  4] local 172.19.18.33 port 34702 connected to 172.19.18.251 port 5250
    [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
    [  4]   0.00-1.00   sec  2.87 GBytes  24.7 Gbits/sec   14   1.31 MBytes       
    [  4]   1.00-2.00   sec   691 MBytes  5.80 Gbits/sec    0   1.31 MBytes       
    [  4]   2.00-3.00   sec  2.73 GBytes  23.5 Gbits/sec  106   1.19 MBytes       
    [  4]   3.00-4.00   sec   638 MBytes  5.35 Gbits/sec    0   1.19 MBytes       
    [  4]   4.00-5.00   sec  2.53 GBytes  21.7 Gbits/sec  178    889 KBytes       
    [  4]   5.00-6.00   sec   508 MBytes  4.26 Gbits/sec    0    947 KBytes       
    [  4]   6.00-7.00   sec  2.57 GBytes  22.1 Gbits/sec   83    933 KBytes       
    [  4]   7.00-8.00   sec   679 MBytes  5.69 Gbits/sec    0   1.01 MBytes       
    [  4]   8.00-9.00   sec  2.53 GBytes  21.7 Gbits/sec   72   1.22 MBytes       
    [  4]   9.00-10.00  sec   515 MBytes  4.32 Gbits/sec    0   1.22 MBytes       
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  4]   0.00-10.00  sec  16.2 GBytes  13.9 Gbits/sec  453             sender
    [  4]   0.00-10.00  sec  16.2 GBytes  13.9 Gbits/sec                  receiver
    
    iperf Done.
    

    with a weird ip/down alternance.

    If i run iperf3 on B to connect to P on LAN VIP interface, I have this results :

    Connecting to host 172.19.18.254, port 5250
    [  4] local 172.19.18.33 port 43800 connected to 172.19.18.254 port 5250
    [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
    [  4]   0.00-1.00   sec   258 MBytes  2.16 Gbits/sec  515    267 KBytes       
    [  4]   1.00-2.00   sec   313 MBytes  2.63 Gbits/sec  592    296 KBytes       
    [  4]   2.00-3.00   sec   313 MBytes  2.63 Gbits/sec  488    249 KBytes       
    [  4]   3.00-4.00   sec   309 MBytes  2.59 Gbits/sec  542    209 KBytes       
    [  4]   4.00-5.00   sec   319 MBytes  2.68 Gbits/sec  513    293 KBytes       
    [  4]   5.00-6.00   sec   315 MBytes  2.64 Gbits/sec  752    419 KBytes       
    [  4]   6.00-7.00   sec   313 MBytes  2.63 Gbits/sec  555    269 KBytes       
    [  4]   7.00-8.00   sec   318 MBytes  2.67 Gbits/sec  670    296 KBytes       
    [  4]   8.00-9.00   sec   319 MBytes  2.68 Gbits/sec  463    297 KBytes       
    [  4]   9.00-10.00  sec   314 MBytes  2.64 Gbits/sec  527    287 KBytes       
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  4]   0.00-10.00  sec  3.02 GBytes  2.59 Gbits/sec  5617             sender
    [  4]   0.00-10.00  sec  3.02 GBytes  2.59 Gbits/sec                  receiver
    
    iperf Done.
    

    with no up/down, but slower speed.

    We can see that each time I test on CARP interface, it's slower, and by a lot on the LAN interface considering that we are on the same ESX ...

    But the worst is If I try to test between A and B :

    Connecting to host raoult.test.esante-bfc.fr, port 5201
    [  4] local 172.17.254.247 port 47080 connected to 172.19.18.33 port 5201
    [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
    [  4]   0.00-1.00   sec  26.2 MBytes   220 Mbits/sec  139   24.0 KBytes
    [  4]   1.00-2.00   sec  20.1 MBytes   168 Mbits/sec   53   36.8 KBytes
    [  4]   2.00-3.00   sec  26.5 MBytes   222 Mbits/sec   52   36.8 KBytes
    [  4]   3.00-4.00   sec  18.3 MBytes   154 Mbits/sec   56   15.6 KBytes
    [  4]   4.00-5.00   sec  27.3 MBytes   229 Mbits/sec   76   36.8 KBytes
    [  4]   5.00-6.00   sec  19.6 MBytes   165 Mbits/sec   38   38.2 KBytes
    [  4]   6.00-7.00   sec  9.23 MBytes  77.4 Mbits/sec  146   28.3 KBytes
    [  4]   7.00-8.00   sec  17.2 MBytes   144 Mbits/sec   70   31.1 KBytes
    [  4]   8.00-9.00   sec  3.90 MBytes  32.7 Mbits/sec   56   52.3 KBytes
    [  4]   9.00-10.00  sec  18.1 MBytes   152 Mbits/sec  126   26.9 KBytes
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  4]   0.00-10.00  sec   186 MBytes   156 Mbits/sec  812             sender
    [  4]   0.00-10.00  sec   186 MBytes   156 Mbits/sec                  receiver
    
    iperf Done.
    
    Connecting to host 172.17.254.247, port 5201
    [  4] local 172.19.18.33 port 57548 connected to 172.17.254.247 port 5201
    [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
    [  4]   0.00-1.00   sec   182 MBytes  1.53 Gbits/sec  5085    161 KBytes       
    [  4]   1.00-2.00   sec   195 MBytes  1.64 Gbits/sec  3487    134 KBytes       
    [  4]   2.00-3.00   sec   140 MBytes  1.18 Gbits/sec  2529   67.9 KBytes       
    [  4]   3.00-4.00   sec   152 MBytes  1.28 Gbits/sec  2993    204 KBytes       
    [  4]   4.00-5.00   sec   138 MBytes  1.16 Gbits/sec  2839    136 KBytes       
    [  4]   5.00-6.00   sec   164 MBytes  1.38 Gbits/sec  3113   86.3 KBytes       
    [  4]   6.00-7.00   sec   132 MBytes  1.11 Gbits/sec  2538    263 KBytes       
    [  4]   7.00-8.00   sec   178 MBytes  1.49 Gbits/sec  4870    256 KBytes       
    [  4]   8.00-9.00   sec   174 MBytes  1.47 Gbits/sec  4615    151 KBytes       
    [  4]   9.00-10.00  sec   170 MBytes  1.42 Gbits/sec  3471    140 KBytes       
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  4]   0.00-10.00  sec  1.59 GBytes  1.36 Gbits/sec  35540             sender
    [  4]   0.00-10.00  sec  1.59 GBytes  1.36 Gbits/sec                  receiver
    
    iperf Done.
    

    I can not explain this horrible perf :/

    I tried a lot of twist, mostly with the Offload parameters, and th fastest is with all offload uncheck in advenced->networking. The LRO had the most impact : if I check it, the speed is divided by 2 on VIP and 4 on direct interface ! I tried to change the packetrouting/balance from the DvSwitch, but was not able to dtermine the best one in my case.

    I searched a lot, but most speed problem refer to offload, and from what I read, CARP should not been impacted with pfSense parameters, but with network configuration. I tried to look and change settings on dvswitch, but so far nothing. If someone have any idea, I am ready to listen.


  • After more test, the more balancer perf I can get are finally with the LRO offload check : it decrease my iperf with the firewall interface a lot (2-3Gb/s instead of 15-20Gb/s), but increase the iperf going throught the firewall, between A and B (2-3Gb/s instead or less than 500Mb/s).

    I did all these test on the same ESX, so where are my speed ???