[2.3.2 - haproxy 1.6.6] balanced connections initially slow, then ramps up
-
Hi,
last weekend we did the final cutover from TMG to a pfSense CARP setup. We've gradually done that over the last few months, all went fine and runs superb. The last step was to migrate our RDGW (Remote Desktop Gateway, basically a reverse SSL proxy for Microsoft remote desktop servers) setup behind HAProxy. We have two RDGWs in DMZ, to which haproxy balances traffic. This al works but the performance is sub-par, with RDP session responding slower, obviously even more notable whenever more graphic content is in the session. We run pfSense on Hyper-V, we plan to get dedicated hardware for them in future though.I've done some tests and found the following behaviour which might be related:
-
I've setup a third test RDGW and created a NAT rule to that. That works perfectly fine, even faster than when using TMG.
-
I also created a haproxy setup for this one test RDGW and found it suffers the same issue when connecting through haproxy
-
As RDGW is basically an IIS box used as reverse proxy, I've put a 700MB ISO on it's webserver space and downloaded that through the NAT rule, which just forwards port 443. Using the NAT rule when I download it from an external address I immediately reach 180Mbps which is the limit of that remote link.
-
When I download that same file through haproxy, the download starts at about 10Mbps/sec and lineary ramps up to 180Mbps in about 20-30 seconds. Then it keeps at 180Mbps.
-
Then I've added a NAT rule to port 80 as well to rule some SSL processing out. I've also changed haproxy to run only on port 80 as well. When accessing through the NAT I expectedly hit 180Mbps immediately as well. When using haproxy the same issue; it starts at about 10Mbps and slowly ramps up to linespeed.
Here is my haproxy.conf.
global
maxconn 2048
stats socket /tmp/haproxy.socket level admin
uid 80
gid 80
nbproc 1
chroot /tmp/haproxy_chroot
daemon
tune.ssl.default-dh-param 2048listen HAProxyLocalStats
bind 127.0.0.1:2200 name localstats
mode http
stats enable
stats admin if TRUE
stats uri /haproxy/haproxy_stats.php?haproxystats=1
timeout client 5000
timeout connect 5000
timeout server 5000frontend rdgw_DMZ
bind x.x.x.x:443 name x.x.x.x:443 ssl crt /var/etc/haproxy/rdgw_DMZ.pem
mode http
log global
option http-keep-alive
timeout client 10800000
default_backend nldmzrdgw_DMZ_http_ipvANYfrontend TESTTEST
bind y.y.y.y:80 name y.y.y.y:80
mode tcp
log global
timeout client 30000
default_backend testtest_tcp_ipvANYbackend nldmzrdgw_DMZ_http_ipvANY
mode http
log global
stick-table type ip size 50k expire 30m
stick on src
balance roundrobin
timeout connect 10800000
timeout server 10800000
retries 3
option httpchk OPTIONS /
server <rdgw_1>172.16.4.6:443 ssl check inter 1000 verify none
server <rdgw_2>172.16.4.7:443 ssl check inter 1000 verify nonebackend testtest_tcp_ipvANY
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
server 10.15.4.6 10.15.4.6:80I've masked the external addresses. For now the test-setup is just pointing to port 80. The huge timeouts on the rdgw_DMZ setup are there for a reason; as RDP traffic is tunneled inside these SSL connections, and the session ofcourse disconnects when haproxy disconnects a session, I've increased the timeouts to 3 hours to match our Remote Desktop servers RDP session timeout.
CPU load is around 20-25% when haproxy reached 180Mbps, and pfSense has 2 vCPU's (on 2950 V3 hardware) so CPU power is plenty, as even single threaded processes could reach up to 50% load. I've got no clue how to troubleshoot this any further. To me this seems a haproxy-thing, as NATting works perfectly fine.</rdgw_2></rdgw_1>
-