HAproxy slow on WAN jagged throughput
-
I'm trying to debug the congestion problem between my proxy and my ISP. I tried with a client connected on the network of the ISP with the same speed and the throughput was almost the same as without proxy.
So the problem lies with the ISP an HA proxy. Also with a client connected with Fast ethernet the throughput didn't throttle so badly.
Without HA proxy I don't have such problems.
-
Still getting the same performance drop when through HAproxy. Any idea how to tweak the congestion mechanism?
-
You have TCP retransmissions... Where you see retransmission's - between pfSense and backend? between pfSense and outside client or internal client?
Why you see retransmission? This you need to explain, this isn't normal flow and I doesn't think they flow can betuned
as it not designed to drop packages .
Why you speak about ISP? What service you try to proxy? Do you see errors in HAproxy logs or in haproxy socat status? -
I do not have jagged throughput but I set up a very simple backend pointing to Apache2 on ubuntu 20.04 and a single http frontend and I see abysmal performance. With NAT I can get an external download speed of 40MiB/s but with HAproxy in between I get 1MiB/s
I start to think there is an issue with the BSD HAproxy package. I am using the 0.60 non devel version with HAprocy 1.8.25
-
- Always use devel version of HAproxy in pfSense. As they mean last stable by devel. Actually I really dislike this naming. Better they have names: haproxy & haproxy-previous-stable as 1.8 is 4 year old & 2.0 is previous stable! 2.2 is current stable version.
- Can't replicate your issue really. In my setup I can take all WAN speed from HAproxy.
-
If I install the devel package will i need to rewrite everything again? Or is it safe to install the devel package, uninstall the non-devel package and all my config is good to go?
-
@se4n_1 install of devel package will remove non-devel package automatically and grab existing config you have for haproxy.
-
This is great news! I will give it a go tonight. Thank you VERY much for these very useful hints.
-
@se4n_1 np, but issue with performance that you described is strange. Try install any nginx behind haproxy and return static 10gb file over http and then over https. If you said you have issues with ssl offloading then I recommend you dig to you dig in aes-ni and vcpu, but you saing that you have issues even with plain http, it really strange.
-
I have the devel version on pfsense. I can get full Gigabit without issues in my ISP and routing myself between VLANs.
The problem is the moment I get out of the ISP. I think with COVID-19 some throttling could been happening, but I got this issue before...
I tested with an installation of HAproxy of other user on Linux and go the same results.
I have to test further, I've got on my side Wireshark captures.
EDIT:
I just did a quick test, with a burst of 500Mbit on the client and then throttling.
-
For sure I see throttling, and bad peering. That's why I had a NAT setup so I could compare downloading the Ubuntu 20.04 image directly from apache2 and the same image via a http (no SSL) frontend to keep things as simple as possible. In principle there are only trivial differences between the packets coming and going through the NAT and HAproxy so I was esentially trying to pin the blame on HAproxy. It seems most people are using the devel package so I will give that a go, I am bullish on the chances of success given this seems very much a minority issue.
-
@se4n_1 did you use
option http-keep-alive
? -
Yes I tried with both, and all of the other keepalive and timeout settings - makes no big difference - I also tried the different closing modes and tunnel mode and a TCP connection instead of http
-
Hello all, happy Sunday. I upgraded the package to the -devel version and I still see the same behavior - no change. Just to recap:
- These settings apply to my fronted:
Automaticaly generated, dont edit manually.
Generated on: 2020-08-30 19:10
global
maxconn 10000
log /var/run/log local0 info
stats socket /tmp/haproxy.socket level admin expose-fd listeners
uid 80
gid 80
nbproc 1
nbthread 4
hard-stop-after 15m
chroot /tmp/haproxy_chroot
daemon
tune.ssl.default-dh-param 2048
server-state-file /tmp/haproxy_server_state
ssl-engine cryptodev
tune.ssl.cachesize 1000000cache webcache total-max-size 256 max-age 1800s
backend srv-frs_ipvANY
mode http
id 126
log global
# use mailers
# level err
email-alert mailers globalmailers
email-alert level err
email-alert from admin@yyy.com
email-alert to sysadmins@yyy.com
email-alert myhostname yyy.com
http-response set-header Strict-Transport-Security max-age=31536000;
timeout connect 30000
timeout server 30000
retries 3
option httpchk OPTIONS /
option tcp-smart-connect
timeout check 5s
timeout tunnel 60000s
timeout connect 20s
timeout http-keep-alive 300s
timeout http-request 30s
timeout queue 20s
timeout server 50s
server srv-frs 10.192.3.54:80 id 127 check inter 10000 resolvers globalresolversfrontend http_test
bind 94.103.xx.yy 80 name 94.103.xx.yy:80
mode http
log global
option http-keep-alive
timeout client 30000
use_backend srv-frs_ipvANYI can see a download of around 1M/s via the proxy but if I NAT directly to the firewall in pfsense, I see a download of around 40M/s
Next step will be to create a virtual server for the proxy and take it off the firewall, that will be a big shame and I hope to avoid it!
-
@se4n_1 what you tried to get by:
cache webcache total-max-size 256 max-age 1800s
This isn't correct part of haproxy config as far I know.
On other side I doesn't see any of uncommon conf which can lead to any performance issues in plain http. For better https I recommend changessl-engine cryptodev
to Intel RAND if CPU is support it. For investigate issues with plain http I already just to test recommended you host install any other NonApache backend, f.e.: nginx, iis, npm hserver, etc. and host one big file to test speed. -
Performance is the same with nginx backend. I think this may be a hardware issue, these are official Netgate XG7100s though :/
As an additional test I also set up a vanilla Ubuntu 20.04 and apt installed haproxy package, dropped in the sample config, NATd a pfsense port to it, and I get downloads of 40M, so it does not appear to be the HAproxy config either.
I dont have any rate limiting/qos/1:1 NAT on this firewall. The only thing special is it has CARP failover and the monitor CARP interface option in HAproxy. Could this be an issue? Actually looking at your config in your header it seems we may be running similar setup, 2x XG7100 presumably with CARP failover and pfsync?
Adendum2:
If I go into maintenance mode and use the second XG7100 I also get the same 1M speed. Wow this is really frustrating! The problem seems very ethereal. -
@se4n_1 I doesn't have carp (my pfsense are not in one place :P), but 100% this isn't case. I really recommend you write to netgate support as xg7100 must provide much more speed from haproxy (more then 1gbs).
-
Hello, so finally an update from me. Netgate and I tried everything we could think of but eventually suspicion fell on the ISP gateway router. I contacted the ISP and they did some tests for a while but eventually this weekend they replaced the gateway router with a new one and the speed issue has disappeared. I can now easily saturate the connection.
So in my case, this was a strange and not fully explained ISP issue that was handling traffic terminating on the WAN VIP differently to traffic NATd to LAN. Thanks for your assistance and sorry my answer will likely be of absolutely no use to anyone else.
-
@se4n_1 hi, actually your answer can help other people as it describes that ISP can also cause performance issues :)
-
I saw the post and redid a test on my side, and the same behaviour, not getting the throughput. I have to test again, but thad the same throtling on multiple ISP with different servers all with HAproxy...