Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    HAProxy Slows At 1500+ connections Really Need some help to figure out why

    Scheduled Pinned Locked Moved Cache/Proxy
    3 Posts 2 Posters 4.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dsefcik
      last edited by

      So I had haproxy installed (as a front end for a cluster of squid proxies) on a low end Dell server with PFS 2.1.5 and was experiencing slow down with 1500+ connections so I  built up a new PFS 2.2.4 machine on a brand new Dell R630  with 64gb RAM, Dual CPU,  bad ass raid disks etc….loaded and configured haproxy with several squid backends and some ICAP  backends. Things work great until I hit about 1500 or more connections and then everything just slows to a crawl. Restarting haproxy helps momentarily but it will slow back down again very quickly. If I offload clients to the point of only 300-400 connections it will become responsive again. In the haproxy stats page it will show 97% idle or similar and the output from top will show maybe 5% cpu for haproxy. If I configure the browser client to use one of the squid backends directly it works fast but as soon as I put the broswer proxy config back to use the haproxy frontend IP it will slow down.

      I am not really sure how to troubleshoot this and would appreciate any help. I have done the usual searching and tried many of the fixes others have posted but my problem continues. I can post any info here that would help someone determine where my problems may be, I am just not sure what is useful. Below are a few of my  essential configs to start with

      TIA..!

      /var/etc/haproxy.cfg file contents:
      global
      maxconn 50000
      log /var/run/log local0 info
      stats socket /tmp/haproxy.socket level admin
      uid 80
      gid 80
      nbproc 1
      chroot /tmp/haproxy_chroot
      daemon
      spread-checks 5

      listen HAProxyLocalStats
      bind 127.0.0.1:2200 name localstats
      mode http
      stats enable
      stats admin if TRUE
      stats uri /haproxy_stats.php?haproxystats=1
      timeout client 5000
      timeout connect 5000
      timeout server 5000

      frontend HTPL_PROXY
      bind 10.1.4.105:8181 name 10.1.4.105:8181 
      mode http
      log global
      option http-server-close
      option forwardfor
      acl https ssl_fc
      reqadd X-Forwarded-Proto:\ http if !https
      reqadd X-Forwarded-Proto:\ https if https
      timeout client 30000
      default_backend HTPL_WEB_PROXY_http_ipvANY

      frontend HTPL_CONTENT_FILTER
      bind 10.1.4.106:8182 name 10.1.4.106:8182 
      mode tcp
      log global
      timeout client 30000
      default_backend HTPL_CONT_FILTER_tcp_ipvANY

      backend HTPL_WEB_PROXY_http_ipvANY
      mode http
      cookie SERVERID insert indirect
      stick-table type ip size 1m expire 5m
      stick on src
      balance roundrobin
      timeout connect 50000
      timeout server 50000
      retries 3
      server HTPL-PROXY-01 10.1.4.103:3128 cookie HTPLPROXY01 check inter 60000  weight 150 fastinter 1000 fall 5
      server HTPL-PROXY-02 10.1.4.104:3128 cookie HTPLPROXY02 check inter 60000  weight 100 fastinter 1000 fall 5
      server HTPL-PROXY-03 10.1.4.107:3128 cookie HTPLPROXY03 check inter 60000  weight 50 fastinter 1000 fall 5
      server HTPL-PROXY-04 10.1.4.108:3128 cookie HTPLPROXY04 check inter 60000  weight 200 fastinter 1000 fall 5
      server HTHPL-PROXY-01 10.1.4.101:3128 cookie HTHPLPROXY1 check inter 60000  weight 150 fastinter 1000 fall 5
      server HTHPL-PROXY-02 10.1.4.102:3128 cookie HTPHLPROXY02 check inter 60000  weight 100 fastinter 1000 fall 5

      backend HTPL_CONT_FILTER_tcp_ipvANY
      mode tcp
      balance roundrobin
      timeout connect 50000
      timeout server 50000
      retries 3
      server HTHPL-PROXY-01 10.1.4.101:1344 check inter 60000 disabled weight 100 fastinter 1000 fall 5
      server HTHPL-PROXY-02 10.1.4.102:1344 check inter 60000 disabled weight 100 fastinter 1000 fall 5
      server HTPL-WEB-01 10.1.4.153:1344 check inter 60000  weight 200 fastinter 1000 fall 5
      server HTPL-WEB-02 10.1.4.154:1344 check inter 60000  weight 200 fastinter 1000 fall 5

      Some sysctl stuff
      kern.ostype: FreeBSD
      kern.osrelease: 10.1-RELEASE-p15
      kern.osrevision: 199506
      kern.version: FreeBSD 10.1-RELEASE-p15 #0 c5ab052(releng/10.1)-dirty: Sat Jul 25 20:20:58 CDT 2015
          root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10

      kern.maxvnodes: 200000
      kern.maxproc: 70788
      kern.maxfiles: 204800
      kern.argmax: 262144
      kern.securelevel: -1
      kern.hostname: HTPL-PROXY-03.hth.hightechhigh.org
      kern.hostid: 1053306123
      kern.clockrate: { hz = 1000, tick = 1000, profhz = 8128, stathz = 127 }
      kern.posix1version: 200112
      kern.ngroups: 1023
      kern.job_control: 1
      kern.saved_ids: 0
      kern.boottime: { sec = 1443678149, usec = 901465 } Wed Sep 30 22:42:29 2015
      kern.domainname:
      kern.osreldate: 1001000
      kern.bootfile: /boot/kernel/kernel
      kern.maxfilesperproc: 300000
      kern.maxprocperuid: 63709
      kern.ipc.maxsockbuf: 4262144
      kern.ipc.sockbuf_waste_factor: 8
      kern.ipc.max_linkhdr: 16
      kern.ipc.max_protohdr: 60
      kern.ipc.max_hdr: 76
      kern.ipc.max_datalen: 76
      kern.ipc.maxmbufmem: 217774080
      kern.ipc.nmbclusters: 262144
      kern.ipc.nmbjumbop: 13291
      kern.ipc.nmbjumbo9: 11814
      kern.ipc.nmbjumbo16: 8860
      kern.ipc.nmbufs: 1048590
      kern.ipc.maxpipekva: 1071579136
      kern.ipc.pipekva: 163840
      kern.ipc.pipefragretry: 0
      kern.ipc.pipeallocfail: 0
      kern.ipc.piperesizefail: 0
      kern.ipc.piperesizeallowed: 1
      kern.ipc.msgmax: 16384
      kern.ipc.msgmni: 40
      kern.ipc.msgmnb: 8192
      kern.ipc.msgtql: 2048
      kern.ipc.msgssz: 32
      kern.ipc.msgseg: 512
      kern.ipc.semmni: 50
      kern.ipc.semmns: 340
      kern.ipc.semmnu: 150
      kern.ipc.semmsl: 340
      kern.ipc.semopm: 100
      kern.ipc.semume: 50
      kern.ipc.semusz: 632
      kern.ipc.semvmx: 32767
      kern.ipc.semaem: 16384
      kern.ipc.shmmax: 536870912
      kern.ipc.shmmin: 1
      kern.ipc.shmmni: 192
      kern.ipc.shmseg: 128
      kern.ipc.shmall: 131072
      kern.ipc.shm_use_phys: 0
      kern.ipc.shm_allow_removed: 0
      kern.ipc.soacceptqueue: 4096
      kern.ipc.numopensockets: 3448
      kern.ipc.maxsockets: 2092935
      kern.ipc.sendfile.readahead: 1
      kern.dummy: 0
      kern.ps_strings: 140737488351200
      kern.usrstack: 140737488351232
      kern.logsigexit: 1
      kern.iov_max: 1024
      kern.hostuuid: 1d9f393c-6870-11e5-9ebd-000e1e9c38d0
      kern.cam.sort_io_queues: 1
      kern.cam.boot_delay: 0
      kern.cam.num_doneqs: 6
      kern.cam.dflags: 0
      kern.cam.debug_delay: 0
      kern.cam.pmp.retry_count: 1
      kern.cam.pmp.default_timeout: 30
      kern.cam.pmp.hide_special: 1
      kern.cam.cam_srch_hi: 0
      kern.cam.scsi_delay: 5000
      kern.cam.cd.poll_period: 3
      kern.cam.cd.retry_count: 4
      kern.cam.cd.timeout: 30000
      kern.cam.ada.legacy_aliases: 1
      kern.cam.ada.retry_count: 4
      kern.cam.ada.default_timeout: 30
      kern.cam.ada.send_ordered: 1
      kern.cam.ada.spindown_shutdown: 1
      kern.cam.ada.spindown_suspend: 1
      kern.cam.ada.read_ahead: 1
      kern.cam.ada.write_cache: 1
      kern.cam.da.poll_period: 3
      kern.cam.da.retry_count: 4
      kern.cam.da.default_timeout: 60
      kern.cam.da.send_ordered: 1
      kern.cam.enc.emulate_array_devices: 1
      kern.tty_pty_warningcnt: 1
      kern.random.adaptors: yarrow,dummy
      kern.random.active_adaptor: yarrow
      kern.random.live_entropy_sources: Hardware, Intel Secure Key RNG
      kern.random.yarrow.gengateinterval: 10
      kern.random.yarrow.bins: 10
      kern.random.yarrow.fastthresh: 96
      kern.random.yarrow.slowthresh: 128
      kern.random.yarrow.slowoverthresh: 2
      kern.random.sys.seeded: 1
      kern.random.sys.harvest.ethernet: 0
      kern.random.sys.harvest.point_to_point: 0
      kern.random.sys.harvest.interrupt: 0
      kern.random.sys.harvest.swi: 1
      kern.rndtest.retest: 120
      kern.rndtest.verbose: 1
      kern.vt.enable_altgr: 1
      kern.vt.debug: 0
      kern.vt.deadtimer: 15
      kern.vt.suspendswitch: 1
      kern.vt.kbd_halt: 1
      kern.vt.kbd_poweroff: 1
      kern.vt.kbd_reboot: 1
      kern.vt.kbd_debug: 1
      kern.vt.kbd_panic: 0
      kern.disks: mfisyspd9 mfisyspd8 mfisyspd7 mfisyspd6 mfisyspd5 mfisyspd4 mfisyspd3 mfisyspd2 mfisyspd1 mfisyspd0
      kern.geom.eli.version: 7
      kern.geom.eli.debug: 0
      kern.geom.eli.tries: 3
      kern.geom.eli.visible_passphrase: 0
      kern.geom.eli.overwrites: 5
      kern.geom.eli.threads: 0
      kern.geom.eli.batch: 0
      kern.geom.eli.boot_passcache: 1
      kern.geom.eli.key_cache_limit: 8192
      kern.geom.eli.key_cache_hits: 0
      kern.geom.eli.key_cache_misses: 0
      kern.geom.dev.delete_max_sectors: 262144
      kern.geom.disk.mfisyspd0.led:
      kern.geom.disk.mfisyspd1.led:
      kern.geom.disk.mfisyspd2.led:
      kern.geom.disk.mfisyspd3.led:
      kern.geom.disk.mfisyspd4.led:
      kern.geom.disk.mfisyspd5.led:
      kern.geom.disk.mfisyspd6.led:
      kern.geom.disk.mfisyspd7.led:
      kern.geom.disk.mfisyspd8.led:
      kern.geom.disk.mfisyspd9.led:
      kern.geom.transient_maps: 33202
      kern.geom.transient_map_retries: 10
      kern.geom.transient_map_hard_failures: 0
      kern.geom.transient_map_soft_failures: 0
      kern.geom.inflight_transient_maps: 0
      kern.geom.confxml:

      1 Reply Last reply Reply Quote 0
      • P
        PiBa
        last edited by

        This was fixed with the sysctl's for your bge network interfaces?
        http://marc.info/?l=haproxy&m=144399351725189&w=2

        hw.bge.tso_enable=0
        hw.pci.enable_msix=0
        
        

        I changed them just now and was able to easily achieve these numbers without a wink:
        pid = 50054 (process #1, nbproc = 1)
        uptime = 0d 0h03m25s
        system limits: memmax = unlimited; ulimit-n = 100047
        maxsock = 100047; maxconn = 50000; maxpipes = 0
        current conns = 5562; current pipes = 0/0; conn rate = 64/sec
        Running tasks: 1/5587; idle = 97 %

        1 Reply Last reply Reply Quote 0
        • D
          dsefcik
          last edited by

          PiBA, yes, I totally boneheaded it and put bce instead of bge..I have several servers, some with bce and some with bge and I just confused it. After making the change and rebooting it seems to be working better. I am slowly ramping up the users but so far so good at 2500+. The stats I posted below were from Apache Bench so I need real world clients to really test it out.

          Thanks for reminding me to post back to the group.

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.