Haproxy timeout



  • Hi Team,

    I have haproxy installed for a while and I just realised that every so often it times out.

    On syslog this is the error i am getting:

    "Message: Server testbox is DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup servers left.
    1 sessions active, 0 requeued, 0 remaining in queue."

    "Message: backend testbox has no server available!"

    I can`t make heads and tail as to why. The testbox is a vm machine and haproxy is on a physical box running latest version of pfsense and the available package.

    Here is the haproxy configuration file:

    Automaticaly generated, dont edit manually.

    Generated on: 2018-03-31 23:26

    global
    maxconn 1000
    log 192.168.0.1 syslog debug
    stats socket /tmp/haproxy.socket level admin
    gid 80
    nbproc 1
    chroot /tmp/haproxy_chroot
    daemon
    log-send-hostname pfsense
    server-state-file /tmp/haproxy_server_state

    listen HAProxyLocalStats
    bind 127.0.0.1:2200 name localstats
    mode http
    stats enable
    stats refresh 10
    stats admin if TRUE
    stats uri /haproxy/haproxy_stats.php?haproxystats=1
    timeout client 5000
    timeout connect 5000
    timeout server 5000

    frontend testbox-merged
    bind my.ext.ip.addr:80 name my.ext.ip.addr:80 
    mode http
    log global
    option http-keep-alive
    option forwardfor
    acl https ssl_fc
    http-request set-header X-Forwarded-Proto http if !https
    http-request set-header X-Forwarded-Proto https if https
    timeout client 30000
    acl httpsredirect hdr(host) -i zam.mydomain.com                               
    acl httpsredirect hdr(host) -i snail.mydomain.co.uk
    acl mydomain1 hdr(host) -i mydomain1.co.uk
    acl mydomain1 hdr(host) -i www.mydomain1.co.uk
    acl mydomain2 hdr(host) -i mydomain2.com
    acl mydomain2 hdr(host) -i www.mydomain2.com
    acl mydomainuk hdr(host) -i www.mydomain.co.uk
    acl mydomainuk hdr(host) -i mydomain.co.uk
    acl mydomaincom hdr(host) -i www.mydomain.com
    acl mydomaincom hdr(host) -i mydomain.com
    acl mydomaincom hdr_beg(host) -i mydomain.com/wp-admin/index.php
    acl prodomainseed hdr_beg(host) -i prodomainseed.mydomain.com
    acl mydomain3 hdr(host) -i mydomain3.com
    acl mydomain3 hdr_beg(host) -i www.mydomain3.com
    acl mydomain3 hdr_beg(host) -i mydomain3.com/wp-admin/index.php
    acl mydomain4 hdr(host) -i mydomain4.co.uk
    acl mydomain4 hdr(host) -i www.mydomain4.co.uk
    acl mydomain5 hdr(host) -i mydomain5.com
    acl mydomain5 hdr_beg(host) -i www.mydomain5.com
    http-request redirect scheme https  if  httpsredirect
    use_backend testbox_http_ipv4  if  mydomain1
    use_backend testbox_http_ipv4  if  mydomain2
    use_backend testbox_http_ipv4  if  mydomainuk
    use_backend testbox_http_ipv4  if  mydomaincom
    use_backend prodomainseed_http_ipvANY  if  prodomainseed
    use_backend testbox_http_ipv4  if  mydomain3
    use_backend testbox_http_ipv4  if  mydomain4
    use_backend testbox_http_ipv4  if  mydomain5

    frontend HTTPS-merged
    bind my.ext.ip.addr:443 name my.ext.ip.addr:443 
    mode tcp
    log global
    option log-separate-errors
    option tcplog
    timeout client 30000
    tcp-request inspect-delay 5s
    acl snail req.ssl_sni -i mail.mydomain.com
    acl testboxm req.ssl_sni -i testbox.mydomain.com
    acl zam req.ssl_sni -i zam.mydomain.com
    acl wawa req.ssl_sni -m beg -i snail.mydomain.co.uk
    acl wawaI req.ssl_sni -m beg -i mail.mydomain.co.uk
    acl wawaI req.ssl_sni -m beg -i isnail.mydomain.co.uk
    acl man req.ssl_sni -i man.mydomain3.com
    tcp-request content accept if { req.ssl_hello_type 1 }

    tcp-request content accept if { req.ssl_hello_type 1 }

    use_backend wawa_https_ipvANY  if  snail
    use_backend testbox_Management_https_ipv4  if  testboxm
    use_backend zam_https_ipv4  if  zam
    use_backend wawa_red_https_ipvANY  if  wawa
    use_backend wawa_I_https_ipvANY  if  wawaI
    use_backend isnail_https_ipvANY  if  wawaI
    use_backend man_https_ipvANY  if  man
    default_backend wawa_https_ipvANY

    backend testbox_http_ipv4
    mode http
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    source ipv4@ usesrc clientip
    option httpchk OPTIONS /
    server testbox 192.168.39.70:80 check inter 1000

    backend prodomainseed_http_ipvANY
    mode http
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    option httpchk OPTIONS /
    server prodomainseed 192.168.0.51:80 check inter 1000

    backend wawa_https_ipvANY
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    server mail 192.168.3.22:443 check inter 1000

    backend testbox_Management_https_ipv4
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    source ipv4@ usesrc clientip
    server testbox 192.168.39.70:8080 check inter 1000

    backend zam_https_ipv4
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    source ipv4@ usesrc clientip
    server zam 192.168.39.40:443 check inter 1000

    backend wawa_red_https_ipvANY
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    server snail 192.168.40.50:443 check inter 1000

    backend wawa_I_https_ipvANY
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    server wawaI 192.168.40.20:443 check inter 1000

    backend isnail_https_ipvANY
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    server isnail 192.168.40.2:443 check inter 1000

    backend man_https_ipvANY
    mode tcp
    log global
    timeout connect 30000
    timeout server 30000
    retries 3
    server man 192.168.39.70:443 check inter 1000

    After starting to write this msg, I changed the setting on the main http, Health Checking Check frequency and that was balnk, changed it to 2000 and since then I have not seen the issue.

    Unsure if it will come again or maybe someone else having the same issue might be helped by this.

    Rajbps



  • check duration: 1001ms

    Well your testbox took longer than 1 second to respond.. And did so 3 times in a row. So either the url its checking does a lot of processing and is already close to that 1 second regularly. Or perhaps the webapplication got restarted and needed to load its configuration from disk again or some memory garbage collection happening.. Just guessing here..
    Anyhow there was a period of time that the webserver was unable to respond properly.. Making the health checks span more time than the default 1 second works to avoid the 'down' syslog message, it doesn't fix the underlying problem.

    Perhaps enable the "Log checks" on the backend, then it will log a bit more around when the state of the health is changing. And might tell how often one or two checks are failing..