Haproxy timeout

rajbps

Hi Team,

I have haproxy installed for a while and I just realised that every so often it times out.

On syslog this is the error i am getting:

"Message: Server testbox is DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup servers left.
1 sessions active, 0 requeued, 0 remaining in queue."

"Message: backend testbox has no server available!"

I can`t make heads and tail as to why. The testbox is a vm machine and haproxy is on a physical box running latest version of pfsense and the available package.

Here is the haproxy configuration file:

Automaticaly generated, dont edit manually.

Generated on: 2018-03-31 23:26

global
maxconn 1000
log 192.168.0.1 syslog debug
stats socket /tmp/haproxy.socket level admin
gid 80
nbproc 1
chroot /tmp/haproxy_chroot
daemon
log-send-hostname pfsense
server-state-file /tmp/haproxy_server_state

listen HAProxyLocalStats
bind 127.0.0.1:2200 name localstats
mode http
stats enable
stats refresh 10
stats admin if TRUE
stats uri /haproxy/haproxy_stats.php?haproxystats=1
timeout client 5000
timeout connect 5000
timeout server 5000

frontend testbox-merged
bind my.ext.ip.addr:80 name my.ext.ip.addr:80
mode http
log global
option http-keep-alive
option forwardfor
acl https ssl_fc
http-request set-header X-Forwarded-Proto http if !https
http-request set-header X-Forwarded-Proto https if https
timeout client 30000
acl httpsredirect hdr(host) -i zam.mydomain.com
acl httpsredirect hdr(host) -i snail.mydomain.co.uk
acl mydomain1 hdr(host) -i mydomain1.co.uk
acl mydomain1 hdr(host) -i www.mydomain1.co.uk
acl mydomain2 hdr(host) -i mydomain2.com
acl mydomain2 hdr(host) -i www.mydomain2.com
acl mydomainuk hdr(host) -i www.mydomain.co.uk
acl mydomainuk hdr(host) -i mydomain.co.uk
acl mydomaincom hdr(host) -i www.mydomain.com
acl mydomaincom hdr(host) -i mydomain.com
acl mydomaincom hdr_beg(host) -i mydomain.com/wp-admin/index.php
acl prodomainseed hdr_beg(host) -i prodomainseed.mydomain.com
acl mydomain3 hdr(host) -i mydomain3.com
acl mydomain3 hdr_beg(host) -i www.mydomain3.com
acl mydomain3 hdr_beg(host) -i mydomain3.com/wp-admin/index.php
acl mydomain4 hdr(host) -i mydomain4.co.uk
acl mydomain4 hdr(host) -i www.mydomain4.co.uk
acl mydomain5 hdr(host) -i mydomain5.com
acl mydomain5 hdr_beg(host) -i www.mydomain5.com
http-request redirect scheme https if httpsredirect
use_backend testbox_http_ipv4 if mydomain1
use_backend testbox_http_ipv4 if mydomain2
use_backend testbox_http_ipv4 if mydomainuk
use_backend testbox_http_ipv4 if mydomaincom
use_backend prodomainseed_http_ipvANY if prodomainseed
use_backend testbox_http_ipv4 if mydomain3
use_backend testbox_http_ipv4 if mydomain4
use_backend testbox_http_ipv4 if mydomain5

frontend HTTPS-merged
bind my.ext.ip.addr:443 name my.ext.ip.addr:443
mode tcp
log global
option log-separate-errors
option tcplog
timeout client 30000
tcp-request inspect-delay 5s
acl snail req.ssl_sni -i mail.mydomain.com
acl testboxm req.ssl_sni -i testbox.mydomain.com
acl zam req.ssl_sni -i zam.mydomain.com
acl wawa req.ssl_sni -m beg -i snail.mydomain.co.uk
acl wawaI req.ssl_sni -m beg -i mail.mydomain.co.uk
acl wawaI req.ssl_sni -m beg -i isnail.mydomain.co.uk
acl man req.ssl_sni -i man.mydomain3.com
tcp-request content accept if { req.ssl_hello_type 1 }

tcp-request content accept if { req.ssl_hello_type 1 }

use_backend wawa_https_ipvANY if snail
use_backend testbox_Management_https_ipv4 if testboxm
use_backend zam_https_ipv4 if zam
use_backend wawa_red_https_ipvANY if wawa
use_backend wawa_I_https_ipvANY if wawaI
use_backend isnail_https_ipvANY if wawaI
use_backend man_https_ipvANY if man
default_backend wawa_https_ipvANY

backend testbox_http_ipv4
mode http
log global
timeout connect 30000
timeout server 30000
retries 3
source ipv4@ usesrc clientip
option httpchk OPTIONS /
server testbox 192.168.39.70:80 check inter 1000

backend prodomainseed_http_ipvANY
mode http
log global
timeout connect 30000
timeout server 30000
retries 3
option httpchk OPTIONS /
server prodomainseed 192.168.0.51:80 check inter 1000

backend wawa_https_ipvANY
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
server mail 192.168.3.22:443 check inter 1000

backend testbox_Management_https_ipv4
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
source ipv4@ usesrc clientip
server testbox 192.168.39.70:8080 check inter 1000

backend zam_https_ipv4
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
source ipv4@ usesrc clientip
server zam 192.168.39.40:443 check inter 1000

backend wawa_red_https_ipvANY
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
server snail 192.168.40.50:443 check inter 1000

backend wawa_I_https_ipvANY
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
server wawaI 192.168.40.20:443 check inter 1000

backend isnail_https_ipvANY
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
server isnail 192.168.40.2:443 check inter 1000

backend man_https_ipvANY
mode tcp
log global
timeout connect 30000
timeout server 30000
retries 3
server man 192.168.39.70:443 check inter 1000

After starting to write this msg, I changed the setting on the main http, Health Checking Check frequency and that was balnk, changed it to 2000 and since then I have not seen the issue.

Unsure if it will come again or maybe someone else having the same issue might be helped by this.

Rajbps

PiBa

check duration: 1001ms

Well your testbox took longer than 1 second to respond.. And did so 3 times in a row. So either the url its checking does a lot of processing and is already close to that 1 second regularly. Or perhaps the webapplication got restarted and needed to load its configuration from disk again or some memory garbage collection happening.. Just guessing here..
Anyhow there was a period of time that the webserver was unable to respond properly.. Making the health checks span more time than the default 1 second works to avoid the 'down' syslog message, it doesn't fix the underlying problem.

Perhaps enable the "Log checks" on the backend, then it will log a bit more around when the state of the health is changing. And might tell how often one or two checks are failing..