Http load pool marking servers down

RobertA

I've got a setup with a carp public IP address and a load balanced web site using 3 servers. Things are running great, but I noticed in the system logs that the TCP pooling checking keeps marking the all the servers down (not at the same time) regularly for a few seconds every couple of minutes. See log below.

I'm wondering if anyone has any suggestions or ideas on how to get this cleaned up so the load pool doesn't keep marking the servers as down. Any help would be appreciated.

Oct 6 10:04:26 slbd[410]: TCP poll failed to start to 10.2.102.27:80 in default (Operation now in progress)
Oct 6 10:04:26 slbd[410]: TCP poll failed for 10.2.102.27:80, marking service DOWN
Oct 6 10:04:26 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:04:31 slbd[410]: TCP poll succeeded for 10.2.102.27:80, marking service UP
Oct 6 10:04:31 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:07:36 slbd[410]: TCP poll failed to start to 10.2.102.29:80 in default (Operation now in progress)
Oct 6 10:07:36 slbd[410]: TCP poll failed for 10.2.102.29:80, marking service DOWN
Oct 6 10:07:36 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:07:41 slbd[410]: TCP poll succeeded for 10.2.102.29:80, marking service UP
Oct 6 10:07:41 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:10:21 slbd[410]: TCP poll failed to start to 10.2.102.27:80 in default (Operation now in progress)
Oct 6 10:10:21 slbd[410]: TCP poll failed for 10.2.102.27:80, marking service DOWN
Oct 6 10:10:21 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:10:26 slbd[410]: TCP poll succeeded for 10.2.102.27:80, marking service UP
Oct 6 10:10:26 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:12:26 slbd[410]: TCP poll failed to start to 10.2.102.28:80 in default (Operation now in progress)
Oct 6 10:12:26 slbd[410]: TCP poll failed for 10.2.102.28:80, marking service DOWN
Oct 6 10:12:26 slbd[410]: Service Apache virtual load pool server changed status, reloading filter policy
Oct 6 10:12:31 slbd[410]: TCP poll succeeded for 10.2.102.28:80, marking service UP

cegb

I had the same problem once.
Try increasing the Firewall Maximum States at the System -> Advanced tab.
I used 20000 and it worked perfectly…

Good luck...

RobertA

I'm increasing the number of states, I'll post how that goes here.

Thanks.

RobertA

:( Increasing the number of states did not seem to do anything. Looking at the states table, it might have been possible that the number of states was exceeding 10,000 at certain times of day. But now that I've set it to 20,000, I don't think that is the problem.

I've also tried tweaking the apache server KeepAlive setting and also turned KeepAlive off. No luck there either.

I guess if no solution presents itself, I'll have to switch load balancing to LVS or something.

The two pfSense boxes are Dell 1850 dual Xeon 3.4Ghz with 4GB of memory and a 2-port Intel PRO 1000 network card for WAN and LAN.