Load balance/failover problem with FTP service

JoshW

I have two pfSense firewalls setup with CARP/Virtual IP failover. I also have a virtual FTP server configured to failover between two DMZ'd FTP servers. The two servers use separate PASV port ranges, which are also forwarded. The servers are running vsftpd.

The setup works; however, performance is not very good. I have noticed the following in the pfSense logs:

Sep 29 11:33:46 slbd[32492]: TCP poll succeeded for 192.168.3.21:21, marking service UP
Sep 29 11:33:46 slbd[32492]: Switching to sitedown for VIP 69.198.121.120:21
Sep 29 11:33:41 slbd[32492]: TCP poll failed to start to 192.168.3.21:21 in default (Operation now in progress)
Sep 29 11:33:41 slbd[32492]: Switching to sitedown for VIP 69.198.121.120:21
Sep 29 11:33:36 slbd[32492]: TCP poll failed to start to 192.168.3.21:21 in default (Bad file descriptor)
Sep 29 11:33:36 slbd[32492]: Switching to sitedown for VIP 69.198.121.120:21
Sep 29 11:33:31 slbd[32492]: Service ftp changed status, reloading filter policy
Sep 29 11:33:31 slbd[32492]: TCP poll failed for 192.168.3.21:21, marking service DOWN
Sep 29 11:33:31 slbd[32492]: TCP poll failed to start to 192.168.3.21:21 in default (Operation now in progress)
Sep 29 11:33:26 slbd[32492]: Service ftp changed status, reloading filter policy
Sep 29 11:33:26 slbd[32492]: TCP poll succeeded for 192.168.3.21:21, marking service UP
Sep 29 11:33:26 slbd[32492]: Switching to sitedown for VIP 69.198.121.120:21
Sep 29 11:33:21 slbd[32492]: Service ftp changed status, reloading filter policy
Sep 29 11:33:21 slbd[32492]: TCP poll failed for 192.168.3.21:21, marking service DOWN
Sep 29 11:33:21 slbd[32492]: TCP poll failed to start to 192.168.3.21:21 in default (Operation now in progress)
Sep 29 11:32:21 slbd[32492]: Service ftp changed status, reloading filter policy
Sep 29 11:32:21 slbd[32492]: TCP poll succeeded for 192.168.3.21:21, marking service UP
Sep 29 11:32:21 slbd[32492]: Switching to sitedown for VIP 69.198.121.120:21
Sep 29 11:32:16 slbd[32492]: Service ftp changed status, reloading filter policy
Sep 29 11:32:16 slbd[32492]: TCP poll failed for 192.168.3.21:21, marking service DOWN
Sep 29 11:32:16 slbd[32492]: TCP poll failed to start to 192.168.3.21:21 in default (Operation now in progress)

This happens when I download a file. The download completes, but the transfer rate is about 1/5 of what it should be.

I don't see this behavior for http/https failover configured on the same firewalls.

Are there any obvious solutions to this problem? Is there a way to change the poll time for the virtual servers?

Follow up: The performance problem is apparently not due to the load balancing UP/DOWN issue. I removed the load balancing server configuration and switched to a port forward. The performance is still 1/5 of the speed that I see as compared with download via http from the same server. FTP to this server from LAN to DMZ (not passing through pfSense NAT) is fast. I don't see any messages in the logs to indicate any problems. My only conclusion is that pfSense NAT is introducing a large performance penalty.

Oh. This is with pfSense 1.2.2, with both pfSense and the FTP server running as virtual machines on ESXi 4.

markl

I am also seeing sort of similar problems with 1.2.3 (also tried 1.2.3 RC snapshot)…

I am trying to do inbound load balancing in front of 7 servers with different services, with 2 of them completely idle.

I keep seeing: "slbd[327]: TCP poll failed to start to 10.1.1.106:143 in default (Operation now in progress)" and the service gets marked as bad. 10.1.1.106 is one of the idle servers.

So I am not sure if OP's problem is just related to FTP or an incoming load balancing problem in general.