PfSense on qemu-kvm - slave firewall intermittently flipping to master and back
-
Hi,
I've got a problem with our slave firewall intermittently thinking master firewall is down, thus becoming master. After some investigation, I believe this is to do with a really slow speed of inbound network traffic on pfSense (outbound network traffic seems ok).
Our setup is two pfSense instances - 1 master - dmz002, 1 slave - dmz001 in HA via CARP and virtual IPs. Three VLANs - one dedicated to CARP traffic between firewalls, one called BACKEND, one called FRONTEND. The pfSense firewalls are virtualized with qemu-kvm - virtio network driver. There are two VMs - bes001 and bes002 on BACKEND and two VMs on FRONTEND VLANs. There is two physical hosts with operating system CentOS 6.5 - TSO, GSO, etc. is turned off for each physical or virtual NIC or bridge (ethtool -K $i tx off sg off tso off gso off gro off).
dmz001 and bes001 are on one host, dmz002 and bes002 are on the other host.
The CPU on the hosts is on load 0.4 - there are 4 CPU cores so it's not CPU bound, there is also heaps of RAM, the CentOS hosts are connected via 1GB/s link.
The real problem is that the slave firewall intermittently thinks that it is master, I believe that is because of slow inbound traffic to the slave firewall which delays the CARP traffic.
Now, we did some benchmarks running iperf for 3 minutes (iperf client on the left, iperf server on the right):
dmz001 -> dmz002 296 Mb/s
dmz001 -> bes001 602 Mb/s
dmz002 -> bes001 731 Mb/s
dmz002 -> bes002 630 Mb/s
bes001 -> dmz002 351 Mb/s
bes001 -> dmz001 375 Mb/s
bes001 -> bes002 850 Mb/sSo, we get 850 Mb/s between VMs (that's expected), 730 Mb/s from pfSense to a VM on a different host (that is ok), only 630 Mb/s from pfSense to a VM on the same host (this is weird), only 350 Mb/s from a VM to pfSense on a different host (this is really slow) and only 300 Mb/s between pfSense firewalls.
Has anybody experienced anything like this?
Thanks,
Tomas -
I think this might be related to our issue https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=165059
Another thing that we noticed, if we enable TSO on the pfSense vtnet interfaces, all traffic to the VMs on the same host as the master pfSense firewall gets stuck. Traffic going via the master firewall but to a VM on a different host is fine.