Testing High Availability
I have installed a High Availability Cluster of PfSense firewalls like the attached diagram. The version installed is 2.4.1
I have followed the document https://portal.pfsense.org/docs/guides/highavailability/ha-on-sg-4860.html and it
seems to work fine. XMLRPC Sync and CARP are working.
I have also setup manual outbpound NAT and set the Translation Address to the WAN VIP address in the primary node
and the rules are reflected in the secondary node.
But, if I start downloading a file, for example with wget, from a computer in the LAN and then I Enter Persistent CARP
Maintenance Mode in the primary node, the download stops.
I don't know why it happens? It's problem related to the pfSense configuration or maybe is related to the WAN switch?
I have the same problem in one installation.
Remember that pfsync must also work to transfer tcp states.
Gui is misleading because in master you should enable xmlrpc sync AND pfsync.
In slave you never enable xmlrpc sync but you must enable pfsync!
I have State Synchronization Settings (pfsync) in both firewalls, primary and secondary,
and XMLRPC Sync only in the primary.
But, the connections are not maintained when I Enter Persistent CARP
Maintenance Mode in the primary node.
Look at Diagnostics > States. See what is actually happening. Post them from both nodes.
I have the same problem and I have done the same things. In diagnostics states there are so many states and they are different in each firewall.
Filter them on what you are interested in.
In my system, the Diagnostics -> States are the same in the primary and secondary firewalls (with very few differences).
I have checked states too and I see in slave the same states of master.
I exclude other problems because I can do this test:
- start a tcp connection on master, disable master, tcp connection does not work, reenable master and tcp starts exchanging packets again;
- exchange master with slave, do the same test, I obtain the same result.
So master and slave have the same behaviour and the same configuration.
I have other pfsense installations with ha and only in this one I have this problem.
Post the states. Detail which address is which (interface, CARP, etc)
On master I have:
VDSL200 udp yyy.183.73.74:53634 (192.168.0.4:5060) -> xxx.97.59.76:5060 MULTIPLE:MULTIPLE 13.985 K / 32.303 K 8.46 MiB / 9.34 MiB
vdsl200 udp yyy.183.73.74:53634 (192.168.0.4:5060) -> xxx.97.59.76:5060 MULTIPLE:MULTIPLE 34 / 50 23KiB/18KiB
In this case I used a voip call that is udp (so it should not have states). The voip call "stays" on master.
I forgot to say an important thing: icmp "works". I mean if I ping from inside to 22.214.171.124 and I put down master ping packets continue to flow.
Is yyy.183.73.74 the CARP VIP?
Is 192.168.0.4 set to use the CARP VIP on the firewall on that interface as its default gateway?
Yes yyy.183.73.74 is the public ip carp vip. Nat is on that ip.
Private subnet carp ip is 192.168.0.254. Yes dhcp gives 192.168.0.254 as gateway to computers.
I explain again tests I have done (please rafel do these tests too):
- ping from an internal pc (e.g. 192.168.0.55) to 126.96.36.199. Ping works. Fence master. Slave becomes master. Ping continue to work! It means that nat/dhcp/carp/… is all ok, right?
- telnet from 192.168.0.55 to internet server xx.yy.aa.bb. Telnet works. Fence master. Slave becomes master. Telnet stops working!!!
After test 2) someone can say me: in your setup obviously master configuration is different from slave. Perhaps some firewall settings.
OK! so I exchange master with slave and I do again test 2. I obtain the same result!
How can I debug it?
Is there someone that with 2.4.1 has HA working?
I have tried to duplicate several of these reports and the only case I can find where there might be a problem is described here: