Questions about CARP setup
-
Hi all,
We are currently setting up a test lab with two (soon to be three) hypervisors, and a public subnet delivered in a VLAN.
pfSense has a /28 network to WAN, and using Virtual IP's and NAT 1:1 to provide access to DMZ, working as expected.
Then we have 4 local networks: LAN, DMZ, DATA and ADM.
We will be using ADM network for CARP SYNC as this network has little traffic.Now I have two main questions, regarding WAN and DHCP:
DHCP: the pfSense is serving as DHCP server on LAN. How should the other pfSense routers be configured? DHCP enabled, disabled, relay?
Should I create a static mapping for the secondary pfSense routers or just put them instatic ipv4
config?WAN: On the /28 one of the IP's in this network is my gateway to the outside which theoretically is always online. Should the pfSense routers have different WAN addresses, or when one goes down the other can pickup that same WAN address?
Thanks
-
You need to read the docs first:
https://docs.netgate.com/pfsense/en/latest/book/highavailability/index.html
Some comments:
I would create a separate local network for the SYNC traffic.
Make sure you configure your hypervisor correctly for CARP traffic.
The secondary box should be statically addressed. I've never tried having two secondary nodes (tertiary?) and I'm not even sure this is supported. I don't see why you'd need it.
You need a public IP for each node AND shared CARP IPs. -
Hi @dotdash , thank you a lot for your quick reply.
I have read the documentation :) but thanks for pointing it out still had this questions afterwards...
Regarding your comments, I already suspected the public ip's necessity just wanted to confirm. About some other things you mentioned, I would like to better understand some points:I would create a separate local network for the SYNC traffic.
The ADM network has really little to no traffic, small payloads. Why wouldn't it be suitable for usage as SYNC network? What conditions may interfere with its traffic?
Make sure you configure your hypervisor correctly for CARP traffic.
CARP traffic isn't routed through the hypervisor. All hypervisors have an interface (without IP configuration) picking up a VLAN, and that interface corresponds to pfSense's WAN interface. No hypervisors are visible on this network. From my understanding, the hypervisor does not require any config for CARP to work in such scenario, but what configuration should I consider?
I've never tried having two secondary nodes (tertiary?) and I'm not even sure this is supported. I don't see why you'd need it.
When I mentioned nodes I meant hypervisor nodes but that does lead to a tertiary pfsense node. We were thinking of setting three pfSense routers (one per hypervisor), and of testing a scenario of disabling two hypervisors at a time and test failover. We assumed this would be easily setup/scaled as documentation states:
Though often erroneously called a “CARP Cluster”, two or more redundant pfSense firewalls are more aptly titled a “High Availability Cluster” (...)
The most common High Availability cluster configuration includes only two nodes. It is possible to have more nodes in a cluster, but they do not provide a significant advantage.
From this we assumed adding more pfsense nodes and different topologies would be attainable. Are you suggesting this is not supported/hard to achieve or not tested?
Thank you!
-
You can use an existing network for sync traffic, but it's easy to create an isolated vlan for it.
ESXI, for example, requires some tweaking to the virtual switch for use with CARP. That's why I mentioned it. Not sure about others.
I have never personally tried running a three node cluster. If the docs say it's possible, then you should be fine. I would bet that it doesn't get a lot of testing though. -
Right. I see putting a dedicated vlan for sync traffic is a solution regarding privacy and security, but considering the network that will be used is already a secured network with few clients and all accounted for in their functions, it does seem adequate.
However we may in the future consider different configurations.What kind of tweaking of which settings? May come useful to know :) about the three node cluster as I mentioned our approach to this setup is testing different scenarios. Although the docs say one thing, for example, I'm curious about how it would work with a three or four node setup on some features like dhcp server, since there's only space for one failover IP. Unless they can be comma or space separated. Or would we configure pfsense1 with failover ip pfsense2, and pfsense2 with that of pfsense3?
-
@maverickws said in Questions about CARP setup:
What kind of tweaking of which settings?
See the 'Hypervisor Users' section here:
https://docs.netgate.com/pfsense/en/latest/highavailability/troubleshooting-high-availability-clusters.html -
re: 3 pfSense in HA...that got asked a few weeks ago (https://forum.netgate.com/topic/155682/ha-for-three-or-more-devices) with no responses. I see it in the docs, as you pointed out, but I don't how how it would be accomplished since the HA settings allow to specify one IP address to sync to. Maybe you could have the second node sync its config to the third but I don't know how all three would sync states.
-
HA with more than 2 devices isn't really that great or a good idea. The only implementation that is working at all is a daisy-chain-like setup, in which you configure the secondary node like the primary node but for the third one. For example you would not setup pfsync with a peer IP and setup XMLRPC sync to sync from primary to secondary AND from secondary to third node. Also those few HA services that support running some sort of active-active mode ('ish), aren't made for running on more than two nodes (DHCP).
That's why it isn't that amazing idea to set up, as you would literally daisy-chain a configuration from node 1->2->3
As there are no really big things beside 1-2 smaller services like DHCP, DNS or NTP that could actually run on every node without being stopped on a non-master-node and the FreeBSD pf/CARP implementation doesn't really have active/active setups in mind, adding a third node doesn't appeal from an availability or security standpoint. I'd say you'd be better of using a potential third node as a cold spare with pfSense installed and some sort of console/mgmt interface set up so you can fire it up quickly and restore a config backup on it rather than install it as a tertiary node in a daisy chain ring. -
Hi @JeGr thank you a lot for your input. You made some very interesting observations, and in fact as @dotdash and @teamits mentioned a configuration with over 2 nodes isn't friendly.
I did made the setup with two nodes, something is failing.
I read about the issue @dotdash mentioned but it seems very specific to VMWare's ESX virtualisation. I tried to find the same issue relating xenserver pfsense and carp and didn't find that solution applied to Xen.So right now I have the following config:
Public subnet: 1.2.3.100/28
pfSense CARP WAN VIP: 1.2.3.100/28 pfSense1 WAN: 1.2.3.101/28 pfSense2 WAN: 1.2.3.102/28
pfSense CARP LAN VIP: 172.16.1.254/24 pfSense1 LAN: 172.16.1.1/24 pfSense2 LAN: 172.16.1.2/24
pfSense CARP SYNC VIP: 172.16.254.254/24 pfSense1 SYNC: 172.16.254.1/24 pfSense2 SYNC: 172.16.254.2/24
pfSense CARP IP's for 1:1 NAT: 1.2.3.105 1.2.3.106 1.2.3.107 1.2.3.108 (etc)
-
I have enabled High Availability Sync: both pfsync and XMLRPC sync.
Sync appears to be working perfectly, except one detail I noticed it does sync authentication servers but on the second pfsense the authentication server selected was still local database and had to change this manually. -
We configured every interface accordingly, the dhcp server to use the CARP LAN IP on DNS and Gateway, set failover ip;
-
Changed to manual outbound nat, changed the rules to use WAN CARP VIP instead of interface;
-
Added extra NAT rules to overcome this issue mentioned on the documentation: https://docs.netgate.com/pfsense/en/latest/highavailability/troubleshooting-vpn-connectivity-to-a-high-availability-secondary-node.html
-
We're using site-to-site IPSec VPN, changed the tunnel configuration to use interface WAN CARP IP:
VPN site-to-site is working: I dial1.2.3.100
and connection is established, not encountering any issues on VPN traffic; -
NAT 1:1 OK - I can access all servers using the Virtual IP configured for CARP.
-
Each CARP IP has its own ID: the /28 subnet have the ID matching the last octet, and the private addresses have VHID's matching the third octet. - There are no overlapping VHID's.
-
All CARP IP's appear as Master on the primary and Backup on the secondary.
-
Rules for SYNC interface are allow SYNC Net to any.
Following an article I found I also performed the additional configurations (after HA failed in the first tests):
- System > Advanced > Firewall & NAT
- Enable NAT Reflection for 1:1 NAT: checked
- Enable automatic outbound NAT for Reflection
However, despite all sync seeming correct, when I halt the first system the secondary quickly changes from BACKUP to MASTER. However, the VPN stays down, and traffic doesn't reach the servers.
For example, this was a PING I was running to the public IP of a web server. When the primary is master I can access the site without issues, and ping it. But when the secondary becomes master, nothing works.
64 bytes from 1.2.3.105: icmp_seq=39 ttl=47 time=52.099 ms 64 bytes from 1.2.3.105: icmp_seq=40 ttl=47 time=51.661 ms Request timeout for icmp_seq 41 Request timeout for icmp_seq 42 Request timeout for icmp_seq 43 Request timeout for icmp_seq 44 Request timeout for icmp_seq 45 Request timeout for icmp_seq 46 Request timeout for icmp_seq 47 Request timeout for icmp_seq 48 Request timeout for icmp_seq 49 Request timeout for icmp_seq 50 Request timeout for icmp_seq 51 Request timeout for icmp_seq 52 Request timeout for icmp_seq 53 Request timeout for icmp_seq 54 Request timeout for icmp_seq 55 Request timeout for icmp_seq 56 Request timeout for icmp_seq 57 Request timeout for icmp_seq 58 Request timeout for icmp_seq 59 Request timeout for icmp_seq 60 Request timeout for icmp_seq 61 Request timeout for icmp_seq 62 Request timeout for icmp_seq 63 Request timeout for icmp_seq 64 Request timeout for icmp_seq 65 Request timeout for icmp_seq 66 Request timeout for icmp_seq 67 Request timeout for icmp_seq 68 Request timeout for icmp_seq 69 Request timeout for icmp_seq 70 Request timeout for icmp_seq 71 Request timeout for icmp_seq 72 Request timeout for icmp_seq 73 Request timeout for icmp_seq 74 Request timeout for icmp_seq 75 Request timeout for icmp_seq 76 Request timeout for icmp_seq 77 Request timeout for icmp_seq 78 Request timeout for icmp_seq 79 Request timeout for icmp_seq 80 Request timeout for icmp_seq 81 Request timeout for icmp_seq 82 Request timeout for icmp_seq 83 Request timeout for icmp_seq 84 Request timeout for icmp_seq 85 Request timeout for icmp_seq 86 Request timeout for icmp_seq 87 Request timeout for icmp_seq 88 Request timeout for icmp_seq 89 Request timeout for icmp_seq 90 Request timeout for icmp_seq 91 Request timeout for icmp_seq 92 Request timeout for icmp_seq 93 Request timeout for icmp_seq 94 Request timeout for icmp_seq 95 Request timeout for icmp_seq 96 Request timeout for icmp_seq 97 Request timeout for icmp_seq 98 Request timeout for icmp_seq 99 Request timeout for icmp_seq 100 Request timeout for icmp_seq 101 Request timeout for icmp_seq 102 Request timeout for icmp_seq 103 Request timeout for icmp_seq 104 Request timeout for icmp_seq 105 Request timeout for icmp_seq 106 Request timeout for icmp_seq 107 Request timeout for icmp_seq 108 Request timeout for icmp_seq 109 Request timeout for icmp_seq 110 Request timeout for icmp_seq 111 Request timeout for icmp_seq 112 Request timeout for icmp_seq 113 Request timeout for icmp_seq 114 Request timeout for icmp_seq 115 Request timeout for icmp_seq 116 Request timeout for icmp_seq 117 Request timeout for icmp_seq 118 Request timeout for icmp_seq 119 Request timeout for icmp_seq 120 Request timeout for icmp_seq 121 Request timeout for icmp_seq 122 Request timeout for icmp_seq 123 Request timeout for icmp_seq 124 Request timeout for icmp_seq 125 Request timeout for icmp_seq 126 Request timeout for icmp_seq 127 Request timeout for icmp_seq 128 Request timeout for icmp_seq 129 Request timeout for icmp_seq 130 Request timeout for icmp_seq 131 Request timeout for icmp_seq 132 Request timeout for icmp_seq 133 Request timeout for icmp_seq 134 Request timeout for icmp_seq 135 Request timeout for icmp_seq 136 Request timeout for icmp_seq 137 Request timeout for icmp_seq 138 Request timeout for icmp_seq 139 64 bytes from 1.2.3.105: icmp_seq=140 ttl=47 time=52.780 ms 64 bytes from 1.2.3.105: icmp_seq=141 ttl=47 time=54.253 ms 64 bytes from 1.2.3.105: icmp_seq=142 ttl=47 time=51.698 ms
-
-
You don't need a VIP for the sync interface. Each router just needs the other one's IP set in the HA pfsync settings.
-
Hi @teamits hehe well actually I do because the SYNC network also has a few other clients behind it that require the VIP such as LAN.
Basically as I mentioned on the above posts, I chose an already existing network for SYNC that has two other clients beside the pfSense machines. This is a secured network and these are administration machines with restricted access and little traffic.
The documentation recommends a separate network, as I see it, for two factors:- network availability and load
- privacy and security (as passwords aren't really encrypted)
Since the chosen network complies with these requirements, it is a very restrict network with very low traffic, this network was used and hence the interface used for sync has a carp vip.
Anyway, all configurations: HA, Interfaces and DHCP server etc have the peer IP directly where it belongs, not the CARP VIP.
I expect this interface to work alike the other interfaces (LAN/DMZ7DATA) etc.