• Hi all,

    We are currently setting up a test lab with two (soon to be three) hypervisors, and a public subnet delivered in a VLAN.

    pfSense has a /28 network to WAN, and using Virtual IP's and NAT 1:1 to provide access to DMZ, working as expected.
    Then we have 4 local networks: LAN, DMZ, DATA and ADM.
    We will be using ADM network for CARP SYNC as this network has little traffic.

    Now I have two main questions, regarding WAN and DHCP:

    DHCP: the pfSense is serving as DHCP server on LAN. How should the other pfSense routers be configured? DHCP enabled, disabled, relay?
    Should I create a static mapping for the secondary pfSense routers or just put them in static ipv4 config?

    WAN: On the /28 one of the IP's in this network is my gateway to the outside which theoretically is always online. Should the pfSense routers have different WAN addresses, or when one goes down the other can pickup that same WAN address?

    Thanks


  • You need to read the docs first:
    https://docs.netgate.com/pfsense/en/latest/book/highavailability/index.html
    Some comments:
    I would create a separate local network for the SYNC traffic.
    Make sure you configure your hypervisor correctly for CARP traffic.
    The secondary box should be statically addressed. I've never tried having two secondary nodes (tertiary?) and I'm not even sure this is supported. I don't see why you'd need it.
    You need a public IP for each node AND shared CARP IPs.


  • Hi @dotdash , thank you a lot for your quick reply.

    I have read the documentation :) but thanks for pointing it out still had this questions afterwards...
    Regarding your comments, I already suspected the public ip's necessity just wanted to confirm. About some other things you mentioned, I would like to better understand some points:

    I would create a separate local network for the SYNC traffic.

    The ADM network has really little to no traffic, small payloads. Why wouldn't it be suitable for usage as SYNC network? What conditions may interfere with its traffic?

    Make sure you configure your hypervisor correctly for CARP traffic.

    CARP traffic isn't routed through the hypervisor. All hypervisors have an interface (without IP configuration) picking up a VLAN, and that interface corresponds to pfSense's WAN interface. No hypervisors are visible on this network. From my understanding, the hypervisor does not require any config for CARP to work in such scenario, but what configuration should I consider?

    I've never tried having two secondary nodes (tertiary?) and I'm not even sure this is supported. I don't see why you'd need it.

    When I mentioned nodes I meant hypervisor nodes but that does lead to a tertiary pfsense node. We were thinking of setting three pfSense routers (one per hypervisor), and of testing a scenario of disabling two hypervisors at a time and test failover. We assumed this would be easily setup/scaled as documentation states:

    Though often erroneously called a “CARP Cluster”, two or more redundant pfSense firewalls are more aptly titled a “High Availability Cluster” (...)

    The most common High Availability cluster configuration includes only two nodes. It is possible to have more nodes in a cluster, but they do not provide a significant advantage.

    From this we assumed adding more pfsense nodes and different topologies would be attainable. Are you suggesting this is not supported/hard to achieve or not tested?

    Thank you!


  • You can use an existing network for sync traffic, but it's easy to create an isolated vlan for it.
    ESXI, for example, requires some tweaking to the virtual switch for use with CARP. That's why I mentioned it. Not sure about others.
    I have never personally tried running a three node cluster. If the docs say it's possible, then you should be fine. I would bet that it doesn't get a lot of testing though.


  • Right. I see putting a dedicated vlan for sync traffic is a solution regarding privacy and security, but considering the network that will be used is already a secured network with few clients and all accounted for in their functions, it does seem adequate.
    However we may in the future consider different configurations.

    What kind of tweaking of which settings? May come useful to know :) about the three node cluster as I mentioned our approach to this setup is testing different scenarios. Although the docs say one thing, for example, I'm curious about how it would work with a three or four node setup on some features like dhcp server, since there's only space for one failover IP. Unless they can be comma or space separated. Or would we configure pfsense1 with failover ip pfsense2, and pfsense2 with that of pfsense3?



  • re: 3 pfSense in HA...that got asked a few weeks ago (https://forum.netgate.com/topic/155682/ha-for-three-or-more-devices) with no responses. I see it in the docs, as you pointed out, but I don't how how it would be accomplished since the HA settings allow to specify one IP address to sync to. Maybe you could have the second node sync its config to the third but I don't know how all three would sync states.

  • LAYER 8 Moderator

    HA with more than 2 devices isn't really that great or a good idea. The only implementation that is working at all is a daisy-chain-like setup, in which you configure the secondary node like the primary node but for the third one. For example you would not setup pfsync with a peer IP and setup XMLRPC sync to sync from primary to secondary AND from secondary to third node. Also those few HA services that support running some sort of active-active mode ('ish), aren't made for running on more than two nodes (DHCP).

    That's why it isn't that amazing idea to set up, as you would literally daisy-chain a configuration from node 1->2->3
    As there are no really big things beside 1-2 smaller services like DHCP, DNS or NTP that could actually run on every node without being stopped on a non-master-node and the FreeBSD pf/CARP implementation doesn't really have active/active setups in mind, adding a third node doesn't appeal from an availability or security standpoint. I'd say you'd be better of using a potential third node as a cold spare with pfSense installed and some sort of console/mgmt interface set up so you can fire it up quickly and restore a config backup on it rather than install it as a tertiary node in a daisy chain ring.


  • Hi @JeGr thank you a lot for your input. You made some very interesting observations, and in fact as @dotdash and @teamits mentioned a configuration with over 2 nodes isn't friendly.

    I did made the setup with two nodes, something is failing.
    I read about the issue @dotdash mentioned but it seems very specific to VMWare's ESX virtualisation. I tried to find the same issue relating xenserver pfsense and carp and didn't find that solution applied to Xen.

    So right now I have the following config:

    Public subnet: 1.2.3.100/28
    
    pfSense CARP WAN VIP:     1.2.3.100/28
    pfSense1 WAN:             1.2.3.101/28
    pfSense2 WAN:             1.2.3.102/28
    
    pfSense CARP LAN VIP:     172.16.1.254/24
    pfSense1 LAN:             172.16.1.1/24
    pfSense2 LAN:             172.16.1.2/24
    
    pfSense CARP SYNC VIP:     172.16.254.254/24
    pfSense1 SYNC:             172.16.254.1/24
    pfSense2 SYNC:             172.16.254.2/24
    
    pfSense CARP IP's for 1:1 NAT:
    1.2.3.105
    1.2.3.106
    1.2.3.107
    1.2.3.108 (etc)
    
    • I have enabled High Availability Sync: both pfsync and XMLRPC sync.
      Sync appears to be working perfectly, except one detail I noticed it does sync authentication servers but on the second pfsense the authentication server selected was still local database and had to change this manually.

    • We configured every interface accordingly, the dhcp server to use the CARP LAN IP on DNS and Gateway, set failover ip;

    • Changed to manual outbound nat, changed the rules to use WAN CARP VIP instead of interface;

    • Added extra NAT rules to overcome this issue mentioned on the documentation: https://docs.netgate.com/pfsense/en/latest/highavailability/troubleshooting-vpn-connectivity-to-a-high-availability-secondary-node.html

    • We're using site-to-site IPSec VPN, changed the tunnel configuration to use interface WAN CARP IP:
      VPN site-to-site is working: I dial 1.2.3.100 and connection is established, not encountering any issues on VPN traffic;

    • NAT 1:1 OK - I can access all servers using the Virtual IP configured for CARP.

    • Each CARP IP has its own ID: the /28 subnet have the ID matching the last octet, and the private addresses have VHID's matching the third octet. - There are no overlapping VHID's.

    • All CARP IP's appear as Master on the primary and Backup on the secondary.

    • Rules for SYNC interface are allow SYNC Net to any.

    Following an article I found I also performed the additional configurations (after HA failed in the first tests):

    • System > Advanced > Firewall & NAT
      • Enable NAT Reflection for 1:1 NAT: checked
      • Enable automatic outbound NAT for Reflection

    However, despite all sync seeming correct, when I halt the first system the secondary quickly changes from BACKUP to MASTER. However, the VPN stays down, and traffic doesn't reach the servers.

    For example, this was a PING I was running to the public IP of a web server. When the primary is master I can access the site without issues, and ping it. But when the secondary becomes master, nothing works.

    64 bytes from 1.2.3.105: icmp_seq=39 ttl=47 time=52.099 ms
    64 bytes from 1.2.3.105: icmp_seq=40 ttl=47 time=51.661 ms
    Request timeout for icmp_seq 41
    Request timeout for icmp_seq 42
    Request timeout for icmp_seq 43
    Request timeout for icmp_seq 44
    Request timeout for icmp_seq 45
    Request timeout for icmp_seq 46
    Request timeout for icmp_seq 47
    Request timeout for icmp_seq 48
    Request timeout for icmp_seq 49
    Request timeout for icmp_seq 50
    Request timeout for icmp_seq 51
    Request timeout for icmp_seq 52
    Request timeout for icmp_seq 53
    Request timeout for icmp_seq 54
    Request timeout for icmp_seq 55
    Request timeout for icmp_seq 56
    Request timeout for icmp_seq 57
    Request timeout for icmp_seq 58
    Request timeout for icmp_seq 59
    Request timeout for icmp_seq 60
    Request timeout for icmp_seq 61
    Request timeout for icmp_seq 62
    Request timeout for icmp_seq 63
    Request timeout for icmp_seq 64
    Request timeout for icmp_seq 65
    Request timeout for icmp_seq 66
    Request timeout for icmp_seq 67
    Request timeout for icmp_seq 68
    Request timeout for icmp_seq 69
    Request timeout for icmp_seq 70
    Request timeout for icmp_seq 71
    Request timeout for icmp_seq 72
    Request timeout for icmp_seq 73
    Request timeout for icmp_seq 74
    Request timeout for icmp_seq 75
    Request timeout for icmp_seq 76
    Request timeout for icmp_seq 77
    Request timeout for icmp_seq 78
    Request timeout for icmp_seq 79
    Request timeout for icmp_seq 80
    Request timeout for icmp_seq 81
    Request timeout for icmp_seq 82
    Request timeout for icmp_seq 83
    Request timeout for icmp_seq 84
    Request timeout for icmp_seq 85
    Request timeout for icmp_seq 86
    Request timeout for icmp_seq 87
    Request timeout for icmp_seq 88
    Request timeout for icmp_seq 89
    Request timeout for icmp_seq 90
    Request timeout for icmp_seq 91
    Request timeout for icmp_seq 92
    Request timeout for icmp_seq 93
    Request timeout for icmp_seq 94
    Request timeout for icmp_seq 95
    Request timeout for icmp_seq 96
    Request timeout for icmp_seq 97
    Request timeout for icmp_seq 98
    Request timeout for icmp_seq 99
    Request timeout for icmp_seq 100
    Request timeout for icmp_seq 101
    Request timeout for icmp_seq 102
    Request timeout for icmp_seq 103
    Request timeout for icmp_seq 104
    Request timeout for icmp_seq 105
    Request timeout for icmp_seq 106
    Request timeout for icmp_seq 107
    Request timeout for icmp_seq 108
    Request timeout for icmp_seq 109
    Request timeout for icmp_seq 110
    Request timeout for icmp_seq 111
    Request timeout for icmp_seq 112
    Request timeout for icmp_seq 113
    Request timeout for icmp_seq 114
    Request timeout for icmp_seq 115
    Request timeout for icmp_seq 116
    Request timeout for icmp_seq 117
    Request timeout for icmp_seq 118
    Request timeout for icmp_seq 119
    Request timeout for icmp_seq 120
    Request timeout for icmp_seq 121
    Request timeout for icmp_seq 122
    Request timeout for icmp_seq 123
    Request timeout for icmp_seq 124
    Request timeout for icmp_seq 125
    Request timeout for icmp_seq 126
    Request timeout for icmp_seq 127
    Request timeout for icmp_seq 128
    Request timeout for icmp_seq 129
    Request timeout for icmp_seq 130
    Request timeout for icmp_seq 131
    Request timeout for icmp_seq 132
    Request timeout for icmp_seq 133
    Request timeout for icmp_seq 134
    Request timeout for icmp_seq 135
    Request timeout for icmp_seq 136
    Request timeout for icmp_seq 137
    Request timeout for icmp_seq 138
    Request timeout for icmp_seq 139
    64 bytes from 1.2.3.105: icmp_seq=140 ttl=47 time=52.780 ms
    64 bytes from 1.2.3.105: icmp_seq=141 ttl=47 time=54.253 ms
    64 bytes from 1.2.3.105: icmp_seq=142 ttl=47 time=51.698 ms
    

  • You don't need a VIP for the sync interface. Each router just needs the other one's IP set in the HA pfsync settings.


  • Hi @teamits hehe well actually I do because the SYNC network also has a few other clients behind it that require the VIP such as LAN.
    Basically as I mentioned on the above posts, I chose an already existing network for SYNC that has two other clients beside the pfSense machines. This is a secured network and these are administration machines with restricted access and little traffic.
    The documentation recommends a separate network, as I see it, for two factors:

    • network availability and load
    • privacy and security (as passwords aren't really encrypted)

    Since the chosen network complies with these requirements, it is a very restrict network with very low traffic, this network was used and hence the interface used for sync has a carp vip.
    Anyway, all configurations: HA, Interfaces and DHCP server etc have the peer IP directly where it belongs, not the CARP VIP.
    I expect this interface to work alike the other interfaces (LAN/DMZ7DATA) etc.