DHCP load balancing vs failover - can one router function standalone?

akom · Dec 23, 2009, 8:39 PM

First, the questions (they may be really just one question):

Is it expected that if I configured two pfsense routers with DHCP failover, one cannot serve new clients if the other is down/dead?
Is it possible to implement pure failover for DHCP vs load balancing?

Now, detail: (I am hoping someone can point out what I am misunderstanding, as this just seems… wrong)

It seems that if I configure "Failover peer IP" for DHCP, both of my pfsense routers are actually actively servicing DHCP clients, using a split algorithm to decide which one handles which client. They do keep their states in sync, which is great.

However, if A goes down, B eventually winds up in RECOVER-WAIT, then RECOVER-DONE state (according to http://tools.ietf.org/html/draft-ietf-dhc-failover-12#page-97 ) in which it will not provide a new lease (but will refresh existing leases). It will not exit this state until A comes back in one state or another. In other words, one router cannot fully function in this "failover" configuration.

I'm fairly new here, so I'm asking for a sanity check: I need both routers up and running to serve DHCP clients? That's not what I'd call failover.
I can clearly see a variety of ways things can go wrong if DHCP servers fall out of communication with each other and begin allocating IP's in isolation - is this what we're preventing here?
I do understand that this is the way the DHCP protocol seems to be designed, but am I the only one this seems strange to? Is there no other way?

On a personal note, my needs are very small (50 machines in a location, if that), so throughput is not a factor, meaning I don't care about load balancing. Being able to survive a week when one of the routers dies, without intervention - that is significant, especially since I'm largely off-site. Has anyone implemented a workaround?

Thanks for any info.

JoshW · Feb 2, 2010, 6:51 PM

I am by no means an expert on DHCP failover; however, the ISC dhcpd.conf manual page states the following:

The failover protocol allows two DHCP servers (and no more than two) to share a common address pool. Each server will have about half of the available IP addresses in the pool at any given time for allocation. If one server fails, the other server will continue to renew leases out of the pool, and will allocate new addresses out of the roughly half of available addresses that it had when communications with the other server were lost.

Thus if one server fails, the second should still issue new leases from half of the address space. Whether or not this is working correctly in pfSense, I cannot say.

I gave up on using DHCP failover with pfSense as it will not work with dnsmasq/DNS Forwarder.