carp VIPs and DHCP Failover advskew ( skew ) primary determination



  • I just figured out (after quite a while ) why the DHCP servers on a pair of failover firewalls would always end up not serving IP addresses. They would also show recovering and peer state unknown. They would both be configured as secondary, meaning the both used the same port 520 and therefore would expect the other to use 519. Further, the split setting in the config would not be present on either machines dhcpd.conf

    As I found out, this is the result of services.inc startup scripts for dhcpd determining which machine to make the primary and creating the config file (dhcpd.conf) as required for secondary (or primary). Once I figured this out and got to the code, it's apparent that the skew setting of the corresponding carp carp interface is the determining factor in which machine is primary and which is secondary.

    This is where my problem was and also where my question is:

    While I realize there are suggestions and hints to use 0 (or 1 or something in examples) as master skew, there are also hints that simply the lower sum of base and advskew on  the two machines is going to be carp master. Just out of habit from hsrp and other setups and not being a BSD guy, I chose 128 carp master and 228 for carp backup with the idea that i had room on either side for any fancy stuff in the future.

    Of course the 128 and 228 setting kills the functionality of dhcpd when configured for failover due to both machines being configured at secondary because of the skew higher than "20" (if not doing failover i assume dhcpd will work). It does not however cause a problem with carp and while they may not be mainstream accepted values, they are definitely valid values.

    So… my problems are all fixed now, but I'm left with the question:

    Why there is no note in the dhcpd failover config about the importance of the carp skew setting when using dhcp failover?

    There is a note that dhcp failover requires a carp interface/vip to operate on, and a error message i think I saw in the code) but there's nothing to tell anyone that the arbitrary skew value of 20 is the cutoff for having a primary dhcp server.

    No I don't have a patch.... I realize that should be included with all my whining here... I'll see if I can get one posted to this thread on the weekend.

    Thanks for the hard work by everyone involved in making pfsense what it is :)


Locked