KEA DHCP Server TLS Transport settings not saved after service restart
-
We noticed an odd issue here at our main site where we are running two Netgate 8200 firewalls running v24.11 in a high availability pair. Everything has been working very well for sometime overall. We migrated to Kea for DHCP a few months ago, and everything seemed to be running correctly with sync between the nodes working and the heartbeat established.
After a recent patch release upon rebooting one of the nodes I noticed that the secondary node was having communication issues with the primary, the heartbeat was failing to be established. Upon further investigation, I saw there was a configuration mismatch between the TLS Transport settings. The settings I had previously applied to both nodes were missing on one
Now I fully understand that the TLS transport settings for Kea DHCP must be set directly on each node, they do not sync. So with that in mind, I reconfigured the node to perform TLS transport. The heartbeat established again shortly thereafter, and stayed connected.
On a hunch, I decided to test just restarting the DHCP Server service on one node. After first verifying that the node had TLS transport settings defined with the correct certificate selected, I then restarted the DHCP Server service on that node.
Sure enough, the TLS Transport settings were no longer the same after the service restart!
My guess is that the KEA DHCP Server service is not saving or committing the TLS transport settings in the configuration correctly. I don't have any way to verify this unfortunately as we only have the one HA pair here. So I can't say for certain whether there's something particular to our config that is at play, but it doesn't seem like that's the case all the same.
As a workaround I have disabled the TLS transport settings on both nodes, as that will at least allow the HA pair to function correctly in case a node restarts.
Otherwise there are some normal firewall/vpn duties happening with this HA pair, just wondering if anyone else has run across this in their environments or if there's something unique here that I'm not considering.
At the same time, I can pretty much reproduce it at will. Enable TLS transport, restart DHCP service, and at least here the settings you just set will not be present.
Open to any suggestions or ideas, happy to clarify anything and thanks for any advice.