kea dhcp server in HA mode drops 50% of dhcp requests
-
Hi,
we are experiencing issues with dropped dhcp packages when using kea dhcp in high availability setting.
We found a lot of wifi devices having problems to get ip addresses via dhcp.Our setup:
- pfSense 24.11 on netgate appliance with Super Micro 1537, Intel(R) Xeon(R) CPU D-1537 @ 1.70GHz, 32 GB RAM, 450 GB SSD
- LAN: pfSense cluster and clients: 10 Gbps fiber
- HA sync LAN: 1 Gbps copper (officially support nic card, not the onboard nic)
- pfSense cluster running captive portal, dhcp, dns, firewall services
Test performance of dhcp server with perfdhcp on an ubuntu client:
perfdhcp -l <local LAN IP on client> -d 5 -r 10 -R 10 -n 500 <pfsense active cluster member>
(10 dhcp requests per second, 500 requests in one set of tests).Result:
Phase Discover-Offer:- 500 discover packets sent
- 257 offers received by client
(packet capture on pfsense LAN interface and on client LAN interface confirms this figures) - min delay: 93 ms
- avg delay: 300 ms
- max delay: 646 ms
Phase Request-Ack:
- 257 request packets sent
- 98 ACK packets reveived
- min delay: 266 ms
- avg delay: 544 ms
- max delay: 900 ms
Investigations during the tests did not show any bottleneck (at least for me):
- ping between client and pfsense active cluster member: 0.2 - 0.3 ms
- cpu load: < 5%
- ram: > 90% available
- HA interface load: < 200 kbps
- ssd io: 1000 kw/s
When I disable HA in kea settings, the perfdhcp test results are changing as following:
- 500 discover packets sent
- 499 offers received by client
(packet capture on pfsense LAN interface and on client LAN interface confirms this figures) - min delay: 2 ms
- avg delay: 9 ms
- max delay: 42 ms
Phase Request-Ack:
- 499 request packets sent
- 499 ACK packets reveived
- min delay: 6 ms
- avg delay: 25 ms
- max delay: 73 ms
Does anybody know how I can track that down to identify the reason?
Does anybody have recommendations regarding HA with kea-dhcp that are not already in the HA documentation of the official pfsense docs?