Kea DHCP in HA mode random crashes.
-
A week ago, I switched our Kea DHCP backend on our production firewall cluster to Kea (after a lot of test in a virtualized environment). It worked fine until yesterday, when suddenly the clients stopped receiving leases.
After some troubleshooting, I found that the Kea server must have crashed on the primary node, and the secondary node didn’t seem to fail over properly. Both firewalls were running, but no leases were being handed out to clients.
The error log I found was the following:
port 67, reason: Address already in use - is another DHCP server running? Oct 21 14:26:57 ITL-FWL-001 kea-dhcp4[65433]: WARN [kea-dhcp4.dhcpsrv.0x7f3c2012000] DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface lagg1.999, reason: failed to bind fallback socket to address 10.0.63.2, port 67, reason: Address already in use - is another DHCP server running? Oct 21 14:26:57 ITL-FWL-001 kea-dhcp4[65433]: WARN [kea-dhcp4.dhcpsrv.0x7f3c2012000] DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface lagg1.999, reason: failed to bind fallback socket to address 10.0.63.1, port 67, reason: Address already in use - is another DHCP server running? Oct 21 14:26:57 ITL-FWL-001 kea-dhcp4[65433]: WARN [kea-dhcp4.dhcpsrv.0x7f3c2012000] DHCPSRV_NO_SOCKETS_OPEN no interface configured to listen to DHCP traffic Oct 21 14:26:57 ITL-FWL-001 kea-dhcp4[65433]: ERROR [kea-dhcp4.dhcp4.0x7f3c2012000] DHCP4_CONFIG_LOAD_FAIL configuration error using file: /usr/local/etc/kea/kea-dhcp4.conf, reason: Error initializing hooks: CmdHttpListener::run failed: unable to setup TCP acceptor for listening to the incoming HTTP requests: bind: Address already in use [system:48 at /usr/local/include/boost/asio/detail/reactive_socket_service.hpp:161:33 in function 'bind'] Oct 21 14:26:57 ITL-FWL-001 kea-dhcp4[65433]: ERROR [kea-dhcp4.dhcp4.0x7f3c2012000] DHCP4_INIT_FAIL failed to initialize Kea server: configuration error using file '/usr/local/etc/kea/kea-dhcp4.conf': Error initializing hooks: CmdHttpListener::run failed: unable to setup TCP acceptor for listening to the incoming HTTP requests: bind: Address already in use [system:48 at /usr/local/include/boost/asio/detail/reactive_socket_service.hpp:161:33 in function 'bind'] Oct 21 14:27:01 ITL-FWL-001 kea-dhcp4[17711]: WARN [kea-dhcp4.dhcpsrv.0x106d79612000] DHCPSRV_MT_DISABLED_QUEUE_CONTROL disabling dhcp queue control when multi-threading is enabled. Oct 21 14:27:01 ITL-FWL-001 kea-dhcp4[17711]: WARN [kea-dhcp4.dhcp4.0x106d79612000] DHCP4_RESERVATIONS_LOOKUP_FIRST_ENABLED Multi-threading is enabled and host reservations lookup is always performed first. Oct 21 14:27:01 ITL-FWL-001 kea-dhcp4[17711]: WARN [kea-dhcp4.dhcpsrv.0x106d79612000] DHCPSRV_MULTIPLE_RAW_SOCKETS_PER_IFACE current configuration will result in opening multiple broadcast capable sockets on some interfaces and some DHCP messages may be duplicated Oct 21 14:27:01 ITL-FWL-001 kea-dhcp4[17711]: WARN [kea-dhcp4.dhcpsrv.0x106d79612000] DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface lagg0.10, reason: failed to bind fallback socket to address 10.0.9.2, port 67, reason: Address already in use - is another DHCP server running? Oct 21 14:27:01 ITL-FWL-001 kea-dhcp4[17711]: WARN [kea-dhcp4.dhcpsrv.0x106d79612000] DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface lagg0.10, reason: failed to bind fallback socket to address 10.0.9.1, port 67, reason: Address already in use - is another DHCP server running?
Oct 22 10:22:33 kea-dhcp4 78645 WARN [kea-dhcp4.dhcpsrv.0xc0595a12000] DHCPSRV_MT_DISABLED_QUEUE_CONTROL disabling dhcp queue control when multi-threading is enabled. Oct 22 10:22:33 kea-dhcp4 78645 WARN [kea-dhcp4.dhcp4.0xc0595a12000] DHCP4_RESERVATIONS_LOOKUP_FIRST_ENABLED Multi-threading is enabled and host reservations lookup is always performed first. Oct 22 10:22:33 kea-dhcp4 78645 WARN [kea-dhcp4.dhcpsrv.0xc0595a12000] DHCPSRV_MULTIPLE_RAW_SOCKETS_PER_IFACE current configuration will result in opening multiple broadcast capable sockets on some interfaces and some DHCP messages may be duplicated Oct 22 10:22:33 kea-dhcp4 78645 WARN [kea-dhcp4.dhcp4.0xc0595a12000] DHCP4_MULTI_THREADING_INFO enabled: yes, number of threads: 48, queue size: 64 Oct 22 10:22:33 kea-dhcp4 78645 ERROR [kea-dhcp4.packets.0xc0595a12000] DHCP4_BUFFER_RECEIVE_FAIL error on attempt to receive packet: Truncated DHCPv4 packet (len=172) received, at least 236 is expected. Oct 22 10:22:33 kea-dhcp4 67709 WARN [kea-dhcp4.dhcpsrv.0x3a7d32612000] DHCPSRV_MT_DISABLED_QUEUE_CONTROL disabling dhcp queue control when multi-threading is enabled. Oct 22 10:22:33 kea-dhcp4 67709 WARN [kea-dhcp4.dhcp4.0x3a7d32612000] DHCP4_RESERVATIONS_LOOKUP_FIRST_ENABLED Multi-threading is enabled and host reservations lookup is always performed first. Oct 22 10:22:33 kea-dhcp4 67709 WARN [kea-dhcp4.dhcpsrv.0x3a7d32612000] DHCPSRV_MULTIPLE_RAW_SOCKETS_PER_IFACE current configuration will result in opening multiple broadcast capable sockets on some interfaces and some DHCP messages may be duplicated Oct 22 10:22:33 kea-dhcp4 67709 WARN [kea-dhcp4.dhcp4.0x3a7d32612000] DHCP4_MULTI_THREADING_INFO enabled: yes, number of threads: 48, queue size: 64
It seems like another instance of Kea tried to start, even though there was probably already one running. I restarted the firewall, and it seemed to recover.
However, I can no longer trust it in production. I’ve looked into it further, and later the same day something similar occurred again. I also noticed since yesterday that, from time to time, the DHCP status on the lease page goes red for one node for a few seconds, then recovers automatically.
ct 21 18:20:31 kea-dhcp4 34212 WARN [kea-dhcp4.dhcpsrv.0x3bc9d6212000] DHCPSRV_MT_DISABLED_QUEUE_CONTROL disabling dhcp queue control when multi-threading is enabled. Oct 21 18:20:31 kea-dhcp4 34212 WARN [kea-dhcp4.dhcp4.0x3bc9d6212000] DHCP4_RESERVATIONS_LOOKUP_FIRST_ENABLED Multi-threading is enabled and host reservations lookup is always performed first. Oct 21 18:20:31 kea-dhcp4 34212 WARN [kea-dhcp4.dhcpsrv.0x3bc9d6212000] DHCPSRV_MULTIPLE_RAW_SOCKETS_PER_IFACE current configuration will result in opening multiple broadcast capable sockets on some interfaces and some DHCP messages may be duplicated Oct 21 18:20:31 kea-dhcp4 34212 ERROR [kea-dhcp4.dhcp4.0x3bc9d6212000] DHCP4_CONFIG_LOAD_FAIL configuration error using file: /usr/local/etc/kea/kea-dhcp4.conf, reason: Error initializing hooks: CmdHttpListener::run failed: unable to setup TCP acceptor for listening to the incoming HTTP requests: bind: Address already in use [system:48 at /usr/local/include/boost/asio/detail/reactive_socket_service.hpp:161:33 in function 'bind'] Oct 21 18:20:31 kea-dhcp4 34212 ERROR [kea-dhcp4.dhcp4.0x3bc9d6212000] DHCP4_INIT_FAIL failed to initialize Kea server: configuration error using file '/usr/local/etc/kea/kea-dhcp4.conf': Error initializing hooks: CmdHttpListener::run failed: unable to setup TCP acceptor for listening to the incoming HTTP requests: bind: Address already in use [system:48 at /usr/local/include/boost/asio/detail/reactive_socket_service.hpp:161:33 in function 'bind'] Oct 21 18:21:15 kea-dhcp4 71088 WARN [kea-dhcp4.dhcpsrv.0x2b7e08012000] DHCPSRV_MT_DISABLED_QUEUE_CONTROL disabling dhcp queue control when multi-threading is enabled. Oct 21 18:21:15 kea-dhcp4 71088 WARN [kea-dhcp4.dhcp4.0x2b7e08012000] DHCP4_RESERVATIONS_LOOKUP_FIRST_ENABLED Multi-threading is enabled and host reservations lookup is always performed first. Oct 21 18:21:15 kea-dhcp4 71088 WARN [kea-dhcp4.dhcpsrv.0x2b7e08012000] DHCPSRV_MULTIPLE_RAW_SOCKETS_PER_IFACE current configuration will result in opening multiple broadcast capable sockets on some interfaces and some DHCP messages may be duplicated Oct 21 18:21:15 kea-dhcp4 71088 ERROR [kea-dhcp4.dhcp4.0x2b7e08012000] DHCP4_CONFIG_LOAD_FAIL configuration error using file: /usr/local/etc/kea/kea-dhcp4.conf, reason: Error initializing hooks: CmdHttpListener::run failed: unable to setup TCP acceptor for listening to the incoming HTTP requests: bind: Address already in use [system:48 at /usr/local/include/boost/asio/detail/reactive_socket_service.hpp:161:33 in function 'bind'] Oct 21 18:21:15 kea-dhcp4 71088 ERROR [kea-dhcp4.dhcp4.0x2b7e08012000] DHCP4_INIT_FAIL failed to initialize Kea server: configuration error using file '/usr/local/etc/kea/kea-dhcp4.conf': Error initializing hooks: CmdHttpListener::run failed: unable to setup TCP acceptor for listening to the incoming HTTP requests: bind: Address already in use [system:48 at /usr/local/include/boost/asio/detail/reactive_socket_service.hpp:161:33 in function 'bind'] Oct 21 18:28:52 kea-dhcp4 16688 WARN [kea-dhcp4.dhcpsrv.0x10a766c12000] DHCPSRV_MT_DISABLED_QUEUE_CONTROL disabling dhcp queue control when multi-threading is enabled. Oct 21 18:28:52 kea-dhcp4 16688 WARN [kea-dhcp4.dhcp4.0x10a766c12000] DHCP4_RESERVATIONS_LOOKUP_FIRST_ENABLED Multi-threading is enabled and host reservations lookup is always performed first. Oct 21 18:28:52 kea-dhcp4 16688 WARN [kea-dhcp4.dhcpsrv.0x10a766c12000] DHCPSRV_MULTIPLE_RAW_SOCKETS_PER_IFACE current configuration will result in opening multiple broadcast capable sockets on some interfaces and some DHCP messages may be duplicated Oct 21 18:28:52 kea-dhcp4 16688 WARN [kea-dhcp4.dhcp4.0x10a766c12000] DHCP4_MULTI_THREADING_INFO enabled: yes, number of threads: 48, queue size: 64
I did some research and found that maybe the 48 threads could be an issue? This is a dual-CPU server with 48 threads in total.
I am also using the DNS Registration and early DNS registration options to register the dns names of static mappings and to also register the dns entries of the clients that bring the hostname when making the dhcp requestDoes anyone else have a clue how to investigate this issue further?
Thanks in Advance