Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC

    Scheduled Pinned Locked Moved DHCP and DNS
    kea dhcp error
    10 Posts 8 Posters 713 Views 8 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      PabloAbonia
      last edited by

      Error messages as below with it occurring every second it seems and causing it to be restarted by the watchdog every hour or so.

      Thoughts as this error message isn't leading me down to a specific problem I can identify.

      Thanks,
      Pablo

      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor
      Dec 5 19:19:24 kea-dhcp4 22525 ERROR [kea-dhcp4.commands.0x2f054a812000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor

      GertjanG 1 Reply Last reply Reply Quote 0
      • P Offline
        PabloAbonia
        last edited by

        In looking at this in more detail, this error message also caused my CPU to be stuck around 50% and was also associated intermittent network loss. Switching to ISC resolved both issues.

        B 1 Reply Last reply Reply Quote 0
        • B Offline
          Benjamin 3 @PabloAbonia
          last edited by

          @PabloAbonia I am seeing the same issue on my Netgate 4200 running 24.11

          1 Reply Last reply Reply Quote 0
          • GertjanG Offline
            Gertjan @PabloAbonia
            last edited by Gertjan

            @PabloAbonia said in KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC:

            COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket

            Can't be that bad.

            See here : https://kea.readthedocs.io/en/kea-2.0.1/kea-messages.html

            I propose : stop kill zap all kea process.
            Console, or better : SSH : option 8 and then

            ps ax | grep 'kea'
            

            and kill them all.

            Then, check that you still have this file /var/run/kea4-ctrl-socket (socket actually).
            rm it.

            Now start all the kea stuff with the GUI.

            edit :
            When kea starts, and runs fine, you can actually use this socket to talk to the process.

            Run this, on the command lie, and you'll see it answers you with loads of information :

            echo '{"command":"lease4-get-all"}' | nc -U /var/run/kea4-ctrl-socket | jq
            

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            1 Reply Last reply Reply Quote 0
            • P Offline
              PVuchetich2
              last edited by

              I just had this happen on my Netgate 1541. The effect was that it generated a dhcp.log file with 14923504 rows in under 7 minutes, rolled the log file due to size and the automatic bzip compression ended up taking 100% of all CPUs trying to compress all the dhcp log files.

              24.11-RELEASE (amd64)
              built on Wed Nov 27 12:22:00 CST 2024
              FreeBSD 15.0-CURRENT

              Everyone on the LAN experienced loss of internet access, as it was sometimes allowing packets to flow, but not reliably.

              Interestingly, IPv6 DHCP worked, so I could access the firewall GUI from the LAN. It was slow due to the bzip processes (based on running top from the shell)

              The repeated error message I saw is nearly identical to the OP:
              Feb 2 13:38:20 pfsense kea-dhcp4[91496]: ERROR [kea-dhcp4.commands.0xc6b64e12000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor

              Checking the log file to see how quickly it generated - 36,217 errors/second!
              $ grep 'Feb 2 13:38:20' dhcpd.log.6 | wc -l
              36217

              I tried rebooting the Netgate 1541, but it immediately happened again.
              I tried disabling DHCP, restarting DHCP from within the GUI, but neither option resolved the error messages.

              I switched back to ISC DHCP for now, but that isn't a long term solution since it is deprecated.

              Because it recurred after a reboot, I am not sure what else to do to get kea DHCP to work reliably. Are there any other settings that could be causing this?

              GertjanG 1 Reply Last reply Reply Quote 0
              • GertjanG Offline
                Gertjan @PVuchetich2
                last edited by

                @PVuchetich2 said in KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC:

                COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor

                Here : https://kea.readthedocs.io/en/kea-2.1.7/kea-messages.html

                A socket is just some kind of special file. Its created at kea starts, no big deal. Every process, like ngins, (web GUI) unbound (resolver) create these.

                COMMAND_SOCKET_ACCEPT_FAIL

                Failed to accept incoming connection on command socket %1: %2

                This error indicates that the server detected incoming connection and executed accept system call on said socket, but this call returned an error. Additional information may be provided by the system as second parameter.
                

                Try this :
                In the GUI, stop all kea server services.
                Then use the console or (better) SSH, menu option 8.

                cd /var/run
                

                then

                ls -al kea*
                

                Normally, there should be no files anymore that starts with "kea".
                If there are, remove them all.

                Now, start the kea server(s) again.
                Check again the content of the directory, there should be a new kea-ctrl-socket file again (and a lock file).

                Other checks :
                Systems processes like unbound, the web GUI and kea gets restarted when there is an up down interface. This happens when an interfaces goes down for a moment. You don't have these ?

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                1 Reply Last reply Reply Quote 0
                • J Offline
                  jacotec
                  last edited by

                  I've had the same issue, it started around 3:00 last night. My monitoring sent me a couple of messages about CPU utilization >90% which had been a couple of bzip processes packing dhcpd logs (because they are flooded with the same message above).

                  Killed all the processes and removed the files as described here and it's back to OK now ... but I'm still curious why this happened.

                  J 1 Reply Last reply Reply Quote 0
                  • J Offline
                    JimNH @jacotec
                    last edited by

                    @jacotec I was just about to consider upgrading to KEA but now I'll hold off...
                    I was considering it since my dns has been sluggish, with getting timeouts from various computers (small home office) saying they cant reach the pfsense; they eventually do but its frustrating...

                    1 Reply Last reply Reply Quote 0
                    • lohphatL Offline
                      lohphat
                      last edited by lohphat

                      Whelp, the KEA bug in arm32 platforms has been closed as "won't fix" as the core of the problem is in BSD's 32bit code.

                      https://redmine.pfsense.org/issues/15973#change-77507

                      The only resolution is to stay with ISC DHCP.

                      Which now brings us back to the supportability of 32bit Netgate platforms going forward.

                      I'll ask again: What's the EOL schedule and support intentions of arm32 units?

                      SG-3100 24.11-RELEASE (arm) | Avahi (2.2_6) | ntopng (5.6.0_1) | openvpn-client-export (1.9.5) | pfBlockerNG-devel (3.2.1_20) | System_Patches (2.2.20_5)

                      S 1 Reply Last reply Reply Quote 0
                      • S Offline
                        SteveITS Rebel Alliance @lohphat
                        last edited by

                        @lohphat Not sure what you’re looking for…? Here’s a thread discussing the announcement, two years ago. Can’t find a blog post in a quick search but I recall the email.

                        https://forum.netgate.com/topic/183472/3100-will-reach-end-of-life-in-5-days

                        @JimNH Ensure you have DHCP lease registration off, or unbound will restart at each renewal. Or maybe grant long leases.

                        Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                        When upgrading, allow 10-15 minutes to reboot, or more depending on packages, and device or disk speed.
                        Upvote 👍 helpful posts!

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.