Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC

    Scheduled Pinned Locked Moved DHCP and DNS
    kea dhcp error
    13 Posts 8 Posters 1.0k Views 8 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B Offline
      Benjamin 3 @PabloAbonia
      last edited by

      @PabloAbonia I am seeing the same issue on my Netgate 4200 running 24.11

      1 Reply Last reply Reply Quote 0
      • GertjanG Offline
        Gertjan @PabloAbonia
        last edited by Gertjan

        @PabloAbonia said in KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC:

        COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket

        Can't be that bad.

        See here : https://kea.readthedocs.io/en/kea-2.0.1/kea-messages.html

        I propose : stop kill zap all kea process.
        Console, or better : SSH : option 8 and then

        ps ax | grep 'kea'
        

        and kill them all.

        Then, check that you still have this file /var/run/kea4-ctrl-socket (socket actually).
        rm it.

        Now start all the kea stuff with the GUI.

        edit :
        When kea starts, and runs fine, you can actually use this socket to talk to the process.

        Run this, on the command lie, and you'll see it answers you with loads of information :

        echo '{"command":"lease4-get-all"}' | nc -U /var/run/kea4-ctrl-socket | jq
        

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        1 Reply Last reply Reply Quote 0
        • P Offline
          PVuchetich2
          last edited by

          I just had this happen on my Netgate 1541. The effect was that it generated a dhcp.log file with 14923504 rows in under 7 minutes, rolled the log file due to size and the automatic bzip compression ended up taking 100% of all CPUs trying to compress all the dhcp log files.

          24.11-RELEASE (amd64)
          built on Wed Nov 27 12:22:00 CST 2024
          FreeBSD 15.0-CURRENT

          Everyone on the LAN experienced loss of internet access, as it was sometimes allowing packets to flow, but not reliably.

          Interestingly, IPv6 DHCP worked, so I could access the firewall GUI from the LAN. It was slow due to the bzip processes (based on running top from the shell)

          The repeated error message I saw is nearly identical to the OP:
          Feb 2 13:38:20 pfsense kea-dhcp4[91496]: ERROR [kea-dhcp4.commands.0xc6b64e12000] COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor

          Checking the log file to see how quickly it generated - 36,217 errors/second!
          $ grep 'Feb 2 13:38:20' dhcpd.log.6 | wc -l
          36217

          I tried rebooting the Netgate 1541, but it immediately happened again.
          I tried disabling DHCP, restarting DHCP from within the GUI, but neither option resolved the error messages.

          I switched back to ISC DHCP for now, but that isn't a long term solution since it is deprecated.

          Because it recurred after a reboot, I am not sure what else to do to get kea DHCP to work reliably. Are there any other settings that could be causing this?

          GertjanG 1 Reply Last reply Reply Quote 0
          • GertjanG Offline
            Gertjan @PVuchetich2
            last edited by

            @PVuchetich2 said in KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC:

            COMMAND_SOCKET_ACCEPT_FAIL Failed to accept incoming connection on command socket -1: Bad file descriptor

            Here : https://kea.readthedocs.io/en/kea-2.1.7/kea-messages.html

            A socket is just some kind of special file. Its created at kea starts, no big deal. Every process, like ngins, (web GUI) unbound (resolver) create these.

            COMMAND_SOCKET_ACCEPT_FAIL

            Failed to accept incoming connection on command socket %1: %2

            This error indicates that the server detected incoming connection and executed accept system call on said socket, but this call returned an error. Additional information may be provided by the system as second parameter.
            

            Try this :
            In the GUI, stop all kea server services.
            Then use the console or (better) SSH, menu option 8.

            cd /var/run
            

            then

            ls -al kea*
            

            Normally, there should be no files anymore that starts with "kea".
            If there are, remove them all.

            Now, start the kea server(s) again.
            Check again the content of the directory, there should be a new kea-ctrl-socket file again (and a lock file).

            Other checks :
            Systems processes like unbound, the web GUI and kea gets restarted when there is an up down interface. This happens when an interfaces goes down for a moment. You don't have these ?

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            1 Reply Last reply Reply Quote 0
            • J Offline
              jacotec
              last edited by

              I've had the same issue, it started around 3:00 last night. My monitoring sent me a couple of messages about CPU utilization >90% which had been a couple of bzip processes packing dhcpd logs (because they are flooded with the same message above).

              Killed all the processes and removed the files as described here and it's back to OK now ... but I'm still curious why this happened.

              J 1 Reply Last reply Reply Quote 0
              • J Offline
                JimNH @jacotec
                last edited by

                @jacotec I was just about to consider upgrading to KEA but now I'll hold off...
                I was considering it since my dns has been sluggish, with getting timeouts from various computers (small home office) saying they cant reach the pfsense; they eventually do but its frustrating...

                1 Reply Last reply Reply Quote 0
                • lohphatL Offline
                  lohphat
                  last edited by lohphat

                  Whelp, the KEA bug in arm32 platforms has been closed as "won't fix" as the core of the problem is in BSD's 32bit code.

                  https://redmine.pfsense.org/issues/15973#change-77507

                  The only resolution is to stay with ISC DHCP.

                  Which now brings us back to the supportability of 32bit Netgate platforms going forward.

                  I'll ask again: What's the EOL schedule and support intentions of arm32 units?

                  SG-3100 25.07-RELEASE (arm) | Avahi (2.2_7) | ntopng (6.2.0) | openvpn-client-export (1.9.5) | pfBlockerNG-devel (3.2.7) | System_Patches (2.2.22)

                  S 1 Reply Last reply Reply Quote 0
                  • S Offline
                    SteveITS Rebel Alliance @lohphat
                    last edited by

                    @lohphat Not sure what you’re looking for…? Here’s a thread discussing the announcement, two years ago. Can’t find a blog post in a quick search but I recall the email.

                    https://forum.netgate.com/topic/183472/3100-will-reach-end-of-life-in-5-days

                    @JimNH Ensure you have DHCP lease registration off, or unbound will restart at each renewal. Or maybe grant long leases.

                    Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                    When upgrading, allow 10-15 minutes to reboot, or more depending on packages, and device or disk speed.
                    Upvote 👍 helpful posts!

                    lohphatL J 2 Replies Last reply Reply Quote 0
                    • lohphatL Offline
                      lohphat @SteveITS
                      last edited by

                      @SteveITS

                      I know the hardware is past support -- the question is when are firmware releases no longer going to support it?

                      It's been documented that ISC DHCP will be deprecated and KEA is preferred, but the bug says KEA will never support arm32 becasue of underlying BSD issues.

                      That's a fair call.

                      What I'm looking for is an expectation of how much longer ISC DHCP will be around AND at what point firmware releases will no longer be offered for my 3100 unit.

                      SG-3100 25.07-RELEASE (arm) | Avahi (2.2_7) | ntopng (6.2.0) | openvpn-client-export (1.9.5) | pfBlockerNG-devel (3.2.7) | System_Patches (2.2.22)

                      GertjanG 1 Reply Last reply Reply Quote 0
                      • GertjanG Offline
                        Gertjan @lohphat
                        last edited by

                        @lohphat said in KEA DHCP continuously rebooting with error message after 24.11 upgrade and switch from ISC:

                        What I'm looking for is an expectation of how much longer ISC DHCP will be around ...

                        ISC DHCP won't receive any updates, upgrades whatever from ISC anymore.
                        ISC DHCP isn't a project you fork, adapt, and deploy yourself ... as it is quiet big.
                        So, if something 'simple' needs to be changed so it can continue to be supported => and it compiles and builds under FreeBSD 15, the version that pfSense uses, they'll include it (I hope).

                        However : the very day an "ISC DHCP" security issue is found, it's exit right away.
                        After all, Netgate : pfSense is 'security first', and the rest has a lower priority.

                        Now you'll ask : when will this security issue be found ? I presume no one can answer that one.

                        Take note : I'm just another pfSense user. I'm just resuming the same question with the answer I found everywhere on the net, as ISC DHCP is a used a lot (every arm based small low bud (ISP ) router out there ?) so the question and answer is known ... (and unknown).

                        I'll hope for you that can use your arm32 for many month, maybe even years to come.

                        No "help me" PM's please. Use the forum, the community will thank you.
                        Edit : and where are the logs ??

                        1 Reply Last reply Reply Quote 0
                        • J Offline
                          JimNH @SteveITS
                          last edited by

                          @SteveITS Thanks; I believe I found the root cause... the mmc in my 4200 was dying and is now dead, replaced by a nvme..

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.