Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    SLBD using entire CPU

    Scheduled Pinned Locked Moved Routing and Multi WAN
    35 Posts 8 Posters 14.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cmb
      last edited by

      It's not a fix, it's a work around until we can properly test and implement an alternative to slbd. We know what the problem is, unfortunately it's pretty much impossible to solve. The solution is ditching slbd for hoststated, which will be done in a future version.

      1 Reply Last reply Reply Quote 0
      • C
        cmb
        last edited by

        Also, this work around does seem to work for the vast majority of people.

        wjs: how much load are you pushing to cause it to break down so easily?

        1 Reply Last reply Reply Quote 0
        • W
          wjs
          last edited by

          cmb,
          Thanks very much for pushing that change, I saw it on the cvs track.

          right now its only my web browsing that going into the wan pool. the primary wan port, which is not part of the pool at the moment, has a good bit of traffic. last night we had about 1MB/s continuous sometimes going up to about 10MB/s when someone would pull down something big.

          The cpu load hovers under 15% or 20% but i think most of that is because i've got the whole dashboard open.
          I am only getting one process at a time maxing out before the script kicks in so the system never goes full load. (dual cpu system)

          I'm not sure this answered your question…

          If there is anything I can do to help get "hoststated" working for the next version let me know.

          1 Reply Last reply Reply Quote 0
          • S
            sullrich
            last edited by

            @wjs:

            If there is anything I can do to help get "hoststated" working for the next version let me know.

            Basically translate the .conf file from slbd -> hostsated.  I really want to get this in here and its on my gigantic whiteboard now but if you want to do the work please do so as my gigantic whiteboard has many entries now :)

            1 Reply Last reply Reply Quote 0
            • W
              wjs
              last edited by

              i'm not an expert but i'll take a shot at it

              1 Reply Last reply Reply Quote 0
              • S
                Superman
                last edited by

                Any steps we can take to install and test hostated, maybe having it alongside slbd just in case?

                The change in the script is working, but in between several processes get started and start to chew up 100% CPU again. It would be nice to try out the newer service, but have the other to fall-back on just in case…

                1 Reply Last reply Reply Quote 0
                • J
                  Juve
                  last edited by

                  Sorry to pull that topic up again but I am suffering the same problem with a clean and fresh 1.2 install. The pool is a failover pool with two WANs. One of the line is currently down so one gateway can't be pinged.

                  Am I unlucky or is it something people still encounter ?

                  1 Reply Last reply Reply Quote 0
                  • W
                    wjs
                    last edited by

                    I was able to 'mitigate' the problem by switching to a new machine and doing a fresh install. It seems to me like a fairly random problem.

                    does slbd go to full load as soon as the pool is activated or after some time?

                    1 Reply Last reply Reply Quote 0
                    • J
                      Juve
                      last edited by

                      It is after some time or just after saving changes, it is really a random behaviour.

                      I'll monitor it to identify when it forks and goes to 99%. I have other machines where it goes very well.

                      1 Reply Last reply Reply Quote 0
                      • J
                        Juve
                        last edited by

                        21 hours without any problem an then 100% cpu usage on one CPU.
                        Where can we adjust the time the killall script is executed ?

                        Thanks

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmb
                          last edited by

                          It's a known issue, using the uniprocessor kernel resolves it.

                          1 Reply Last reply Reply Quote 0
                          • W
                            wjs
                            last edited by

                            @cmb:

                            It's a known issue, using the uniprocessor kernel resolves it.

                            That would explain why the problem resolved itself on the new machine, its single core.

                            Any reason why that is? I had been planning to build a new dual core router.

                            1 Reply Last reply Reply Quote 0
                            • J
                              Juve
                              last edited by

                              Thanks cmb,

                              When we do the killall -9 on slbd what effects does it have on connections ? does it kills openned connections ?

                              1 Reply Last reply Reply Quote 0
                              • E
                                eri--
                                last edited by

                                1.2.1 should not suffer from it.

                                1 Reply Last reply Reply Quote 0
                                • J
                                  Juve
                                  last edited by

                                  OK fine.
                                  But what about connections when killing slbd ?

                                  Thanks

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    cmb
                                    last edited by

                                    slbd has no effects on your connections, its only function is to update the ruleset.

                                    1 Reply Last reply Reply Quote 0
                                    • J
                                      Juve
                                      last edited by

                                      Ok thanks !
                                      This mean I can my own crontab script to kill/restart it more often.

                                      1 Reply Last reply Reply Quote 0
                                      • E
                                        eri--
                                        last edited by

                                        Try 1.2.1 it does not have the slbd problems with multi-wan and should behave better.

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.