Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Did I find a bug? Load Balancer Issue - Can't round robin 3 hosts [SOLVED]

    Scheduled Pinned Locked Moved Routing and Multi WAN
    13 Posts 3 Posters 3.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ
      jimp Rebel Alliance Developer Netgate
      last edited by

      In 2.0 it's done using relayd, so the commands would be found in documentation for relayd, not pfctl.

      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • A
        ang
        last edited by

        Would anyone be willing/able to try and replicate this problem before I submit a bug report?

        I'll wait a day or two and if no one is interested, I'll go ahead with the bug report.

        1 Reply Last reply Reply Quote 0
        • C
          cmb
          last edited by

          check /var/etc/relayd.conf  If you can find a specific problem there, please open a bug report. If you can't find the specific issue, post here for further help, bug tickets are only for confirmed specific issues where this could be any number of things unless you can find a relayd config issue.

          1 Reply Last reply Reply Quote 0
          • A
            ang
            last edited by

            The configs look normal:

            [2.0-RC3-IPv6][admin@vm-pfs-2.0-rc3.localdomain]/root(55): relayctl show summary
            Id      Type            Name                            Avlblty Status
            1       redirect        VIP1                                    active
            1       table           vip1-realservers:80                     active (3 hosts)
            1       host            192.168.0.10                    100.00% up
            2       host            192.168.0.20                    100.00% up
            3       host            192.168.0.30                    100.00% up
            
            [2.0-RC3-IPv6][admin@vm-pfs-2.0-rc3.localdomain]/root(57): cat /var/etc/relayd.conf
            log updates 
            timeout 1000 
            table <vip1-realservers> { 192.168.0.10, 192.168.0.20, 192.168.0.30 }
            redirect "VIP1" {
              listen on 10.10.1.101 port 80
              forward to <vip1-realservers> port 80 check http '/'  code 200 
            }</vip1-realservers></vip1-realservers>
            

            Is there anything else I can check?

            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              Compare that output to what you see in the same places when only two servers are in the pool.

              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • A
                ang
                last edited by

                I removed web2 from the LB pool using the GUI (Services -> Load Balancers)

                The output is pretty much what you'd expect to see:

                [2.0-RC3-IPv6][admin@vm-pfs-2.0-rc3.localdomain]/root(58): relayctl show summary
                Id      Type            Name                            Avlblty Status
                1       redirect        VIP1                                    active
                1       table           vip1-realservers:80                     active (2 hosts)
                1       host            192.168.0.10                    100.00% up
                2       host            192.168.0.30                    100.00% up
                [2.0-RC3-IPv6][admin@vm-pfs-2.0-rc3.localdomain]/root(59): cat /var/etc/relayd.conf
                log updates 
                timeout 1000 
                table <vip1-realservers>{ 192.168.0.10, 192.168.0.30 }
                redirect "VIP1" {
                  listen on 10.10.1.101 port 80
                  forward to <vip1-realservers>port 80 check http '/'  code 200 
                }
                [2.0-RC3-IPv6][admin@vm-pfs-2.0-rc3.localdomain]/root(60):</vip1-realservers></vip1-realservers> 
                

                If you're wondering if I've goofed up the testing somehow, my testing method is pretty simple:

                I load the VIP 10.10.1.101 in my browsers (IE and Firefox, caching disabled) and mash on the F5 button.  Each web server serves a page with its hostname at the top.

                1 Reply Last reply Reply Quote 0
                • jimpJ
                  jimp Rebel Alliance Developer Netgate
                  last edited by

                  No I was just wondering if maybe some other keyword showed up in the relayd.conf file in the other case since it seemed to behave differently.

                  That said, try it with curl or wget at a command prompt. Even with the browser cache disabled, it will still hold open TCP connections, unless you are closing the browser window completely between tests as well.

                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 0
                  • A
                    ang
                    last edited by

                    Well I guess I goofed the testing after all.

                    Using wget, I could see the round-robin works correctly.

                    Then confirmed by closing my browser after loading each page.

                    I assumed that I could just test by refreshing since it worked properly with any 2 hosts in the load balancing pool.  (That's odd, right?  I'm still wondering how that happened.)

                    1 Reply Last reply Reply Quote 0
                    • jimpJ
                      jimp Rebel Alliance Developer Netgate
                      last edited by

                      Hard to say on that one. Short of low-level debugging like packet captures of the connections and watching the states on the client and server ends, it's hard to even speculate.

                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                      Need help fast? Netgate Global Support!

                      Do not Chat/PM for help!

                      1 Reply Last reply Reply Quote 0
                      • A
                        ang
                        last edited by

                        Thanks for all your help jimp!

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmb
                          last edited by

                          Testing a load balancer with a web browser is really hit and miss, with persistent TCP connections and caching. Always use wget or similar for load balancer testing.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.