Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Load Balancer stopped Balancing

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    31 Posts 13 Posters 16.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      townsenk
      last edited by

      ifconfig output

      rl0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
              options=8 <vlan_mtu>ether 00:11:95:1d:60:22
              inet 192.168.22.1 netmask 0xffffff00 broadcast 192.168.22.255
              inet6 fe80::211:95ff:fe1d:6022%rl0 prefixlen 64 scopeid 0x1
              nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (100baseTX <full-duplex>)
              status: active
      dc0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
              options=80008 <vlan_mtu,linkstate>ether 00:08:a1:83:08:74
              inet6 fe80::208:a1ff:fe83:874%dc0 prefixlen 64 scopeid 0x2
              inet 68.1.124.153 netmask 0xfffffe00 broadcast 68.1.125.255
              nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (100baseTX <full-duplex>)
              status: active
      dc1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
              options=80008 <vlan_mtu,linkstate>ether 00:1a:70:0f:cc:fd
              inet 24.249.193.155 netmask 0xffffffe0 broadcast 24.249.193.159
              inet6 fe80::21a:70ff:fe0f:ccfd%dc1 prefixlen 64 scopeid 0x3
              nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (100baseTX <full-duplex>)
              status: active
      plip0: flags=8810 <pointopoint,simplex,multicast>metric 0 mtu 1500
      pflog0: flags=100 <promisc>metric 0 mtu 33200
      enc0: flags=0<> metric 0 mtu 1536
      lo0: flags=8049 <up,loopback,running,multicast>metric 0 mtu 16384
              options=3 <rxcsum,txcsum>inet 127.0.0.1 netmask 0xff000000
              inet6 ::1 prefixlen 128
              inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7
              nd6 options=3 <performnud,accept_rtadv>pfsync0: flags=0<> metric 0 mtu 1460
              syncpeer: 224.0.0.240 maxupd: 128
      ovpns1: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 1500
              options=80000 <linkstate>inet6 fe80::211:95ff:fe1d:6022%ovpns1 prefixlen 64 scopeid 0x9
              inet 10.0.8.1 –> 10.0.8.2 netmask 0xffffffff
              nd6 options=3 <performnud,accept_rtadv>Opened by PID 16567

      apinger.conf

      pfSense apinger configuration file. Automatically Generated!

      User and group the pinger should run as

      user "root"
      group "wheel"

      Mailer to use (default: "/usr/lib/sendmail -t")

      #mailer "/var/qmail/bin/qmail-inject"

      Location of the pid-file (default: "/var/run/apinger.pid")

      pid_file "/var/run/apinger.pid"

      Format of timestamp (%s macro) (default: "%b %d %H:%M:%S")

      #timestamp_format "%Y%m%d%H%M%S"

      status {

      File where the status information whould be written to

      file "/tmp/apinger.status"

      Interval between file updates

      when 0 or not set, file is written only when SIGUSR1 is received

      interval 10s
      }

      ########################################

      RRDTool status gathering configuration

      Interval between RRD updates

      rrd interval 60s;

      These parameters can be overriden in a specific alarm configuration

      alarm default {
      command on "/usr/local/sbin/pfSctl -c 'filter reload'"
      command off "/usr/local/sbin/pfSctl -c 'filter reload'"
      combine 10s
      }

      "Down" alarm definition.

      This alarm will be fired when target doesn't respond for 30 seconds.

      alarm down "down" {
      time 10s
      }

      "Delay" alarm definition.

      This alarm will be fired when responses are delayed more than 200ms

      it will be canceled, when the delay drops below 100ms

      alarm delay "delay" {
      delay_low 200ms
      delay_high 500ms
      }

      "Loss" alarm definition.

      This alarm will be fired when packet loss goes over 20%

      it will be canceled, when the loss drops below 10%

      alarm loss "loss" {
      percent_low 10
      percent_high 20
      }

      target default {

      How often the probe should be sent

      interval 1s

      How many replies should be used to compute average delay

      for controlling "delay" alarms

      avg_delay_samples 10

      How many probes should be used to compute average loss

      avg_loss_samples 50

      The delay (in samples) after which loss is computed

      without this delays larger than interval would be treated as loss

      avg_loss_delay_samples 20

      Names of the alarms that may be generated for the target

      alarms "down","delay","loss"

      Location of the RRD

      #rrd file "/var/db/rrd/apinger-%t.rrd"
      }
      target "24.249.193.129" {
      description "GW_WAN2"
      srcip "24.249.193.155"
      alarms override "loss","delay","down";
      rrd file "/var/db/rrd/GW_WAN2-quality.rrd"
      }

      target "24.249.193.129" {
      description "GW_WAN2"
      srcip "24.249.193.155"
      alarms override "loss","delay","down";
      rrd file "/var/db/rrd/GW_WAN2-quality.rrd"
      }

      target "68.1.124.1" {
      description "GW_WAN1"
      srcip "68.1.124.153"
      alarms override "loss","delay","down";
      rrd file "/var/db/rrd/GW_WAN1-quality.rrd"
      }

      apinger.status

      24.249.193.129|24.249.193.155|GW_WAN2|19074|19070|1284569183|9.139ms|0.0%|none
      68.1.124.1|68.1.124.153|GW_WAN1|19074|19073|1284569183|12.777ms|0.0%|none</performnud,accept_rtadv></linkstate></up,pointopoint,running,multicast></performnud,accept_rtadv></rxcsum,txcsum></up,loopback,running,multicast></promisc></pointopoint,simplex,multicast></full-duplex></performnud,accept_rtadv></vlan_mtu,linkstate></up,broadcast,running,simplex,multicast></full-duplex></performnud,accept_rtadv></vlan_mtu,linkstate></up,broadcast,running,simplex,multicast></full-duplex></performnud,accept_rtadv></vlan_mtu></up,broadcast,running,simplex,multicast>

      1 Reply Last reply Reply Quote 0
      • T
        townsenk
        last edited by

        There is also another error concerning a gateway file that is displayed on the console while booting from /etc/inc/gwlb.inc
        If you can tell me where this logfile is located I provide the exact verbage from that log as well

        Thank you

        1 Reply Last reply Reply Quote 0
        • E
          eri--
          last edited by

          All should be fixed on snapshots later than this post.

          1 Reply Last reply Reply Quote 0
          • ?
            Guest
            last edited by

            I just wanted to chime in and say thanks to the OP and everyone else who contributed, I've been trying to resolve my load balancing issues for the last few days and I just now got around to checking the forums. While I could have saved a good amount of time by checking the forums sooner, finding that the problem has been brought up and addressed is really pleasing. Anyway, thanks to everyone I look forward to having functional load balancing again. Cheers.

            ~infinityv~

            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              That snapshot was not built after Ermal's post. A lot of work was done yesterday afternoon, and there hasn't been a new snap since then (the builder hasn't produced a usable snapshot run)

              You could try to gitsync and then try again, but iirc you will also need new apinger and check_reload_status/pfSctl binaries so it might break things.

              Just wait and try on the next new one.

              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                The most recent snapshot (and the one you posted the timestamp from) was from yesterday morning, not today.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • T
                  townsenk
                  last edited by

                  With the new snapshot some outbound loadbalancing seems to be happening, However under Status/Gateways/Gateway Groups still show the gateways as "unknown".

                  and this still appears in the system log

                  Sep 16 20:15:01 php: : Gateways status could not be determined, considering all as up/active.

                  Looks like a partial fix…

                  1 Reply Last reply Reply Quote 0
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    There are even more fixes that didn't make it into that snapshot, but the next one should be building now that has them… Though it should be safe to gitsync from today's snap up to current code.

                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 0
                    • T
                      townsenk
                      last edited by

                      Thanks I'll give it a try

                      1 Reply Last reply Reply Quote 0
                      • T
                        townsenk
                        last edited by

                        I gitsynced. Instead of displaying "unknown" it says "Gathering Data"

                        I'll see what tomorrows snapshot does…

                        1 Reply Last reply Reply Quote 0
                        • T
                          trunglam
                          last edited by

                          I used snapshost 2.0-BETA4 (i386) built on Sat Sep 18 22:12:31 EDT 2010
                          but Load Balancing outbound still error with

                          Sep 21 08:45:13 php: : Gateways status could not be determined, considering all as up/active.
                          and gateway information is
                          Tier 1
                          WANGW, Gathering data
                          OPT1GW, Gathering data

                          Wait for good snapshost

                          1 Reply Last reply Reply Quote 0
                          • M
                            muffin
                            last edited by

                            Also having this issue.
                            Currently running:
                            2.0-BETA4  (i386)
                            built on Sat Sep 18 23:15:00 EDT 2010

                            1 Reply Last reply Reply Quote 0
                            • N
                              n1ko
                              last edited by

                              Issues here too on the 18th snapshot. Some times it doesnt gather data,other times it gathers but loadbalancing still dont work and failover doesnt seem to be working either. Randomly it balances ok, but failover seems to be never working…

                              1 Reply Last reply Reply Quote 0
                              • R
                                roi
                                last edited by

                                In my case, failover is working fine.
                                Both my GW's are set as Tire1 and when one of them fail (happens at least once a day  >:( ) it keep on by sending all traffic on the other interface.

                                Version 2.0-BETA4 (i386)
                                AMD Athlon™ XP 2000+

                                1 Reply Last reply Reply Quote 0
                                • M
                                  muffin
                                  last edited by

                                  Any news on this? Just an update would be good.  ;)

                                  1 Reply Last reply Reply Quote 0
                                  • R
                                    roi
                                    last edited by

                                    It stopped working but again now seem to work fine.
                                    I am now going to update to Tue Sep 21 23:29:56 EDT 2010 and will see after this.

                                    Version 2.0-BETA4 (i386)
                                    AMD Athlon™ XP 2000+

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      stramato
                                      last edited by

                                      @roi:

                                      It stopped working but again now seem to work fine.
                                      I am now going to update to Tue Sep 21 23:29:56 EDT 2010 and will see after this.

                                      please keep us updated if Load Balance works on Sep 21 release :)

                                      I reverted back to Aug 27 release. Anything after Aug 27 release seem to be a lot more buggy

                                      1 Reply Last reply Reply Quote 0
                                      • T
                                        townsenk
                                        last edited by

                                        Logs still fill up with…

                                        Sep 22 23:15:01 php: : Gateways status could not be determined, considering all as up/active.
                                        Sep 22 23:00:00 php: : Gateways status could not be determined, considering all as up/active.

                                        Gateway Group Status still displays "gathering data"...

                                        1 Reply Last reply Reply Quote 0
                                        • R
                                          roi
                                          last edited by

                                          Over here it seem to work.
                                          It's not even 7am so as the day will pass there will be more traffic and I will have a better feedback.

                                          townsenk - are your interfaces configured using DHCP or static ?

                                          Version 2.0-BETA4 (i386)
                                          AMD Athlon™ XP 2000+

                                          1 Reply Last reply Reply Quote 0
                                          • T
                                            townsenk
                                            last edited by

                                            one static and one DHCP. Loadbalancing seems to work. Just the group status isn't reported and I gwt my system log filled up with the previous message I posted.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.