Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    SG-3100 Loadbalance and failover

    Official Netgate® Hardware
    3
    8
    236
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      kramer9 last edited by

      Hello all, I am having problems setting up load balancing and failover for a dual WAN setup. I see the gatewate groups and alarms being generated, but it is never failing over:

      May 23 16:31:14	rc.gateway_alarm	82750	>>> Gateway alarm: WAN_STARLINK_DHCP (Addr:8.8.8.8 Alarm:0 RTT:39.250ms RTTsd:10.998ms Loss:5%)
      May 23 16:31:14	check_reload_status	374	updating dyndns WAN_STARLINK_DHCP
      May 23 16:31:14	check_reload_status	374	Restarting ipsec tunnels
      May 23 16:31:14	check_reload_status	374	Restarting OpenVPN tunnels/interfaces
      May 23 16:31:14	check_reload_status	374	Reloading filter
      May 23 16:31:15	php-fpm	360	/rc.openvpn: MONITOR: WAN_STARLINK_DHCP is available now, adding to routing group LoadBalanced
      May 23 16:31:15	php-fpm	360	8.8.8.8|192.168.1.232|WAN_STARLINK_DHCP|39.065ms|10.932ms|2%|online|none
      May 23 16:32:08	rc.gateway_alarm	67343	>>> Gateway alarm: WAN_STARLINK_DHCP (Addr:8.8.8.8 Alarm:1 RTT:37.056ms RTTsd:8.504ms Loss:21%)
      May 23 16:32:08	check_reload_status	374	updating dyndns WAN_STARLINK_DHCP
      May 23 16:32:08	check_reload_status	374	Restarting ipsec tunnels
      May 23 16:32:08	check_reload_status	374	Restarting OpenVPN tunnels/interfaces
      May 23 16:32:08	check_reload_status	374	Reloading filter
      May 23 16:32:09	php-fpm	360	/rc.openvpn: MONITOR: WAN_STARLINK_DHCP has packet loss, omitting from routing group LoadBalanced
      May 23 16:32:09	php-fpm	360	8.8.8.8|192.168.1.232|WAN_STARLINK_DHCP|37.204ms|8.526ms|23%|down|highloss
      May 23 16:33:39	rc.gateway_alarm	86689	>>> Gateway alarm: WAN_STARLINK_DHCP (Addr:8.8.8.8 Alarm:0 RTT:116.274ms RTTsd:281.754ms Loss:5%)
      May 23 16:33:39	check_reload_status	374	updating dyndns WAN_STARLINK_DHCP
      May 23 16:33:39	check_reload_status	374	Restarting ipsec tunnels
      May 23 16:33:39	check_reload_status	374	Restarting OpenVPN tunnels/interfaces
      May 23 16:33:39	check_reload_status	374	Reloading filter
      May 23 16:33:41	php-fpm	23581	/rc.openvpn: MONITOR: WAN_STARLINK_DHCP is available now, adding to routing group LoadBalanced
      May 23 16:33:41	php-fpm	23581	8.8.8.8|192.168.1.232|WAN_STARLINK_DHCP|114.721ms|278.25ms|3%|online|none
      

      Does anyone have a good troubleshooting guide to try and determine what is setup wrong here? I have the gateway groups, firewall lan rules, and DNS setup in the general settings tab.

      DaddyGo 1 Reply Last reply Reply Quote 0
      • DaddyGo
        DaddyGo @kramer9 last edited by

        @kramer9 said in SG-3100 Loadbalance and failover:

        I see the gatewate groups and alarms being generated, but it is never failing over

        Hi,

        What is your exact GW Group setting, can you show us?

        -tier(s)
        and
        Trigger Level:

        -member down
        -high latency
        -packet loss
        -loss + high latency

        BTW:

        Known DNS server(s) as minitor IP, not a good choice here, as it does not exactly show your ISP connection, as the PING depends on the DNS server load too....

        Can you remove your Dual-NAT configuration? (bridge mode)

        Cats bury it so they can't see it!
        (You know what I mean if you have a cat)

        K 1 Reply Last reply Reply Quote 0
        • K
          kramer9 @DaddyGo last edited by

          @daddygo - Trying to figure out the best way to show you what you are asking:

          61633d41-9263-45fe-bcda-343c80512297-image.png ```

          
          

          7dd6f435-9b26-4ab1-8c83-d448e577cedd-image.png code_text

          DaddyGo 1 Reply Last reply Reply Quote 0
          • DaddyGo
            DaddyGo @kramer9 last edited by DaddyGo

            @kramer9 said in SG-3100 Loadbalance and failover:

            best way to show

            😉

            (?) What I'd do, if I can't eliminate the Dual-NAT, so it stays the monitor IP on the DNS server(s), or "tracert" to find a nearby ISPs upstream GW that responds to the PING.
            (results in a better measurement and a more stable value for the "dpinger")

            I would not configure it for packet loss, but I would choose member down.

            (Is there a big speed difference between the two ISP links?
            if not then a plain loadbalance will solve all your question and no failover setup is needed

            This will help (this is a rough link - suddenly- I couldn't find a better one for you): 😉 😉

            https://www.cyberciti.biz/faq/howto-configure-dual-wan-load-balance-failover-pfsense-router/?cf_chl_captcha_tk=b19a8d5b347fd3f6a25579b8c123f3ca7dd76d3a-1621868538-0-AaaAJyc-XA0E_URuyvq0PWv1HMcVWaLA4YlA9uq7f61D_EDbT6SdOjLrN1YNALceSrBn9ni3SZ0nlGyt5I_Tq84TJGAbMGvFE9M7ZUbtNDxplLM-ZDHu6NnftrAaEQiFjYg0SgL9q-83tjIlR1-hq6N5VWtGAqZW-u-sKKAHkSDa1EG4FRJdiQHDSekvGkAr93cuC4GnTw2McCMXeac3PZGteBkSCKnT5IkEPmR1oP7rJur3TAmtorH07uMw3O73r53cFKo29BCVD04qJ07Qqe86tKSZw2SQEskOz20mes1NUh1CMK1LPO7vJaSfqjgEl6pVzIX_tK-0-pzww_zsjSaX0iNlwF5JfEMBwmvxlgRnodHOCufP-w35cf8KbvnRKQGLaKS__z1tTiZiS5WiDldda7TcLE8xLL10jbHjV0eMrUrmmbxYSl_KiInn8845gbYf4I2yNrt2T6GMCAXXtQpWD6v3kQcl4VMKwCD_LL_BP9uy0ufhoBoFhjS-j1cbThASyTs8WufVhg143Rj2seGN4SKQsXmwHdUNzzJ_DOv7TucHqZhY0ZmiCG2QNqRLPRZ2rsl5wJi1oXadTQTrTpLVvfWVXdePbuzjslThiK10ztKkbfr6JqOAxQ2xWXnRG7fRqKFXE5Z5p_bVWVh8yoKa78YY2ag107cLwOp3J2lJtNiWSiIGC-mcRFx7FyMPqSitREY1-u-1gJh95ulIogyvrYz_LNtVDcyJ-WEgVhKah2KFo6Kg6cuFzHDiFEMf4w

            Cats bury it so they can't see it!
            (You know what I mean if you have a cat)

            1 Reply Last reply Reply Quote 0
            • K
              kramer9 last edited by

              Many thanks! Made a couple of minor tweaks based on the url and your notes. Works like a champ now for both load balancing and failover, changed to speedtest.net to test the connection from the endpoints to make sure, the only thing they see if the speed cut drastically. There is a HUGE speed difference Starlink gives me about 140M/down and USCellular 15M/down. Since starlink still has lags and gaps, thats why cellular is the backup for LB and failover.

              So where are you seeing the Dual-Nat? I haven't switched the house over to this setup until I get the mesh hardware, on backorder.

              DaddyGo 1 Reply Last reply Reply Quote 0
              • DaddyGo
                DaddyGo @kramer9 last edited by

                @kramer9 said in SG-3100 Loadbalance and failover:

                There is a HUGE speed difference Starlink gives me about 140M/down and USCellular 15M/down.

                Okay 😉
                so I understand your dual approach loadbalance / failover

                @kramer9 "So where are you seeing the Dual-Nat?"

                I think I saw an RFC1918 IP address on the WAN_STARLINK_DHCP gateway, correct me if I'm wrong and this is just a test.... 😉

                Cats bury it so they can't see it!
                (You know what I mean if you have a cat)

                S 1 Reply Last reply Reply Quote 0
                • S
                  SteveITS @DaddyGo last edited by

                  @daddygo said in SG-3100 Loadbalance and failover:

                  RFC1918 IP address on the WAN_STARLINK_DHCP gateway

                  I've seen comments elsewhere that Starlink uses CGNAT.

                  Steve

                  Only install packages for your version, or risk breaking it. If yours is older, select it in System/Update/Update Settings.
                  When upgrading, let it finish. Allow 10 minutes, or more depending on packages and device speed.

                  DaddyGo 1 Reply Last reply Reply Quote 0
                  • DaddyGo
                    DaddyGo @SteveITS last edited by

                    @steveits said in SG-3100 Loadbalance and failover:

                    I've seen comments elsewhere that Starlink uses CGNAT.

                    Well then I saw it right 😉

                    Aha, this is not the best situation, because you can only hope that the CGNAT is only because of the few IPv4 address space of the provider and there are no nonsense filtering rules on the NAT.

                    It's like when you're at work and you need two hands and it's one fixed behind your back.

                    It's also strange that they use 192.168.0.0/16 and not 10.0.0.0/8, they're not that out of addresses then, hmmm?

                    Cats bury it so they can't see it!
                    (You know what I mean if you have a cat)

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post