Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN

    Scheduled Pinned Locked Moved Routing and Multi WAN
    87 Posts 5 Posters 7.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      knoppolis
      last edited by

      @jimeez random question. For your NIC's that your two WAN connections come in on. Is it a single card with dual or quad ports, or two physically separate NIC's? I had a single quad port Intel NIC, with two of the ports used for my WAN connections and one for my LAN.

      As soon as I get a chance I am going to try putting another NIC in the box and see if having two isolated NIC's makes any difference.

      J 1 Reply Last reply Reply Quote 0
      • J
        jimeez @knoppolis
        last edited by

        @knoppolis

        Three Single NIC cards. All Intel. One of the first things I did when this started happening was replace the NICs.

        Tonight I'm going to experiment with the EdgeRouter a bit more and put CenturyLink as the main connection with StarLink as the failover. See if I get the same result.

        1 Reply Last reply Reply Quote 0
        • J
          jimeez
          last edited by

          So there we have it. The problem HAS to be on StarLink's end. At least this is my unprofessional conclusion.

          I feel silly for not trying this before now, but tonight I re-inserted the EdgeRouter into my network but this time I made CenturyLink the primary and StarLink the secondary fail-over. Guess what? It's been working fine for the last several hours.

          Whatever is happening every 15 minutes when StarLink is the primary WAN is beyond me. But it is currently working fine in a dual-WAN fail-over environment on an EdgeRouter Lite-3. I suppose the real test will be to see if I get the same result on the pfSense box.

          1 Reply Last reply Reply Quote 1
          • J
            jimeez @preston
            last edited by jimeez

            @preston

            One more item to add to the list of discoveries, I tried making CentruyLink the primary interface this morning with StarLink the backup fail-over. No issues. The connection remained solid for several hours with both interfaces active. As soon as I switched it back to StarLink as primary and CenturyLink as fail-over the drop s started all over again. Every 15 minutes on the exact 15 minute mark.

            I don't know how this is NOT a StarLink issue.

            P 1 Reply Last reply Reply Quote 1
            • P
              preston @jimeez
              last edited by preston

              @jimeez

              I can confirm the same on my end. When CL is primary, it stays up.

              Question about your setup: What are you using for DNS servers? Are you using something like 1.1.1.1 or 8.8.8.8? I've tried different combinations and nothing seems to matter.

              J 1 Reply Last reply Reply Quote 0
              • J
                jimeez @preston
                last edited by

                @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                @jimeez

                Question about your setup: What are you using for DNS servers? Are you using something like 1.1.1.1 or 8.8.8.8? I've tried different combinations and nothing seems to matter.

                Yeah, I use four DNS servers. In this order

                8.8.8.8
                8.8.4.4
                (and then two other DNS servers from a geographically local ISP)
                
                1 Reply Last reply Reply Quote 0
                • P
                  preston
                  last edited by preston

                  I still haven't found a solution...

                  • Starlink pushed an update yesterday. I applied it, no change.

                  • I finally fiddled with the Advanced DHCP configuration, no change.

                  • I switched back to ISC DHCP again (the Kea service keeps crashing/stopping when adding Centurylink), no change.

                  J 1 Reply Last reply Reply Quote 0
                  • J
                    jimeez @preston
                    last edited by jimeez

                    @preston

                    Same here. I have spread this information to several folks that are much more knowledgeable than I am and have not figured out a solution.

                    One of these people suggested using tcdump to capture activity on both the LAN and WAN side when the drop occurs on the 15 minute mark. I haven't done that yet. Might mess around with it today.

                    I really think you're on to something with the "dhclient 86826 bound to 76.0.28.79 – renewal in 900 seconds”. But I don't know what to do with it. I have not dug around in the CenturyLink modem settings yet. But maybe there is something that changed on CenturyLink's end? Would be curious to see the date of the most recent firmware and if it corresponds to the onset of our problem.

                    Assume you are using the CenturyLink modem in Transparent Bridge mode?

                    P 1 Reply Last reply Reply Quote 0
                    • chpalmerC
                      chpalmer
                      last edited by

                      What happens if you put your Centurylink as tier 3?

                      Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                      Triggering snowflakes one by one..
                      Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                      P J 3 Replies Last reply Reply Quote 0
                      • P
                        preston @jimeez
                        last edited by

                        @jimeez said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                        @preston

                        Assume you are using the CenturyLink modem in Transparent Bridge mode?

                        Yep, I am running the Centurylink modem (Zyxel C110Z running Firmware CZW007-4.16.012.15) in Transparent bridging mode. The firmware was one version out of date when all this started. I updated it to the latest firmware I could find early on in our troubleshooting.

                        1 Reply Last reply Reply Quote 0
                        • P
                          preston @chpalmer
                          last edited by preston

                          @chpalmer said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                          What happens if you put your Centurylink as tier 3?

                          Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                          I haven't tried putting the CL modem back in modem mode (for fear of a double NAT), but that is a good idea. I will give that a try when I can and report back.

                          As far as Tier 3, I only have two WANs.

                          Like @jimeez, when I make Centurylink Tier 1 and Starlink Tier 2, Starlink seems to stay online (as far as I can tell). Although, I'm starting to get lost in everything I've tried so far...

                          1 Reply Last reply Reply Quote 0
                          • P
                            preston @chpalmer
                            last edited by preston

                            @chpalmer said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                            Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                            You may be on to something there!

                            I took the Centurylink modem out of Transparent Bridging mode and connected a LAN port on the Centurylink modem to my pfSense box (WAN2) and I made it past the 15 minute mark without losing Starlink (WAN1). Also, no errors in the DHCP logs. I am also using ISC DHCP instead of keadhcp.

                            Edit: It's been about 30 minutes and things are still working (Starlink WAN1 staying online). I'm out of time tonight to do any more troubleshooting, but will over the next few days. Will report back.

                            J 1 Reply Last reply Reply Quote 0
                            • J
                              jimeez @preston
                              last edited by

                              @preston

                              Everything I have ever read or watched about connecting a DSL modem to pfSense instructs that the modem be placed in transparent bridge mode. Curious to know what settings you applied in pfSense to get this to work.

                              1 Reply Last reply Reply Quote 0
                              • J
                                jimeez @chpalmer
                                last edited by

                                @chpalmer said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                                What happens if you put your Centurylink as tier 3?

                                As @preston already stated, both of us have only two WANs.

                                Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                                Sounds like this may have worked for @preston . Assuming it did/does, I have a couple questions:

                                • With the DSL modem no longer in transparent bridge mode I assume that it will assign that WAN interface a local IP address of 192.168.1.xxx. If this assumption is correct is this connection now sitting behind a double NAT?
                                • If that's the case, I guess we can no longer use pfSense to resolve our Dynamic DNS clients as that interface will no longer have an outside IP address.

                                You'll have to pardon my ignorance here. I only know enough to be slightly dangerous.

                                P 1 Reply Last reply Reply Quote 0
                                • P
                                  preston @jimeez
                                  last edited by

                                  @jimeez

                                  Things are still working here with the CenturyLink modem out of transparent bridging mode.

                                  Here is what I did ( I connected my laptop via ethernet to the CenturyLink modem for the setup):

                                  • Factory reset the CenturyLink modem (again), disabled the CL modem WiFi, reset the admin password and so on.

                                  • My CenturyLink modem's default GUI address is 192.168.0.1 - I left that as is.

                                  • I connected LAN 1 on CL modem to my WAN2 pfSense port.

                                  • Under DHCP reservations in the CL modem, I assigned my pfSense box an IP of 192.168.0.2.

                                  • I disabled the DHCP server on the CL modem.

                                  • Rebooted the CL modem

                                  • Unplugged my laptop from the CL modem.

                                  • I reconnected to my pfSense network and set up the CL interface and gateway.

                                  • My DNS servers and monitor IPs are 1.1.1.1 for Starlink and 8.8.8.8 for CenturyLink respectively.

                                  • My pfSense LAN is in the 192.168.1.xxx range

                                  • The pfSense dashboard shows the CL WAN IP as 192.168.0.2, but when I check sites like infosniper.net I can see the CL IP address.

                                  • As an added bonus I can now access the CL modem GUI (192.168.0.1) via the pfSense network without having to fiddle with additional pfSense settings.

                                  • I'm not sure about Dynamic DNS, but I have been using Tailscale with Starlink and it has worked great.

                                  • @chpalmer may just be our hero!

                                  As far as Double NAT while using the CL WAN, I really don't know (or understand it completely), but here is my Traceroute from the CL WAN to www.google.com:

                                  1  192.168.0.1  0.544 ms  0.425 ms  0.402 ms
                                   2  184.102.159.254  28.701 ms  28.475 ms  28.817 ms
                                   3  71.33.4.9  28.078 ms  28.604 ms  28.296 ms
                                   4  4.68.144.169  59.685 ms  46.815 ms  42.017 ms
                                   5  4.68.127.114  44.359 ms  55.718 ms  63.480 ms
                                   6  * * *
                                   7  142.251.60.10  42.169 ms
                                      216.239.51.116  43.287 ms
                                      209.85.255.172  42.109 ms
                                   8  209.85.247.117  42.379 ms
                                      192.178.249.234  43.372 ms
                                      209.85.247.117  42.327 ms
                                   9  142.251.233.230  43.373 ms  44.324 ms
                                      142.250.190.4  41.627 ms
                                  
                                  J chpalmerC 2 Replies Last reply Reply Quote 1
                                  • J
                                    jimeez @preston
                                    last edited by

                                    @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                                    I disabled the DHCP server on the CL modem.

                                    Ahhhhh....there it is. This makes sense now. Perfect! I'll give this a try tonight and see if I get the same result.

                                    1 Reply Last reply Reply Quote 0
                                    • chpalmerC
                                      chpalmer @preston
                                      last edited by chpalmer

                                      @preston @jimeez

                                      My guess is that pfsense is re-authenticating with C-Link every 15 minutes and something occurs to cause the issue at that time.

                                      Though I am unsure why this hasn't come up before with other users trying to utilize similar setups.. I use Astound and Verizon here and have no issues. Neither of my ISPs use any kind of PPP.

                                      Triggering snowflakes one by one..
                                      Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                                      P 1 Reply Last reply Reply Quote 0
                                      • P
                                        preston @chpalmer
                                        last edited by

                                        @chpalmer

                                        Generally speaking, CenturyLink (now called Brightspeed) has been the worst ISP I have ever had. Until Starlink, they were the only option in my area.

                                        That being said, it worked fine in Transparent Bridging for a long time. Not sure what changed, but it sure broke things. So far, so good. Things seem to be back to normal. Hope it works for you @jimeez.

                                        Thank-you again!

                                        J 1 Reply Last reply Reply Quote 0
                                        • J
                                          jimeez @preston
                                          last edited by

                                          @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                                          @chpalmer

                                          Generally speaking, CenturyLink (now called Brightspeed) has been the worst ISP I have ever had. Until Starlink, they were the only option in my area.

                                          That being said, it worked fine in Transparent Bridging for a long time. Not sure what changed, but it sure broke things.

                                          This was my exact experience as well. Only option available until StarLink (I mean I choose to live in the middle of nowhere). Worked just fine in transparent bridge mode forever....and still does when it's the only active interface. But something changed on or about August 22, 2024.

                                          @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                                          Hope it works for you @jimeez.

                                          I'm not quite there yet. Although it does seem promising. I spent more hours on this last night than I will admit and am still struggling with the CL modem settings. I have an old Protectli 4-port device with which I decided to start fresh. Got StarLink up and running no problem on the main WAN interface. Adding the CL interface is another story for some reason. I must be doing something wrong.

                                          (i disconnected the StarLink interface while setting up the CL interface)

                                          • I initially connected a laptop to the factory-reset CL modem via the WAN port (laptop to WAN port).
                                          • After initial config of the CL modem (turn off WiFi etc.) I connected the pfSense device OPT1 interface to Port 1 on the CL modem and reserved an IP for it. In this case I was not able to use 192.168.0.2 because the laptop already took it, so I gave it 192.168.0.5.
                                          • When the DHCP service is active both the laptop and pfSense see the modem and have internet.
                                          • As soon as I disable the DHCP server on the CL modem I can no longer resolve DNS addresses. The laptop and pfSense devices both now show that they no longer have internet.
                                          • I can ping actual IP addresses on both devices (like 8.8.8.8), but can't resolve addresses (say google.com).

                                          Basically I'm stuck here. Grateful for any input on the likely obvious thing I'm doing wrong. ;-)

                                          1 Reply Last reply Reply Quote 1
                                          • J
                                            jimeez
                                            last edited by jimeez

                                            @chpalmer @preston

                                            Short response right now is that this works for me too. Thank you so much!!

                                            Will post back later with more detail.

                                            chpalmerC P 2 Replies Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.