Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN

    Scheduled Pinned Locked Moved Routing and Multi WAN
    87 Posts 5 Posters 7.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jimeez @preston
      last edited by jimeez

      @preston

      One more item to add to the list of discoveries, I tried making CentruyLink the primary interface this morning with StarLink the backup fail-over. No issues. The connection remained solid for several hours with both interfaces active. As soon as I switched it back to StarLink as primary and CenturyLink as fail-over the drop s started all over again. Every 15 minutes on the exact 15 minute mark.

      I don't know how this is NOT a StarLink issue.

      P 1 Reply Last reply Reply Quote 1
      • P
        preston @jimeez
        last edited by preston

        @jimeez

        I can confirm the same on my end. When CL is primary, it stays up.

        Question about your setup: What are you using for DNS servers? Are you using something like 1.1.1.1 or 8.8.8.8? I've tried different combinations and nothing seems to matter.

        J 1 Reply Last reply Reply Quote 0
        • J
          jimeez @preston
          last edited by

          @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

          @jimeez

          Question about your setup: What are you using for DNS servers? Are you using something like 1.1.1.1 or 8.8.8.8? I've tried different combinations and nothing seems to matter.

          Yeah, I use four DNS servers. In this order

          8.8.8.8
          8.8.4.4
          (and then two other DNS servers from a geographically local ISP)
          
          1 Reply Last reply Reply Quote 0
          • P
            preston
            last edited by preston

            I still haven't found a solution...

            • Starlink pushed an update yesterday. I applied it, no change.

            • I finally fiddled with the Advanced DHCP configuration, no change.

            • I switched back to ISC DHCP again (the Kea service keeps crashing/stopping when adding Centurylink), no change.

            J 1 Reply Last reply Reply Quote 0
            • J
              jimeez @preston
              last edited by jimeez

              @preston

              Same here. I have spread this information to several folks that are much more knowledgeable than I am and have not figured out a solution.

              One of these people suggested using tcdump to capture activity on both the LAN and WAN side when the drop occurs on the 15 minute mark. I haven't done that yet. Might mess around with it today.

              I really think you're on to something with the "dhclient 86826 bound to 76.0.28.79 – renewal in 900 seconds”. But I don't know what to do with it. I have not dug around in the CenturyLink modem settings yet. But maybe there is something that changed on CenturyLink's end? Would be curious to see the date of the most recent firmware and if it corresponds to the onset of our problem.

              Assume you are using the CenturyLink modem in Transparent Bridge mode?

              P 1 Reply Last reply Reply Quote 0
              • chpalmerC
                chpalmer
                last edited by

                What happens if you put your Centurylink as tier 3?

                Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                Triggering snowflakes one by one..
                Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                P J 3 Replies Last reply Reply Quote 0
                • P
                  preston @jimeez
                  last edited by

                  @jimeez said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                  @preston

                  Assume you are using the CenturyLink modem in Transparent Bridge mode?

                  Yep, I am running the Centurylink modem (Zyxel C110Z running Firmware CZW007-4.16.012.15) in Transparent bridging mode. The firmware was one version out of date when all this started. I updated it to the latest firmware I could find early on in our troubleshooting.

                  1 Reply Last reply Reply Quote 0
                  • P
                    preston @chpalmer
                    last edited by preston

                    @chpalmer said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                    What happens if you put your Centurylink as tier 3?

                    Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                    I haven't tried putting the CL modem back in modem mode (for fear of a double NAT), but that is a good idea. I will give that a try when I can and report back.

                    As far as Tier 3, I only have two WANs.

                    Like @jimeez, when I make Centurylink Tier 1 and Starlink Tier 2, Starlink seems to stay online (as far as I can tell). Although, I'm starting to get lost in everything I've tried so far...

                    1 Reply Last reply Reply Quote 0
                    • P
                      preston @chpalmer
                      last edited by preston

                      @chpalmer said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                      Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                      You may be on to something there!

                      I took the Centurylink modem out of Transparent Bridging mode and connected a LAN port on the Centurylink modem to my pfSense box (WAN2) and I made it past the 15 minute mark without losing Starlink (WAN1). Also, no errors in the DHCP logs. I am also using ISC DHCP instead of keadhcp.

                      Edit: It's been about 30 minutes and things are still working (Starlink WAN1 staying online). I'm out of time tonight to do any more troubleshooting, but will over the next few days. Will report back.

                      J 1 Reply Last reply Reply Quote 0
                      • J
                        jimeez @preston
                        last edited by

                        @preston

                        Everything I have ever read or watched about connecting a DSL modem to pfSense instructs that the modem be placed in transparent bridge mode. Curious to know what settings you applied in pfSense to get this to work.

                        1 Reply Last reply Reply Quote 0
                        • J
                          jimeez @chpalmer
                          last edited by

                          @chpalmer said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                          What happens if you put your Centurylink as tier 3?

                          As @preston already stated, both of us have only two WANs.

                          Have any of you put your Centurylink modem back to modem mode and tried to let it handle the PPP?

                          Sounds like this may have worked for @preston . Assuming it did/does, I have a couple questions:

                          • With the DSL modem no longer in transparent bridge mode I assume that it will assign that WAN interface a local IP address of 192.168.1.xxx. If this assumption is correct is this connection now sitting behind a double NAT?
                          • If that's the case, I guess we can no longer use pfSense to resolve our Dynamic DNS clients as that interface will no longer have an outside IP address.

                          You'll have to pardon my ignorance here. I only know enough to be slightly dangerous.

                          P 1 Reply Last reply Reply Quote 0
                          • P
                            preston @jimeez
                            last edited by

                            @jimeez

                            Things are still working here with the CenturyLink modem out of transparent bridging mode.

                            Here is what I did ( I connected my laptop via ethernet to the CenturyLink modem for the setup):

                            • Factory reset the CenturyLink modem (again), disabled the CL modem WiFi, reset the admin password and so on.

                            • My CenturyLink modem's default GUI address is 192.168.0.1 - I left that as is.

                            • I connected LAN 1 on CL modem to my WAN2 pfSense port.

                            • Under DHCP reservations in the CL modem, I assigned my pfSense box an IP of 192.168.0.2.

                            • I disabled the DHCP server on the CL modem.

                            • Rebooted the CL modem

                            • Unplugged my laptop from the CL modem.

                            • I reconnected to my pfSense network and set up the CL interface and gateway.

                            • My DNS servers and monitor IPs are 1.1.1.1 for Starlink and 8.8.8.8 for CenturyLink respectively.

                            • My pfSense LAN is in the 192.168.1.xxx range

                            • The pfSense dashboard shows the CL WAN IP as 192.168.0.2, but when I check sites like infosniper.net I can see the CL IP address.

                            • As an added bonus I can now access the CL modem GUI (192.168.0.1) via the pfSense network without having to fiddle with additional pfSense settings.

                            • I'm not sure about Dynamic DNS, but I have been using Tailscale with Starlink and it has worked great.

                            • @chpalmer may just be our hero!

                            As far as Double NAT while using the CL WAN, I really don't know (or understand it completely), but here is my Traceroute from the CL WAN to www.google.com:

                            1  192.168.0.1  0.544 ms  0.425 ms  0.402 ms
                             2  184.102.159.254  28.701 ms  28.475 ms  28.817 ms
                             3  71.33.4.9  28.078 ms  28.604 ms  28.296 ms
                             4  4.68.144.169  59.685 ms  46.815 ms  42.017 ms
                             5  4.68.127.114  44.359 ms  55.718 ms  63.480 ms
                             6  * * *
                             7  142.251.60.10  42.169 ms
                                216.239.51.116  43.287 ms
                                209.85.255.172  42.109 ms
                             8  209.85.247.117  42.379 ms
                                192.178.249.234  43.372 ms
                                209.85.247.117  42.327 ms
                             9  142.251.233.230  43.373 ms  44.324 ms
                                142.250.190.4  41.627 ms
                            
                            J chpalmerC 2 Replies Last reply Reply Quote 1
                            • J
                              jimeez @preston
                              last edited by

                              @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                              I disabled the DHCP server on the CL modem.

                              Ahhhhh....there it is. This makes sense now. Perfect! I'll give this a try tonight and see if I get the same result.

                              1 Reply Last reply Reply Quote 0
                              • chpalmerC
                                chpalmer @preston
                                last edited by chpalmer

                                @preston @jimeez

                                My guess is that pfsense is re-authenticating with C-Link every 15 minutes and something occurs to cause the issue at that time.

                                Though I am unsure why this hasn't come up before with other users trying to utilize similar setups.. I use Astound and Verizon here and have no issues. Neither of my ISPs use any kind of PPP.

                                Triggering snowflakes one by one..
                                Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                                P 1 Reply Last reply Reply Quote 0
                                • P
                                  preston @chpalmer
                                  last edited by

                                  @chpalmer

                                  Generally speaking, CenturyLink (now called Brightspeed) has been the worst ISP I have ever had. Until Starlink, they were the only option in my area.

                                  That being said, it worked fine in Transparent Bridging for a long time. Not sure what changed, but it sure broke things. So far, so good. Things seem to be back to normal. Hope it works for you @jimeez.

                                  Thank-you again!

                                  J 1 Reply Last reply Reply Quote 0
                                  • J
                                    jimeez @preston
                                    last edited by

                                    @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                                    @chpalmer

                                    Generally speaking, CenturyLink (now called Brightspeed) has been the worst ISP I have ever had. Until Starlink, they were the only option in my area.

                                    That being said, it worked fine in Transparent Bridging for a long time. Not sure what changed, but it sure broke things.

                                    This was my exact experience as well. Only option available until StarLink (I mean I choose to live in the middle of nowhere). Worked just fine in transparent bridge mode forever....and still does when it's the only active interface. But something changed on or about August 22, 2024.

                                    @preston said in Dual WAN Fail-over Issue - Tier 1 WAN frequently failing upon activation of the second Tier 2 WAN:

                                    Hope it works for you @jimeez.

                                    I'm not quite there yet. Although it does seem promising. I spent more hours on this last night than I will admit and am still struggling with the CL modem settings. I have an old Protectli 4-port device with which I decided to start fresh. Got StarLink up and running no problem on the main WAN interface. Adding the CL interface is another story for some reason. I must be doing something wrong.

                                    (i disconnected the StarLink interface while setting up the CL interface)

                                    • I initially connected a laptop to the factory-reset CL modem via the WAN port (laptop to WAN port).
                                    • After initial config of the CL modem (turn off WiFi etc.) I connected the pfSense device OPT1 interface to Port 1 on the CL modem and reserved an IP for it. In this case I was not able to use 192.168.0.2 because the laptop already took it, so I gave it 192.168.0.5.
                                    • When the DHCP service is active both the laptop and pfSense see the modem and have internet.
                                    • As soon as I disable the DHCP server on the CL modem I can no longer resolve DNS addresses. The laptop and pfSense devices both now show that they no longer have internet.
                                    • I can ping actual IP addresses on both devices (like 8.8.8.8), but can't resolve addresses (say google.com).

                                    Basically I'm stuck here. Grateful for any input on the likely obvious thing I'm doing wrong. ;-)

                                    1 Reply Last reply Reply Quote 1
                                    • J
                                      jimeez
                                      last edited by jimeez

                                      @chpalmer @preston

                                      Short response right now is that this works for me too. Thank you so much!!

                                      Will post back later with more detail.

                                      chpalmerC P 2 Replies Last reply Reply Quote 1
                                      • chpalmerC
                                        chpalmer @jimeez
                                        last edited by

                                        @stephenw10 any comment from the balcony seats? 😁

                                        This seems to be reproducible but the particulars need to be understood a bit more I think.

                                        Triggering snowflakes one by one..
                                        Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                                        1 Reply Last reply Reply Quote 1
                                        • P
                                          preston @jimeez
                                          last edited by preston

                                          @jimeez

                                          A bit of bad news here. After about 24 hours, I lost the Centurylink connection. pfSense shows the Centurylink WAN as "pending" and will not reconnect. Restarted dpinger, rebooted pfSense, and it is still offline. I also have lost the ability to connect to the CenturyLink modem interface.

                                          Perhaps disabling the DHCP server on the CL modem caused the lease to time out even though I assigned an IP address to the pfSense connection. I had to factory reset to get back to the CL interface.

                                          I'm going to try it with the CenturyLink DHCP server enabled to see what happens. Back online now.

                                          I'm going to play with lease times and see what happens.

                                          EDIT: The lease expire time seemed to be the culprit. The default lease expire was 24 hours. I left the CenturyLink DHCP server enabled and changed the lease expire time to 5 minutes. It made it past the 5 minute mark.

                                          More testing to come.

                                          chpalmerC 1 Reply Last reply Reply Quote 0
                                          • chpalmerC
                                            chpalmer @preston
                                            last edited by

                                            @preston Yes.. you have to set your pfsense CL WAN to static and use something like 192.168.0.5 as its address and 192.168.0.1 as its gateway.

                                            Triggering snowflakes one by one..
                                            Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                                            P J 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.