Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    [SOLVED] XG-7100 1U WAN Gateway goes offline after 10-20 min, 100% Packet loss

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    10 Posts 3 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      runDMG
      last edited by runDMG

      Hi all, thank you for any help in advance..hopefully, I have this post in the correct place on the forums

      I am replacing a Unifi USG-Pro with a Netgate XG-7100 1U. My WAN gateway in pfsense keeps dropping and reporting 100% packet loss. Below is a rundown of what is going on and things I've tried, at a loss as to what to try next

      the current setup is
      CenturyLink CPE Adtran router -> Switch -> pfsense -> LAN

      I know that the Adtran isn't going down because I have a Sonicwall and the USG-Pro plugged into the switch that the pfsense box is plugged into and never experience internet loss for each of those. Other devices still can ping Adtran while this happens

      Network setup is (IPs changed for obvious reasons)

      • Network: 71.23.12.80/29
      • Adtran: 71.23.12.81
      • USG-Pro: 71.23.12..82
      • Sonicwall: 71.23.12.83
      • pfsense: 71.23.12.84

      If I reboot pfsense the WAN will be online for a few minutes (10-20) before dropping and if I let it run for a while it will occasionally come back online for a few minutes before dropping again

      Things I've tried so far

      • Looked into others issues on the forums, including this one
      • Put the pfsense WAN behind the USG on its own network to confirm pfsense setup and port were ok
        never showed any issues, Online 100% of the time
      • Disabled Block private networks and loopback addresses and Block bogon networks
      • Disabled Gateway Monitoring...internet still dropped
      • Changed Monitor IP to a Ubuntu Digital Ocean Droplet
      • Disable Gateway Monitoring Action
      • Changed Data Payload to 1
      • Changed WAN from Port1 to Port3, same symptoms as the original issue
      • Manually set the Switch port to 1000baseT full-duplex
      • Replaced patch cable from pfsense to switch

      Other bits of info
      When pfsense reports the gateway is offline I get 100% packet loss on anything I try to ping (not shockingly). So far I've tried to ping:

      • Adtran: 71.23.12.81
      • google.com

      I am able to ping the USG-Pro (71.23.12.82) from pfsense box's WAN Port (I enabled IMCP temporarily on WAN Local of the USG to troubleshoot)

      Gateway Logs
      All of the log messages are similar to this for the gateway when the issues arrise

      Apr 9 13:48:10	dpinger		WANGW 71.23.12.81: Alarm latency 0us stddev 0us loss 100%
      Apr 9 13:48:08	dpinger		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 71.23.12.81 bind_addr 71.23.12.84 identifier "WANGW "
      Apr 9 13:29:09	dpinger		WANGW 71.23.12.81: Alarm latency 0us stddev 0us loss 100%
      Apr 9 13:29:07	dpinger		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 71.23.12.81 bind_addr 71.23.12.84 identifier "WANGW "
      Apr 9 13:24:56	dpinger		WANGW 71.23.12.81: Alarm latency 555us stddev 134us loss 22%
      Apr 9 13:05:06	dpinger		WANGW 71.23.12.81: Clear latency 42911us stddev 222477us loss 14%
      
      1 Reply Last reply Reply Quote 0
      • R
        runDMG
        last edited by

        So in doing more digging it is coming back online for ~10min and then dropping again. If I delete the ARP record for the gateway in the ARP Table the ping will go back down and internet will be restored until the problem comes back in a few minutes. So it seems that when that record expires is when the internet comes back on for a few min

        I have a call out to CenturyLink techs for Monday

        1 Reply Last reply Reply Quote 0
        • R
          runDMG
          last edited by

          I spent most of the day on the phone with CenturyLink techs and a tech that supports the SonicWall that is on-site, the CenturyLink tech noticed that in the ARP table for the Adtran the Sonicwall's MAC address would start reporting itself for all of the IPs in our static block (except .82 for some reason) and when it did that is when the internet drops on the XG-7100. I followed up with the Sonicwall team and they read me the config for the WAN and it all seems fine

          Has anyone else heard of anything like this? I am going to follow up with them again tomorrow and verify they don't have any IP aliases set up in there for some reason and if they can't help me will probably just have to 1:1 nat their box behind the pfsense box but wanted to see if anyone out there ran into this before

          I have had the SonicWall unplugged for some time now and the issue hasn't happened so I feel confident it is a configuration issue in their box somewhere

          1 Reply Last reply Reply Quote 0
          • Cool_CoronaC
            Cool_Corona
            last edited by

            What subnet are the Sonicwall using on WAN?

            1 Reply Last reply Reply Quote 0
            • R
              runDMG
              last edited by

              /29, I too was suspect of that being the issue unfortunately they confirmed to me it was correct

              1 Reply Last reply Reply Quote 0
              • R
                runDMG
                last edited by

                I was able to get the login for the SonicWall. nothing in the configuration or logs are jumping out at me. The only thing I noticed was the MTU size was set to 1404 instead of the usually 1500 on the WAN Port. They are having another tech look and see if they see anything

                1 Reply Last reply Reply Quote 0
                • Cool_CoronaC
                  Cool_Corona
                  last edited by

                  Some of the equipment is trying to be the main FW for /29 subnet.

                  Try using a /32 subnet for the Sonicwall and everything connected to wan.

                  R 1 Reply Last reply Reply Quote 1
                  • R
                    runDMG @Cool_Corona
                    last edited by

                    @Cool_Corona This definitely seems the case, we did some digging and after searching for ARP Proxy and SonicWall a lot of other people seem to have this issue (https://www.reddit.com/r/networking/comments/4ijdl7/why_sonicwall_took_over_the_arp_for_the_whole_wan/)

                    Unfortunately, it doesn't look like Sonicwall's support /32 WAN subnets so I followed some suggestions on that Reddit post and will report back. If this fails and SonicWall support can't help me then we plan on doing a Static ARP table in the Adtran which isn't ideal but is a workaround for the time being

                    1 Reply Last reply Reply Quote 0
                    • R
                      runDMG
                      last edited by

                      pfsense hasn't dropped the internet once since they made a few changes in the Sonicwall. I asked what the Sonicwall Tech had to change. If I hear back I will post the solution for my issue in here in case anyone else runs into something similar. Thank you @Cool_Corona for your input

                      1 Reply Last reply Reply Quote 1
                      • E
                        Eribrenes Banned
                        last edited by Eribrenes

                        This post is deleted!
                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.