Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfSense on netgate 6100 stops passing traffic multiple times per day

    Scheduled Pinned Locked Moved General pfSense Questions
    16 Posts 5 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dragonfly
      last edited by

      Hello, I switched from Meraki to Netgate about 1 year ago. I have not had a single issue during that time and have been very happy with the increased throughput. .
      Suddenly, about a week ago, my network started going down multiple times per day. I can still get into the Web GUI, there are no alerts or alarms present. The WAN interface reports as being up and I can still ping outside websites from the firewall itself using the diagnostic tools. However, no traffic is being passed from the internal LAN out to the Internet, so essentially we go down. I have tried restarting each of the services one by one on the Netgate, I have tried disabling/re-enable the WAN interface, I have tried resetting the states table. Nothing brings it back, my only option is the reboot the firewall which takes about 5 minutes.
      This has already happened twice today and twice yesterday, three times the day before. All at different times of the day, no event seems to trigger it. Everything seems to be functioning normally and there are absolutely no signs of trouble anywhere. Nothing has changed internally on my network. Can anyone advise what I might check the next time this happens or how I can go about troubleshooting?

      keyserK 1 Reply Last reply Reply Quote 0
      • keyserK
        keyser Rebel Alliance @dragonfly
        last edited by

        @dragonfly 5 minuttes for a restart?

        Could it perhaps be a dying/dead eMMC storage (boot) device?
        It if stops responding or becomes extremely slow to write log IO, I dunno how pfSense reacts, but perhaps something like your experience?
        Anyhow - 5 minutes to boot where it should take perhaps 1 minute seems to suggest an issue somewhere.

        Love the no fuss of using the official appliances :-)

        D 1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          If you can reach the webgui from LAN and the firewall itself can reach external sites (by IP and FDQN?) that implies either no NAT or no routing somehow.
          Check Diag > Routes.

          Or it could be something else on the LAN becoming the default gateway, a rogue DHCP device.

          Steve

          D 1 Reply Last reply Reply Quote 0
          • D
            dragonfly @keyser
            last edited by

            @keyser It's always done that, it's only about a year old, but whenever I've done a firmware update, it's always taken about 5 minutes to reboot, is there any way that I can check if the disk is bad?

            1 Reply Last reply Reply Quote 0
            • D
              dragonfly @stephenw10
              last edited by

              @stephenw10 Hi Steve, the only other thing on the network is a WiFi router which is set to AP mode, it get it's DHCP instructions from the Netgate. I have checked the IP information on all the devices whenever this happens and they all still have their correct IP's and gateway set to the Netgate. I will check the Diag -> routes next time. (should be soon)

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                How are you actually testing? How does it fail?

                The description sure sounds like some rogue device could be the cause. If it was an IP conflict pfSense would be complaining loudly in the logs.

                Make sure when you test that traffic from LAN side clients is actually arriving at pfSense. Check the state tables or add logging to the LAN pass rules so it appears in the firewall logs.

                Steve

                D 1 Reply Last reply Reply Quote 0
                • D
                  dragonfly @stephenw10
                  last edited by

                  @stephenw10 Hi Steve, first thanks for responding, I appreciate it.

                  You're asking how do I test, when the Internet suddenly goes down out of nowhere, the first test was, can I still get to devices locally? Which I can, I can still get to any local device, so the next step, is can I get into the firewall, which I can, the Web GUI works just fine. There are no alerts or alarms or logs about anything. The WAN interface on the firewall reports that it is still up. I go the diagnostics page on the Web Gui and I ping google.com, it works. I try to ping google.com from my computer, it fails. What's between me and google.com? The firewall. It works from the firewall, but not my computer. So I try to ping 8.8.8.8 to see if it's a DNS issue, but that fails as well (but works from the firewall diagnostics tool). I check ipconfig to make sure my IP address and gateway are correct, they are. I reboot the firewall, the Internet comes back (for anywhere between 2 and 12 hours, then it goes down again). I also did all the steps which I outlined in the original post. (basically trying to see if it was a particular service or if disabling/re-enabling the WAN interface brought it back or if clearing the states table would fix the issue). The only thing that brings the Internet connection back is rebooting the firewall. Reboots take about 5 minutes. Every time I reboot it, the Internet connection comes back. I'm discounting the rogue DHCP server because nothing on this network has changed in at least 5 years, probably more. (In terms of no new devices added or taken away).

                  I will add logging to the LAN pass rules as you suggest.

                  S 1 Reply Last reply Reply Quote 0
                  • S
                    SteveITS Galactic Empire @dragonfly
                    last edited by

                    @dragonfly There are some disk troubleshooting docs here https://docs.netgate.com/pfsense/en/latest/troubleshooting/index.html#hardware

                    5 minutes for a normal restart seems very long. 5 minutes for a version update seems fine. I haven’t managed a 6100 yet but a 2100/3100 takes 10-15 minutes or more normally. eMMC vs SSD makes a difference too.

                    Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                    When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                    Upvote 👍 helpful posts!

                    D 1 Reply Last reply Reply Quote 0
                    • D
                      dragonfly @SteveITS
                      last edited by

                      @steveits This one is an eMMC. Maybe that's normal for this?

                      S 1 Reply Last reply Reply Quote 0
                      • S
                        SteveITS Galactic Empire @dragonfly
                        last edited by

                        @dragonfly I’d expect maybe a minute? I’d watch the console during boot…. And/or check the system and boot logs after. I’m not sure what could make a normal restart take that long.

                        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                        Upvote 👍 helpful posts!

                        D 1 Reply Last reply Reply Quote 0
                        • D
                          dragonfly @SteveITS
                          last edited by

                          @steveits Ok, I just reviewed the boot logs during the two reboots which took place today. Nothing really stands out, I mean there's a lot of lines of things happening, but nothing looks odd or out of place. It is taking almost exactly 5 minutes from the the issuing of the reboot command to the device being fully reloaded and functional.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Ok, that's good troubleshooting. Always difficult to judge quite what level people are operating from on the forum. 😉

                            So I would still want to be sure traffic from the clients is actually arriving at pfSense when it's in the failed situation. I would just run a pcap on LAN whilst pinging. But using the state table or logging passed traffic will also show that.
                            If it is arriving and it is being passed then I'd have to guess it's some NAT failure or maybe a package. Snort or Suricata could present like this in blocking mode.
                            Since pfSense itself can ping google.com it must still have a valid default route. But it may lose the default gateway (whilst keeping the route) somehow.

                            Go to System > Routing > Gateways. Set the WAN gateway as the default rather then 'automatic'. That may prevent the issue.

                            Steve

                            D 1 Reply Last reply Reply Quote 0
                            • D
                              dragonfly @stephenw10
                              last edited by

                              @stephenw10 Hi Steve, I have added logging on the LAN pass rules as you've suggested and I also just made the changed you suggested in the routing -> gateways screen, let's see how things go today. I appreciate your help. If it goes down today, I'll see if I can post the logging from the LAN pass screen.

                              1 Reply Last reply Reply Quote 1
                              • N
                                NOCling
                                last edited by

                                Go to System -> Package Manager
                                Install: "Netgate_Firmware_Upgrade" and "System_Patches" if not installed.

                                Go for the "Netgate Firmware Upgrade" under System and install the newest Version.

                                If you use the old one, the boot take minutes to check and select the boot device, afte the update and the nessesary power cycle, reboot take 90-120 Seconds.

                                Netgate 6100 & Netgate 2100

                                D 1 Reply Last reply Reply Quote 1
                                • D
                                  dragonfly @NOCling
                                  last edited by

                                  @nocling Thank you this was helpful, I thought I was on the latest firmware, but I wasn't both of the screens had updates that I wasn't aware of (I am new to pfSense).

                                  Also as an overall update, since I changed the gateway setting above (from @stephenw10) , it hasn't gone down, one other thing, there was an external IP address that was mercilessly hitting the firewall and I'm wondering if that was taking my connection down. I created a rule to completely block the IP at the same time I made that change on the gateway screen, since then I haven't gone down. If I get through another couple of days then I'm willing to say this is solved.

                                  stephenw10S 1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator @dragonfly
                                    last edited by

                                    @dragonfly said in pfSense on netgate 6100 stops passing traffic multiple times per day:

                                    there was an external IP address that was mercilessly hitting the firewall

                                    If it was hitting the firewall I assume I was being blocked? If so adding a different rule to block it wouldn't change anything. Unless the new rule is non-logging and hit rate was so high that the number of block logs was creating a significant load.

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.