Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Netgate 6100 LAN crashes

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    10 Posts 4 Posters 313 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • N
      Nightwolf
      last edited by

      Dear all,

      I replaced a perfectly working HP blade with a Netgate 6100 for two reasons:
      Reduce power consumption.
      Use an SFP port, which was essential for the ISP; the blade was too old.
      The router is no longer under warranty.

      My infrastructure is nothing special, everything was fine with the blade, but the 6100 has the LAN that randomly goes unresponsive about 2-3 times per day.

      The router is 100% up to date (OS, patches and all packages).

      I tried disabling packages one by one without success.
      I also tried disabling all packages, without success.
      I also tried disabling WAN failover, without success.
      I tried uninstalling all packages except HAPROXY for production reasons, still without success.
      I tried reset to factory default 3 times

      I can't find in the logs what's causing the crash, especially since the Wi-Fi is responding correctly and is configured like the LAN, except for the transparent proxy. The client retrieves an IP address from the Netgate's DHCP and can access the internet; however, LAN resources are inaccessible.
      The GUI interface is accessible via the WAN (public IP restrictions allowed) but is very slow.
      The GUI interface is no longer accessible via the LAN when it crashes, and no traffic works between the LAN and the router.
      Only restarting the Netgate resolves the problem; sometimes you have to restart it twice in a row.
      It works for a few hours, then crashes again.

      I've been in this situation since March 2025, and I can't take it anymore.
      I thank in advance anyone who can help me.

      Schema.jpg
      Services.png
      Interfaces.png
      GW.png GW WAN.png
      Health.png

      GertjanG 1 Reply Last reply Reply Quote 0
      • GertjanG
        Gertjan @Nightwolf
        last edited by Gertjan

        @Nightwolf

        First, export your pfSense config.This allows you to go 'back in time' and find a known situation with one click.

        Next : remove 'useless' stuff that can eat loads of resource while offering real gain :
        I'll rephrase : go "clean machine" : make it look like the day you received it. The '6100' ou of the box experience is a 24h/24 service guaranteed and 'interruptions' are not on the feature list.
        So, exit : ntopng squid suricata as these are known to be very able in destroying your emmc drive as they can produce huge quantities of data (I presume you don't have the MAX version). You can keep pfBlockerng, just 'stop' it.

        You have your WAN marked as active, but no speed ? Do you need it ? Isn't the WAN_4G enough ?
        Remember : you, as the admin, your mission is : keep pfSense interface up, no matter what. Every time an interface goes down, or up, loads of processes gets restarted, and this takes time. Keep you pfSense happy and humming. If needed : give it an UPS, and all the attached devices (switches, upstream routers) also : UPS. So no more flapping interfaces, no more process restarts.

        Restart.

        From now on, forget about the dashboard, as that's the page you look at when all is fine. So, you actually never look at it, as when all is fine, it's becomes boring really fast.
        The page that a admins are always looking at : this one : Status > System Logs > System > General as the logs tell you when things are ok, and also when things go wrong.
        Tell us what you see, what interrogates you, and we'll tell you what is ok, suspect, not ok etc.

        Btw : I also use the captive portal for my hotel. No issues.

        edit : it shows an issue. You saw the gap in the stats ? Last friday, 50 km further away, somebody dug a hole a bit to deep and destroyed a fiber cable, half the department (50k+ people) were cut of for 40+hours. No more Internet. and no more cash distributors, no more credit cards payments .... things went downhill fast. Only "cash" worked.

        @Nightwolf said in Netgate 6100 LAN crashes:

        LAN resources are inaccessible.

        Devices on your LAN can use the other LAN devices even with pfSense powered down.

        @Nightwolf said in Netgate 6100 LAN crashes:

        I've been in this situation since March 2025,

        That alone could be the answer to the question : what changed back then ?
        I'm using the better 24.11 : the latest 25.03 beta, one came out just this morning : it's soooooo good. Rock solid.

        edit : I noticed you might be using a phone data carrier as your WAN. I can image why you chose that one.
        Phone companies sell "voice call time". That's how their quality is measured.
        The added "xx Mbits/sec" data connection is an accessory. Very useful to update that phone's GPS map, or send a mail, or use it with telegram/whatssap. These are all 'small consumers' and everybody is happy.
        But now people want also stream 4K, or worse, they hang entire companies on a 3G/4G data connection. Some of my friends had to use such a connection as they live in the middle of nowhere. I've brought along a pfSense router so I could monitor all this ....
        Yeah, I get it, they didn't have a choice. It was a little bit better as the classic satellite Internet (not Starlink, the older system). latency jumped up and down. In the evening the connection became nearly useless, you know why.

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        N 1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Define exactly how unresponsive the LAN becomes.

          Can it still be pinged?

          Does it stop handing out DHCP leases?

          Can you ping out from pfSense to other devices on the LAN?

          If the interface still shown as linked and active?

          Is anything logged when this happens?

          N 1 Reply Last reply Reply Quote 0
          • N
            Nightwolf @Gertjan
            last edited by Nightwolf

            @Gertjan Thanks for the feedback.
            I have already disabled everything and even done 3 factory resets.
            Fiber is the main connection and doesn't cause me any slowness issues. 2500/1500 the 4G connection is only a backup.

            If I don't find a solution, I'm seriously considering this: https://www.reddit.com/r/Netgate/comments/1in8uwm/successful_emmc_replacement_in_netgate_6100/

            1 Reply Last reply Reply Quote 0
            • N
              Nightwolf @stephenw10
              last edited by

              @stephenw10 Thank you for your reply.

              Can it still be pinged?
              From a client station in the LAN the router is inaccessible and no longer responds to ping.

              Does it stop handing out DHCP leases?
              DHCP leases are distributed by the domain controller not the router.

              Can you ping out from pfSense to other devices on the LAN?
              No.

              If the interface still shown as linked and active?
              Yes.

              Is anything logged when this happens?
              The system logs are not telling me anything. The log is full of lines from syslogd that are not responding.
              I have disabled syslog but I can't find the source of the problem.

              When this happens, the Wi-Fi interface, whose DHCP leases are distributed by the router, works.
              Wi-Fi clients can access the internet, but access to the LAN is blocked.
              Locally, I have to use the console to reboot the router.
              Remotely, the WAN interface responds, and I can reboot via the GUI.

              dennypageD 1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                @Nightwolf said in Netgate 6100 LAN crashes:

                Can you ping out from pfSense to other devices on the LAN?
                No.

                Like it fails or you're unable to test? Does the serial console still respond when this happens?
                I assume it does if WiFi still works.

                @Nightwolf said in Netgate 6100 LAN crashes:

                The log is full of lines from syslogd that are not responding.

                You have an example of that?

                Do you see errors on the LAN interface shown? It seems like the LAN itself just stops passing traffic rather than the firewall hangs.
                I would try reassigning LAN to a different NIC if you can.

                1 Reply Last reply Reply Quote 1
                • dennypageD
                  dennypage @Nightwolf
                  last edited by

                  @Nightwolf said in Netgate 6100 LAN crashes:

                  Is anything logged when this happens?
                  The system logs are not telling me anything. The log is full of lines from syslogd that are not responding.
                  I have disabled syslog but I can't find the source of the problem.

                  Do you have remote syslog enabled? If so, is the syslog server on the same interface that stops working?

                  N 1 Reply Last reply Reply Quote 0
                  • N
                    Nightwolf @dennypage
                    last edited by

                    I had a crash yesterday that I was able to catch in time to retrieve the logs.
                    The local IP address and local domain are truncated for security reasons.
                    Everything else is unchanged; I intentionally left all packages running except ntopng.
                    Disabling/uninstalling the packages doesn't change anything anyway.
                    Hope this helps.

                    Netgate_Logs.zip

                    GertjanG 1 Reply Last reply Reply Quote 0
                    • GertjanG
                      Gertjan @Nightwolf
                      last edited by

                      @Nightwolf said in Netgate 6100 LAN crashes:

                      I had a crash yesterday that I was able to catch in time to retrieve the logs.
                      The local IP address and local domain are truncated for security reasons.
                      Everything else is unchanged; I intentionally left all packages running except ntopng.
                      Disabling/uninstalling the packages doesn't change anything anyway.
                      Hope this helps.

                      Netgate_Logs.zip

                      Wow ... that's a - sorry for the word : a bit messy.

                      I'll start with this : I don't use it HA Proxy - have no usage for it as I lost all interest in hosting things myself decades ago. I got myself a small VPS server in some data centre, and that one took care of my web sites, mail stuff, powering, disk maintenance, cleaning the fans, paying the power bills, and all that stuff. @home and @work my pfSense is their to 'regulate' my ISP connection.

                      Because 99++ % is TLS these days, this means that plain text data transfers, for mail, web sites etc etc doesn't exist anymore, stuff like Suricata (clam-av) or more general, IDS/IPS, has become pure rocket science.
                      The latest example was shown last evening = see Not Nominal from Scott Manley. Rocket science is hard, hurts, and things will blow up "all the time".
                      Those who master "IDS/IPS/Proxy" are not the ones** posting here on this forum. They are the TLS gods ....
                      The good old days of 'traffic scanning' is gone now.

                      ** well, not true, there is one person here in the forum : bmeeks.


                      First things first : ok that your WAN disconnects ....
                      But your log starts at the bottom with a LAN disconnect ! That's not good !!
                      Your mission, if you want to have an easy admin live : stop that from happening.
                      The 6100 LAN plugs, or actually, any plug, don't disconnect them. The devices connected to these : (switches, ISP boxes) : these are small power consumer, share the pfSense UPS with them.

                      That said, it's 'ok' for interfaces to go down. In theory, this should break anything.
                      But there is a but ....
                      When an interface goes down, processes like nginx, the DHCP server (or client), the resolver, the gateway scanner 'dpinger' etc etc etc will restart. And here comes the issue : all these process restart nearly at the same moment, which opens the door for a the most dreaded situation : race conditions. Your mission, a an admin, is : never ever create situations where race conditions can bite you.
                      The final goal is : Keep the logs dull, with no errors neither warnings messages. You'll be granted a very stable router, an admin's dream.

                      Next issue : you use the "servicewatchdog" : that's another admin-self-inflicting-pain tool.
                      Don't use it. Like never.
                      Locate "error: bind: address already in use" phrase in your zipped log. That's "servicewatchdog" doing things it should not do. "servicewatchdog" wasn't needed in this case, and did make things worse.

                      I use a 6100 (4100) myself, with UPS, and my 'unbound' (example) never gets restarted because interfaces went down.
                      This means you and I have the same hardware, the same software. If you use the same default Netgate settings for the core settings, you have nearly the same settings like me : result, as it is meant to be : it works "forever" - and don"'t take my word for it, have a look yourself.

                      Also repair this :
                      "haproxy: startup error output!" (several)
                      and
                      "/status_logs.php: ERROR! ldap_get_groups() could not bind"

                      Don't be ashamed if you can not make the error go away.
                      You are allowed and I even advise you to apply the KIS rule : don't use stuff that produces errors in the logs.
                      And if you do, accept them and with the consequences ^^

                      I'm not saying you can't / shouldn't use "pfSense package X" and "pfSense package Y" **, they are all easy to install. Setting them up can go way beyond what is needed to operate a vanilla pfSense.

                      ** exception : "servicewatchdog" which should be banned from the package list.
                      Be assured : my goal is that you have a 6100 that 'never fails' on you, so you can go back to do other stuff like spending time in the garden ^^

                      No "help me" PM's please. Use the forum, the community will thank you.
                      Edit : and where are the logs ??

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        So you have replaced the LAN IP address with 10.10.10.10254? Are you using a public subnet on LAN?

                        Was there a significant gap in the logs before that? The first thing logged there looks like a reaction to the LAN coming back up.

                        Mostly what is concerning there is that igc0 is flapping repeatedly. What is it actually connected to? Did you try reassigning LAN to one of the other igc NICs?

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.