• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

6100 - Stopped passing traffic / Web GUI not accessible

Scheduled Pinned Locked Moved Official Netgate® Hardware
13 Posts 3 Posters 1.1k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    BNetworker
    last edited by BNetworker Apr 18, 2023, 4:41 AM Apr 18, 2023, 4:40 AM

    Hey all, my Netgate 6100 (23.01-RELEASE) just froze/hung/locked a bit ago. I could still ping the gateway, but no traffic allowed in or out. The web GUI was not responding. I finally rebooted it by pulling power. It came back up and is working again.

    I can't find anything strange in the logs, just the bootup. No files were created in /var/crash. No spikes in CPU or Memory in the graphs, etc.

    Any ideas how to determine the cause?

    G 1 Reply Last reply Apr 18, 2023, 6:39 AM Reply Quote 0
    • G
      Gertjan @BNetworker
      last edited by Gertjan Apr 18, 2023, 6:40 AM Apr 18, 2023, 6:39 AM

      @bnetworker said in 6100 - Stopped passing traffic / Web GUI not accessible:

      Any ideas how to determine the cause?

      Well, you saw the patient, and he wanted to know whast up with him.
      You gave him the lethal injection and said : "next".

      So, maybe some post mortem investigation might shed some lights.

      For the next time :
      You have the console : use it, and use option 8.
      Remember these two words /var/log/
      as that is the place where you can find the logs files.
      These contain "what happens when when" and are plain text files with timestamps.

      @bnetworker said in 6100 - Stopped passing traffic / Web GUI not accessible:

      by pulling power

      pfSense uses a disk with a very resilient file system, but still, don't do that.
      It's not a light bulb.
      You can use the console, and then 5 or option 6.

      No "help me" PM's please. Use the forum, the community will thank you.
      Edit : and where are the logs ??

      1 Reply Last reply Reply Quote 1
      • S
        stephenw10 Netgate Administrator
        last edited by Apr 18, 2023, 1:13 PM

        Do you see any gaps in the logs or monitoring graphs? A disk issue can present like that.

        B 1 Reply Last reply Apr 18, 2023, 2:49 PM Reply Quote 0
        • B
          BNetworker @stephenw10
          last edited by BNetworker Apr 18, 2023, 2:54 PM Apr 18, 2023, 2:49 PM

          @stephenw10 said in 6100 - Stopped passing traffic / Web GUI not accessible:

          Do you see any gaps in the logs or monitoring graphs? A disk issue can present like that.

          Hey @stephenw10 - Yes, the logs do show a long period (looks like issues started about 12:33, then finally fully locked up about 20:50, rebooted short after) where they show 0.00 for all counters:

          b31616d0-a7ec-4327-a271-12f36122ad27-image.png

          This is the MAX with the M.2 (P80) 3TE6 (SMART overall-health self-assessment test result: PASSED)

          Logs show similar gap:

          Apr 17 09:08:00 sshguard 99451 Exiting on signal.
          Apr 17 09:08:00 sshguard 9417 Now monitoring attacks.
          Apr 17 12:30:00 sshguard 9417 Exiting on signal.
          Apr 17 12:30:00 sshguard 51269 Now monitoring attacks.
          Apr 17 20:55:55 syslogd kernel boot file is /boot/kernel/kernel
          Apr 17 20:55:55 kernel ---<<BOOT>>---
          Apr 17 20:55:55 kernel Copyright (c) 1992-2022 The FreeBSD Project

          G 1 Reply Last reply Apr 18, 2023, 3:01 PM Reply Quote 0
          • G
            Gertjan @BNetworker
            last edited by Gertjan Apr 18, 2023, 3:10 PM Apr 18, 2023, 3:01 PM

            @bnetworker

            "Disk full" issues with a 6100 MAX ... only suricata (and snort ? topng ?) users could mange to do that, as some of them didn't image that these packages can create huge log files.
            ( and only for these users the auto log rotate mechanism is 'broken' )

            @bnetworker : disks space : all is ok ?

            What packages are you suing ?

            If I compare your graph with mine, it's identical - it hovers around 310 processes.
            pfSense Plus 23.01, on a 4100 MAX, with pfBlockerng doing 'something' - nothing more.

            4bc61b31-b5df-4d69-bf87-be1b433c730c-image.png

            Look also at memory used.

            a8114b0c-87f3-47c8-8e52-572f13569984-image.png

            That's me trying to load and use very huge pfBlockerng DNSBL feeds. It was not a success story. The system even started to use swap, and that's bad on a firewall.

            628b4f0f-e39d-4a61-899a-6125dcea42f6-image.png

            2 % disk space used.
            I wonder why I bought a MAX 😊

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            B 1 Reply Last reply Apr 18, 2023, 3:05 PM Reply Quote 0
            • B
              BNetworker @Gertjan
              last edited by Apr 18, 2023, 3:05 PM

              @gertjan - Very little in use. This is the disk usage now.

              c9c31e5a-4eb5-4be3-99eb-db413bcaa651-image.png

              Very few packages as well:

              42aa6d9b-a32a-4be8-b345-bb58da1c37f9-image.png

              I did just notice there is a new 03.00.00.03t-uc-18 firmware out there.

              G 1 Reply Last reply Apr 18, 2023, 3:20 PM Reply Quote 0
              • S
                stephenw10 Netgate Administrator
                last edited by Apr 18, 2023, 3:17 PM

                The new firmware would not affect the disk.

                You might check the SMART data.

                Unfortunately in that situation often the only place that shows the error would be at the console. If it happens again check the console before resetting it. You would see drive errors there when trying to do anything that tries to read or write from it.

                Steve

                B 1 Reply Last reply Apr 18, 2023, 3:22 PM Reply Quote 0
                • G
                  Gertjan @BNetworker
                  last edited by Gertjan Apr 18, 2023, 3:21 PM Apr 18, 2023, 3:20 PM

                  @bnetworker
                  All that looks fine.

                  One exception though : do yourself a favor, and the Service_Watchdog package.
                  See it as a nasty dog : yes, it will byte every burglar, if it finds one. If none, it will byte you, the wive, the kinds and even worse : parents-in-law. "Service_Watchdog" is a software development tool (during the 'things don't work well yet' phase).

                  "Service_Watchdog" is bad ... I'm not sure it will lock up a 6100. It might be capable of doing so.

                  I've never seen pfSense services like unbound, the captive portal etc dying on me for the last 10+ years. Running on own hardware, VM and now a 4100.

                  The others you've listed : the run or do something at startup, and then they do nothing anymore.
                  openvpn-client-export : only used when you use the GUI openvpn-client-export to export a ovpn file file.

                  edit : yes, look at the "dmesg" log file ... !

                  No "help me" PM's please. Use the forum, the community will thank you.
                  Edit : and where are the logs ??

                  1 Reply Last reply Reply Quote 0
                  • B
                    BNetworker @stephenw10
                    last edited by Apr 18, 2023, 3:22 PM

                    @stephenw10 - Smart data shows clean. I'll check the console if it happens again. If the NVME is failing, it it possible to replace with another and reload?

                    @Gertjan - Service watchdog was to restart OpenVPN, as sometimes I've seen the service stop after a config change, then I'm down remotely till I get home. It's resolved that issue. If it's that horrible and buggy, should we not raise an issue, or is it beyond repair?

                    S G 2 Replies Last reply Apr 18, 2023, 3:31 PM Reply Quote 0
                    • S
                      stephenw10 Netgate Administrator @BNetworker
                      last edited by Apr 18, 2023, 3:31 PM

                      @bnetworker said in 6100 - Stopped passing traffic / Web GUI not accessible:

                      If the NVME is failing, it it possible to replace with another and reload?

                      Yes, that is possible. It's not recommended to open the case normally though as it's easy to damage it doing so. Care is required! If it's in warranty we would replace that for you if needs be.

                      Steve

                      B 1 Reply Last reply Apr 18, 2023, 3:32 PM Reply Quote 0
                      • B
                        BNetworker @stephenw10
                        last edited by Apr 18, 2023, 3:32 PM

                        @stephenw10 said in 6100 - Stopped passing traffic / Web GUI not accessible:

                        @bnetworker said in 6100 - Stopped passing traffic / Web GUI not accessible:

                        If the NVME is failing, it it possible to replace with another and reload?

                        Yes, that is possible. It's not recommended to open the case normally though as it's easy to damage it doing so. Care is required! If it's in warranty we would replace that for you if needs be.

                        Steve

                        Understood. I'll keep an eye on it. I'll report back if there are further issues.

                        1 Reply Last reply Reply Quote 1
                        • S stephenw10 moved this topic from General pfSense Questions on Apr 18, 2023, 3:36 PM
                        • G
                          Gertjan @BNetworker
                          last edited by Apr 19, 2023, 5:42 AM

                          @bnetworker said in 6100 - Stopped passing traffic / Web GUI not accessible:

                          should we not raise an issue, or is it beyond repair?

                          It's dumb.

                          It loops around with a time delay, like */1 as its a cron task, and checks if the pid of the Openvpn server exists.
                          If the process was commanded to shut down, then the pid will be removed also.
                          But who gave the shut down command then ?

                          If the process just 'dies', or goes zombie in memory, the pid (file !) still exists. Watchdog still see the file, and does nothing. IMHO : Not a real good indication.

                          If the process had a bug, and dies 'clean' it will get restarted : to eventually hit the same bug, and die .... etc etc

                          @bnetworker said in 6100 - Stopped passing traffic / Web GUI not accessible:

                          beyond repair

                          If a process dies, have it repaired. Permanent electrocuting it never made anything 'better'. ;)

                          Btw : Ok, I understand your usage.
                          I've always an OpenVPN instance running for remote 'admin' access. I never found it 'stopped' because it had 'failed'.

                          Where things go downhill fast, is when it gets used with unbound, and the user also has pfBlockerng installed (of course with many and big dnsbl lists).
                          unbound can have big startup times, so, if it dies and was revived by the watchdog, and it needs more then one minute to start, it will get restarted while it was already restarting (and didn't write out its pid file yet).
                          Best situation : the system's DNS is down as unbound never reaches a 'working' state. Worst : the entire system goes downhill fast.

                          No "help me" PM's please. Use the forum, the community will thank you.
                          Edit : and where are the logs ??

                          1 Reply Last reply Reply Quote 0
                          • S
                            stephenw10 Netgate Administrator
                            last edited by Apr 19, 2023, 11:44 AM

                            The service watchdog is a trouble shooting tool and should be seen as such.

                            There are some situations where you might want to enable it on a service in a more permanent way but even then it's usually to address some underlying bug.
                            For example if you have an OpenVPN client and want it to be always up you might use the watchdog. But only because if the server side rejects the connection as unauthorized the client will exit and not retry. Most clients would never see that or if they did retrying would not help.

                            Steve

                            1 Reply Last reply Reply Quote 0
                            1 out of 13
                            • First post
                              1/13
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                              This community forum collects and processes your personal information.
                              consent.not_received