Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    21.02 Sudden lockup

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    164 Posts 30 Posters 49.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      solarizde
      last edited by solarizde

      Same here. I Upgrade to 21.02 (SG-3100) 2h ago and since that the device became totally unresponsive already the 3rd time. Console disconnects only thing helping is a powercycle. Will revert now back to prev. version.

      Only things I found is a lot of strange non functional things in the system log:

      Feb 17 23:24:01	nginx		2021/02/17 23:24:01 [error] 37560#100148: send() failed (54: Connection reset by peer)
      Feb 17 23:23:59	root	71184	/etc/rc.d/hostid: WARNING: hostid: unable to figure out a UUID from DMI data, generating a new one
      Feb 17 23:23:58	php	375	rc.bootup: The command '/usr/sbin/powerd -b 'adp' -a 'adp' -n 'adp'' returned exit code '69', the output was 'powerd: no cpufreq(4) support -- aborting: No such file or directory'
      Feb 17 23:23:55	kernel		matchaddr failed
      Feb 17 23:23:54	kernel		matchaddr failed
      Feb 17 23:23:53	kernel		matchaddr failed
      Feb 17 23:23:54	php	375	rc.bootup: The command '/usr/local/sbin/strongswanrc stop' returned exit code '1', the output was 'strongswan not running? (check /var/run/daemon-charon.pid).'
      Feb 17 23:23:52	kernel		..
      Feb 17 23:23:52	kernel		.
      Feb 17 23:23:50	kernel		.
      Feb 17 23:23:33	kernel		route: writing to routing socket: Network is unreachable
      
      1 Reply Last reply Reply Quote 3
      • K
        kerry diehl @jimp
        last edited by

        Same occurred here - with the 3100 also.

        Once dashboard finally came up, CPU was running at 100%. Disabled running services w/o any drop in CPU cycles. Removing SNORT finally brought CPU cycles down and was able to stabilize and reboot. Now seems to be behaving normally -
        But, when I reloaded SNORT, its running but not accessible from the menus.

        ...and when I then did it remotely on an second 3100 - got the same behavior.

        1 Reply Last reply Reply Quote 1
        • B
          behemyth
          last edited by behemyth

          Yeah, I just had the same thing happen. I reported this back in 2.5 beta, it seems to only occur on the 3100 series. I still have IPv4/IPv6 addresses on all of my interfaces, but I get total connectivity failure. I had about 6 hours of uptime before this happened. It's completely random.

          I tried to get logs of it, but since its a total loss of network communication, my log servers never get anything, and the local logs never showed anything.

          I have no packages running btw, just openvpn export.

          A power cycle will fix it, but i used a console connection to manually reboot. Everything came right back up.

          I will leave a console session open as jimp has suggested.

          1 Reply Last reply Reply Quote 0
          • S
            solarizde
            last edited by

            its indeed really weird. Because it was late yesterday (EU) I wanted to do the rollback / reinstall this morning but it didn't crashed during the night. Maybe because of not much load maybe random?

            Only thing I changed yesterday was to disable all packages, which actually werent so much;

            pfblocker, avahi, service watchdock, lldp

            But after the crash the same picture for me, besides the logs posted above no other clue.

            Let's see how it run during the day without packages enabled.

            K 1 Reply Last reply Reply Quote 0
            • K
              Kuser @solarizde
              last edited by

              Same issue here, install went without issues. Device was working for about 30-45 minutes before it froze/locked up the first time. Now I need to power cycle it every 10-60 minutes. Tried removing all unnecessary packages, but without success.

              When it freezes I can't even ping it via LAN.
              This has now happened 5 times.

              I'm opening a support ticket in order to get access to the image, so I can test if reinstall solves the issues...

              Netgate: SG-3100.

              K 1 Reply Last reply Reply Quote 0
              • K
                Kuser @Kuser
                last edited by

                @kuser
                Added a ticket and got hold of 21.02 image, reflashed the device and reimported backup.
                Same issue after about 65 minutes.
                The device doesn't actually freeze, but something happens with internal switch/interface.
                It stops responding to WAN/LAN, however usb-console is available.

                I've requested 2.4.5p1 image from NetGate.

                1 Reply Last reply Reply Quote 0
                • jimpJ
                  jimp Rebel Alliance Developer Netgate
                  last edited by

                  Has anyone monitored the console yet when this happens? The system log wouldn't have the same information printed to the console necessarily.

                  And that also would let you check easily if it's actually locked up vs still being responsive at the console but losing connectivity.

                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  K 1 Reply Last reply Reply Quote 0
                  • K
                    Kuser @jimp
                    last edited by Kuser

                    @jimp

                    I can confirm that the console was available the last time I lost LAN/WAN. I didn't find anything interesting in the logs(dmesg), but I do suspect it might be related to the internal switch. But I'm not really sure I know what I was looking for. I am currently connected to the console and can provide some debug information if it locks up again. Anything particular I should check?

                    I tried service netif restart but that seemed to hang.

                    1 Reply Last reply Reply Quote 0
                    • B
                      behemyth
                      last edited by

                      @jimp

                      Every time this has happened to me the console is accessible. Both interfaces also keep their ipv6/ipv4 addresses. It "feels" like routes are randomly disappearing, but I should still be able to ping stuff on the local connected network if that was the issue, and I can't even do that. Traffic pretty much stops.

                      M K 2 Replies Last reply Reply Quote 0
                      • M
                        mcury @behemyth
                        last edited by

                        Try to disable pfblockerng, I'm getting similar behavior, and it's working with it disabled.

                        dead on arrival, nowhere to be found.

                        1 Reply Last reply Reply Quote 0
                        • K
                          kphillips Administrator Netgate @behemyth
                          last edited by

                          @behemyth If the console is accessible like you said, can you please provide the output?

                          1 Reply Last reply Reply Quote 0
                          • T
                            TommyG
                            last edited by

                            I am also experiencing these same issues with loss of LAN/WAN on my 3100 after upgrade last night to 21.02. I am not running any special packages aside from DHCP, DNS, NTP and UPnP.

                            1 Reply Last reply Reply Quote 1
                            • R
                              rloeb
                              last edited by

                              Been running for 18+ hours. However, just noticed that Snort is NOT running and aborted just after midnight:

                              Feb 18 00:30:17 kernel pid 76998 (php), jid 0, uid 0: exited on signal 11 (core dumped)

                              Feb 18 00:30:14 php 76998 [Snort] Building new sid-msg.map file for WAN...

                              Something is very wrong with this release!

                              M 1 Reply Last reply Reply Quote 0
                              • B
                                behemyth
                                last edited by

                                @kphillips

                                I am waiting for it to happen again - I've had a console open and logging since last night. Once it does I will post the output.

                                1 Reply Last reply Reply Quote 0
                                • M
                                  mcury @rloeb
                                  last edited by mcury

                                  @rloeb said in 21.02 Sudden lockup:

                                  Been running for 18+ hours. However, just noticed that Snort is NOT running and aborted just after midnight:

                                  Feb 18 00:30:17 kernel pid 76998 (php), jid 0, uid 0: exited on signal 11 (core dumped)

                                  Feb 18 00:30:14 php 76998 [Snort] Building new sid-msg.map file for WAN...

                                  Something is very wrong with this release!

                                  Getting similar errors but with pfblockerng, during boot.

                                  https://forum.netgate.com/post/964587

                                  Feb 18 02:05:29	kernel		pid 49475 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped)
                                  Feb 18 02:09:02	kernel		pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
                                  Feb 18 02:16:21	kernel		pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
                                  Feb 18 02:39:03	kernel		pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
                                  Feb 18 02:44:59	kernel		pid 377 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
                                  Feb 18 02:52:02	kernel		pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
                                  Feb 18 03:07:38	kernel		pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
                                  

                                  dead on arrival, nowhere to be found.

                                  K 1 Reply Last reply Reply Quote 0
                                  • K
                                    kphillips Administrator Netgate @mcury
                                    last edited by

                                    @mcury Can someone provide serial console output? We've asked for this a few times and until someone gives us diagnostics information we can't move forward.

                                    M 1 Reply Last reply Reply Quote 0
                                    • M
                                      mcury @kphillips
                                      last edited by mcury

                                      @kphillips said in 21.02 Sudden lockup:

                                      @mcury Can someone provide serial console output? We've asked for this a few times and until someone gives us diagnostics information we can't move forward.

                                      Sure, the only problem during boot is the Configuring Firewall.Segmentation fault (core dumped). This only happens after the pfblocker installation, and after a reboot.

                                      Let me install the pfblockerng-devel again, and reboot to provide you the logs.
                                      One moment please.

                                      dead on arrival, nowhere to be found.

                                      K 1 Reply Last reply Reply Quote 0
                                      • K
                                        kphillips Administrator Netgate @mcury
                                        last edited by

                                        @mcury Thank you. So, to confirm, this issue is only present when you are running pfBlockerNG and you don't experience the issue when you are not running pfBlockerNG?

                                        M 1 Reply Last reply Reply Quote 0
                                        • Y
                                          yammering
                                          last edited by

                                          I’m getting the same problem on my SG-3100, and I’m not using pfBlocker or Snort. I only have HAProxy (nearly idle) and OpenVPN packages installed.

                                          K 1 Reply Last reply Reply Quote 1
                                          • K
                                            kphillips Administrator Netgate @yammering
                                            last edited by

                                            @yammering Can you please provide serial console output for your appliance when one of these lockups occurs?

                                            https://docs.netgate.com/pfsense/en/latest/solutions/sg-3100/connect-to-console.html

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.