21.02 Sudden lockup
-
Same occurred here - with the 3100 also.
Once dashboard finally came up, CPU was running at 100%. Disabled running services w/o any drop in CPU cycles. Removing SNORT finally brought CPU cycles down and was able to stabilize and reboot. Now seems to be behaving normally -
But, when I reloaded SNORT, its running but not accessible from the menus....and when I then did it remotely on an second 3100 - got the same behavior.
-
Yeah, I just had the same thing happen. I reported this back in 2.5 beta, it seems to only occur on the 3100 series. I still have IPv4/IPv6 addresses on all of my interfaces, but I get total connectivity failure. I had about 6 hours of uptime before this happened. It's completely random.
I tried to get logs of it, but since its a total loss of network communication, my log servers never get anything, and the local logs never showed anything.
I have no packages running btw, just openvpn export.
A power cycle will fix it, but i used a console connection to manually reboot. Everything came right back up.
I will leave a console session open as jimp has suggested.
-
its indeed really weird. Because it was late yesterday (EU) I wanted to do the rollback / reinstall this morning but it didn't crashed during the night. Maybe because of not much load maybe random?
Only thing I changed yesterday was to disable all packages, which actually werent so much;
pfblocker, avahi, service watchdock, lldp
But after the crash the same picture for me, besides the logs posted above no other clue.
Let's see how it run during the day without packages enabled.
-
Same issue here, install went without issues. Device was working for about 30-45 minutes before it froze/locked up the first time. Now I need to power cycle it every 10-60 minutes. Tried removing all unnecessary packages, but without success.
When it freezes I can't even ping it via LAN.
This has now happened 5 times.I'm opening a support ticket in order to get access to the image, so I can test if reinstall solves the issues...
Netgate: SG-3100.
-
@kuser
Added a ticket and got hold of 21.02 image, reflashed the device and reimported backup.
Same issue after about 65 minutes.
The device doesn't actually freeze, but something happens with internal switch/interface.
It stops responding to WAN/LAN, however usb-console is available.I've requested 2.4.5p1 image from NetGate.
-
Has anyone monitored the console yet when this happens? The system log wouldn't have the same information printed to the console necessarily.
And that also would let you check easily if it's actually locked up vs still being responsive at the console but losing connectivity.
-
I can confirm that the console was available the last time I lost LAN/WAN. I didn't find anything interesting in the logs(dmesg), but I do suspect it might be related to the internal switch. But I'm not really sure I know what I was looking for. I am currently connected to the console and can provide some debug information if it locks up again. Anything particular I should check?
I tried service netif restart but that seemed to hang.
-
Every time this has happened to me the console is accessible. Both interfaces also keep their ipv6/ipv4 addresses. It "feels" like routes are randomly disappearing, but I should still be able to ping stuff on the local connected network if that was the issue, and I can't even do that. Traffic pretty much stops.
-
Try to disable pfblockerng, I'm getting similar behavior, and it's working with it disabled.
-
@behemyth If the console is accessible like you said, can you please provide the output?
-
I am also experiencing these same issues with loss of LAN/WAN on my 3100 after upgrade last night to 21.02. I am not running any special packages aside from DHCP, DNS, NTP and UPnP.
-
Been running for 18+ hours. However, just noticed that Snort is NOT running and aborted just after midnight:
Feb 18 00:30:17 kernel pid 76998 (php), jid 0, uid 0: exited on signal 11 (core dumped)
Feb 18 00:30:14 php 76998 [Snort] Building new sid-msg.map file for WAN...
Something is very wrong with this release!
-
I am waiting for it to happen again - I've had a console open and logging since last night. Once it does I will post the output.
-
@rloeb said in 21.02 Sudden lockup:
Been running for 18+ hours. However, just noticed that Snort is NOT running and aborted just after midnight:
Feb 18 00:30:17 kernel pid 76998 (php), jid 0, uid 0: exited on signal 11 (core dumped)
Feb 18 00:30:14 php 76998 [Snort] Building new sid-msg.map file for WAN...
Something is very wrong with this release!
Getting similar errors but with pfblockerng, during boot.
https://forum.netgate.com/post/964587
Feb 18 02:05:29 kernel pid 49475 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped) Feb 18 02:09:02 kernel pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped) Feb 18 02:16:21 kernel pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped) Feb 18 02:39:03 kernel pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped) Feb 18 02:44:59 kernel pid 377 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped) Feb 18 02:52:02 kernel pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped) Feb 18 03:07:38 kernel pid 375 (php-cgi), jid 0, uid 0: exited on signal 11 (core dumped)
-
@mcury Can someone provide serial console output? We've asked for this a few times and until someone gives us diagnostics information we can't move forward.
-
@kphillips said in 21.02 Sudden lockup:
@mcury Can someone provide serial console output? We've asked for this a few times and until someone gives us diagnostics information we can't move forward.
Sure, the only problem during boot is the Configuring Firewall.Segmentation fault (core dumped). This only happens after the pfblocker installation, and after a reboot.
Let me install the pfblockerng-devel again, and reboot to provide you the logs.
One moment please. -
@mcury Thank you. So, to confirm, this issue is only present when you are running pfBlockerNG and you don't experience the issue when you are not running pfBlockerNG?
-
I’m getting the same problem on my SG-3100, and I’m not using pfBlocker or Snort. I only have HAProxy (nearly idle) and OpenVPN packages installed.
-
@yammering Can you please provide serial console output for your appliance when one of these lockups occurs?
https://docs.netgate.com/pfsense/en/latest/solutions/sg-3100/connect-to-console.html
-
@kphillips Exactly, the problem is when pfblockerng is enabled.
If it's installed, but disabled, the problem doesn't happen.