21.02 Sudden lockup
-
@ffuentes I've been watching this thread all day. It seems the focus is turning to pfblockerNG, yet I have uninstalled all my packages. Removing PfblockerNG-devel did lengthen the time between lockups, it did not eliminate the problem.
edit: I am rebooting via USB cable->putty->console
-
@bulfinch Thanks for this update.
I suspect not a pfblocker issue, but something a bit deeper.
When we get lockups like this, sometimes it hints a kernel issue.
Maybe a wild process, but even then you get a ping response when the OS is running low on resources. -
@ffuentes @Bulfinch I tried to upgrade from 2.4.5p1, got a lot of errors, so decided to perform a clean install and configure everything from scratch..
I suggest you guys do the same, I lost a few hours yesterday reinstalling everything.. but I'm not getting this 'lockups'.
The only thing I restored from my 2.4.5p1 backup was the aliases..I saved a screenshot from my update from 2.4.5p1 to 21.02 (this screen doesn't show all the errors, there were plenty more..
-
@mcury For me, the console has never frozen. And I can still access the WAN from the console (pings, etc.) -- it's the LAN and the DNS Resolver that go down.
I have fully removed pfBlockerNG and will give this one more shot before throwing in the towel. But if the root cause really is pfBlockerNG somehow, I hope that gets resolved very quickly -- it's very important to have that.
How can I downgrade though to 2.4.5p1 though if I can't reach the internet because my pfSense is down? I did backup the configuration before I upgraded to 21.02, but that backup doesn't include the full pfSense software does it?
-
@mcury said in 21.02 Sudden lockup:
, so decided to perform a clean install and configure
I may need to move in this direction. The family is rising up like the French resistance. I fear for my head.
-
@mcury do you know how many hours I have to invest in a reconfigure?
I am not sure I want to do that...Its going to be painful. very painful.
-
@bulfinch said in 21.02 Sudden lockup:
@mcury said in 21.02 Sudden lockup:
, so decided to perform a clean install and configure
I may need to move in this direction. The family is rising up like the French resistance. I fear for my head.
kkkkkk, be careful, nowadays, a family without internet can be real dangerous :)
I'm in the same boat, this SG-3100 is a home device, my family went berzerk mode.. You should do it drinking a beer@ffuentes said in 21.02 Sudden lockup:
@mcury do you know how many hours I have to invest in a reconfigure?
I am not sure I want to do that...
Its going to be painful. very painful.I took some minutes thinking about that, how bad was going to be, but now I'm feeling better that I did it..
-
@mcury I am curious, how do you do a fresh install with an SG-3100
-
@ffuentes Open a TAC ticket with the support, go.netgate.com is the link to do it, click in support and create a ticket asking for the 21.02 version, they will provide a link to download.
You will need a pendrive, If you are in Windows, just download 7zip to extract the file, and Balena Etcher to burn that image to the pendrive.
Reinstall instructions:
https://docs.netgate.com/pfsense/en/latest/solutions/sg-3100/reinstall-pfsense.html -
@mcury u know... after some serious thoughts. I think is time for me to move on and find a new solution. Thanks for your help.
-
I'm having this same issue here too. It actually took a few hard resets to actually get any response from my sg-3100.
My device seems to be using a lot of CPU on "/use/local/sbin/check_reload_status" which after a bit of google-ing seems to be a symptom of something else not working properly.
-
@ffuentes said in 21.02 Sudden lockup:
@mcury u know... after some serious thoughts. I think is time for me to move on and find a new solution. Thanks for your help.
You are welcome
-
Same issue here with my SG-3100, system stops responding to GUI, cannot ping the router and must unplug the router 5 times today after upgrading to v21
-
When your device "locks up", can someone go into the serial console, choose option 8 for the shell, and run the following?
php /usr/local/www/status.php
then run
cp /tmp/status_output.tgz /root/
After this, reboot your firewall, go into the webConfigurator --> Diagnostics --> Command Prompt and paste in /root/status_output.tgz into the "Download File" prompt and download the file from the firewall.
If you can then open a ticket with our support providing this file, it will hopefully help us figure out what's going on here.
Thank you everyone for your patience.
-
@kphillips @mcury I wanted to come back after a few hours... I was able to finally warm-up, take a nice warm shower, and have a hot meal. Things down in the south (Houston, TX) are not good with this winter storm.
I have not had a hard lockup after uninstalling pfblockerng (Several hours now)
-
@ffuentes Same here.
-
I disabled pfblockerng and it lasts longer but eventually still locks up, of course, i have many devices connected all the time. I have the serial console connected and will snag the status info once it locks again.
-
@softcoder I had the same issue, I had to uninstall it. Disabling did nothing for me.
Try uninstalling it. -
@kphillips I made an attempt...
Since my last post. I followed the manual and reinstalled 21.02 via USB stick.I then restored my previously saved configuration.
After restoring the config and logging in I had 2 errors
Feb 18 17:08:43 php-fpm 97269 /rc.update_urltables: : ERROR: could not update pfB_PRI1_v4 content from https://127.0.0.1:443/pfblockerng/pfblockerng.php?pfb=pfB_PRI1_v4 <br />[ Abuse_Feodo_C2_v4, Abuse_IPBL_v4, Abuse_SSLBL_v4, CINS_army_v4, ET_Block_v4, ET_Comp_v4, ISC_1000_30_v4, ISC_Block_v4, Spamhaus_Drop_v4, Spamhaus_eDrop_v4, Talos_BL_v4 ] Feb 18 17:08:43 php-fpm 97269 /rc.update_urltables: Download file failed with status code 404. URL: https://127.0.0.1:443/pfblockerng/pfblockerng.php?pfb=pfB_PRI1_v4 <br />[ Abuse_Feodo_C2_v4, Abuse_IPBL_v4, Abuse_SSLBL_v4, CINS_army_v4, ET_Block_v4, ET_Comp_v4, ISC_1000_30_v4, ISC_Block_v4, Spamhaus_Drop_v4, Spamhaus_eDrop_v4, Talos_BL_v4 ]
I found under Firewall->Aliases a leftover pfblockerng reference.
I deleted the alias entry and upon saving, pfsense 'crashed'. I then rebooted via serial console.
Back online I tried the access the firewall logs status->system logs->firewall and the GUI just hangs. After 90 seconds or so I can access the dashboard again and have
"Netgate pfSense Plus has detected a crash report or programming bug. Click here for more information."
[18-Feb-2021 18:44:58 America/Los_Angeles] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20415890 bytes) in /usr/local/www/csrf/csrf-magic.php on line 161
This is 100% consistent. I can not access the firewall logs at all. (although the widget on the dashboard is working)
This appears separate from the locking up crash this thread deals with.Pfsense ran for about 45 minutes then crashed. I attempted via serial console to execute your commands however the first command php /usr/local/www/status.php returned "Gathering Status Data..." and hung.
I let it sit for several minutes with no change and no response from the putty console. I closed the connection and reconnected but now could not even access the console menu. It was completely locked. I had to reboot via the reset button on the back of the 3100.
Once I rebooted, I was able to run the commands via the console. I have the output, but I don't know if it is what you are looking for.
I'm on the verge of rolling back to 2.4.5
-
I'm fairly certain I've tracked this one down, after spending a few hours trying various attempts to reproduce on my own SG-3100. I have shared my test results with our developer team. I will update when I have more information or a resolution for you all.