Issue with SG-3100 and 22.01? [Solved]
-
I have the same problem on my SG-3100 running 22.01. Roughly every 4-5 weeks the device freezes. I can ping the device/gateway address but not traffic goes through the WAN interface. DNS resolver/web GUI etc. does not work or is non-reachable.
I have a RasberryPi connected to it via USB-serial console and recording all the output using GNU screen to a text file. Nothing is recorded when this happens. Nothing particular is shown in boot log either. My knowledge of all the log files is however limited, so I might be missing something.
I ran S.M.A.R.T. tests on my M.2 SATA drive and no errors are shown. I see the similar memory graph as previous poster.
Any ideas or suggestion what to check next would be greatly appreciated.
Should I perform a re-install and restore config from backup maybe?
Register support ticker to Netgate? -
Does the console still respond when this happens? Can pfSense connect out on any interface?
Do you see free ram drop to <10%?
Steve
-
This post is deleted! -
Hmm, and you manually power-cycled it at that point?
-
After it happens again this morning I finally replaced the device with the 2nd SG-3100 (which I didnt in the last weeks).
We will see, if it happens with this device too.
In the meantime I will start rebuild the replaced device from scratch.Regards
Edit: Help needed: After exchange of the SG-3100 I am flooded with mails :"can't create socket: No buffer space available", I cant find any reason for that.
Exchanged the devices again, but also the other device (which stops working this morning) too send thousands(!) of this mails???
Whats happened now?
There are some threads with "No buffer space available", but they are all relating to ping, traceroute or any other networking command.
But I did not find any post with "cant create socket", any ideas?Edit 2: no idea what has happened, but mails now stopped!??? Currently the device which stopped this morning is on duty again. Checked routes, NICs, updated some packages, and mails stopped. Whats a mystery!?
-
@stephenw10 Correct. Can't do much else with no response from the device or WAN connection.
-
@fsc830 said in Issue with SG-3100 and 22.01?:
Edit 2: no idea what has happened, but mails now stopped!??? Currently the device which stopped this morning is on duty again. Checked routes, NICs, updated some packages, and mails stopped. Whats a mystery!?
May be I was mistaken and there was a delay in mail receiving so that mails from device#1 arrived at mail client after device#2 was at work again. So I misinterpreted the mails as from device#2.
Anyhow: any idea what is causing this error?
Regards
-
@fsc830 said in Issue with SG-3100 and 22.01?:
Anyhow: any idea what is causing this error?
That looks like what you see if the WAN is changed and it's still trying to monitor the old gateway.
-
Yes, was my first thought that an interface is not working properly, but how can that be? The device was powered on, so in my understanding there cant be an old gateway?
Will boot up the appliance in an isolated network later and check the settings.
In dashboard the WAN interface was reported as good (no packet loss to monitoring gateway).Regards
-
This morning device updated to 22.05.
We will see, if this "frozen" status will appear again in the next weeks...
Hopefully not.Regards
-
Also updated my SG-3100 to 22.05 today. Will find out if issue is resolved in about 3-4 weeks.
-
@pwyde You've got a m.2 sata disk, mine is still runnig at eMMC.
Have the m.2 here, but did not found the time to install them.
But if yours with m.2 and mine with eMMC shows the same freezes, my guess is thats not to an unresponsive disk as suspected. We will see, usually within 3, max. 4 weeks the issue occurred since 22.01.Regards
-
If the storage goes away during runtime you will see a bunch or errors on the console if you're logging that.
-
Logging was enabled all the time, but there is still ... nothing.
When rebooting device every line is recorded again, until next freeze.
I am always logged in and viewing the dashboard in GUI, so a login is also not recorded.
When device freezes, GUI is 404 error(?), and console is still without any error/issue/event.Regards
-
Is it actually 404 because that is a response indicating the web server is still running?
-
Actually I cant remember if it was really 404... .
But now I have noticed another strange "issue?"...Just looked up the memory use, and noticed a big difference of usage from 22.01 compared to 22.05.
Can someone confirm this behaviour?
In 22.05 memory usage is signifcant higher?
22.01 free abt 80%, 22.05 abt 30%?
And the "inactive" vice versa from 7.5% to 60%?Regards
-
For me it was not 404, but website could not be reached.
I have the exact same memory usage as you, comparing 22.01 and 22.05.
-
Has it been rebooted since the update? The update process itself uses a lot of RAM. That would normally be freed over time or if required. Rebooting it should show similar usage to 22.01.
Steve
-
an EBKAC error ... (Error Between Keyboard And Chair)
Just rebootet:
Regards
-
I am rather sure I found the cause for the issue..
Today it was (nearly) the same, no internet access, no name resolution, but pfSense GUI was still available. That was slighty different. And a running internet stream was still active!?Anyhow, the appliance was rebootet at July 2nd after installing v22.05.
And since then I used the widget for the used disk capacity.
Some months ago (could be shortly after installing 22.01) I used the option "use RAM disks" for /tmp and /var.Since last reboot I noticed, that /var is getting more and more full. I was curious if there is any process which frees up filesystem at a certain level.
No, it wasnt. This morning /var was filled up by 100% (may be 99.99% and thats, why it was slighty different this time).Anyhow, administration via GUI was possible and I disabled the RAM disks.
Now I am curious, what will happen next, in Disk widget there is only / (root) and /var/run visible.
With 22.05 I installed also a m2.ssd with 128gb, my guess is, that this should not run out of space- hopefully.Regards
Edit: nothing seen at console, last entry was a login notification from July 12th, and next is the reboot from today. At least I have had expected some warning about the 100% file system usage.