Issue with SG-3100 and 22.01? [Solved]
-
The monitoring graphs log that over time so should show anything like that still if it happened.
-
Upps, just seen that it stores over time. Clearly seen, the gap from this morning when services stops:
But usage is constant every time!?
Regards
Edit: And here is the first drop in March:
-
So no big CPU usage before it stopped logging. You can check other resources by changing the graph settings. I would definitely check memory.
Steve
-
Yes, seems that memory is slowly, but steadily decreasing over the days!
But if I see it correct, there was more than 65% free this morning!?
From today:
And in total since running on 22.01:
Regards
-
Hmm, still mostly free though. That wouldn't stop it responding.
-
Is this the Unit you Update Wireguard and got into the wrong branch issue?
If so, go for a Backup and clean reinstall.
-
Hi, I have the exact same problem with a 2100. I did a clean install of 2.6 about a month ago, and this problem has happened twice since then. I had no issues with the device before.
The GUI becomes inaccessible, SSH unreachable, cron jobs stop running etc., but routing still works to some extent (I can access my VPN server behind pfSense remotely).
I haven't been able to find anything relevant in the logs, so any pointers as to where to look would be appreciated.
I use pfSense at an SMB so this is an absolute deal breaker for me unfortunately. -
There is no 2.6 image for the 2100, which is aarch64, I assume you mean 22.01?
Some services remaining up whilst others fail is typical of something like RAM exhaustion though so check the same things. It can also be caused by a failing drive which then prevents any errors being logged.
Steve
-
@stephenw10: Yes, I meant 22.01, sorry. I don't see anything unusual with RAM usage. Is there anything in S.M.A.R.T. data that would indicate that the M.2 SSD is failing? Also, is there an M.2 SATA SSD (2242) that you'd "officially" recommend?
-
Not other than the one that was already fitted if it has one.
I would expect some errors in the SMART data if it is failing.
I might expect to see other errors logged also.
-
@stephenw10: It didn't come with an M.2 SSD, I installed one. Do you know what errors/indicators I should be looking for, specifically?
-
Not anything specific. Any errors are bad!
Try running the short test and check the results.
Logging the console output is usually the best way to diagnose a drive failure of you can since the system will often dump error output there that cannot be written to the system log.
Steve
-
@stephenw10: I've run a short test, but it doesn't show any errors. I'll reinstall to eMMC. I really hope it's the SSD and not a software issue.
-
My device has no SSD, only eMMC. Hopefully its not the hardware.
Next days I will setup my standby device from scratch and replace this unit when issue occurrs next time.Regards
-
I would also consider re-installing 22.01 clean and restoring your config to rule out any issues during the upgrade.
Steve
-
My 3100 booted once (if I recall correctly) after the 22.01 update, back in Feb. Since then, I have been forced to set the unit aside. Update completed fine, but afterwards, I cannot access from serial nor Web page.
I've tried to hard reset - via the reset button with no changes. Yet the boot process seems to complete, since the LED panel-indicators seem to proceed in a valid sequence.
I cannot access by any means.
Is there factory services?
I've opened a TAC-Lite ticket and am just surfing around the forum until a reply is received. -
You should be able to see the boot output from uboot at the serial console even if there is no OS installed so I would concentrate your efforts on getting that working. Once you can see the console it will probably be obvious what is preventing you connect to the webgui.
https://docs.netgate.com/pfsense/en/latest/solutions/sg-3100/connect-to-console.htmlSteve
-
@stephenw10 Appreciate the reply:
Closer inspection revealed th Serial connection to be only on the ;aptop side.
The 3100 has been successfully upgraded to the latest 22.01 and I left it unattended while packages were to be loaded, *even though WAN port reported 'no carrier' /'DHCP down'. *This is clearly isolated to the 3100 only as moving the WAN cabling to another outside facing device works perfectly.
But thinking that the DNS Resolver took some time to come up, it was believed that time is what would cure all.
This is not the case. After 3 hour elapse, Resolve is up but remains with no WAN port. And Gateway Monitoring Service will not start, remaining stopped.Thoughts ? / Help !!!
-
You have already opened a TAC ticket.
Ask for the recovery image, usually you will get a download link within 1 hour (during normal office hours).
Then you can install SG from scratch.Regards
-
Do you see link LEDs on the WAN port?
What does Status > Interfaces show?
It should link and show UP if it's connected to anything unless it's disabled.
Steve