SG-2220 Recovery
-
Hi folks,
this morning my SG2220 was dead and hanging. When I powerswitched it again the box have all LEDs lit indefinitely. This was already a replacement for an older 2220 which had the Intel C2000 Bug and hung dead sometimes around 2018. This seems quite similar...
As I was warned I bought a used one (short tested and OKeyd) years ago and today I plugged it in (V2.4.5p1) and because my Inet was down due to routerproblems I did not update again and installed an older backup from early 2021... Guess it, it was much too young and ruined the config. Normally this would be a good training for an USB Recovery, BUT the console is not working (only recognized as unknown device because the enumeration fails and no vendor/device infos are exchanged to chose the kind and according driver. This is a hardware defect, which is why the box was "quite cheap" and normally you do not need the console for too much so I bought it anyway (it did start up and so net is sufficent). It does boot up to some point but appears dead from both net sides.
I opened it up and took a deeper look into the SBC and tried to see some bad solderings around the SIL chip, but it's all OK, so I guess the chip or it's programming got sometimes damaged before me.My question: is there another way to reset the config (e.g 30sek RESET, then do ..." ) or get a flash automated (TFTP?). I see the SMB signals at a jumper post; same goes for J12. While the "old" one is dead, the replacement might be recovered quite simply If I get some kind of console working. In earlier years I'd have searched for the JTAG pins, but nowadays I'm lots of years out.
Anything I'd try?
Cheers
Michael -
The first thing I would try here is to upload a known a good config using the External Config Locator:
https://docs.netgate.com/pfsense/en/latest/backup/restore-during-install.html#restore-using-the-external-configuration-locator-eclIf it is still booting at least partly that should work.
If it's booting that far you should also be able to reset the config to default using the reset button:
https://docs.netgate.com/pfsense/en/latest/solutions/sg-2220/index.html#how-to-guidesIf the NIC LEDs are lit continually though that implies the NIC driver is not loading which would indicate it's not booting far enough for those.
Is the status LED changing from orange to green?
Steve
-
@stephenw10
Hi Steve,
thanks for the fast reply.
I tried the ECL with two FAT32 reformated (=empty) Sticks (16GB +32GB), created a config folder and copied the renamed config.xml (V19.1, and even gone back to the oldest one) onto it, plugged it in and started the system. I see lights change (yellow status to green after about 10sec., then the emmc LED is blinking a lot for about 90sec. and then every 2min. or so, but that's it. I do not see a reboot because of config change or anything. The NIC LEDs are initially lit, then while (I guess) booting they blank and then do the normal blinking (act) and fixed green (connect) state. This goes for WAN and LAN. So the driver seems starting, but the ECL doesn't seem to get into action. I repeated it for about 15 times until I quit.I checked the USB Voltage which was 4.9V and seems OK. While checking the access to the stick I switched to some old LED equipped Sticks and it contained an old PartedMagic Install, which bootet and used the WAN Port to access my network now and offers SSH AND proofs that USB is working.
So far I'm more successful this way (while the ECL looks so promising it fails me why it does not do it's intended work), but I lack a dd Image. The reinstall tgz Image only offers a xz copy install, while partedmagic only offers linux filesystems so far (esp. this old version of it), so no UFS(2) to simply mount and exchange the wrong config file.Do you have access to a SG2220 and create an unconfigured dd Image of any version for me to download? Would be of great help.
Cheers
MichaelP.S.: The reset button is somehow of no use at it did not trigger a reset under any tested situation (even in PartedMagic it did nothing at all). Is that normal or just because the GPIO pins are not configured so far?
-
@stephenw10 : Here's the boot process (with USB Stick, FAT32, non Boot + config file):
I plug in the PSU, Orange status LED is on for 3sec. then changes to green; USB is powered and my stick signals readiness, after 10sec. the eMMC LED flickers a lot, the USB Access LED (inside stick) flashes 1-2 times which could be a simple mount, then the eMMC again active a lot, NICs are initiated (LAN first afair, then WAN), USB blanks and comes up about 2 sec. later, no further access and eMMC blinks for another 50sec. Then just as described earlier. Now I tried three sticks, about 30 startups. Still the PM startup was the best "backdoor" I have up to now. Because the first box (2220) is dead (all LEDs are lit all the time; nothing changes until I switch power off) I'm somehow stuck between a rock and a hard place for quite funny reasons.
Because the chips are so small (I used to be an electronic developer) and my precise movements not as good as they (once) used to be, I'm unable to tap into the UART side to see what's going on. I wonder if there's another I/O connector (J12 ?) for a normal RS232. While I have an PCIe 1x VGA card, I don't have an adapter to miniPCIe in any reach to gain some insight into the box.
... -
Yeah the reset button it just connected to GPIOs and does nothing until it's read in pfSense something like 60s after powering up. So nothing in any other OS.
You can see in the video that's slightly after the ECL looks for any attached devices with config.
The ECL looks both in the root of the drive and in /config. I always just use the root.
Are you sure the config is good though?I don't have an SG-2220 but any ADI image would likely be OK as long as it only has two interfaces assigned.
In fact any x86 image with igb NICs should work since you do not have console access anyway.Steve
-
@stephenw10 : Hi Steve
great video and I think a good way for a reset. Unfortunately it doesn't work on my box, so it seems the boot process is stuck earlier for strange reasons (which would explain the config.xml fail as well as it didn't get to the point).
I tested min. of 4 old (V19.x or 17.x) conf files on 3 different stick on about 30 tries. The not working reset (tried it it for 3-4 min pressing and no reboot or anything was initiated; looks more like a loop to me). I already made an dd Image so we could go into detail, but as the ADI versions are EOL and....well old (and buggy thanks to Intel) I'm not sure if that helps.The ADI Image (download...pfsense...CE...ADI...Images...) would be great. All I know of them is an installer Images (e.g. "pfSense-CE-memstick-ADI-2.6.0-RELEASE-amd64.img.gz") which does not work here. Are there installed "finished" images somewhere?
Cheers
Michael -
There isn't an image of an installed system because those devices were never installed like that. However you should be able to use an image of any pfSense install with igb NICs because the only difference the ADI specific images have is the serial console is configured for com2. And you don't need the console to work.
So you could install install to any device with igb NICs and create an image from that. You could probably use a VM and the rwaw disk image.Steve
-
@stephenw10 Hi Steve,
as I have no older ADI Version in Service (this 2220 is my private box, so no access to my work, which are more younger and typ. ARM64 based).
Anyway, thanks to the x86 I created a raw image, transferred it to my VMware Workstation, converter it, loaded it and SAW:
On a reboot I saw that the ECL function (+the reset function) was instantly overdone in less than a second (so no screenshot was fast enough to protocol) by the config error (Stick was "inserted" ) and it seems not mounted at all).
While anything was dead and presented the PHP error of the picture, I was able to start 8 and had a shell open, I mounted the stick manually (shouldn't the ECL not mount it by default to access?), copied the config over and shut down (the key functions ended all up dead with the xml error), but the acpi shutdown of the VM worked. I converted the image back to RAW and uploaded it via PartedMagic and it boots!!!!
Everything is fine again; updated it to 22.05 and imported the real config.I learned a lot about the boot process (which might help me in further usage). I wonder if the ECL not mount+read and "no reset acknowledged" (I'm aware that the GPIO is not emulated in the VM, but should have worked like in the video in the real world) is fixed in the newer releases.
Anyway I salvaged one box and have already replaced my VM Router with the "real stuff".
Thanks
Michael -
Nice recovery!
The ECL should work in a VM if it sees a mounted USB drive (daX). We have seen issues on faster machines, or slow USB drives, where it fails to mount before the ECL runs. I could imagine you might be hitting that in a VM.
Steve