pfSense Plus and SG-3100
-
Agree. After letting it sit for an hour, and connecting a console (no output), I realized it was safe to pull the plug. It ended up resolving the issue.
I also submitted my logs to Netgate support so they could maybe try to figure out why some devices were locking up. Maybe you could open a support ticket and submit your logs to them?
-
@flsnowbird
jesus christ, i can't even think of staying more than 10 minutes without a firewall, there are several clinics around the clock, let me know if after hanging up and calling you decided.
I'm afraid of this update -
@luketa I can't ask them to unplug it because the cables are not labeled and there's a lot more critical equipment plugged in that I don't want them to unplug by mistake. They'll just have to wait.
@Amarand I did open a ticket and they asked me to hook up a USB console cable to see what's going on. Can't really do that until I get home. -
No one here in the thread has had a failure during the installation other than the flashing-lights lock-up. Probably shouldn't upgrade a firewall remotely without smart-hands in place. And you probably shouldn't upgrade a firewall during core business hours - that's just common sense.
Myself, I did it during the day because sometimes you just forget the rules above, and you cowboy a fix.
But if it gets to that point where all three lights are flashing, you're probably just a power cycle away from being up, which is only a minute or two. I waited an hour to see if it was in some sort of upgrade state, like "do not interrupt" or something.
I only power cycled after verifying no output on the console.
-
My guess is, from personal experience, you'll hook it up, and see nothing. You'll type characters/return/enter and it will echo your keys, but won't actually do anything other than echoing.
You'll pull the power, and plug it back in, and the console will spring to life, showing you the latest version at the serial console menu.
That seems to be the consensus.
As an aside, it's super easy to see/feel which is the power cable, because it's connected with a screw-on protective ring, whereas everything else on the SG-3100 isn't.
So if you tell them to look for a silver ring on the back right (looking down from the top, from the front), it should be the ONLY thing with a silver screw-ring.
I think you can also hit the Reset button with a toothpick or paperclip, and that acts as a quick restart without clearing your settings, but I've never tried that, so I don't know what the reset button's rules are.
Intuitively, a quick press would reboot, and a long-press (more than a few seconds) would reset to factory defaults or put you into a safe mode.
Anyway, good luck! It's good to read these replies, because it makes me feel like maybe I'm not alone with my lock-up experience, and it also gives me more confidence when I have issues in the future.
-
Also, I know I've mentioned this a few times before in this thread alone but, how cool is it that Netgate includes a serial USB console cable with the unit? Heck, even my iPhone isn't including a charging brick anymore and this was only half the cost. How many $399 units are all metal, too? It's a really nice unit.
Since I bought mine last May, I've had exactly one problem with it (this one), and I could have 100% avoided this problem by ignoring this upgrade for a week or two. But I like upgrading my stuff as soon as new versions come out, so that's on me. ️
-
@amarand thanks for the tips and recommendation, but like you mentioned, I have no "smart-hands" available at home at the moment. No big deal, their streaming/ps5 playing will have to wait a few hours.
-
@amarand Agreed.
-
Just upgraded from 21.02 to the new 21.02-RELEASE-p1. Had no trouble!
I am wondering if the people that have the 3 blinking light problem are upgrading from 2.4.5? -
+1
I'd like to know too -- I need this fix, but watching this thread has convinced me that I should request the updated USB image, and block out an hour during "off-peak" usage to attempt the upgrade.
If folks upgrading from 21.02 aren't having any issues, I might roll the dice.
-
-
I upgraded to 21.02 with no issues.
I then experienced the hanging, so I added the CPU limiter workaround.
The -p1 hotfix released, and I applied it from the web-site.
The upgrade log showed, in the web-browser, "success" at the end, no errors.
After five minutes of "retying, please wait" and no SSH response, I went downstairs to see the blinking lights.
I forgot about the console/cable, and opened a ticket with Netgate because I wasn't sure if I should unplug a system that was potentially in the process of upgrading. That's a good way to actually brick any appliance/embedded system.
After figuring out that the console was the safest way forward, got that hooked up, saw no response on the console, and decided a power cycle was warranted. That was almost an hour from upgrade.
-p1 came back up just fine.
-
@sdd said in pfSense Plus and SG-3100:
+1
I'd like to know too -- I need this fix, but watching this thread has convinced me that I should request the updated USB image, and block out an hour during "off-peak" usage to attempt the upgrade.
Installing "off-peak" is always a good idea anyway.
If folks upgrading from 21.02 aren't having any issues, I might roll the dice.
If you're physically there by the firewall, go ahead and apply it, as long as you can afford a short outage. But I would have a serial console hooked up and watching BEFORE you apply. This way, you can see where it's hanging, if it hangs.
As an aside, for me, it applied -p1 from 21.02, hung, I power cycled, it came up on -p1 hotfix. It shouldn't have locked up post-install, but it did successfully update.
-
Oh, and, of course, if you have folks at home who are willing to trace that power cable from the back of the unit, to where it plugs into the wall, that might be easier to unplug, wait a few seconds, then plug it back in. That's slightly easier than unscrewing the screw.
I already have my kid trained to detect cable modem issues based on LED statuses. He knows how to unplug the cable modem and the main house networking switch, watch for things to come back up, and knows how to report that back to me.
Might be good opportunity to train your family!
-
@amarand said in pfSense Plus and SG-3100:
@sdd said in pfSense Plus and SG-3100:
+1
I'd like to know too -- I need this fix, but watching this thread has convinced me that I should request the updated USB image, and block out an hour during "off-peak" usage to attempt the upgrade.
Installing "off-peak" is always a good idea anyway.
If folks upgrading from 21.02 aren't having any issues, I might roll the dice.
If you're physically there by the firewall, go ahead and apply it, as long as you can afford a short outage. But I would have a serial console hooked up and watching BEFORE you apply. This way, you can see where it's hanging, if it hangs.
As an aside, for me, it applied -p1 from 21.02, hung, I power cycled, it came up on -p1 hotfix. It shouldn't have locked up post-install, but it did successfully update.
In my line of work, back in the 90's, I "bricked" a super computer I was installing, and had to wait for someone to bring me a flash image on USB. That was MY stupidity.
I think this is just some weird post-install glitch Netgate hasn't figured out yet. If someone who's on 21.02 is willing to hook up a serial console pre-upgrade to -p1, AND they get a crash, they might get some actionable data to send back to Netgate, especially if they were recording (for example in PuTTY). My logs didn't show Netgate anything...but serial console output is usually more verbose.
I don't think anyone has actually "bricked" their SG-3100 at this point, during this upgrade, at least no one contributing to this thread.
-
Not worried about bricking. Just saying I'd love to try it out and report back, but I can only afford a short outage right now. Sounds like there's some risk it might be ~10min to get things going again, and maybe more if for some reason I end up doing a clean install. So I'm taking precautions and will try it out tonight. Doesn't sound like a clean install will be necessary though, and I appreciate your insight.
-
For the record, firewall has been rock-solid since the upgrade to the -p1 hotfix.
Had I not been overly cautious, I could have power cycled it in less than a minute after seeing the blinking lights.
-
@sdd Just make backups before you do anything. You can always import it again.
-
I tried the upgrade tonight, and after more than an hour of working through issues, I went back to 2.4.5-p1.
Here's my experience:
-
I performed the normal OTA upgrade from the web console.
-
It rebooted to do the upgrade, and hung here:
mountroot: waiting for device /dev/diskid/DISK-10F88FE8s2a...
This is the "3 lights flashing" issue reported by others in this thread. Since the filesystem hadn't been mounted, I power cycled the device. It mounted on the next boot, and it finished up the update.
- After the update, the device wouldn't finish booting. The firewall crashed coming up, and then dropped to a regular login prompt:
Configuring firewall.Segmentation fault (core dumped) Starting CRON... done. >>> Removing unnecessary packages... done. >>> Cleanup pkg cache... done. Netgate pfSense Plus 21.02-RELEASE (Patch 1) arm Mon Feb 22 09:38:52 EST 2021 Bootup complete FreeBSD/arm (pfSense.xxx.com) (ttyu0) login:
I saw this previously with 21.02, but it was intermittent and only happened once or twice. With 21.02-p1 it happens consistently (3/3 tries). I could still get in with ssh, and it gave me the normal text menu.
- I did a fresh install from the USB image, and it came up just fine. So then I restored my config.
After doing this, I got one successful boot, and then the firewall started crashing again on every subsequent boot.
When I logged in after the one successful boot, there were about 7 warning notifications telling me about issues with some of my firewall rules. They were rules carried over from 2.4.5-p1, and were related to traffic shaping.
Unfortunately, I don't have time to re-build my firewall and traffic shaping from scratch, so I reverted to 2.4.5-p1 and moved on with my life. I'll check back in a few months once it's stabilized a bit more and try again.
I don't believe the firewall crash is related to the original issue presented here. On a different day I might be able to debug further, collect some logs, and file a separate issue. I work in tech on embedded devices and occasionally on IPv6 routing, but at the end of a long week it's not something I have the energy to do on a Friday night.
On a minor note, I also wanted to call out that upgrading to 21.02-p1 automatically removes the
hw.ncpu=1
workaround in/boot/loader.rc.local
. I noticed it when I went to remove it. The upgrade overwrites the file with the following:cat /boot/loader.rc.local ubenv import ubenv import boardpn ubenv import boardrev ubenv import boardsn ubenv import eth1addr ubenv import eth2addr ubenv import ethaddr
Best regards
-
-
Hmm, thank's for documenting that.
The
hw.ncpu=1
value should be in/boot/loader.conf.local
. That is the file that is specifically carried across an upgrade.Steve