3hrs +? "Packages are currently being reinstalled in the background."



  • Two i386 pfsense boxes, pfsynced, doing great on the last stable release.

    Did the upgrade to the latest stable release.  Same everything.  Autoupgrade, download went fine, rebooted, and it appears basic connectivity is restored since I'm able to post here.

    However, on the gui, 3+ hours later, on both machines it still says:

    "Packages are currently being reinstalled in the background."  Don't make any changes, etc.

    So, you know, how long is too long?

    I'm noticing the postfix forwarder is the last item mentioned in the log as being reinstalled, and also that service doesn't appear on the services list on the 'master' pf box, while it is installed on the backup one.  The backup box is getting messages every minute about sql and data flushes from the main box.  So, I'm thinking the postfix forwarder reinstall just didn't exit on the main box?  Or?  The logs show postfix processing incoming email on gate1, so maybe that's ok.

    Diag hints?  Ideas?



  • Make that 4hrs… still reloading those packages in the background, no gui config changes possible.

    Any hints?  No obvious clues.  I hesitate to just reboot and manually reinstall each package, I'm not even sure it will let me in to that part of the system.  ssh does work.


  • Rebel Alliance Developer Netgate

    Check the console (VGA, or serial) and see what it's displaying.

    If a package reinstall did not go correctly it can end up in that state. You can clear the message by going to Diagnostics > Backup/Restore and hitting the button there to reset the package lock.



  • Good tip, the vga console.  There is this result:

    Both systems were extracting postfix, which succeeded.  Then the extraction proceeded to cyrus-sasl, which suceeded.

    The last thing printed on each was

    libspf2  (extracting)

    and, nothing more.  Should ^C on the console?  Or, do the above advice?  Reboot? Reinstall the packages?


  • Rebel Alliance Developer Netgate

    I would try ctrl-c first, then if that doesn't work, depending on what else shows up, it should be safe enough to just use the package lock clear button.

    You'd definitely need to reinstall whatever package was using postfix though (mailscanner?) - I think you're the second person to mention an issue reinstalling it, so the package maintainer may need to have a look at the install process.



  • So, ^c asked me for which shell I wanted, I hit enter.  Nothing happend.  I hit enter again, and it went on to boot, as it started into postfix a whole bunch of html stuff appeared on the console apparently to the effect of reinstalling postfix. It would stall from time to time until I hit enter, then spew some more.

    Upon reboot, all configs lost with the complaint that system.inc was corrupt.  Dumping it, it appears to have been truncated, it ends in the middle of a statement.

    Complete loss of all functions, no interfaces configured.   Attempting to 'restart web configurator' yields 'unexpected $end in system.inc, line 1029, so, no webconfigurator either.  Trying to find a source for system inc and ssh it in if I can.

    What a hairball.   Is there a better way without losing everything?

    P.S. And, of course, github is down ,  on the hunt for a system.inc for 2.02


  • Rebel Alliance Developer Netgate

    If github were up right now, and you have external connectivity, you could fetch a raw copy of system.inc right from there directly.

    As it is, the best way to get back from that would be to reapply the firmware update.

    No idea how the package reinstall would have trashed system.inc though. Any chance your drive is failing?



  • No chance of reapplying anything, since all the scripts depend on system.inc.  No interfaces configured, no off-box connectivity of any sort.  But I found a system.inc on redmine, releng 2_ and it seemed different than the one there by only the missing bit and one line at the top.  Wrote it on to a flash drive, walked it over there,  mounted it, copied it, and … viola.
      up and seemingly all good.

    And after having copied system.inc,  rebooting, something truncated system.inc to exactly 32Kb again!
    SmartD short test reports no errors on the drive, drive log reports no errors.
    fsck didn't run at boot, and there have been no problems these many months.  fsck at single user boot reports clean.
    copied system.inc again, but didn't try to reinstall postfix.  So far, so good.

    What can truncate system.inc to 32K?


  • Rebel Alliance Developer Netgate

    Nothing I know of, except a drive that isn't actually syncing to disk (like a CF that has worn out and will no longer write). The 32k value is pretty sketchy. I'd say that it's being cached somewhere initially and then when the cache is flushed to disk by some subsequent i/o it starts showing you what is really on the disk.

    Try this:

    dd if=/dev/zero of=/etc/inc/system.inc bs=1M count=1
    rm /etc/inc/system.inc
    cp /path/to/good/copy/system.inc /etc/inc.system.inc
    
    


  • Update:  Seems the right thing to have done during the initial update during the hang at the libspf2 extraction was not to ^C on the console when the postfix reinstall happened, but to hit the return key.  It then just marches on apparently correctly.

    No clue if it all really works at the detail level, but when I reapplied the update and hit the return at the hang instead of the ^C all seems to proceed correctly.

    That means that unless you've got access to the console – don't try to upgrade with the postfix package installed if you don't have physical access to the pfsense box keyboard.

    No idea what clobbered system.inc



  • I think you're the second person to mention an issue reinstalling it…

    The first would have been me yesterday.

    After going through the clean install and restore I was running very late and just dropped a post here (http://forum.pfsense.org/index.php/topic,57024.msg304361.html#msg304361).

    As you can see, it looks like we had the same problem with libspf2.  I also had a problem with pfblocker but didn't get a screenshot of that and don't remember the context now.  I use postfix without any related packages.

    Of course, what I should have done was an ESXi snapshot before I started the update.



  • Followup:  Went to the second of the two identical PF boxes, it was still hung at the libspf2 extraction.  I hooked up a keyboard and a monitor, hit the return key, and off everything went, apparently normally.  Checked the system.inc file, it appears normal.

    Something that happens related to the postfix forwarder install just sits waiting for stdin, apparently.



  • @hcoin:

    Good tip, the vga console.  There is this result:

    Both systems were extracting postfix, which succeeded.  Then the extraction proceeded to cyrus-sasl, which suceeded.

    The last thing printed on each was

    libspf2  (extracting)

    and, nothing more.  Should ^C on the console?  Or, do the above advice?  Reboot? Reinstall the packages?

    I had this happen on a system as well, just had to press <enter>on the keyboard attached to the machine… weird.

    Jim? Maybe a bug in the scripts?</enter>


  • Rebel Alliance Developer Netgate

    No bug in the scripts that I know of.

    If one of the packages is prompting for an answer on install, it may be waiting for input. They don't normally do that, but it's possible one of the packages being pulled in during this install is doing that. It's not unheard of for FreeBSD packages to do that (I know some of the Java packages used to). Not sure why it would only happen during a reinstall/upgrade though.



  • Thats good to know, I will keep that in mind in the future.


Locked