Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Upgrade to 2.1.2: Stuck on 2.1

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    81 Posts 29 Posters 31.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      Darkk
      last edited by

      @mjohnson:

      I'm in the same boat with 40 plus devices scattered across 4 provinces. I actually haven't managed to get a single one to upgrade except the 3 in our local offices that I performed a clean install on, 2G and 4G, current versions 2.03 and 2.1 on all remotes attempting auto upgrade and manual firmware uploads. Not sure what to do, since I don't want to risk chopping them up and losing remote connectivity. I don't own a helicopter to get around that quickly  ;D

      All Alix boards. Errors in the upgrade log

      fdisk: invalid fdisk partition table found
      bsdlabel: /dev/ad0s3: no valid label found
      bsdlabel: /dev/ad0s3: no valid label found
      bsdlabel: /dev/ad0s3: no valid label found
      tar: Failed to set default locale
      tar: Failed to set default locale
      shutdown: [pid 24752]

      I hear ya about all those remote devices and no time to travel there if there is a problem.  For me if there is a risk of breaking it at a remote site I would send them another firewall box and tell them to swap it when time permits.  This way the new firewall is working at the corporate office with the remote site's configuration file and should work fine when it reaches at the remote site.  Then have them send the old one back to redo for another site.

      I know it's a PITA.  Might be good for critical sites that can't go down for any period of time.  Hopefully soon we can get these upgrade issues sorted out.

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        I have a faulty image in my hands now, hopefully I can track down a solution soon.

        Current theory is that it was actually corrupt before the latest update and it just started to show it now, but once I have more time to experiment with the broken CF image I'll know for sure.

        I'm certain we can come up with a fix, but it might be something scary like doing a DD of a good partition table to the start of the disk. Not something I'd generally recommend however in theory all NanoBSD images of the same size should have the same partition layout so it may be safe.

        In the meantime I'd like to find out if everyone involved here had 4GB NanoBSD images, or both 2GB and 4GB, or even more.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • M
          me too
          last edited by

          Both 4GB CF for me.

          1 Reply Last reply Reply Quote 0
          • M
            mjohnson
            last edited by

            2 and 4GB for me.

            1 Reply Last reply Reply Quote 0
            • R
              RCS-Michael
              last edited by

              Jim,

              I have both 2G and 4G nanobsd images displaying this problem.

              Michael

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                For those of you that have an issue, show me the output of:

                fdisk -p /dev/ad0
                

                And note if it's 2gb or 4gb.

                If you have a working system of the same size to compare against, show the output from it also.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • M
                  mkomar
                  last edited by

                  Not working: 2GB

                  /dev/ad0

                  g c3875 h16 s63
                  p 1 0xa5 63 1902033
                  a 1
                  p 2 0xa5 1902159 1902033
                  p 3 0xa5 3804192 102816

                  Working: 4GB

                  /dev/ad0

                  g c7751 h16 s63
                  p 1 0xa5 63 3844449
                  p 2 0xa5 3844575 3844449
                  a 2
                  p 3 0xa5 7689024 102816

                  Not Working: 2GB

                  /dev/ad0

                  g c3875 h16 s63
                  p 1 0xa5 63 1902033
                  p 2 0xa5 1902159 1902033
                  a 2
                  p 3 0xa5 3804192 102816

                  Working: 2GB

                  /dev/ad0

                  g c3897 h16 s63
                  p 1 0xa5 63 1902033
                  a 1
                  p 2 0xa5 1902159 1902033
                  p 3 0xa5 3804192 102816

                  1 Reply Last reply Reply Quote 0
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    Those last two are interesting in that they're nearly identical and one works and the other doesn't. I expect some variation as we have, over time, slightly shrunk the NanoBSD slice sizes, but that is a bit curious.

                    The .img file I read from the CF with the "corrupt" table appears to be OK, despite the CF showing a damaged table. So I'm left to wonder if there may be some other CF-related factor at play.

                    The following commands could be dangerous so if you choose to attempt them, proceed with extreme caution. I tested these on my own ALIX with a good MBR and it survived, but there are no guarantees. You need only try one of these methods unless it doesn't help, then proceed to the next one.

                    Method #1: Rewrite the MBR+Partition table with dd

                    sysctl kern.geom.debugflags=16
                    dd if=/dev/ad0 of=/tmp/mbr_part_bkup.img bs=512 count=1
                    dd of=/dev/ad0 if=/tmp/mbr_part_bkup.img bs=512 count=1
                    
                    

                    Method #2: Have fdisk reset the partition table:

                    sysctl kern.geom.debugflags=16
                    fdisk -p /dev/ad0 > /tmp/fdisk_bkup.txt
                    fdisk -if /tmp/fdisk_bkup.txt /dev/ad0
                    
                    

                    Method #3: Take a "working" fdisk output and rewrite using it. I can't stress enough that you must make sure the partition boundaries line up, don't grab the fdisk output from a differently sized card:

                    
                    # Get the "fdisk -p" output from a similar but working CF, save it in /tmp/fdisk_bkup.txt
                    sysctl kern.geom.debugflags=16
                    fdisk -if /tmp/fdisk_bkup.txt /dev/ad0
                    
                    

                    After any of those, chances are that no commands will work to reboot the unit, so either pull the power or run the following to force a panic+reboot:

                    sysctl debug.debugger_on_panic=0
                    sysctl debug.kdb.panic=1
                    

                    After that has completed, try the upgrade once again.

                    Obviously that isn't something you'd want to try on a remote unit.

                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 0
                    • M
                      mkomar
                      last edited by

                      All of mine are in remote locations and in production. Can't risk taking them down.

                      I'll be swapping them out with upgraded (Software) replacements in the next week or so.

                      Is there value in trying these fixes after that?

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        @mkomar:

                        All of mine are in remote locations and in production. Can't risk taking them down.

                        I'll be swapping them out with upgraded (Software) replacements in the next week or so.

                        Is there value in trying these fixes after that?

                        It would still help to know if any of the above methods would correct the faulty partition table, so that others can benefit from the knowledge.

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • M
                          mkomar
                          last edited by

                          I should have a couple of those units on hand next week. I'll give it a shot and report back once I've done so.

                          1 Reply Last reply Reply Quote 0
                          • M
                            mkomar
                            last edited by

                            JimP - Should have my hands on one or two of the malfunctioning units in the next few days. I'd be happy to try the various fixes you have proposed, and/or if it would be of more value, I'd be happy to either get you serial access to one or both of them and/or get either CF card(s) and/or dd img dumps out to you.

                            Would any of the options work better than others as far as getting a 'known good' fix out there?

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              From my post a few entries up ( https://forum.pfsense.org/index.php?topic=75069.msg413219#msg413219 ) – I listed them in order of preference (and likely destructive potential!)

                              So try them in that order, method #1, then #2, then #3 only if both 1 and 2 fail.

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • M
                                mkomar
                                last edited by

                                [2.1-RELEASE][root@pfsense]/root(1): sysctl kern.geom.debugflags=16
                                kern.geom.debugflags: 0 -> 16
                                [2.1-RELEASE][root@pfsense]/root(2): d if=/dev/ad0 of=/tmp/mbr_part_bkup.img bs=512 count=1
                                dd of=/dev/ad0 if=/tmp/mbr_part_bkup.img bs=512 count=1dd if=/dev/ad0 of=/tmp/mbr_part_bkup.img bs=512 count=1
                                1+0 records in
                                1+0 records out
                                512 bytes transferred in 0.000771 secs (664033 bytes/sec)
                                [2.1-RELEASE][root@pfsense]/root(3): dd of=/dev/ad0 if=/tmp/mbr_part_bkup.img bs=512 count=1
                                1+0 records in
                                GEOM_PART: integrity check failed (ad0, MBR)

                                512 bytes transferred in 0.024720 secs (20712 bGEOM: ad0s1: media size does not match label.
                                ytes/sec)
                                GEOM: ad0s2: media size does not match label.
                                [2.1-RELEASE][root@pfsense]/root(4):

                                1 Reply Last reply Reply Quote 0
                                • P
                                  pmiller
                                  last edited by

                                  I also have the problem in this post and have been following it closely.  I am particularly interested in an in-place fix since I don't have an extra CF card or reader to re-flash.

                                  mkomar - in the third line of your post shouldn't there be 2 d's instead of one for the 'dd' command?  Maybe a copy/paste error?

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    MattMeyer
                                    last edited by

                                    Adding my experience to the thread.  I am also having the same issue.  I am using a SanDisk Extreme 4 GB CF.  This is the output for fdisk -p /dev/ad0:

                                    /dev/ad0

                                    g c7751 h16 s63
                                    p 1 0xa5 63 3861585
                                    a 1
                                    p 2 0xa5 3861711 3861585
                                    p 3 0xa5 7723296 102816

                                    i have tried recreating the MBR using both the dd method and the fdisk method.  Both did not help with a successful upgrade.  I do not have a working device that I can copy the MBR so option #3 is out.

                                    Additionally, trying to change the boot slice does not "stick".

                                    1 Reply Last reply Reply Quote 0
                                    • M
                                      mkomar
                                      last edited by

                                      @pmiller:

                                      mkomar - in the third line of your post shouldn't there be 2 d's instead of one for the 'dd' command?  Maybe a copy/paste error?

                                      Must have been dropped somehow. If it wasn't 'dd' the we would see an error instead of the command output.

                                      I've still got a 'broken' device standing by if someone is interested in finding a reliable on-line fix. In the mean time, I've just done a config backup/restore and replaced the production units.

                                      1 Reply Last reply Reply Quote 0
                                      • G
                                        gazoo
                                        last edited by

                                        Since 2.1.3 came out, has this solved any of these problems for anyone? I was going to go for it, but I don't know if that's a good idea - wanted to see what others have seen for 2.1.3

                                        I'm on nano alix 4g netgate

                                        1 Reply Last reply Reply Quote 0
                                        • T
                                          trunix
                                          last edited by

                                          jimp, I've got the same problem on a 4gb CF. Output of fdisk -p /dev/ad0:

                                          /dev/ad0

                                          g c7745 h16 s63
                                          p 1 0xa5 63 3854529
                                          p 2 0xa5 3854655 3854529
                                          a 2
                                          p 3 0xa5 7709184 102816

                                          I've tried method #1 and #2, but neither worked. The output of fdisk -if /tmp/fdisk_bkup.txt /dev/ad0 from method #2 is below in case it's notable. I didn't get any errors from method #1, the system just booted back into 2.1 on the same slice. The same thing happened after method #2. I'm also not able to switch the bootup slice for whatever reason.

                                          fdisk: WARNING line 2: number of cylinders (7745) may be out-of-range
                                              (must be within 1-1024 for normal BIOS operation, unless the entire disk
                                              is dedicated to FreeBSD)
                                          ******* Working on device /dev/ad0 *******

                                          This system and CF card have been in stable operation for awhile now and I've successfully installed all the updates from 2.0.1 to 2.1. I never got a chance to install 2.1.1, I've had similar problems attempting to install 2.1.2.

                                          1 Reply Last reply Reply Quote 0
                                          • C
                                            cmb
                                            last edited by

                                            @gazoo:

                                            Since 2.1.3 came out, has this solved any of these problems for anyone? I was going to go for it, but I don't know if that's a good idea - wanted to see what others have seen for 2.1.3

                                            Nothing related to this in particular has changed. The worst that'll happen is it'll reboot back on the same version so there isn't any harm in trying. The vast majority can upgrade just fine, so there's no reason to not try, as it'll more than likely work for you.

                                            For those who have an ALIX (or anything else with 256 MB RAM and nano), if you're running more than you reasonably should be on a box with 256 MB RAM, disabling some services (especially OpenVPN if you're running multiple instances) before upgrading will allow you to upgrade successfully in an unrelated circumstance to this thread, where it fails because you run out of RAM when trying to upgrade.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.