Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    After replacing a failed (cache only) Cisco 12G SAS controller, the following: kdb_enter+0x32: movq

    Scheduled Pinned Locked Moved Hardware
    9 Posts 2 Posters 490 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DaddyGoD
      DaddyGo
      last edited by

      Re: pfSense as VM > Stopped at kdb_enter+0x32

      It is not a virtual machine it is a bare metal installation, we received a "Moderate Fault" log from the Cisco CIMC of one of our pfSense machines, indicating a RAID controller cache failure.

      We had on the shelf a replacement RAID card with a 1GB cache module with exactly the same FW.

      After a quick replacement, pfSense starts with this error (below) and cannot be used, if I rebuild the faulty RAID card it works without any problems, of course with the original error.

      Has anyone come across something similar?
      Is it possible to replace a RAID card without a full pfS reinstall?

      30d816af-c954-4d2f-8e27-e6fd6a992d8c-image.png

      d73707a9-5e5d-450b-ab65-66ca8d8214ba-image.png

      Cats bury it so they can't see it!
      (You know what I mean if you have a cat)

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        This seems unrelated to the linked thread about Proxmox?

        MCA errors like that are almost exclusively a hardware issue.

        Steve

        DaddyGoD 1 Reply Last reply Reply Quote 0
        • DaddyGoD
          DaddyGo @stephenw10
          last edited by DaddyGo

          @stephenw10 said in After replacing a failed (cache only) Cisco 12G SAS controller, the following: kdb_enter+0x32: movq:

          This seems unrelated to the linked thread about Proxmox?

          Yes, as I wrote it is not a virtual environment "It is not a virtual machine it is a bare metal installation"

          Which is identical in both cases (and google search threw this thread): kdb_enter+0x32: movq

          -we clearly have a HW error, the RAID card cache is degraded ("Moderate Fault")

          So we replaced it with a completely identical one RAID cont., but pfSense won't start with it, I think pfSense should "survive" a HW element replacement

          I could replace the RAID controller and reinstall the whole pfSense from scratch, but I wanted to avoid that, because of the longer downtime.

          I was wondering if anyone had any ideas...

          PS:
          currently this pfS install running with the original (faulty) RAID controller, but something needs to be done as it already predicts other possible failures

          Cats bury it so they can't see it!
          (You know what I mean if you have a cat)

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Right but you're seeing that MCA error with the replacement card as I understand it?

            Very occasionally you will see that when a software update exposes some hardware conflict that was always there but never hit.

            Here though if the hardware is identical you know that shouldn't happen so it pretty much has to be bad hardware.

            DaddyGoD 1 Reply Last reply Reply Quote 0
            • DaddyGoD
              DaddyGo @stephenw10
              last edited by DaddyGo

              @stephenw10 said in After replacing a failed (cache only) Cisco 12G SAS controller, the following: kdb_enter+0x32: movq:

              Right but you're seeing that MCA error with the replacement card as I understand it?

              I only get this error when I start with the new (ergo replaced) card .

              when I put the partially faulty card back (as it is only moderate fault) so it still works, but no cache, but no MCA error the pfsense boots fine

              +++edit:
              with the replacement card the Cisco HW check runs (in CIMC) and does not find the new RAID card faulty but pfS does not start
              I should also mention that we have 3 sets of these spare cards (Cisco UCS 12G SAS - since we have several Cisco UCS-C2xxM4 servers) and all of them behave like this

              Cats bury it so they can't see it!
              (You know what I mean if you have a cat)

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                I would look for hardware revision differences or firmware differences then.

                There is nothing you can do in pfSense to correct that other than disabling the driver entirely so it never tries to access it.

                DaddyGoD 1 Reply Last reply Reply Quote 0
                • DaddyGoD
                  DaddyGo @stephenw10
                  last edited by

                  @stephenw10

                  yes, I was afraid of that...

                  by the way, I compared the cards and all of them were made on the same day, their serial numbers are within a thousand pieces, and I chose the one that differs from the original (faulty) serial number by only about two hundred, so they might have been on the production line at the same time :)

                  Well, I suspect - because this installation is configured to Cisco doing RAID1 with two physical SAS disks (it's not good) and pfSense only sees the Cisco VD boot drive and configured for plain ZFS.

                  At this point something goes wrong, when another RAID controller is inserted the pfSense can't handle it...
                  (unfortunately I inherited this setting from another colleague who no longer works with us)

                  Since I have to reinstall pfSense anyway (this is now clear to me) I think I will skip the Cisco RAID1 and install the new pfSense with ZFS RAID on the two SAS disks

                  One more question, this Cisco is running on CE version and we plan to switch to the paid version.
                  I haven't followed the updates here for a long time, so my question is 2.7.2 config.xml compatible with 24.03? - because I would have to make this version switch now.

                  Cats bury it so they can't see it!
                  (You know what I mean if you have a cat)

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Yes you can import a 2.7.2 config into 24.03.

                    DaddyGoD 1 Reply Last reply Reply Quote 0
                    • DaddyGoD
                      DaddyGo @stephenw10
                      last edited by

                      @stephenw10

                      Thanks for the help and info

                      Cats bury it so they can't see it!
                      (You know what I mean if you have a cat)

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.