• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

After replacing a failed (cache only) Cisco 12G SAS controller, the following: kdb_enter+0x32: movq

Hardware
2
9
482
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    DaddyGo
    last edited by Jul 2, 2024, 10:42 AM

    Re: pfSense as VM > Stopped at kdb_enter+0x32

    It is not a virtual machine it is a bare metal installation, we received a "Moderate Fault" log from the Cisco CIMC of one of our pfSense machines, indicating a RAID controller cache failure.

    We had on the shelf a replacement RAID card with a 1GB cache module with exactly the same FW.

    After a quick replacement, pfSense starts with this error (below) and cannot be used, if I rebuild the faulty RAID card it works without any problems, of course with the original error.

    Has anyone come across something similar?
    Is it possible to replace a RAID card without a full pfS reinstall?

    🔒 Log in to view

    🔒 Log in to view

    Cats bury it so they can't see it!
    (You know what I mean if you have a cat)

    1 Reply Last reply Reply Quote 0
    • S
      stephenw10 Netgate Administrator
      last edited by Jul 2, 2024, 11:31 PM

      This seems unrelated to the linked thread about Proxmox?

      MCA errors like that are almost exclusively a hardware issue.

      Steve

      D 1 Reply Last reply Jul 3, 2024, 1:24 PM Reply Quote 0
      • D
        DaddyGo @stephenw10
        last edited by DaddyGo Jul 3, 2024, 1:26 PM Jul 3, 2024, 1:24 PM

        @stephenw10 said in After replacing a failed (cache only) Cisco 12G SAS controller, the following: kdb_enter+0x32: movq:

        This seems unrelated to the linked thread about Proxmox?

        Yes, as I wrote it is not a virtual environment "It is not a virtual machine it is a bare metal installation"

        Which is identical in both cases (and google search threw this thread): kdb_enter+0x32: movq

        -we clearly have a HW error, the RAID card cache is degraded ("Moderate Fault")

        So we replaced it with a completely identical one RAID cont., but pfSense won't start with it, I think pfSense should "survive" a HW element replacement

        I could replace the RAID controller and reinstall the whole pfSense from scratch, but I wanted to avoid that, because of the longer downtime.

        I was wondering if anyone had any ideas...

        PS:
        currently this pfS install running with the original (faulty) RAID controller, but something needs to be done as it already predicts other possible failures

        Cats bury it so they can't see it!
        (You know what I mean if you have a cat)

        1 Reply Last reply Reply Quote 0
        • S
          stephenw10 Netgate Administrator
          last edited by Jul 3, 2024, 4:12 PM

          Right but you're seeing that MCA error with the replacement card as I understand it?

          Very occasionally you will see that when a software update exposes some hardware conflict that was always there but never hit.

          Here though if the hardware is identical you know that shouldn't happen so it pretty much has to be bad hardware.

          D 1 Reply Last reply Jul 3, 2024, 4:19 PM Reply Quote 0
          • D
            DaddyGo @stephenw10
            last edited by DaddyGo Jul 3, 2024, 4:26 PM Jul 3, 2024, 4:19 PM

            @stephenw10 said in After replacing a failed (cache only) Cisco 12G SAS controller, the following: kdb_enter+0x32: movq:

            Right but you're seeing that MCA error with the replacement card as I understand it?

            I only get this error when I start with the new (ergo replaced) card .

            when I put the partially faulty card back (as it is only moderate fault) so it still works, but no cache, but no MCA error the pfsense boots fine

            +++edit:
            with the replacement card the Cisco HW check runs (in CIMC) and does not find the new RAID card faulty but pfS does not start
            I should also mention that we have 3 sets of these spare cards (Cisco UCS 12G SAS - since we have several Cisco UCS-C2xxM4 servers) and all of them behave like this

            Cats bury it so they can't see it!
            (You know what I mean if you have a cat)

            1 Reply Last reply Reply Quote 0
            • S
              stephenw10 Netgate Administrator
              last edited by Jul 3, 2024, 4:28 PM

              I would look for hardware revision differences or firmware differences then.

              There is nothing you can do in pfSense to correct that other than disabling the driver entirely so it never tries to access it.

              D 1 Reply Last reply Jul 3, 2024, 4:52 PM Reply Quote 0
              • D
                DaddyGo @stephenw10
                last edited by Jul 3, 2024, 4:52 PM

                @stephenw10

                yes, I was afraid of that...

                by the way, I compared the cards and all of them were made on the same day, their serial numbers are within a thousand pieces, and I chose the one that differs from the original (faulty) serial number by only about two hundred, so they might have been on the production line at the same time :)

                Well, I suspect - because this installation is configured to Cisco doing RAID1 with two physical SAS disks (it's not good) and pfSense only sees the Cisco VD boot drive and configured for plain ZFS.

                At this point something goes wrong, when another RAID controller is inserted the pfSense can't handle it...
                (unfortunately I inherited this setting from another colleague who no longer works with us)

                Since I have to reinstall pfSense anyway (this is now clear to me) I think I will skip the Cisco RAID1 and install the new pfSense with ZFS RAID on the two SAS disks

                One more question, this Cisco is running on CE version and we plan to switch to the paid version.
                I haven't followed the updates here for a long time, so my question is 2.7.2 config.xml compatible with 24.03? - because I would have to make this version switch now.

                Cats bury it so they can't see it!
                (You know what I mean if you have a cat)

                1 Reply Last reply Reply Quote 0
                • S
                  stephenw10 Netgate Administrator
                  last edited by Jul 3, 2024, 5:13 PM

                  Yes you can import a 2.7.2 config into 24.03.

                  D 1 Reply Last reply Jul 3, 2024, 5:15 PM Reply Quote 0
                  • D
                    DaddyGo @stephenw10
                    last edited by Jul 3, 2024, 5:15 PM

                    @stephenw10

                    Thanks for the help and info

                    Cats bury it so they can't see it!
                    (You know what I mean if you have a cat)

                    1 Reply Last reply Reply Quote 0
                    6 out of 9
                    • First post
                      6/9
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.