Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    BOOT problem ZFS / NVME SSD - nvme0: System interrupt issues?

    Scheduled Pinned Locked Moved General pfSense Questions
    16 Posts 3 Posters 753 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S Offline
      stephenw10 Netgate Administrator
      last edited by

      @ramup said in BOOT problem ZFS / NVME SSD - Root mount waiting for CAM:

      nvme0: <Generic NVMe Device> mem 0xa1600000-0xa1603fff at device 0.0 on pci2

      Do you still see that message in the boot log when it fails to mount root after rebooting?

      How cold is the cold boot? Actually full power cycle? Or shutdown from the gui and boot with the power button?

      R 1 Reply Last reply Reply Quote 0
      • M Offline
        mer
        last edited by

        My experience has been if there is a difference between warm reboot and cold (power cycle) reboot is "Device initialization difference". Warm reboot typically triggers a reset signal (hardware signal) that should get routed to any physical device (controllers) to cause them to reset. Sometimes vendors don't do that and you get problems.
        Cold reboot takes power away and the software reinitializes everything it needs to.

        The check that @stephenw10 asks can tell you what "isn't" happening.
        It can be worked around if the software always reinitializes everything but you don't always want that.

        R 1 Reply Last reply Reply Quote 1
        • R Offline
          ramup @stephenw10
          last edited by ramup

          @stephenw10

          Thanks for your input.

          I just rebooted once again testwise and yes, the messages are almost the same (and in same order) but I now noticed (wasn`t aware before) that after (in respect to boot disk):

          nvme0: <Generic NVMe Device> mem 0xa1600000-0xa1603fff at device 0.0 on pci2
          ....
          ZFS filesystem version: 5
          ZFS storage pool version: features support (5000)
          
          

          It says instead of (in case of successful boot)

          Trying to mount root from zfs:pfSense/ROOT/default []...
          Root mount waiting for: CAM
          Root mount waiting for: CAM
          Root mount waiting for: CAM
          ...
          

          in the order of this line:

          nvme0: System interrupt issues?
          Root mount waiting for: CAM
          Root mount waiting for: CAM
          Root mount waiting for: CAM
          .... (endless repeat)
          

          Cold boot is e.g. when the warm boot fails I shut down the pc via power button (long press). When I turn the PC on again 10 seconds later it boots fine (like the example given in my first post).

          1 Reply Last reply Reply Quote 0
          • R Offline
            ramup @mer
            last edited by ramup

            @mer

            Thanks for your input. According to your experience does the type of messages appear I posted above when vendors do not implement it the way it shoud be on warm reboots?

            M 1 Reply Last reply Reply Quote 0
            • stephenw10S Offline
              stephenw10 Netgate Administrator
              last edited by stephenw10

              Oh it actually shows: nvme0: System interrupt issues? at the warm boot? That seems like a huge clue if so.
              https://man.freebsd.org/cgi/man.cgi?query=nvme&sektion=4&manpath=FreeBSD+15.0-CURRENT#DIAGNOSTICS

              Try booting verbose at the reboot see if that gives you more. Interrupt boot at the loader menu then at the OK> prompt enter: boot -v

              Sometime the extra delay caused by booting verbose can actually pass whatever error is present. Which is inconvenient for troubleshooting!

              1 Reply Last reply Reply Quote 0
              • M Offline
                mer @ramup
                last edited by

                @ramup possibly. A reset line not physically routed to the right place. But again, sometimes this can be worked around by have software reinitializing registers.

                1 Reply Last reply Reply Quote 0
                • stephenw10S Offline
                  stephenw10 Netgate Administrator
                  last edited by

                  Mmm I could believe something missing in the adapter could present like this.

                  I'd try forcing legacy interrupts for nvme.

                  R 2 Replies Last reply Reply Quote 0
                  • R Offline
                    ramup @stephenw10
                    last edited by

                    @stephenw10
                    Thank you for the hint to FreeBSD man page. I did not even find that with Google with search "system interrupt issues?"

                    Can you tell me what you technically mean by "try forcing legacy interrupts for nvme"?

                    @mer
                    Thank you also for your input, but I also do not understand technically how to do "software reinitializing registers"?

                    M 1 Reply Last reply Reply Quote 0
                    • R Offline
                      ramup @stephenw10
                      last edited by ramup

                      @stephenw10
                      Do you mean:

                      https://man.freebsd.org/cgi/man.cgi?nvme(4)

                      setting:

                      hw.nvme.force_intx=1

                      in in /boot/loader.conf.local ?

                      1 Reply Last reply Reply Quote 1
                      • stephenw10S Offline
                        stephenw10 Netgate Administrator
                        last edited by

                        Create the file /boot/loader.conf.local

                        Add the following line to it:
                        hw.nvme.force_intx=1

                        Reboot.

                        I've never had to try that so you may find you have to disable that again at the loader prompt if it fails to boot.

                        R 1 Reply Last reply Reply Quote 0
                        • M Offline
                          mer @ramup
                          last edited by

                          @ramup said in BOOT problem ZFS / NVME SSD - Root mount waiting for CAM:

                          Thank you also for your input, but I also do not understand technically how to do "software reinitializing registers"?

                          That would be the OS/driver level. OS as it goes through boot sequence, OS detects a piece of hardware (ram, usb, pci, whatever), OS knows (because of magic) warm/cold boot, decides to set registers or not. Values in sysctls can affect this behavior.
                          So not a "pfSense problem" but maybe a FreeBSD issue (pfSense is built on top of FreeBSD), but reporting it here because this is where you ran across it here lets the pfSense team figure out what to do.

                          1 Reply Last reply Reply Quote 0
                          • R Offline
                            ramup @stephenw10
                            last edited by

                            @stephenw10 + @mer

                            Thank you for your help and guidance. I will start to try with the extra boot command and if that does not work I will search for the possibilities to reinitialize registers via sysctl.

                            I will report back!

                            R 1 Reply Last reply Reply Quote 1
                            • R Offline
                              ramup @ramup
                              last edited by ramup

                              Feedback:

                              hw.nvme.force_intx=1

                              in

                              /boot/loader.conf.local

                              solved the issue!

                              Many thanks for your help!

                              I renamed the thread in case someone searches for it.

                              1 Reply Last reply Reply Quote 1
                              • stephenw10S Offline
                                stephenw10 Netgate Administrator
                                last edited by

                                Nice, good result! 👍

                                M 1 Reply Last reply Reply Quote 0
                                • M Offline
                                  mer @stephenw10
                                  last edited by

                                  @stephenw10 I agree @ramup thanks for keeping everyone in the loop

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.