Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Need driver for hardware watchdog

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    64 Posts 8 Posters 41.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Offline
      MrKoen
      last edited by

      Thanks. I'm glad to be able to help others with this great product. And you guys know what the funny thing about all of this is? Since I have the watchdog up and running, my pfSense installation hasn't crashed or freezed anymore for a whole day already! It used to crash or freeze at least 5 times a day before I got this up and running. Can't explain it, but I'm very happy with it ;)

      1 Reply Last reply Reply Quote 0
      • C Offline
        Cino
        last edited by

        Really nice write up!! I made have to try 64bit one of days now :-) Just need one more driver and i'll be good to go

        Good to hear that your box is stable now

        1 Reply Last reply Reply Quote 0
        • M Offline
          MrKoen
          last edited by

          I want to express once more that following the steps to enable the watchdog really has made my pfSense system rock solid and stable now. It has been running for a week already without any problems, a single crash or single hang. I did a reinstall on a new disk last weekend and the crashes and hangs started right again. Once applying the watchdog driver again, the system is solid and stable again. Strange, but true. So if you're using the SuperMicro X7SPA-HF-D525 or equivalent board and experience an unstable pfSense installation, apply these easy steps.

          1 Reply Last reply Reply Quote 0
          • stephenw10S Offline
            stephenw10 Netgate Administrator
            last edited by

            Weird?  :-
            I sounds like the perhaps the ICH is not being correctly setup by either the bios or the standard ICH driver.
            Loading the ICHWD driver sets the registers to known values preventing some unstable condition.
            Speculation.  ;)

            Steve

            1 Reply Last reply Reply Quote 0
            • D Offline
              dzeanah
              last edited by

              Nice write-up.

              Does anyone know if this same procedure will work with a Netgate hamakua?

              1 Reply Last reply Reply Quote 0
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                It should work on anything that has an Intel ICH (I/O Controller Hub). Except that it can be disabled by the manufacturer.
                See the man page here.

                Steve

                1 Reply Last reply Reply Quote 0
                • _ Offline
                  _igor_
                  last edited by

                  stephenw10: thanks for your tutorial, but seems as if it doesn't work when your harddisks are configured as AHCI: ichwd.ko does not work/load as expected. So i searched and found a reasonable thing to solve that problem:

                  First of all i installed freeipmi:

                  pkg_add -r freeipmi
                  

                  That installs a bunch of nice tools to control your hardware like sensors, voltages and so on. But here only one program is interesting: bmc-watchdog.

                  Next i created a shell-script to start/enable the bmc-watchdog, i called it "watchdog.sh":

                  #!/bin/sh
                  # /usr/local/etc/rc.d/watchdog.sh
                  # First stop the running bios-watchdog:
                  /usr/local/sbin/bmc-watchdog -y
                  # Next start the watchdog with new settings
                  # -d start bmc-watchdog as daemon
                  # -i xx is the initial timer, which generates the reset
                  # -e xx is our timer which resets to initial timer
                  /usr/local/sbin/bmc-watchdog -d -i 16 -e 10
                  
                  # alternative which should run too,
                  # but not in conjunction with bmc-watchdog
                  # not tested in my case :)
                  # /etc/rc.d/watchdogd forcestart
                  
                  

                  after creating the watchdog.sh, please change permissions:

                  chmod a+x /usr/local/etc/rc.d/watchdog.sh
                  

                  Now we configure freeipmi, freeipmi.conf is located in /usr/local/etc/freeipmi/
                  The freeipmi.conf is rather big, but easy to understand. All options are disabled, so we have to enable some things:

                  The only relevant part is this one:

                  #####################################################################################################
                  #
                  # BMC-WATCHDOG OPTIONS
                  #
                  # The following options are specific to bmc-watchdog(8).  They will be
                  # ignored by other tools.
                  #
                  # bmc-watchdog-workaround-flags workaround1,workaround2,workaround3
                  
                  bmc-watchdog-logfile /var/log/watchdog.log
                  #
                  # bmc-watchdog-no-logging DISABLE
                  #
                  #####################################################################################################
                  

                  You can test your script by running it:

                  /usr/local/etc/rc.d/watchdog.sh
                  

                  Now get the info via "bmc-watchdog -g". It will output something like this:

                  Timer Use:                  SMS/OS
                  Timer:                      Stopped
                  Logging:                    Enabled
                  Timeout Action:              None
                  Pre-Timeout Interrupt:      None
                  Pre-Timeout Interval:        0 seconds
                  Timer Use BIOS FRB2 Flag:    Clear
                  Timer Use BIOS POST Flag:    Clear
                  Timer Use BIOS OS Load Flag: Clear
                  Timer Use BIOS SMS/OS Flag:  Set
                  Timer Use BIOS OEM Flag:    Clear
                  Initial Countdown:          16 seconds
                  Current Countdown:          16 seconds

                  Next of it reboot your pfsense and enter the BIOS-settings.

                  Go to "Advanced", "IPMI Configuration" and enable the watchdog here, but ONLY here!
                  Set it to power-recycle or hard-reset, and timeout to 5 min.

                  That setting gives your pfsense enough time to get up and running after restart. Even when a file-system-check happens, it will be enough time!
                  (tested with a geom_mirror after a hard reset - disks have 250GB.)

                  After rebooting you can control the thing:

                  ssh into your pfsense and see what "bmc-watchdog -g" outputs:

                  Timer Use:                  BIOS POST
                  Timer:                      Running
                  Logging:                    Enabled
                  Timeout Action:              Power Cycle
                  Pre-Timeout Interrupt:      None
                  Pre-Timeout Interval:        0 seconds
                  Timer Use BIOS FRB2 Flag:    Clear
                  Timer Use BIOS POST Flag:    Clear
                  Timer Use BIOS OS Load Flag: Clear
                  Timer Use BIOS SMS/OS Flag:  Set
                  Timer Use BIOS OEM Flag:    Clear
                  Initial Countdown:          300 seconds
                  Current Countdown:          93 seconds

                  When your pfsense is started completely (after the little start-sound), control again:

                  bmc-watchdog -g
                  Timer Use:                  BIOS POST
                  Timer:                      Running
                  Logging:                    Enabled
                  Timeout Action:              Power Cycle
                  Pre-Timeout Interrupt:      None
                  Pre-Timeout Interval:        0 seconds
                  Timer Use BIOS FRB2 Flag:    Clear
                  Timer Use BIOS POST Flag:    Clear
                  Timer Use BIOS OS Load Flag: Clear
                  Timer Use BIOS SMS/OS Flag:  Set
                  Timer Use BIOS OEM Flag:    Clear
                  Initial Countdown:          16 seconds
                  Current Countdown:          15 seconds

                  e voila, all is running as expected!
                  Test it with a "killall -9 bmc-watchdog"
                  Your countdown goes to 0 and your pfsense reboots.

                  Looking at your watchdog.log you see the timer-resets:

                  [May 11 19:47:25]: BMC-Watchdog Timer Reset
                  [May 11 19:47:35]: BMC-Watchdog Timer Reset
                  [May 11 19:47:45]: BMC-Watchdog Timer Reset
                  [May 11 19:47:55]: BMC-Watchdog Timer Reset
                  [May 11 19:48:05]: BMC-Watchdog Timer Reset
                  [May 11 19:48:15]: BMC-Watchdog Timer Reset
                  [May 11 19:48:25]: BMC-Watchdog Timer Reset
                  [May 11 19:48:35]: BMC-Watchdog Timer Reset

                  So loading the "ichwd.ko" is not any more necessary!

                  edit (forgot it to mention): please create a new file in /boot:

                  /boot/loader.conf.local with following setting in:

                  ipmi_load="YES"

                  If you put that setting in your loader.conf, it will be lost with the next firmware-update.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S Offline
                    stephenw10 Netgate Administrator
                    last edited by

                    That's good stuff. What other treasures does IPMI hold?  ;D

                    Thanks should go to Koen Zomers for taking the time to write up the procedure. Your write up seems equally comprehensive. Two options for using a watchdog is better than one.

                    Steve

                    1 Reply Last reply Reply Quote 0
                    • _ Offline
                      _igor_
                      last edited by

                      one other fine thing is that:

                      supermicro X7SBA-D525-HF:

                      ipmi-sensors
                      4: System Temp (Temperature): 37.00 C (-7.00/77.00): [OK]
                      71: CPU Temp (Temperature): 33.00 C (-8.00/90.00): [OK]
                      138: CPU FAN (Fan): 4840.00 RPM (585.00/29815.00): [OK]
                      205: SYS FAN (Fan): 5025.00 RPM (585.00/29815.00): [OK]
                      272: CPU Vcore (Voltage): 1.14 V (0.66/1.41): [OK]
                      339: Vichcore (Voltage): 1.04 V (0.82/1.18): [OK]
                      406: +3.3VCC (Voltage): 3.26 V (2.88/3.65): [OK]
                      473: VDIMM (Voltage): 1.53 V (1.33/1.66): [OK]
                      540: +5 V (Voltage): 4.96 V (4.32/5.60): [OK]
                      607: +12 V (Voltage): 12.30 V (10.60/13.20): [OK]
                      674: +3.3VSB (Voltage): 3.26 V (2.88/3.65): [OK]
                      741: VBAT (Voltage): 3.17 V (2.62/3.39): [OK]
                      808: Chassis Intru (Platform Chassis Intrusion): [OK]
                      875: PS Status (Power Supply): [Presence detected][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State]

                      1 Reply Last reply Reply Quote 0
                      • C Offline
                        Cino
                        last edited by

                        igor, great find!! Took me a while to get this to work, I had added ipmi.ko to my loader.conf.local a while back. Removed it and all is working now on my X7SPA-D510-HF.. The ipmi-sensors is great add-on!!

                        
                        ipmi-sensors
                        4: System Temp (Temperature): 54.00 C (-7.00/77.00): [OK]
                        71: CPU Temp (Temperature): 61.00 C (-8.00/90.00): [OK]
                        138: CPU FAN (Fan): NA (NA/NA): [Unknown]
                        205: SYS FAN (Fan): 4840.00 RPM (585.00/29815.00): [OK]
                        272: CPU Vcore (Voltage): 1.16 V (0.66/1.41): [OK]
                        339: Vichcore (Voltage): 1.04 V (0.82/1.18): [OK]
                        406: +3.3VCC (Voltage): 3.26 V (2.88/3.65): [OK]
                        473: VDIMM (Voltage): 1.83 V (1.48/1.99): [OK]
                        540: +5 V (Voltage): 4.96 V (4.32/5.60): [OK]
                        607: +12 V (Voltage): 12.29 V (10.50/13.06): [OK]
                        674: +3.3VSB (Voltage): 3.26 V (2.88/3.65): [OK]
                        741: VBAT (Voltage): 3.15 V (2.62/3.39): [OK]
                        808: Chassis Intru (Platform Chassis Intrusion): [OK]
                        875: PS Status (Power Supply): [Presence detected][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State][Unrecognized State]
                        
                        

                        btw, I have both watchdogd and bmc-watchdog running on my box. They both work, but I need to play with the options for bmc-watchdog. Watchdogd does a hard reset, but bmc-watchdog is doing a power cycle. Which I think is better for the hard drive but it still needs to clean the filesystem when it boots up.

                        Thanks again everyone!

                        1 Reply Last reply Reply Quote 0
                        • M Offline
                          MrKoen
                          last edited by

                          Thanks igor for sharing your findings! Looking good. I'm going to give it a try in the next few days. Who can create a nice frontpage widget to parse and show the ipmi-sensors information? ;)

                          1 Reply Last reply Reply Quote 0
                          • _ Offline
                            _igor_
                            last edited by

                            cino: Thanks for the hint with ipmi="YES" in loader.conf. I forgot it to mention in my post. I added that setting there, it should be complete now.

                            Best practise is to set this in /boot/loader.conf.local. So it survives upgrades of pfsense.

                            1 Reply Last reply Reply Quote 0
                            • C Offline
                              Cino
                              last edited by

                              @_igor_:

                              cino: Thanks for the hint with ipmi="YES" in loader.conf. I forgot it to mention in my post. I added that setting there, it should be complete now.

                              Best practise is to set this in /boot/loader.conf.local. So it survives upgrades of pfsense.

                              I'm sorry, I had to remove that line to get bmc_watchdog to work. When ipmi.ko is loaded, the watchdog doesn't work for me. ipmi-sensors would also return an error. I unloaded the driver and both programs started to work with no errors. So i removed it from my loader.conf.local, rebooted the box and everything works. Sorry for the confusing

                              1 Reply Last reply Reply Quote 0
                              • C Offline
                                Cino
                                last edited by

                                @Koen:

                                Thanks igor for sharing your findings! Looking good. I'm going to give it a try in the next few days. Who can create a nice frontpage widget to parse and show the ipmi-sensors information? ;)

                                I copied page diag_system_activity.php and changed some of the code so it calls ipmi-sensors instead of top… It works :-)  Upload it to /usr/local/www and you can access it via http://pfsense/diag_ipmi-sensors.php.

                                Scott created the page, and if i'm reading his copyright notice correct. We can make changes to it as long as we leave the notice there.

                                diag_ipmi-sensors.php.txt

                                1 Reply Last reply Reply Quote 0
                                • _ Offline
                                  _igor_
                                  last edited by

                                  hey cino, that sounds strange, i have to load ipmi to get that info. Reading at the freeipmi-docs i saw that freeipmi has the ability to load its own ipmi-drivers. Maybe there is something wrong with your system-ipmi. Don't know that. Its included in the pfSense, i didn't load it from other installation.
                                  My entry in /boot/loader.conf.local is ipmi="YES"

                                  Much thanks for that changed/new ipmi-sensors.php! I'll try that out.

                                  1 Reply Last reply Reply Quote 0
                                  • C Offline
                                    Cino
                                    last edited by

                                    i had added ipmi="YES" to my /boot/loader.conf.local when this thread first started to test the ipmi watchdog. I never did take it out since it wasn't hurting anything. I have updated my BIOS and ipmi firmware to the latest version less then 2 months ago. Maybe that's why it doesn't work when I load the ipmi.ko driver. My ipmi is ver 2.0, i can't remember but i think it was either 1.1 or 1.5 when I my board first came out.

                                    1 Reply Last reply Reply Quote 0
                                    • M Offline
                                      MrKoen
                                      last edited by

                                      I just applied the steps to my pfSense box and it works superb! We have two great ways of using the IPMI functionality now.

                                      igor: I hope you don't mind, but I created a tutorial on my website for this method based on your tutorial. You already created a nice tutorial in this topic, but for my own reference since I did some things slightly different and as an addition to the first method, I created a page covering this as well. Of course all credits for this method go out to you and are mentioned multiple times in the tutorial.

                                      The tutorial can be found here:

                                      http://www.zomers.eu/knowledge/pfSense/Pages/Configure-pfSense-2.0-RC1-to-use-Watchdog-functionality-Method2.aspx

                                      1 Reply Last reply Reply Quote 0
                                      • stephenw10S Offline
                                        stephenw10 Netgate Administrator
                                        last edited by

                                        Another great write up!  ;D

                                        Steve

                                        1 Reply Last reply Reply Quote 0
                                        • _ Offline
                                          _igor_
                                          last edited by

                                          yeah! Great work! :-)

                                          1 Reply Last reply Reply Quote 0
                                          • C Offline
                                            Cino
                                            last edited by

                                            I have a suggestion as this has save my a$$ here and there doing remote updates and getting a panic for no reason.

                                            Instead of putting 10_watchdog.sh in /usr/local/etc/rc.d so it autostarts AFTER pfsense finishes its startup processes. Copy the script to say the /root folder. In my case, I add it to /root/custom/scripts since I have to run scripts after every update so it updates my files for my build. Then manually add it to your config.xml under earlyshellcmd section. You can add the 'shellcmd' package and add it via the gui interface.

                                            This way the watchdogd will start before core processes and interfaces are started.

                                            Last night I was messing around with snort and was rebooting the box a lot… On one reboot, I get a panic after it started OpenNTPD for some reason...16 seconds later, the box rebooted... If the watchdog started when the normal autostart scripts started, it would had never rebooted.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.