| This is RedHat/Fedora specific, but can be used with other distros with minor adjustments. |
| |
| Instructions for how to set up the watchdog daemon to work with IPMI's hardware watchdog |
| ---------------------------------------------------------------------------------------- |
| |
| First, verify that the ipmitool utility is present on the system to allow |
| the watchdog timer to be turned off via the command line (which ipmitool). |
| This will allow the hardware watchdog timer to be turned off gracefully |
| should it ever become necessary. If ipmitool is not present, install |
| it or download the latest version from http://ipmitool.sourceforge.net and |
| build and install it on your system. |
| |
| Next, prior to starting up the watchdog daemon, the BMC BIOS should be set |
| to enable the IPMI/BMC hardware watchdog timer, the OpenIPMI watchdog driver |
| module should be inserted with the desired configuration/startup settings, |
| and the watchdog daemon's configuration file should be modified to use /dev/watchdog: |
| |
| 1. To setup the IPMI/BMC BIOS to enable the hardware watchdog |
| timer, see BMC documentation. The main settings in the BMC BIOS |
| requiring modification to turn on the IPMI watchdog timer are: |
| |
| - Set the BMC POST Watchdog to "ENABLED". |
| - Set the BMC POST Watchdog Timeout to "5 Minutes". |
| |
| 2. To insert the OpenIPMI watchdog driver module with the |
| desired configuration settings, two steps are necessary: |
| |
| i.) Configure the OpenIPMI watchdog driver by editing the |
| /etc/sysconfig/ipmi configuration file: |
| |
| - Set "IPMI_WATCHDOG=yes". |
| - Set desired options via the IPMI_WATCHDOG_OPTIONS |
| config entry. |
| |
| EXAMPLE: 'IPMI_WATCHDOG_OPTIONS="timeout=60 start_now=1 \ |
| preop=preop_give_data action=power_cycle pretimeout=1" ' |
| |
| Execute "modinfo ipmi_watchdog" for more detailed information |
| on the available ipmi watchdog timer options. |
| |
| - Execute "service ipmi start" (the watchdog driver starts |
| automatically along with the other ipmi drivers). |
| |
| IMPORTANT: If "start_now=1" has been set as one of the |
| configuration options, be sure to start up the watchdog |
| daemon before the BMC timer expires! |
| |
| ii.) Set the OpenIPMI daemon and watchdog to start during bootup: |
| |
| - chkconfig ipmi on |
| - chkconfig watchdog on |
| |
| |
| 3. Configure the watchdog daemon by editing the |
| /etc/watchdog.conf configuration file: |
| |
| - Uncomment the "watchdog-device = /dev/watchdog" line. |
| - Ensure that "realtime = yes" and "priority = 1" are set and not |
| commented-out. |
| - Uncomment the "interval" line, and set the interval to be less |
| than what you set the timeout option to be in the /etc/sysconfig/ipmi |
| file (ex "timeout=60" so you might set interval to 50). |
| |
| So in the example described herein, the BMC BIOS setting is in |
| minutes (5), and the "interval" and ipmi_watchdog "timeout" settings |
| are both in seconds (50 and 60 respectively). Therefore, the BMC |
| hardware watchdog timer is set to expire and trigger a system power |
| cycle unless reset by the watchdog daemon within 5 minutes, and the |
| watchdog daemon will reset the timer every 60 seconds. |
| |
| |
| 4. Start the Watchdog daemon: |
| |
| - execute "service watchdog start" |
| |
| |
| IMPORTANT: To gracefully stop/kill the watchdog daemon, be sure |
| to use "service watchdog stop" (which executes "kill -s SIGTERM <pid>") |
| and do *not* use "kill -9 <pid>". Using "kill -9 <pid>" will cause the |
| daemon to be shut off without stopping the BMC's watchdog timer, thus |
| a system reboot will be triggered when the BMC's watchdog timer expires. |
| |
| Alternately, or in case the watchdog daemon is killed "ungracefully", |
| you can stop the BMC timer by executing the following ipmitool utility |
| command before the watchdog timer expires: |
| |
| # ipmitool -v raw 0x06 0x24 0x04 0x01 0x00 0x10 0x00 0x0a |
| |
| ---------------------------------------------------------------------- |
| |
| To test the watchdog after system configuration and setup: |
| |
| . Use kill -9 on the watchdog daemon so it doesn't shut down the watchdog daemon |
| gracefully. Verify that the system gets reset after the BMC timer expires. |
| |
| . Use "service watchdog stop" and verify that the watchdog daemon shuts off |
| the BMC watchdog timer gracefully (the system doesn't get reset). |
| |
| . Set the timer on the watchdog daemon to be greater than the time set in |
| the BMC BIOS for system reset and verify that the system is reset. |
| |
| . Set the timer on the daemon to be less than the time set in the |
| BMC timer and verify that the BMC watchdog is poked regularly and the |
| system is not reset. |
| |
| . Test some of the other actions the BMC can take when the watchdog timer |
| goes off (see modinfo ipmi_watchdog for some other settings to try). |
| |