Linux WDT (watchdog) Driver

Source: Internet
Author: User

Linux WDT (watchdog) Driver

Part 1: WDT driver principles
WDT is usually implemented as the misc driver in the kernel.
WDT Introduction
A Watchdog Timer (WDT) is a hardware circuit that can reset the computer system when a software error occurs.
Generally, a user space daemon will notify the kernel's watchdog driver through the/dev/watchdog special device file during the normal interval. The user space is still normal. When such a notification occurs, the driver usually tells the hardware watchdog that everything is normal, and then the watchdog should wait for a while to reset the system. If there is a problem in the user space (RAM error, kernel bug, etc.), the notification will stop, and the hardware watchdog will reset the system after the timeout.
The watchdog API in Linux is quite special. Different driver implementations are different, and sometimes some are incompatible. This document attempts to describe the usage that has already appeared, and allows later drivers to use it as a reference.
The simplest API:
All device drivers support basic operation modes. Once/dev/watchdog is enabled, watchdog is activated and will be restarted after a period of time unless the dog is fed, this time is called timeout or margin. The simplest way to feed a dog is to write some data to the device. A very simple watchdog daemon looks like this file:
Documentation/watchdog/src/watchdog-simple.c
# Include <stdio. h>
# Include <stdlib. h>
# Include <unistd. h>
# Include <fcntl. h>

Int main (void)
{
Int fd = open ("/dev/watchdog", O_WRONLY );
Int ret = 0;
If (fd =-1 ){
Perror ("watchdog ");
Exit (EXIT_FAILURE );
}
While (1 ){
Ret = write (fd, "\ 0", 1 );
If (ret! = 1 ){
Ret =-1;
Break;
}
Ret = fsync (fd );
If (ret)
Break;
Sleep (10 );
}
Close (fd );
Return ret;
}

An advanced driver may do other things before feeding the dog, for example, checking whether the HTTP server can still respond.
When the device is disabled, unless the "Magic Close" feature is supported. Otherwise, the watchdog is disabled. This is not always a good idea. For example, if the watchdog daemon encounters a bug and crashes, the system will not restart. Therefore, some drivers support the "Disable watchdog shutdown on close" and CONFIG_WATCHDOG_NOWAYOUT configuration options. When the kernel is compiled, this option is set to Y. Once watchdog is started, there is no way to stop it. In this way, when the watchdog daemon crashes, the system will restart after the timeout. Watchdog devices often support nowayout module parameters, so that this option can be controlled at runtime.
Magic Close features:
If a driver supports "Magic Close", the driver will not stop watchdog unless the Magic character 'V' is sent to/dev/watchdog before closing the file. If the user space daemon does not send this character before closing the file, the driver considers the user space to crash and stops feeding the dog before closing the watchdog.
In this case, if watchdog is not re-opened within a certain period of time, a restart will occur.
Ioctl API:
All standard drivers should also support an ioctl API.
Use an ioctl to feed the dog:
All drivers have an ioctl interface that supports at least one ioctl command, KEEPALIVE. This ioctl is exactly the same as a watchdog device. Therefore, the main loop of the above program can be replaced:
While (1 ){

Ioctl (fd, WDIOC_KEEPALIVE, 0 );

Sleep (10 );

}

The ioctl parameter is ignored.
Set and obtain the timeout value:
For some drivers, it is possible to use the SETTIMEOUT ioctl command on the upper layer to change the watchdog timeout value. Those drivers have the WDIOF_SETTIMEOUT flag in their options. A parameter is a timeout value in seconds. The driver returns the actually used timeout value in the same variable. This timeout value may be caused by hardware restrictions, different from the request timeout value
Int timeout = 45;
Ioctl (fd, WDIOC_SETTIMEOUT, & timeout );
Printf ("The timeout was set to % d seconds \ n", timeout );
If The device time-out value can only be set to minutes, this example may actually print "The timeout was set to 60 seconds ".
Since Linux 2.4.18 kernel, it is also possible to query the current timeout value through the GETTIMEOUT ioctl command:
Ioctl (fd, WDIOC_GETTIMEOUT, & timeout );
Printf ("The timeout was is % d seconds \ n", timeout );
Preprocessing:
Pretimeouts:
Some watchdog timers can be set to have a trigger before they actually reset the system. This may be done through an NMI, interrupt, or other mechanism. This will allow Linux to record some useful information (such as panic information and kernel dump) before it resets the system ).
Pretimeout = 10;
Ioctl (fd, WDIOC_SETPRETIMEOUT, & pretimeout );
Note that the pre-Timeout value is a number of seconds ahead of the timeout value. Instead of the number of seconds until the pre-timeout.
For example, if you set the timeout value to 60 seconds and the pre-Timeout value to 10 seconds, the pre-Timeout value will arrive after 50 seconds. If it is set to 0, it is disabled. Pre-Timeout also has a get function:
Ioctl (fd, WDIOC_GETPRETIMEOUT, & timeout );
Printf ("The pretimeout was is % d seconds \ n", timeout );
Not all watchdog drivers support a pre-timeout.
Returns the number of seconds before the restart.
Some watchdog drivers have a function to report the time remaining before the restart. WDIOC_GETTIMELEFT is the ioctl command that returns the number of seconds before the restart.
Ioctl (fd, WDIOC_GETTIMELEFT, & timeleft );
Printf ("The timeout was is % d seconds \ n", timeleft );
Environment Monitoring:
Environmental monitoring:
All watchdog drivers are required to return

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.