I needed a watchdog with variable timeouts during the lifecycle of the supervised process - longer timeouts during startup/initialisation, shorter during interactive operation. Since I could not find one, I wrote my own:
- no library dependencies (except libc)
- no config files, single binary
- ideal for embedded systems
- support for kernel watchdogs
- permission check via UID and program name
- millisecond precision
I find it so easy to use that I also run it on my web servers to monitor my daemons.
A process that wants to be monitored creates a file with unique filename (e.g. command name + pid) in /run/watchdog. The file contains the following three lines:
- command name is the name of the running process as in /proc/<PID>/command,
- pid is the PID of the process, and
- timeout is the timeout in milliseconds.
- the program name given in the file in /run/watchdog matches the name in /proc/<PID>/comm, and
- the EUID of the process (as in /proc/<PID>/status) is the same as the UID of the file in /run/watchdog
All watchdogs in /dev/watchdogX are pinged every WATCHDOGTIMEOUT ms and will receive the "Magic Close" upon exit of watchdogd.
Simply run 'watchdogd' at startup. If /var/run/watchdog does not exist, it will be created.
Send the process a SIGINT or SIGTERM to exit. These signals are caught, watchdogd writes the magic close byte to all system watchdogs and exits gracefully. Any files in /var/run/watchdog are kept in place, so after a restart of watchdogd it will resume operation immediately.
LOGGINGAll starts, shutdowns, process killings and errors are logged via syslog and stdout/stderr.
To use watchdogd in your program, simply include watchdogd.h and call watchdog_update() before any previous timeout runs out. To turn off the watchdog, call watchdog_disable():
Make sure that "myapp" is the EXACT command name of your application as it appears in /proc/<PID>/comm.
- /var/run should be tmpfs so you do not wear out your flash
- if you cache your pid make sure to reset it in the child process after fork()ing