NAME
SYNOPSIS
DESCRIPTION
OPTIONS
CONFIG FILE
Client
- Client commands
Server
- Server commands
INSTALL
- Requirements
- Installation
PRODUCTION
SIGNALS
FILES
SEE ALSO
LICENSE
AUTHOR

NAME

  dmon - a distributed monitor

SYNOPSIS

  Usage: dmon [OPTIONS] [COMMANDS]
  OPTIONS  : [-s] [-q] [-v] [-d] [-h] [-i] [-t] [-c file]
  COMMANDS : stop|start|reload|state
  option s : be silent
  option q : be quiet ; only errors
  option v : be verbose ; show all actions
  option d : show debug info ; internals
  option T : trace ; show all debug info
  option h : show help ; exit
  option t : test config and work ; exit
  option i : interactive ; don't fork the daemon
  option c : use config <file> ; default [dmon.conf,/etc/dmon/conf]

DESCRIPTION

Program dmon provides system monitoring on a collection of hosts. Each host runs dmon as a daemon.

On each host, dmon is a client ; on one host dmon is also a server ; and one host is also the (web) pagemaker.
On each host, dmon monitors items ; things like cpu-load, disk-usage, the status of a couple of daemons etc.
Every minute, dmon computes a fresh value for each item, and stores these values in a local (sqlite) database.
Values of items have a fitness-level ; typically :
```
  fine soso sick critical dead
```
The fitness of an item is determined by a configurable (per [host-]item) fitness-function.
Every 5 minutes dmon sends a report to the server ; this reports contains current information about the items.
If/when the fitness of an host-item changes, dmon sends an event to the server.

On receiving events, the server may send mails to selected users, depending on the event (host, item, current level, previous level).

In a few web-pages, dmon shows you :

all problems : host-items that aren't fine
recent events ; host-item fitness changes
the history of each host-item ; in a graph ; per day, hour, day, month or year
various overviews ; items by host, hosts by item ; for various intervals

What the dmon-system does, is determined by a work-config file. It contains (among other things) :

a list of hosts and host-groups
for each host and/or host-group, the items it must monitor
the designated pagemaker
fitness-levels and (per host and/or item) fitness-functions
a list of users and user-groups
for all fitness-changes, the users that must receive an alert

Program mk-dmon-work reads the work-config file and generates a work-file for each client, the server and the pagemaker.

The system is distributed in the sense that :

each client-dmon runs on its own ; there is no central (work)flow-control
each client-dmon keeps its own history in a local (sqlite) database
each client-dmon can talk to the server ; the server can talk to each client ; all connections are short-lived.

OPTIONS

-s: be silent
-q: be quiet ; only errors
-v: be verbose ; show all actions
-d: show debug info ; internals
-T: trace ; show all debug info
-h: show help ; exit
-t: test config and work ; exit
-i: interactive ; don't fork the daemon
-c: use config file ; default : dmon.conf, /etc/dmon/conf

CONFIG FILE

location

The default locations of the config file are :

./dmon.conf
/etc/dmon/conf

syntax

A config file looks like this :

  +--------------------------------------------------
  |# lines that start with '#' are comment
  |# blank lines are ignored too
  |# tabs are replaced by a space
  |
  |# the config entries are 'key' and 'value' pairs
  |# a 'key' begins in column 1
  |# the 'value' is the rest of the line
  |somekey  part1 part2 part3 ...
  |otherkey part1 part2 part3 ...
  |
  |# indented lines are glued
  |# the next three lines mean 'somekey part1 part2 part3'
  |somekey part1
  |  part2
  |  part3
  +--------------------------------------------------

config file : required entries

server host: Specify the (fully qualified) hostname of the server.

config file : optional entries

hostname host

domain domain

Specify a fully qualified hostname for the client. The default is :

  hostname `hostname`

If `hostname` is not fully qualified (does not contain a dot), dmon attempts to append a a domain-string. If defined, dmon uses config-option domain ; otherwise file /etc/resolv.conf is searched for a search list and the first name found used.

If $hostname resolves to a CNAME, the pointed-to name is used.

Using dmon -t -v tells you which (canonical) $hostname dmon uses.

PORT port

Specify a number ; the default is :

  PORT 22007

The server listens for connections on port $PORT. The client listens for connections on port $PORT+1.

ival_make_state interval-spec

Specify the interval after which the client will compute the next state. The default is (one minute) :

  ival_make_state 1m

An interval-spec can be given in seconds (as in 22 or 22s), minutes [m], hours [h], days [d] and/or weeks [w].

The interval-specs can be combined in any order :

  dw      # a day and a week
  7d+24h  # same thing
  w-0.5h  # a week minus half an hour
  hm6     # 3666 seconds

ival_send_report interval-spec

Specify the interval after which the client will send the next report to the server. The default is :

  ival_send_report 5m

ival_check_work interval-spec

Specify the interval after which the client will request a fresh work-file from the server. The default is :

  ival_check_work 10m

ival_keep_events interval-spec

Specify how long the server must store events. The default is four weeks :

  ival_keep_events 4w

The server cleans up the event-history hourly.

Specify a log-level ; the default is

  loglvl Terse

plot_url url

Script gen-dmon-page generates references to plotter plotter.php as $url ; the default is :

  plot_url /plotter.php

A leading slash is interpreted as the DOCUMENTROOT of the vhost running gen-dmon-page.

rotate num interval-spec

Specify parameters for log rotation ; the default is

  rotate 8 1d

On start-up, and after interval-spec seconds, dmon will rotate its logfile (/var/log/dmon/dmon.log), saving num files.

bindir dir

Specify the directory where the programs live. The default is :

  bindir /usr/sbin

Option $bindir is used by the UPGRADE-facility. The server sends (and the clients installs in) $bindir/dmon.

Option $bindir is ignored by root.

logdir dir

Specify the directory where the logfiles live. The default is :

  logdir /var/log/dmon

vardir dir

Specify a dmon keeps it files. The default is :

  vardir /var/dmon

rundir dir

Specify a run-dir. The default is :

  rundir /var/run/dmon

This is where the daemon keeps its files.

lckdir dir

Specify a lock-dir. The default is :

  rundir /var/lock/subsys

httpdgid group-name | number

There is no default. Specify the group (name or number) under which the httpd is running on the pagemaker.

Gid $httpdgid is used by program users-dmon to make the user-database writable for the http-daemon.

page_sec url

Specify a (https) url ; there is no default.

$page_sec is used on the pagemaker by cgi-script gen-dmon-page to refer to the secure dmon-web-site, where users can login.

Client

Client commands

By default, the clients listens on port 22008 for connections. The client accepts connections from localhost and the server. It expects a command from the list below.

To send a command to the client, use netcat (nc) :

  echo <command> | nc localhost 22007
  echo <command> | nc <client-hostname> 22007

PING

The server responds :

  COMMAND PING
  PONG from Client <hostname> dmon-0.05-p179
  COMMAND DONE

STATE

The client responds with something like :

  COMMAND STATE
  -- version  dmon-0.05-p184
  -- logfile  /var/log/dmon/dmon.log
  -- loglevel Trace
  -- listening on port 22008 as a Client
  -- Client is processing a command-session << 127.0.0.1 port 22008
  -- hostname science-bs32.science.uu.nl
  -- server   down.science.uu.nl
  -- work     Thu Jan 28 14:21:27 2016
  -- Client state : { ... }
  COMMAND DONE

STOP secret
Command STOP stops the client. On startup, the daemon generates a secret and stores it (by default) in file /var/run/dmon/dmon.stp, mode 0600. This secret must be supplied in the STOP command.

Since the secret is only available to the owner of the daemon, only the owner of the daemon can use the STOP command.
HIST [ ival [ pnts [ host [ name ... ] ] ] ]
where
- ival represents an interval : Hour, Day, Week, Month or Year ; only the first letter is used ; default H
- pnts is ignored ; default 100
- host is a hostname ; default : the client's canonical hostname
- name is an item-name ; default : all item-names for host
The client queries its history-database, grouping rows and averaging over groups.
```
  +----------+------------+------+
  | Interval | Group by   | rows |
  |----------+------------+------|
  | Hour     |  1 minute  |   60 |
  | Day      | 10 minutes |  145 |
  | Week     |  1 hour    |  168 |
  | Month    |  4 hours   |  186 |
  | Year     |  2 days    |  184 |
  +----------+------------+------+
```
Note ; instead of these constants, we should be able to specify the approximate number of rows we want ($pnts). There is a relation with ZAP.

The client responds with a json-encoded summary of the host-item history ; something like
```
  COMMAND HIST
  { "resp" : "ok rows 145"
  , "data" : { "cols" : [ "TIME", item1, ... ]
             , "rows" : [ [ time, value, ... ], ... ]
             }
  }
  COMMAND DONE
```
where cols is a list of (requested) item-names (ordered alphbetically), and rows is a list of [time,values] tuples.

The HIST command is used by the plotter to retrieve historical data from a client, so it can generate a plot.
SEND
Command SEND instructs the client to send a report to the server.

The client re-schedules sending its next report to a randomised time in the near future ; this avoids congestion on the server if/when all clients receive a SEND command as a result of a server's ALLSEND command.

The client responds with :
```
  COMMAND SEND
  next_send down.science.uu.nl Sat Feb 13 18:45:39 2016
  COMMAND DONE
```
UPGRADE { ``version'' : version-string ∥ null }
The client rsyncs $server::$upgr_mod into $vardir/upgrade/ ; if succesful, it runs cd $vardir/upgrade/ ; make upgrade. The make tests the new program and, if ok, installs the program, sends a reponse, and schedules a re-exec.

The client responds with something like :
```
  COMMAND UPGRADE
  upgrade science-vs14.science.uu.nl ... ok dmon-0.05-p183 → dmon-0.05-p184
  ...
  COMMAND DONE
```
Note: by default (or when running as root), the server sends (and the client installs in) /local/sbin/dmon.

Note: a client accepts an UPGRADE command only from the server.
SERVER cmd
The client connects to the server and issues command cmd. The clients responds with the server's response.

This facility is used for testing.
REPORT
The client responds with a json-dump of its current state.

This facility is used for testing.
ZAP
The ZAP command instructs the client to zap its history database : the client reduces the number of rows in its history by replacing groups of rows by a single row containing the average values of the group.

This facility is used for testing ; clients zaps their history just after startup, and subsequently every hour.

Server

Server commands

By default, the server listens on port 22007 for connections. The server accepts connections from localhost and the clients. It expects a command from the list below.

To send a command to the server, use netcat (nc) :

  echo <command> | nc localhost 22007
  echo <command> | nc <server-hostname> 22007

PING

The server responds :

  COMMAND PING
  PONG from Server <hostname> <dmon-version>
  COMMAND DONE

STATE

The server responds with something like :

  COMMAND STATE
  -- version  dmon-0.05-p179
  -- logfile  /var/log/dmon/dmon.log
  -- loglevel Trace
  -- listening on port 22008 as a Client
  -- listening on port 22007 as a Server
  -- Server is processing a command-session << 127.0.0.1 port 22007
  -- Server state : keeping state for 74 clients
  COMMAND DONE

WORK prog hostname
where prog == dmon-server | dmon-client | dmon-pmaker

The server responds with something like :
```
  COMMAND WORK
  { "resp" : "ok work"
  , "data" : contents of the work-file as a string
  , "lm"   : work-file's last-modified timestamp
  }
  COMMAND DONE
```
where (by default) the work-file is :
```
  /var/dmon/works/<prog>/<hostname>.txt
```
On the server, dmon retrieves work-files from the file-system.

The clients and the pagemaker retrieve a work-file from the server ; except when its (config) server is equal to its (canonical) hostname, in which case the work-file is retrieved from the file-system.
WORK_LM prog hostname
Same as command WORK except that data is always null.

After startup, periodicly (default every 10 minutes), a client issues a WORK_LM command to the server, which responds with the current last-modified timestamp of the client's work-file.

If the timestamp has changed, the client reloads itself.
REPORT json-hash
The server expects a client report : a one-line json-hash describing the current state of the hosts the client monitors (a client can monitor other (dumb) hosts like upses, using SNMP).

The json-hash contains the state of one or more hosts :
```
  { host1 :
      { item1 : ...
      , ...
      }
  , host2 : ...
  }
```
where per item it has a value, fitness, probe-errors etc.

The server just timestamps and stores the hash ; it responds with something like :
```
  COMMAND REPORT
  { "resp": "ok report from <client> [<ip>] for <host1, ...>"
  , "work": <timestamp work-file>
  }
  COMMAND DONE
```
The work-element is the timestamp of the workfile of the requesting host. The client caches this info, to avoid a WORK_LM request.

The pagemaker retrieves the combined state of all the hosts with server-command CLIENTS ; see below.
EVENTS json-array
The server expects a list of events. Each event describes a change of the fitness of an item on a host ; attributes : hostname, item-name, old fitness-level, new fitness-level, old value, new value.

The server stores the events in a database.

The pagemaker may retrieve events from the server using the server command CLIENTS ; see below.
CLIENTS [HOST] [ITEM] [interval-spec ∥ num]
The server responds with something like :
```
  COMMAND CLIENTS
  { "resp"   : "ok clients"
  , "cdmp"   : { host : state of host, ... }
  , "events" : { recent events ; optionally selected by time or count }
  }
  COMMAND DONE
```
If HOST is specified, cdmp only contains the state of HOST.

If HOST and/or ITEM is specified, events only contains event pertaining to HOST and/or ITEM.

The default for the last argument is 0 (meaning all events) ; if the argument looks like a number, only the last arg events are sent ; otherwise only the events in the specified interval are sent.

The CLIENTS command is issued by the pagemaker when it generates a html report-page.
CLIENT hostname cmd
The server connects with client hostname and issues command cmd. The server responds with the client's response.

This facility enables a client-host to talk with another client-host. It is used by the pagemaker to retrieve history information from clients, unless the pagemaker runs on the same host as the server (in which case it can talk directly to any client).
ALLPING
The server issues (in parallel) a PING command to each client.

The server's response is the concatenation of the client-responses.
ALLSEND
The server issues (in parallel) a SEND command to each client.

The server's response is the concatenation of the client-responses ; something like :
```
  COMMAND SEND
  next_send down.science.uu.nl Sat Feb 13 18:09:11 2016
  ...
  COMMAND DONE
```
On the server-host, the client sends an ALLSEND to its server-part, on start-up and after a reload ; so the server is quickly up-to-date.
UPGRADE [-f]
UPGRADE [-f] hostname ...
The server issues a UPGRADE command to each client (or only clients hostname ...). Sent with the command is a version-string, or undef if -f is specified.

The client compares its version with the version-string (if defined) ; if versions are equal, the client ignores the UPGRADE-command. Otherwise, the client rsyncs new software from the server, and runs a make upgrade ; this tests and installs the new stuff and, if ok, the client schedules a re-exec.

The server's response is the concatenation of the client-responses ; something like :
```
  COMMAND UPGRADE
  upgrade science-vs14.science.uu.nl ... ok dmon-0.05-p183 → dmon-0.05-p184
  ...
  COMMAND DONE
```
Note: by default (or when running as root), the server sends (and the client installs in) /local/sbin/dmon.

Note: the server accepts an UPGRADE command only from localhost ; clients accept an UPGRADE command only from the server.

INSTALL

Requirements

Daemon dmon must run as root because many system diagnostic tools can only be used as root.
The server must be able to connect to all the clients on port 22008. The server listens on port 22007 ; It allows connections from localhost and the clients, rejecting others.
A client must be able to connect to the server on port 22007. A client listens on port 22008 ; it allows connections from localhost and the server, rejecting others.
Program dmon requires a bunch of perl modules, most of which are CORE (come with perl) ; the others are widely available from platform-repo's.
A dmon system works best if the server runs on real hardware that is not dependent on other hosts you want to monitor. This gives you the best chance that monitoring is available when some calamity occurs.

Whatever your initial setup may be, it is easy to (later) move the dmon-server to another host ; idem for the required web-service.
To use dmon, one of the monitored hosts must be a web-server ; it is efficient (but not necessary) to have the server and the web-service on the same host.

Installation

Installation - server

Fetch the software
Create some directory ; for instance /local/dmon/ ; then get dmon :
```
  % rsync -avz archive.science.uu.nl::dmon-dev/ /local/dmon/
```
Install perl modules :
```
  % cd /local/dmon/
  % perl dmon -v -t
```
Perl will probably complain with something like :
```
  Can't locate xxx.pm ...
```
... where xxx is a missing modules.

Install the module with your favorite tool (yum, apt-get), or use program cpanm.

Installing modules with cpan(1) is usually horrible ; use cpanm instead ; see the INSTALL file for hints ; also CPAN's How to install CPAN modules.

If cpanm fails, view the cpanm-build-log ; some modules require gcc(1).
Install perl modules (repeat)
Install missing perl modules until dmon complains about a missing config file.
create a dmon-config file in /etc/dmon/conf
Supply the server's hostname or (preferably) a CNAME for your server-host, so you can switch later.
```
  server ...
  loglvl Verbose
```

create a dmon-work-config file in /etc/dmon/work

  # define a host
  host hostname-of-your-server myhost
  host hostname-of-your-webserver www
  pmaker www
  # set fitness levels
  fit_level fine
  fit_level soso
  fit_level sick
  fit_level crit critical
  fit_level dead
  # on myhost, monitor some items
  get myhost cpu_load root_avail root_usage uptime

... where hostname-of-your-server is a fully-quallified hostname, and myhost is just a unique tag, used to refer to the host.

create dmon work-files in /var/dmon/works/
Run program mk-dmon-work :
```
  % mk-dmon-work -v -f
```
try dmon
Start the daemon ; for now, don't fork (use -i) :
```
  % dmon start -i
  # stop it with ^C
```
If all is well, you will see dmon start up, first as server, then as client.

After a while,
- every minute : the daemon (as client) computes fresh item values ;
- every five minutes :
  - the daemon (as client) sends a report to the server ;
  - the daemon (as server) acceps the report, responds to the client ;
  - the client receives the response from the server ; closes the connection.
install dmon as a service
- copy init.d :
```
  % cp init.d /etc/init.d/dmon
```
- Make sure dmon is started after a reboot ; use chkconfig :
```
  % chkconfig --add dmon
```
  or (on Ubuntu) :
```
  % update-rc.d dmon defaults 50 50
```

Installation - client

Fetch the software
Same as server-install.
Install missing perl modules
Same as server-install.
create a dmon-config file in /etc/dmon/conf
Supply the server's hostname or (preferably) a CNAME for your server-host, so you can switch later.
```
  server ...
```
Configure hostname if necessary ; see config option hostname.
install dmon as a service
Same as server-install.

Installation - pagemaker

The pagemaker must be a dmon-host and a web-server.

install gen-dmon-page on the webserver
- Make sure the webserver runs gen-dmon-page as a cgi-script.
- Script gen-dmon-page generates a login-reference ; preferably to a secure site ; configure something like :
```
  page_sec https://dmon.your.org/cgi-bin/gen-dmon-page
```
  Just use http instead of https if you don't have a secure site (yet).
install plotter.php on the webserver
- Make sure the webserver runs plotter.php as a php-script.
- Script gen-dmon-page generates references to the plotter as /plotter.php. Configure option plot_url to change that.
- The plotter uses package jpgraph ; install the package and make sure the plotter can do :
```
  require_once ( 'jpgraph/jpgraph.php' ) ;
  require_once ( 'jpgraph/jpgraph_canvas.php' ) ;
  require_once ( 'jpgraph/jpgraph_line.php' ) ;
  require_once ( 'jpgraph/jpgraph_date.php' ) ;
```
  That is, if your plotter lives in :
```
  /path/to/plotter.php
```
  install jpgraph in :
```
  /path/to/jpgraph-<VERSION>/
```
  and make a symlink :
```
  % ln -s /path/to/jpgraph-<VERSION>/src/ /path/to/jpgraph/
```

PRODUCTION

Here are some tips for using dmon :

In general, monitoring works best if the server runs on stand-alone hardware, that doesn't use any other resources. Dmon is very light-weight, so some old box will probably work.
Dress up the server as a web-server and run the pagemaker (gen-dmon-page and plotter.php) on the server ; this is a little more efficient.
To make installing dmon on clients easier, dress up the server as a rsync-server ; then :
- Create a module [dmon] containing the downloaded dmon distro.
- In module [dmon] copy the Makefile that comes with dmon, to file makefile :
```
  cp Makefile makefile
```
  Tweak the makefile to meet your local needs.
  
  Note : for make, the makefile takes precedence over Makefile.
On a client, rsync dmon from the rsync server :
```
  % mkdir -p /local/dmon/
  % rsync -avz dmon.your.org::dmon/ /local/dmon/
```
Then, on the client you can run a make :
```
  % cd /local/dmon/
  # repeat until all perl-modules are installed :
  % ./dmon -t -v 
  % make install
  # start dmon
  % /etc/init.d/dmon start
  # fix iptables ; allow tcp connections from dmon.your.org:22008
```

SIGNALS

  HUP  : dmon reloads  ; dmon re-reads the config ; re-inits client and server
  USR1 : dmon re-execs ; dmon stops with END { exec $PROGRAM_NAME, @ARGV }

FILES

all hosts

  /etc/dmon/conf         dmon-configuration
  /etc/init.d/dmon       dmon start/stop script
  /var/dmon/data.lite    database for item-history data
  /var/dmon/probes       dmon probes ; installed by dmon on startup
  /var/lock/subsys/dmon  touched on dmon-startup ; removed on stop
  /var/log/dmon          logfiles
  /var/run/dmon/dmon.lck lock-file
  /var/run/dmon/dmon.pid pid-file
  /var/run/dmon/dmon.stp dmon stop-secret ; for client STOP-command

server only

  /etc/dmon/work         work-configuration
  /var/dmon/works        per/client work-files 
  /var/dmon/works.tmp    staging directory for /var/dmon/works

pagemaker only

  /var/dmon/cgi-data     database for gen-dmon-page ; login/out-log
  /var/dmon/cgi-secret   secret for cookie-verification ; should be 0600

LICENSE

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl 5.10.0 README file.