MRTG Thresholds

This page should be of some help to users of MRTG who want to use the Threshold options. Up until now (Dec, 1999) there has been little published information about how to get this working. Most of the previous information has been from the MRTG mailing list.

Another decent page is at http://www.wtcs.org/snmp4tpc/threshol.htm, focusing mostly on using a windows platform.

I'll try to give a brief, simple, straightforward example of how to set up Thresholds. The same basic format can be used for any configuration.

You must define a directory for MRTG to place temporary threshold files in.

These files are used to keep track of which thresholds have been broken. MRTG cannot do this in memory, since it is restarted every 5 minutes (by default). It has to have a way to keep track of this information between runs.

To specify this directory, you must use the global parameter "ThreshDir". This must be done in each MRTG configuration file, and normally comes after the "WorkDir" global parameter. The format of this option is:

ThreshDir: /path/to/temp/directory

You must specify the actual threshold values for MRTG to check for.

The options to set the actual threshold values are specific to each target. There are four thresholds which can be set for each target: maximum input, minimum input, maximum output, and minimum output. The "Target" option determines what values are used to check the thresholds. (For non-network-interfaces, the first value is considered the input, and the second value is the output.) Be aware that the thresholds are based on the values MRTG records, which are not necessarily the values it graphs (i.e. "bits").

To specify the thresholds, you use the four parameters "ThreshMaxI", "ThreshMinI", "ThreshMaxO", and "ThreshMinO". You do not need to define all the thresholds -- if they aren't defined, they won't be used. I tend to put these options after all the other MRTG options for a target. The format for these options are (all the same):

    ThreshMaxI[target.name]: 1234
    ThreshMinI[target.name]: 99
    ThreshMaxO[target.name]: 8259
    ThreshMinO[target.name]: 1

You must specify the external program to run when thresholds are broken.

MRTG only detects threshold breaches. It can then run an external program of whatever form you choose. MRTG itself doesn't send email, doesn't send pages, or do anything else funky. All it does is execute an external program which you must provide (I'll have a very simple one shown later). MRTG will execute the program with 3 parameters: the target name, the threshold value that was broken, and the current value of that target.

To specify the programs to execute, you use two parameters "ThreshProgI" and "ThreshProgO". If either "ThreshMaxI" or "ThreshMinI" are broken, then the program specified in "ThreshProgI" will be executed with the correct 3 parameters. "ThreshProgO" works the same way, using "ThreshMaxO" and "ThreshMinO", of course. The program names should not have any quotation marks, and cannot be given any parameters (no spaces). The format for these options are:

ThreshProgI[target.name]: /path/to/some/scripts/threshprogramname
ThreshProgO[target.name]: /path/to/some/scripts/anotherprogramname

You can specify external programs to run when thresholds are OK again.

Since MRTG keeps track of which thresholds were broken last time it was run (by using the temporary files in the directory specified with "ThreshDir"), it can detect when those threshold breaches are OK again. If you want to execute an external program when a threshold is OK again, you can use two options to specify the external programs to run. MRTG will execute the programs with 2 parameters: the target name, and the current value of that target.

To specify the prgrams to execute, you use two parameters "ThreshProgOKI" and "ThreshProgOKO". If either the input (first) or output (second) values where breaching a threshold, but are now within the threshold, the appropriate program will be executed. The format for these options are:

ThreshProgOKI[target.name]: /path/to/some/scripts/yetanotherthreshprogramname
ThreshProgOKO[target.name]: /path/to/some/scripts/andanotherprogramname

You'll have to create an external program to take some action(s).

I personally prefer to use Perl scripts for my external programs. I'll provide a fairly simple example of one here, which you can feel free to take, modify, distribute, etc. It will log the threshold breach, and also send an email message. It will likely need some minor modifications to work on your system. (It was written an tested on Solaris 2.x.)

#!/usr/local/bin/perl # # Called when MRTG detects a threshold problem for a variable. # ARGV[0] = Parameter name, such as 'wanrouter.cpu'. # ARGV[1] = Threshold value which was breached, such as "99". # ARGV[2] = Actual current value of the parameter, such as "100". # # Command line looks like: # thisprogram wanrouter.cpu 99 100 # my($timestr, $param, $thresh, $value, $message, $logfile); $timestr = localtime(time); $param = $ARGV[0]; $thresh = $ARGV[1]; $value = $ARGV[2]; $logfile = "/tmp/mrtgthresh.log"; $emailprog = "/usr/ucb/mail -s 'MRTG Thresh'"; $emailuser = "adminuser@mycompany.com"; # # Do something meaningful with the information. # Send an email message, log to a file, execute some script... # if ($thresh > $value) { $abovebelow = "below"; } else { $abovebelow = "above"; } $message .= "$param ($value) is $abovebelow threshold ($thresh)"; # Log it. open(LOG, ">>$logfile"); print LOG "$timestr $message\n"; close(LOG); # Email it. system("echo '$message' | $emailprog $emailuser"); exit(0);

Normally on NT systems, you'd want to specify programs as "c:\path\to\perl c:\path\to\some\script". However, since you can't specify parameters (have spaces) in the "ThreshProg*" options, you must first call an external batch script, which can then call a Perl script. The batch script would be something very simple, such as:

c:\path\to\perl c:\path\to\some\script\programname %1 %2 %3 exit

So, a final MRTG config file might look like this (Cisco router CPU):

    WorkDir: /usr/mrtg/data/wanrouter     ThreshDir: /usr/mrtg/tmp-thresh         Target[wanrouter.cpu]: 1.3.6.1.4.1.9.2.1.56.0&1.3.6.1.4.1.9.2.1.56.0:public@wanrouter     MaxBytes[wanrouter.cpu]: 100     Title[wanrouter.cpu]: wanrouter - CPU Usage (%)     PageTop[wanrouter.cpu]: <H2>wanrouter - CPU Usage (%)</H2>     YLegend[wanrouter.cpu]: CPU Usage (%)     ShortLegend[wanrouter.cpu]: %     Legend2[wanrouter.cpu]: CPU Usage     LegendO[wanrouter.cpu]: Usage:     Options[wanrouter.cpu]: growright, nopercent, gauge     ThreshMaxI[wanrouter.cpu]: 95     ThreshProgI[wanrouter.cpu]: /usr/mrtg/script/threshprogi     ThreshProgOKI[wanrouter.cpu]: /usr/mrtg/script/threshprogoki

It would check whether the input (first) parameter (which happens to be the same as the output parameter) is above 95, and if so, run the script. For instance, if the CPU is running at 98%, it will run the program "/usr/mrtg/script/threshprogi" with the 3 parameters: "wanrouter.cpu", "95", and "98". From the command line, it would look like:

    /usr/mrtg/script/threshprogi wanrouter.cpu 95 98

If the CPU usage then drops back down below 95 (let's use 55%), it will run:

    /usr/mrtg/script/threshprogoki wanrouter.cpu 55

Well, I hope that helps you out. If there are any type-os or other problems with this page, send an email to "tom" at (@) "cloudnet.com".

If you found this useful, head on over to Kings of Chaos and help me out: Kings of Chaos Recruit

Thomas J. Muggli