Wednesday, October 31, 2012

Nagios SNMP check plugin

I was in need of a basic but cool SNMP Nagios plug-in, so I wrote one.

All I wanted to do was send an snmpget and return a numeric value. If the value was >$warn then throw a warning and if it was >$critical, then throw a critical. Pretty simple right?

Well, this one does it and can be extended easily because you can just keep adding more OIDs.

First of all, you need to make sure you install the Perl Net SNMP.

Then, you can use it like this:
command[check_mysnmp.db]=/opt/nrpe/libexec/check_mysnmp.pl localhost db 20 30

This tells the plug-in to connect to the localhost and to use the db option and set the warn threshold to >20 and critical to >30. In my environment, we have a maximum of 40 DB connections in the DB pool. We normally only see anywhere from 5 to 10 DB connections, therefore, I set the warning to 20 and the critical to 30. Once we hit 30, we definitely know there is a connection leak.

You can add as many OIDs as you want. Just keep adding more to the case statement in the plugin and change the OID to yours. Then, set the threshold to what you need.

So for example... Let's say, you have an OID .1.2.3.4.1.12 and this corresponds to the number of errors that has occurred. Let's say that you can tolerate up to 10 errors at any given time but if you start to see 50 to 100 errors, you want to know about it.

You add the following right after case(db):


 case(apperr)  { $apperr_oid=".1.2.3.4.1.12";
              ($apperr_result,$apperr_exit)=&check_snmp($apperr_oid);
              &time2exit($apperr_exit,$apperr_result); }


Then, in your nrpe config, you would add:
command[check_mysnmp.apperr]=/opt/nrpe/libexec/check_mysnmp.pl localhost apperr 20 30

and here is the wonderful plugin....


#!/usr/bin/perl
#AUTHOR: GOU YANG
#PURPOSE: This is a nagios plugin to check snmp
#pass the host, the oid, the warn threshold and the critical threshold
#
#                      if we can't make an snmp connection  =UNKNOWN  (who knows what happened)
#                      if > warn threshold                  =WARNING  (throw a warning)
#                      if > critical threshold              =CRITICAL (throw a critical)
#                      if less than warn & critical         =OK       (must be ok)
#
# set $theoid  = pick from the list of case below
# set $thewarn = value before throwing a warning
# set $thecrit = value before throwing a critical

use Switch;
use Net::SNMP;

$thehost=shift;
$type=shift;
$thewarn=shift;
$thecrit=shift;

switch($type) {

#the number of used db connection; for example
  case(db)  { $db_used_oid=".1.2.3.4.1.11";
              ($db_result,$db_exit)=&check_snmp($db_used_oid);
              &time2exit($db_exit,$db_result); }

  default   { &check_snmp(); }
}

sub check_snmp{

$theoid=shift;

if ( !$thehost || !$theoid || !$thewarn || !$thecrit ) { &time2exit("UNKNOWN","Make sure to specify host,type,warning,critical values"); }

 else {

  ($session,$error) = Net::SNMP->session(
   -hostname => "$thehost",
   -community => community_string,
   -timeout   => 10);

  if (!$session){ &time2exit("UNKNOWN","$error"); }
   else {

  $result = $session->get_request($theoid);

  if (!$result){ $exitstat="UNKNOWN"; $msg="an error occured"; }

  $session->close;

  %result = %$result;
  foreach my $k (keys %result){
    $snmp_result=$result{$k};}

  }
 }

if ($snmp_result > $thewarn){$exitstat="WARNING";  $msg=$snmp_result;}
if ($snmp_result > $thecrit){$exitstat="CRITICAL"; $msg=$snmp_result;}
if ($snmp_result  !~ m/(\d)/g  ){$exitstat="UNKNOWN";  $msg=$snmp_result;}

return ($snmp_result,$exitstat);

}#sub

sub time2exit{

$exitstat=shift;
$msg=shift;

 switch($exitstat) {

   case(UNKNOWN) { print "UNKNOWN - $msg\n";exit 3; }
   case(WARNING) { print "WARNING - $msg\n";exit 1; }
   case(CRITICAL){ print "CRITICAL - $msg\n";exit 2; }

   default       { print "OK - snmp stat is $snmp_result\n";exit 0; }

 }
}

No comments:

Post a Comment