Nagiosr Consolidated report Hosts and Services their duration – Nagios Reporting

Nagios is Enterprise Monitoring tool using Nagios we can monitor anything and everything which support SNMP, NSClient and NRPE. Monitoring can be done but how about reports, to show data on consolidated base will take most of the time. In this article we are going to learn Nagiosr Consolidated report hosts and services their duration – Nagios Reporting.

Nagios Package source

How to Generate Nagiosr Consolidated Report

Consolidated report will looks like below shown screenshot, line start time, end time, server name, service name, duration.

nagiosr sample report

nagiosr sample report

In order to get as like above perform below steps from this article

Step 1: Download Nagiosr plugin from above specified site

It uses hooks function to generate text based reports from command line interface

Step 2: Install required packages for Nagiosr reporting tool

Using yum server install below mentioned packages

[root@ark-nagios ~]# yum install perl-JSON perl-libwww-perl perl-XML-Parser perl-Net-SSLeay perl-Excel-Writer-XLSX --skip-broken
[root@ark-nagios ~]# yum install perl-TermReadKey

 

Step 3: Configuring Nagiosr Script

Before going to run ./nagiosr report little configuration is required. Create .cf file /usr/local/nagios/etc/nagiosr.cf and copy below content to it.

# Example nagios.cf
# blank lines and '#' lines are ignored.
path = /usr/local/nagios/var
# define a custom header, if you wish.
# Your headings should stay in the same order. They are comma-delimited.
svc_header = Date & Time Down,Date & Time Up,Host / Service,Plugin Output,Duration
host_header = Date & Time Down,Date & Time Up,Hostname,Plugin output,Duration
# This is an abbreviated form of Perl's format function.
# use '<' for left-justify, '|' for center, '>' for right
# Then, put the number of spaces you want to display (the fn adds 1).
# This is followed by a comma-delimited list of the columns you wish to display.
# Note: "Hostname" or "Host / Service" plus "Plugin Output" are
# intelligently merged and thus guaranteed to be the width you specify here.
# Note also: you can not (currently) change the order of columns.
# Options: $timedown,$timeup,$descr_and_plugout,$duration
format = @<17 @<17 @|48 @>9,$timedown,$timeup,$descr_and_plugout,$duration
#############################################################################
# The *_hook definitions
# - define your regular expressions to shorten your output
# - all REs should assume they are using $_
# - you can embed comments within the text (line must begin w/ #)
# - end the *_hook declaration with a single '.'
#############################################################################
# name_hook shrinks the host and/or service names.
name_hook =
#quasi-generic
s/Win Services/WinS/; s/MySQL.*/MySQL/; s/System Time/Systime/;
s/ Disk Space.*//; s/Drive (\w)/drive$1/; s/CPU Load.*/CPU/;
s/.*(Memory).*/mem/; s/[Ss]yslogs/logs/; s/Bind DNS/DNS/;
s/Open//;
# Dell-specific
s/ (AM|SM)//; s/Log Disks/logdisk/;
# site-specific:
.
#############################################################################
# svc_plugin_hook shrinks the service plugin output.
svc_plugin_hook =
# check_pingwithperl...
s/.* (\d+)% packet loss, \d+.\d+ ms ave rtt/$1% pkt loss/;
# check_tcp...
s/.* (\d+\.\d+) second[s]? response time.*/$1 sec response/;
s/.* (\d+) second[s]? response time.*/$1 sec response/;
s/.* (\d+\.\d+) sec[s]? response time.*/$1 sec response/;
# chech_ssh...
s/.* (.*?OpenSSH.*?) .*/$1/;
# check_netsnmp_disk & check_netsnmp_bigdisk...
# also works for check_dcache_usage...
#while ( $_ =~ /(\d+\.\d+) TB/ ) {
# my $tb = $1;
# WARNING: will fail for > 9999 GB...
# my $gb = sprintf("%4.4s",int($tb *1024));
# $_ =~ s/$tb TB/$gb GB/;
#}
#while ( $_ =~ /(\d+\.\d+) GB/ ) {
# my $gb = $1;
# WARNING: will fail for > 9999 GB...
# my $gb_new = sprintf("%4.4s",int($gb));
# $_ =~ s/$gb GB/$gb_new GB/;
#}
s/.*?(\d+ GB total,).*?,(\s*\d+ GB avail)/$1$2/;
# check_netsnmp_loadave...
s/.* load average: (\d+\.\d+).*/$1 loadave/;
# check_ntp...
s/.* Offset ([-]*\d+\.\d+) secs.*/$1 sec offset/;
s/.* stratum (\d+), offset ([-]*\d+\.\d+).*/stratum $1, $2 sec offset/;
s/.* Received \d+ DHCPOFFER.*max lease time = (\d+) sec.*/$1 sec lease time/;
# check_dhcp...
s/.* \d+ in use, (\d+) free/$1 free leases/;
if ( s/DHCP problem: (.*)\.?/$1/ ) { $_ = lc($_); }
# check_afs_*...
s/.* (\d+ blocked) connections/$1/;
s/(.*?) AFS (\/.*)/$1 $2/;
s/(\d+) processes running normally/$1 ok processes/;
s/one process running normally/one ok process/;
s/(\d+) volumes under quota/$1 ok volumes/;
s/db version (\d+.\d+)/db $1/;
# check_condor_client...
s/.* vm1 = .*?\/(\S+), vm2 = .*?\/(\S+),.*/$1\/$2/;
s/.* vm1 = .*?\/(\S+), vm2 = .*?\/(\S+).*/$1\/$2/;
s/.* cpu = (\S+)/$1/;
s/CondorQueue.*?(\d+ job[s]?, \d+ running).*/$1/;
s/.*?No condor status.*/no condor status/;
# check_condor_pool...
s/.*?(\d+) nodes.*/$1 nodes/;
# check_nsr...
s/.*?(\d+\.\d+ GB), (\d+ saves) since.*/$1, $2/;
s/(\d+ GB avail)able, \d+ GB total/$1/;
# check_LPRng_queue...
s/(\d+) active job[s]?/$1 active/;
s/(\d+) stalled job[s]?/$1 stalled/;
s/(\d+) spooled job[s]?/$1 spooled/;
s/(\d+) incoming job[s]?/$1 incoming/;
# check_dcache*...
s/.*no status available.*/no dcache status/; s/service is (.*)/$1/;
s/.*(\d+) ms ave ping time/$1 ms ping time/; s/.*(\d+) ms ping time/$1 ms ping time/;
s/\d+\.\d+ bps/000.00 Mbps/g;
# check_netsnmp_raid...
s/.*degraded.*/degraded/; s/.*degraded/degraded/; s/.*built.*/building/;
s/.*built/building/; s/.*optimal.*/optimal/;
# plugin generic...
s/.*no response.*/connection timed out/i;
s/.*no route to host.*/no route to host/i;
s/Socket timeout.*/socket timed out/;
# nagios generic...
s/\(Service Check Timed Out\)/check timed out/;
s/\(No output returned from plugin\)/no output from plugin/;
s/Service check scheduled for.*/none/;
s/No data yet.*/no data yet/;
# generic generic...
s/.*?OK - //i; s/.*?WARNING - //i; s/.*?CRITICAL - //i;
# Additions
s/.*[Nn]o data received.*/no data received/;
s/SNMP ([Pp]roblem|OK)( - )?//;
s/.*?([Pp]lugin timed out).*/plugin timeout/;
s/.*[Cc]an't [Cc]onnect.*/connection failure/;
s/NRPE.*[Cc]ommand.*(does not exist|not defined).*/bad NRPE command/;
s/.*total.*([Uu]sed.*?\(\d+%\)).*/$1/;
s/.*(CPU Load \d+%).*/$1/;
s/.*?HTTP WARNING.*No [Hh]ost matches.*/no page found/;
s/.*System (time|date) is (way )?off.*/system time off/;
#site-specific
s/.*(\.log).*not updated.*(errors)?/log not updated/;
s/.*? (\w+) - (\d+ crits).*/$2/;
s/.*has not been rebooted.*/not rebooted/;
s/.*rebooted today.*/rebooted today/;
s/.*[Ss]topped/stopped services/;
s/.*possible error.*log.*/possible log error/;
.
############################################################################
host_plugin_hook =
s/\(Host assumed to be up\)/assumed up/; s/\(Host check timed out\)/timed out/;
s/\(Not enough data to determine host status yet\)/none/;
s/Ping .*? - (\d+)% packet loss.*/$1% pkt loss/;
# generic generic...
s/.*?OK - //i; s/.*?WARNING - //i; s/.*?CRITICAL - //i;
s/.*?([Pp]lugin timed out).*/plugin timeout/;
# site-specific
.

Now your almost ready to generate text based report. If required you can change the script as per your requirement.

Extract downloaded package using below command

unzip nagios-1010.zip

Provide executable permissions to nagiosr script

chmod 777 nagiosr
OR
chmod u+x nagiosr

Step4: How to run nagiosr script to generate CLI report

[root@ark-nagios ~]#perl nagiosr
[root@ark-nagios ~]#./nagiosr

You can run the script as shown above, which will give last 24 hours services down report. But how do you know which are HARD state service report or SOFT state.

There are many options available in this nagiosr script, let’s see what are those.

Generate Hosts down report

[root@ark-nagios ~]# ./nagiosr -H -s

In last 24 hours what are the hosts went to down state, it will show. if still host did not came back to up it will show as Still CRITICAL

Point out different configuration file and run the report based on new .cf file

./nagiosr --conf=new.cf -H

if you would like to run report using nagios archived log file still you can do that using below command

./nagiosr --file=/usr/local/nagios/var/archive1.log

Sort the report using column name, which is going to give more visibility to the report

./nagiosr --sort=
  • timedown
  • timeup
  • host
  • plugin
  • duration

Soft and Hard alerts summary using below command

./nagiosr -s

Based on state you can still sort report

./nagiosr --state=
  • down
  • warning
  • critical
  • both

Today report using command line interface

./nagiosr -t --state=critical

Yesterday report

./nagiosr -y --state=critical

To know version of nagiosr script

./nagiosr -V
nagiosr v. 1.0.10

Related Articles

Monitoring Netapp Cluster mode using Nagios

Configure Entire Linux hosts and there services within 5 minutes of time

Installing Linux NRPE Agent on RHEL 7 / Centos 7

Thanks for your wonderful Support and Encouragement

ARK

My Name is ARK. Expert in grasping any new technology, Interested in Sharing the knowledge. Learn more & Earn More

You may also like...

Leave a Reply