Netapp Cluster Mode Monitoring using Nagios Nagios-Cdot-Netapp Plugin

Monitoring storage usage / performance / utilization of space is more important, if something went wrong on the SAN / NAS dependent applications will also fail. To avoid we should continuously monitor the storage devices. Manually monitoring is not an good idea / humanly not possible. Netapp Cluster Mode Monitoring Using Nagios Nagios-Cdot-Netapp Plugin Step by step Guide.

Netapp Cluster Mode Monitoring using Nagios 

Even If you use Nagios Core completely free open source application will work

Download Plugins Keep it ready

Also Download Netapp SDK file required login to Netapp Mysupport Site

Copy both the files to Linux and extract them using unzip <File Name> 

[root@Ark-Nagios netapp-cdot-nagios-master]# ./check_cdot_aggr.pl
Can't locate NaElement.pm in @INC (@INC contains: /usr/lib/netapp-manageability-sdk/lib/perl/NetApp /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /usr/local/lib64/perl5/NaServer.pm line 34.
BEGIN failed--compilation aborted at /usr/local/lib64/perl5/NaServer.pm line 34.
Compilation failed in require at ./check_cdot_aggr.pl line 17.
BEGIN failed--compilation aborted at ./check_cdot_aggr.pl line 17.

Before placing Netapp SDK stuff to /usr/lib/ path you will see above error. After extracting SDK file just copy every thing to /usr/lib/ example provided below

[root@Ark-Nagios ~]# mv netapp-manageability-sdk-5.6 /usr/lib/netapp-manageability-sdk

Execute commands manually from Nagios Server and check status

To ensure commands are working by running manually. 

[root@Ark-Nagios netapp-cdot-nagios-master]# ./check_cdot_aggr.pl --hostname 192.168.2.20 --username Username --password Password --warning 80 --critical 90
CRITICAL: aggr1_SAS_ARK_NA01 (93%), aggr1_SAS_ARK_NA03 (90%), aggr1_SAS_ARK_NA04 (66%), aggr1_SATA_ARK_NA02 (63%), aggr2_SAS_ARK_NA01 (59%), aggr2_SAS_ARK_NA03 (88%), aggr3_SAS_ARK_NA01 (33%), aggr4_SAS_ARK_NA01 (7%), root_ARK_NA01 (96%), root_ARK_NA02 (95%), root_ARK_NA03 (95%), root_ARK_NA04 (95%)

As per above output commands are working fine as expected, let move them to plugin directory and configure commands

[root@Ark-Nagios netapp-cdot-nagios-master]# cp * /usr/local/nagios/libexec/

Configure Commands in commands.cfg file plugins Nagios-Cdot-Netapp

[root@Ark-Nagios ~]# vi /usr/local/nagios/etc/objects/commands.cfg
define command {
 command_name checkaggr
 command_line /usr/local/nagios/libexec/check_cdot_aggr.pl --hostname $HOSTADDRESS$ --username UserName --password Password --warning 90 --critical 96
}

#Netapp Cluster Mode Monitoring Interface Status
define command {
 command_name checkint
 command_line /usr/local/nagios/libexec/check_cdot_interfaces.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}

#Netapp Cluster Mode Monitoring snapmirror snapshot age by default 90Days
define command {
 command_name checksnaps
 command_line /usr/local/nagios/libexec/check_cdot_snapmirror_snapshots.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}

#Disk Failed Status Netapp Cluster Mode Monitoring
define command {
 command_name checkdisk
 command_line /usr/local/nagios/libexec/check_cdot_disk.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}

#Check Snapmirror Idle time Netapp Cluster Mode Monitoring Using Nagios
define command {
 command_name checksnapm
 command_line /usr/local/nagios/libexec/check_cdot_snapmirror.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}

#Volume Space Usage Netapp Cluster Mode Monitoring
define command {
 command_name cdotvol
 command_line /usr/local/nagios/libexec/check_cdot_volume.pl --hostname $HOSTADDRESS$ --username UserName --password Password --size-warning 95 --size-critical 97 --inode-warning 95 --inode-critical 97 --volume $ARG1$ --perf
}

#Aggregate Status Netapp Cluster Mode Monitoring
define command {
 command_name cdotaggr
 command_line /usr/local/nagios/libexec/check_cdot_aggr.pl --hostname $HOSTADDRESS$ --username UserName --password Password --warning 95 --critical 97 --perf --aggr $ARG1$
}

Generate / Define your Host config File for Monitoring

Using above command names, have to define your host config file for monitoring checks

Host definition, PING and SSH service monitoring config below

######################################
#### Host Defination: Netapp-Cdot ###
######################################

define host{
 use linux-server
 host_name Netapp-Cdot
 alias Netapp Filer Cluster Mode
 address 192.168.2.20
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$
 }


define service{
 use generic-service
 host_name Netapp-Cdot
 service_description PING
 check_command check_ping!200.0,40%!500.0,60%
 notifications_enabled 1
 process_perf_data 1
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 }

define service{
 use local-service ; Name of service template to use
 host_name Netapp-Cdot
 service_description SSH
 check_command check_ssh
 notifications_enabled 1
 process_perf_data 1
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Interface Health Status
 check_command checkontap!interface_health
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Port Status
 check_command checkontap!port_health
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }


define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Cluster Health Status
 check_command checkontap!cluster_health
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Filer Hardware Health Status
 check_command checkontap!filer_hardware_health
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Volume Health Status
 check_command checkontap!volume_health
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description cDOT Interfaces Status
 check_command checkint
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description cDOT Snapmirror Snapshot Status
 check_command checksnaps
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description cDOT Disk Health Status
 check_command checkdisk
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }


define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description cDOT Snapmirror Status
 check_command checksnapm
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description cDOT Aggregate Status
 check_command checkaggr
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

Likewise take all volumes list and specify config definitions.

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Volume Volume1
 check_command cdotvol!Volume1
 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

Define All Aggregates by replacing Aggregate name, copy definition paste it number of times.

define service{
 use generic-service-storage ; Name of service template to use
 host_name Netapp-Cdot
 service_description Aggregate Usage aggr1
 check_command cdotaggr!aggr1
 action_url /pnp4nagios/graph?host=$Netap-Cdot$&srv=$SERVICEDESC$
 process_perf_data 1
 }

 That’s it your Nagios Core Server is monitoring Netapp Cluster Mode Monitoring Filers continuously and send’s you an alerts when something CRITICAL / WARNING usage. 

Conclusion: 

Don’e be afraid when you get NaServer.pm file is missing error, Download Netapp SDK and copy the same to /usr/lib path. Using Nagios Core Completely free open source will save lot of money and Best monitoring tool for enterprise. Netapp Cluster Mode Monitoring Using Nagios is enabled Enjoy.

Related Articles

Generation Nagios Configuration in Just 1 Minute

Installing and Configuring NRPE Client in Linux machines

Monitoring Windows Machines using Nagios

Netapp C-Mode Cheat Sheet

Thanks for your wonderful Support and Encouragement