Netapp Cluster Mode Monitoring using Nagios Nagios-Cdot-Netapp Plugin
Monitoring storage usage / performance / utilization of space is more important, if something went wrong on the SAN / NAS dependent applications will also fail. To avoid we should continuously monitor the storage devices. Manually monitoring is not an good idea / humanly not possible. Netapp Cluster Mode Monitoring Using Nagios Nagios-Cdot-Netapp Plugin Step by step Guide.
Netapp Cluster Mode Monitoring using Nagios
Even If you use Nagios Core completely free open source application will work
Download Plugins Keep it ready
Also Download Netapp SDK file required login to Netapp Mysupport Site
Copy both the files to Linux and extract them using unzip <File Name>
[root@Ark-Nagios netapp-cdot-nagios-master]# ./check_cdot_aggr.pl Can't locate NaElement.pm in @INC (@INC contains: /usr/lib/netapp-manageability-sdk/lib/perl/NetApp /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /usr/local/lib64/perl5/NaServer.pm line 34. BEGIN failed--compilation aborted at /usr/local/lib64/perl5/NaServer.pm line 34. Compilation failed in require at ./check_cdot_aggr.pl line 17. BEGIN failed--compilation aborted at ./check_cdot_aggr.pl line 17.
Before placing Netapp SDK stuff to /usr/lib/ path you will see above error. After extracting SDK file just copy every thing to /usr/lib/ example provided below
[root@Ark-Nagios ~]# mv netapp-manageability-sdk-5.6 /usr/lib/netapp-manageability-sdk
Execute commands manually from Nagios Server and check status
To ensure commands are working by running manually.
[root@Ark-Nagios netapp-cdot-nagios-master]# ./check_cdot_aggr.pl --hostname 192.168.2.20 --username Username --password Password --warning 80 --critical 90 CRITICAL: aggr1_SAS_ARK_NA01 (93%), aggr1_SAS_ARK_NA03 (90%), aggr1_SAS_ARK_NA04 (66%), aggr1_SATA_ARK_NA02 (63%), aggr2_SAS_ARK_NA01 (59%), aggr2_SAS_ARK_NA03 (88%), aggr3_SAS_ARK_NA01 (33%), aggr4_SAS_ARK_NA01 (7%), root_ARK_NA01 (96%), root_ARK_NA02 (95%), root_ARK_NA03 (95%), root_ARK_NA04 (95%)
As per above output commands are working fine as expected, let move them to plugin directory and configure commands
[root@Ark-Nagios netapp-cdot-nagios-master]# cp * /usr/local/nagios/libexec/
Configure Commands in commands.cfg file plugins Nagios-Cdot-Netapp
[root@Ark-Nagios ~]# vi /usr/local/nagios/etc/objects/commands.cfg
define command {
command_name checkaggr
command_line /usr/local/nagios/libexec/check_cdot_aggr.pl --hostname $HOSTADDRESS$ --username UserName --password Password --warning 90 --critical 96
}
#Netapp Cluster Mode Monitoring Interface Status
define command {
command_name checkint
command_line /usr/local/nagios/libexec/check_cdot_interfaces.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}
#Netapp Cluster Mode Monitoring snapmirror snapshot age by default 90Days
define command {
command_name checksnaps
command_line /usr/local/nagios/libexec/check_cdot_snapmirror_snapshots.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}
#Disk Failed Status Netapp Cluster Mode Monitoring
define command {
command_name checkdisk
command_line /usr/local/nagios/libexec/check_cdot_disk.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}
#Check Snapmirror Idle time Netapp Cluster Mode Monitoring Using Nagios
define command {
command_name checksnapm
command_line /usr/local/nagios/libexec/check_cdot_snapmirror.pl --hostname $HOSTADDRESS$ --username UserName --password Password
}
#Volume Space Usage Netapp Cluster Mode Monitoring
define command {
command_name cdotvol
command_line /usr/local/nagios/libexec/check_cdot_volume.pl --hostname $HOSTADDRESS$ --username UserName --password Password --size-warning 95 --size-critical 97 --inode-warning 95 --inode-critical 97 --volume $ARG1$ --perf
}
#Aggregate Status Netapp Cluster Mode Monitoring
define command {
command_name cdotaggr
command_line /usr/local/nagios/libexec/check_cdot_aggr.pl --hostname $HOSTADDRESS$ --username UserName --password Password --warning 95 --critical 97 --perf --aggr $ARG1$
}
Generate / Define your Host config File for Monitoring
Using above command names, have to define your host config file for monitoring checks
Host definition, PING and SSH service monitoring config below
###################################### #### Host Defination: Netapp-Cdot ### ###################################### define host{ use linux-server host_name Netapp-Cdot alias Netapp Filer Cluster Mode address 192.168.2.20 action_url /pnp4nagios/graph?host=$Netapp-Cdot$ } define service{ use generic-service host_name Netapp-Cdot service_description PING check_command check_ping!200.0,40%!500.0,60% notifications_enabled 1 process_perf_data 1 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ } define service{ use local-service ; Name of service template to use host_name Netapp-Cdot service_description SSH check_command check_ssh notifications_enabled 1 process_perf_data 1 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Interface Health Status check_command checkontap!interface_health action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Port Status check_command checkontap!port_health action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Cluster Health Status check_command checkontap!cluster_health action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Filer Hardware Health Status check_command checkontap!filer_hardware_health action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Volume Health Status check_command checkontap!volume_health action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description cDOT Interfaces Status check_command checkint action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description cDOT Snapmirror Snapshot Status check_command checksnaps action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description cDOT Disk Health Status check_command checkdisk action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description cDOT Snapmirror Status check_command checksnapm action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description cDOT Aggregate Status check_command checkaggr action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } Likewise take all volumes list and specify config definitions. define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Volume Volume1 check_command cdotvol!Volume1 action_url /pnp4nagios/graph?host=$Netapp-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 } Define All Aggregates by replacing Aggregate name, copy definition paste it number of times. define service{ use generic-service-storage ; Name of service template to use host_name Netapp-Cdot service_description Aggregate Usage aggr1 check_command cdotaggr!aggr1 action_url /pnp4nagios/graph?host=$Netap-Cdot$&srv=$SERVICEDESC$ process_perf_data 1 }
That’s it your Nagios Core Server is monitoring Netapp Cluster Mode Monitoring Filers continuously and send’s you an alerts when something CRITICAL / WARNING usage.
Conclusion:
Don’e be afraid when you get NaServer.pm file is missing error, Download Netapp SDK and copy the same to /usr/lib path. Using Nagios Core Completely free open source will save lot of money and Best monitoring tool for enterprise. Netapp Cluster Mode Monitoring Using Nagios is enabled Enjoy.
Related Articles
Generation Nagios Configuration in Just 1 Minute
Installing and Configuring NRPE Client in Linux machines
Monitoring Windows Machines using Nagios
Thanks for your wonderful Support and Encouragement