Inventory - How Check_MK finds services to checkJanuary 17. 2011
IntroductionConfiguring which check should be done on which host is a tedious work in Nagios. More than that: Another issue is keepng your configuration up-to-date. Your colleagues introduce new filesystems, new network interfaces and new database instances without always informing you. How can you be sure that every important item is really being monitored? Check_MK helps you not only to scan new hosts for items to check but also to keep track of your existing hosts. It can do so because of the special nature of its agents: They always send all interesting data about the host regardless of which items are checked with Nagios. All of Check_MK's check plugins support automatic detection of service - i.e. inventory. A few of them need a bit of configuration (for example checks for processes and services). But in most cases everything happens automatically. If you are curious which checks are shipped with Check_MK, use the option -L (the list is abbreviated here):
root@linux# cmk -L
Available check types:
plugin perf- in-
Name type data vent. service description
-------------------------------------------------------------------------
3ware_disks tcp no yes RAID 3ware disk %s
3ware_info tcp no yes RAID 3ware controller %s
3ware_units tcp no yes RAID 3ware unit %s
ad_replication tcp no yes AD Replication %s
aironet_clients snmp yes yes Average client signal %s
aironet_errors snmp yes yes MAC CRC errors radio %s
apc_symmetra snmp yes yes APC Symmetra status
apc_symmetra_ext_temp snmp yes yes APC External Temp %s
apc_symmetra_power snmp yes yes Power phase %s
apc_symmetra_temp snmp yes yes %s
blade_bays snmp no yes BAY %s
blade_blowers snmp yes yes Blower %s
blade_health snmp no yes Summary health state
blade_mediatray snmp no yes Media tray
blade_misc snmp yes yes SENSOR %s
blade_powerfan snmp yes yes Power Module Cooling Device %s
blade_powermod snmp no yes Power Module %s
bluecoat_diskcpu snmp yes yes %s
bluecoat_sensors snmp yes yes %s
cisco_fan snmp no yes FAN %s
Performing an inventoryInventory is not done automatically (for good reasons). You perform it by calling cmk with the option -I and the list of hosts to inventorize (i.e. to scan for new checks on): root@linux# cmk -I somehost otherhost It is also allowed to leave out the host names - Check_MK will then inventorize all hosts (you'll probably do this only in small installations): root@linux# cmk -I When you want to restrict the inventory to one or several check types, you need the option --checks= before the option -I. Separate several check types with commas. The following call inventorizes the checks snmp_info and df_netapp: root@linux# cmk --checks=snmp_info,df_netapp -I filer01 filer02 More flexible host specificationAs of version 1.1.13i2 it is also allowed to specify one or more host tags by prefixing them with a @: root@linux# cmk -I @linux @windows The upper call will inventorize all linux hosts and all windows hosts. When you need a combination of host tags in order to make the inventory more specific, join the tags with commas. The following example will inventorize all Hosts having the tags prod and linux at the same time: root@linux# cmk -I @linux,prod As long as none of your hosts incidentally has the name of a tag, it's also allowed to leave out the @: root@linux# cmk -I linux,prod When you have defined clusters (configuration variable clusters), then please not that inventory is always done on the physical nodes. As of version 1.1.13i2 - however - it's possible to specify the cluster when doing inventory. Check_MK will automatically replace this by the list of nodes of the cluster. Cache filesWhen you do not specify hosts to -I, Check_MK scans all host for new services. In order to speed up that procedure, Check_MK does not retrieve the data from the hosts if they already have been checked at least once. Each time a check is running a cache file is kept in /var/lib/check_mk/cache. Inventory information is drawn from there if available. You can force Check_MK to retrieve fresh data with the option --no-cache: root@linux# cmk --no-cache -I This should not be neccessary in normal situations. It's just that a change on a host can take up to a minute (normal Nagios check interval) to be reflected by the inventory. If the change happened more than one check interval ago, it will already be in your cached data. Caching does not happen as long as you specify one or more hosts. In that case the inventory will always retrieve fresh data. SNMP checksSNMP based checks can also be inventorized as the upper example has shown. There is not much difference from the checks based on the Check_MK agent. The good news: Check_MK does not have to retrieve the complete SNMP data in order to find interesting OIDs. Each SNMP checks provides a specific scan function the just retrieves one or two single OIDs in order to know if the check will make sense on that particular device. Since most checks make use of the same OIDs for scanning, only few OIDs needs to be fetched in order to know which of the more then 100 shipped SNMP checks need to be inventorized. The gross result: Doing a cmk -I on an SNMP device will find all services which are supported by Check_MK automatically. Please note, that SNMP hosts need to be tagged with snmp. Consult the see the SNMP page for more details. What happens with the items found?All new items Check_MK finds are saved in configuration files similar - but not quite compatible - to main.mk. They are created in a separate directory which defaults to /var/lib/check_mk/autochecks. At setup.sh you have been asked for a "working directory of check_mk". autochecks will be created as a subdirectory of that. Each time you call check_mk it reads in all files in that directory and appends the entries to your checks variable. Let's look at such a file: /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
# /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
[
# === zwin17 ===
("zwin17", "df", 'C:/', filesystem_default_levels), # 36
# === zsrv01 ===
("zsrv01", "df", '/', filesystem_default_levels), # 24
("zsrv01", "df", '/home', filesystem_default_levels), # 17
]
Changing and removing inventorized checksCheck_MK's inventory usually does not remove checks but only add new ones. Why? If e.g. a filesystem previously found is now missing, that is either a critical problem or it has been removed by the host's administrator. Check_MK cannot safely know which of both is the case and rather leaves the check. There are two ways to remove checks found be previous inventories: 1. Edit or delete autochecks filesCheck_MK never overwrites files in autochecks. It is completely save to edit them and remove checks not longer needed. You can either delete files or open them with an editor and delete single entries: /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
# /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
[
# === zwin17 ===
("zwin17", "df", 'C:/', filesystem_default_levels), # 36
# === zsrv01 ===
("zsrv01", "df", '/', (98, 99) ), # DELETE THIS LINE
("zsrv01", "df", '/home', filesystem_default_levels), # 17
]
2. Reinventorize with -IIAs of version 1.1.7i1 Check_MK supports the option -II. It does exactly the same as -I but removes all existing checks before doing the inventory. Only those checks are affected that are being inventorized. Example 1: root@linux# cmk -II df xyzsrv01 This first removes all checks of type df on host xyzsrv01 and then does inventory. Example 2: root@linux# cmk -II xyzsrv01 This removes all agent based of host xyzsrv01 before doing inventory. You can even do a check_mk -II and thus reinventorize all agent based checks on all hosts - and removing all checks currently not found on the target hosts. Cleaning up autochecksThe fact that Check_MK creates new files for each inventory is handy if you want to revert or modify the results of recent inventories. As time goes by there are quite a lot of files in the autochecks directory, however. As of version 1.1.7i1, Check_MK offers the new option -u or --cleanup-autochecks, which reads in all files in /var/lib/check_mk/autochecks, creates one new file per host and removes the exceeding files afterwards. That greatly reduces the number of files in the directory and also makes the removal of all data of a host an easy task. This option can either be used stand alone... root@linux# cmk -u ... or as a modifier to -I: root@linux# cmk -uI host123 If called that way, the cleanup is done right after the inventory. If you like that feature, you can make Check_MK always cleanup immediately after each inventory by setting in your main.mk: main.mk always_cleanup_autochecks = True Updating your Nagios configurationPlease do not forget to update your monitoring configuration and restart the monitoring core with: root@linux# cmk -R ... after every inventory (that found something new) or manual change in the autochecks. That will not only update your Nagios configuration files but also recompile all host checks. Inventorized versus manual checksEven when checks can be found via inventory it is allowed to configure them manually. You can have various reasons for that. One is that you want to define levels others than those the inventory sets. Whenever a check is defined manually in main.mk the inventory will never find that item agin. Excluding items from the inventorySometimes the inventory finds things that you do not want to check. Removing that items from the files in autochecks is not a perfect idea: At the next inventory those items will reappear again. It is better to explicitely exclude them. Check_mk provides two configuration variables for doing that:
In ignored_checktypes you can switch off inventory for certain check types completely and globally. Lets assume, that you do not want to monitor network interface throughput and link settings at all. Simply list the according check types (see check_mk -L) in this list: main.mk ignored_checktypes = [ "netctr.combined", "netif.params" ] If you want to control inventory more specific you need ignored_services. This is a configuration list with the following values in each entry:
The following example will exclude the Eventlog Security from the two hosts win01 and win01: main.mk ignored_services = [ ( [ "win01", "win02" ], [ "LOG Security" ] ) ] Note that the list of services is interpreted as regular expressions matching the beginning of the service description as displayed in Nagios. The following example will not only one but all Logfiles, i.e. all services beginning with LOG, as well as the drives with the letter D:: main.mk ignored_services = [ ( [ "win01", "win02" ], [ "LOG", "fs_C:" ] ) ] If you are unsure about the correct spelling of a service you can call check_mk -D to dump all services. If you have tagged all your windows host with win the following configuration snippet will do the same but for all Windows hosts: main.mk ignored_services = [ ( [ "win" ], ALL_HOSTS, [ "LOG", "fs_C:" ] ) ] NEW in 1.1.9i1 Using the option ignored_checks you can exclude specific checktypes for several host. This options behaves like ignored_checktypes with the advantage that you can configure different options for different hosts. To disable all hr_* checks for all your linux hosts you can use the following configuration: ignored_checks = [ ( [ "hr_cpu", "hr_mem", "hr_fs" ], [ "linux" ], ALL_HOSTS) ] This is useful when you monitor your windows servers using the Check_MK Agent AND SNMP at the same time for some reason. That setup could result in duplicate services e.g. for the filesystems, memory and cpu checks. And with the above line you can prevent these duplicate servicenames by disabling these checks via SNMP. You can also use this option very selective. This line disables the df check on the host win01: ignored_checks = [ ( "df", [ "win01" ]) ] Please note that the two ignore_... variables only affect future inventories. They have no effect on the checking or on previously inventorized services. |
| ||||||||||||||||||||