1 # Agent-based Checks <a id="agent-based-checks-addon"></a>
3 If the remote services are not directly accessible through the network, a
4 local agent installation exposing the results to check queries can
7 Prior to installing and configuration an agent service, evaluate possible
8 options based on these requirements:
10 * Security (authentication, TLS certificates, secure connection handling, etc.)
11 * Connection direction
12 * Master/satellite can execute commands directly or
13 * Agent sends back passive/external check results
14 * Availability on specific OS types and versions
16 * Configuration and initial setup
17 * Updates and maintenance, compatibility
19 Available agent types:
21 * [Icinga Agent](07-agent-based-monitoring.md#agent-based-checks-icinga) on Linux/Unix and Windows
22 * [SSH](07-agent-based-monitoring.md#agent-based-checks-ssh) on Linux/Unix
23 * [SNMP](07-agent-based-monitoring.md#agent-based-checks-snmp) on Linux/Unix and hardware
24 * [SNMP Traps](07-agent-based-monitoring.md#agent-based-checks-snmp-traps) as passive check results
25 * [REST API](07-agent-based-monitoring.md#agent-based-checks-rest-api) for passive external check results
26 * [NSClient++](07-agent-based-monitoring.md#agent-based-checks-nsclient) and [WMI](07-agent-based-monitoring.md#agent-based-checks-wmi) on Windows
29 ## Icinga Agent <a id="agent-based-checks-icinga"></a>
31 For the most common setups on Linux/Unix and Windows, we recommend
32 to setup the Icinga agent in a [distributed environment](06-distributed-monitoring.md#distributed-monitoring).
34 ![Icinga 2 Distributed Master with Agents](images/distributed-monitoring/icinga2_distributed_monitoring_scenarios_master_with_agents.png)
38 * Directly integrated into the distributed monitoring stack of Icinga
39 * Works on Linux/Unix and Windows
40 * Secure communication with TLS
41 * Connection can be established from both sides. Once connected, command execution and check results are exchanged.
42 * Master/satellite connects to agent
43 * Agent connects to parent satellite/master
44 * Same configuration language and binaries
45 * Troubleshooting docs and community best practices
47 Follow the setup and configuration instructions [here](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite).
49 On Windows hosts, the Icinga agent can query a local NSClient++ service
50 for additional checks in case there are no plugins available. The NSCP
51 installer is bundled with Icinga and can be installed with the setup wizard.
53 ![Icinga 2 Windows Setup](images/distributed-monitoring/icinga2_windows_setup_wizard_01.png)
55 ## SSH <a id="agent-based-checks-ssh"></a>
59 > This is the recommended way for systems where the Icinga agent is not available
60 > Be it specific hardware architectures, old systems or forbidden to install an additional software.
62 This method uses the SSH service on the remote host to execute
63 an arbitrary plugin command line. The output and exit code is
64 returned and used by the core.
66 The `check_by_ssh` plugin takes care of this. It is available in the
67 [Monitoring Plugins package](02-installation.md#setting-up-check-plugins).
68 For your convenience, the Icinga template library provides the [by_ssh](10-icinga-template-library.md#plugin-check-command-by-ssh)
71 ### SSH: Preparations <a id="agent-based-checks-ssh-preparations"></a>
73 SSH key pair for the Icinga daemon user. In case the user has no shell, temporarily enable this.
74 When asked for a passphrase, **do not set it** and press enter.
79 ssh-keygen -b 4096 -t rsa -C "icinga@$(hostname) user for check_by_ssh" -f $HOME/.ssh/id_rsa
82 On the remote agent, create the icinga user and generate a temporary password.
89 Copy the public key from the Icinga server to the remote agent, e.g. with `ssh-copy-id`
90 or manually into `/home/icinga/.ssh/authorized_keys`.
91 This will ask for the password once.
96 ssh-copy-id -i $HOME/.ssh/id_rsa icinga@ssh-agent1.localdomain
99 After the SSH key is copied, test at the connection **at least once** and
100 accept the host key verification. If you forget about this step, checks will
101 become UNKNOWN later.
104 ssh -i $HOME/.ssh/id_rsa icinga@ssh-agent1.localdomain
107 After the SSH key login works, disable the previously enabled logins.
109 * Remote agent user's password with `passwd -l icinga`
110 * Local icinga user terminal
112 Also, ensure that the permissions are correct for the `.ssh` directory
113 as otherwise logins will fail.
115 * `.ssh` directory: 700
116 * `.ssh/id_rsa.pub` public key file: 644
117 * `.ssh/id_rsa` private key file: 600
120 ### SSH: Configuration <a id="agent-based-checks-ssh-config"></a>
122 First, create a host object which has SSH configured and enabled.
123 Mark this e.g. with the custom variable `agent_type` to later
124 use this for service apply rule matches. Best practice is to
125 store that in a specific template, either in the static configuration
126 or inside the Director.
129 template Host "ssh-agent" {
130 check_command = "hostalive"
132 vars.agent_type = "ssh"
133 vars.os_type = "linux"
136 object Host "ssh-agent1.localdomain" {
139 address = "192.168.56.115"
143 Example for monitoring the remote users:
146 apply Service "users" {
147 check_command = "by_ssh"
149 vars.by_ssh_command = [ "/usr/lib/nagios/plugins/check_users" ]
151 // Follows the same principle as with command arguments, e.g. for ordering
152 vars.by_ssh_arguments = {
154 value = "$users_wgreater$" // Can reference an existing custom variable defined on the host or service, evaluated at runtime
157 value = "$users_cgreater$"
161 vars.users_wgreater = 3
162 vars.users_cgreater = 5
164 assign where host.vars.os_type == "linux" && host.vars.agent_type == "ssh"
168 A more advanced example with better arguments is shown in [this blogpost](https://www.netways.de/blog/2016/03/21/check_by_ssh-mit-icinga-2/).
171 ## SNMP <a id="agent-based-checks-snmp"></a>
173 The SNMP daemon runs on the remote system and answers SNMP queries by plugin scripts.
174 The [Monitoring Plugins package](02-installation.md#setting-up-check-plugins) provides
175 the `check_snmp` plugin binary, but there are plenty of [existing plugins](05-service-monitoring.md#service-monitoring-plugins)
176 for specific use cases already around, for example monitoring Cisco routers.
178 The following example uses the [SNMP ITL](10-icinga-template-library.md#plugin-check-command-snmp)
179 CheckCommand and sets the `snmp_oid` custom variable. A service is created for all hosts which
180 have the `snmp-community` custom variable.
183 template Host "snmp-agent" {
184 check_command = "hostalive"
186 vars.agent_type = "snmp"
188 vars.snmp_community = "public-icinga"
191 object Host "snmp-agent1.localdomain" {
197 apply Service "uptime" {
198 import "generic-service"
200 check_command = "snmp"
201 vars.snmp_oid = "1.3.6.1.2.1.1.3.0"
202 vars.snmp_miblist = "DISMAN-EVENT-MIB"
204 assign where host.vars.agent_type == "snmp" && host.vars.snmp_community != ""
208 If no `snmp_miblist` is specified, the plugin will default to `ALL`. As the number of available MIB files
209 on the system increases so will the load generated by this plugin if no `MIB` is specified.
210 As such, it is recommended to always specify at least one `MIB`.
212 Additional SNMP plugins are available using the [Manubulon SNMP Plugins](10-icinga-template-library.md#snmp-manubulon-plugin-check-commands).
214 For network monitoring, community members advise to use [nwc_health](05-service-monitoring.md#service-monitoring-network)
218 ## SNMP Traps and Passive Check Results <a id="agent-based-checks-snmp-traps"></a>
220 SNMP Traps can be received and filtered by using [SNMPTT](http://snmptt.sourceforge.net/)
221 and specific trap handlers passing the check results to Icinga 2.
223 Following the SNMPTT [Format](http://snmptt.sourceforge.net/docs/snmptt.shtml#SNMPTT.CONF-FORMAT)
224 documentation and the Icinga external command syntax found [here](24-appendix.md#external-commands-list-detail)
225 we can create generic services that can accommodate any number of hosts for a given scenario.
227 ### Simple SNMP Traps <a id="simple-traps"></a>
229 A simple example might be monitoring host reboots indicated by an SNMP agent reset.
230 Building the event to auto reset after dispatching a notification is important.
231 Setup the manual check parameters to reset the event from an initial unhandled
232 state or from a missed reset event.
234 Add a directive in `snmptt.conf`
237 EVENT coldStart .1.3.6.1.6.3.1.1.5.1 "Status Events" Normal
238 FORMAT Device reinitialized (coldStart)
239 EXEC echo "[$@] PROCESS_SERVICE_CHECK_RESULT;$A;Coldstart;2;The snmp agent has reinitialized." >> /var/run/icinga2/cmd/icinga2.cmd
241 A coldStart trap signifies that the SNMPv2 entity, acting
242 in an agent role, is reinitializing itself and that its
243 configuration may have been altered.
247 1. Define the `EVENT` as per your need.
248 2. Construct the `EXEC` statement with the service name matching your template
249 applied to your _n_ hosts. The host address inferred by SNMPTT will be the
250 correlating factor. You can have snmptt provide host names or ip addresses to
251 match your Icinga convention.
255 > Replace the deprecated command pipe EXEC statement with a curl call
256 > to the REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result).
258 Add an `EventCommand` configuration object for the passive service auto reset event.
261 object EventCommand "coldstart-reset-event" {
262 command = [ ConfigDir + "/conf.d/custom/scripts/coldstart_reset_event.sh" ]
265 "-i" = "$service.state_id$"
267 "-s" = "$service.name$"
272 Create the `coldstart_reset_event.sh` shell script to pass the expanded variable
273 data in. The `$service.state_id$` is important in order to prevent an endless loop
274 of event firing after the service has been reset.
286 Usage: ${0##*/} [-h] -n HOST_NAME -s SERVICE_NAME
287 Writes a coldstart reset event to the Icinga command pipe.
289 -h Display this help and exit.
290 -i SERVICE_STATE_ID The associated service state id.
291 -n HOST_NAME The associated host name.
292 -s SERVICE_NAME The associated service name.
296 while getopts "hi:n:s:" opt; do
303 SERVICE_STATE_ID=$OPTARG
318 if [ -z "$SERVICE_STATE_ID" ]; then
320 printf "\n Error: -i required.\n"
324 if [ -z "$HOST_NAME" ]; then
326 printf "\n Error: -n required.\n"
330 if [ -z "$SERVICE_NAME" ]; then
332 printf "\n Error: -s required.\n"
336 if [ "$SERVICE_STATE_ID" -gt 0 ]; then
337 echo "[`date +%s`] PROCESS_SERVICE_CHECK_RESULT;$HOST_NAME;$SERVICE_NAME;0;Auto-reset (`date +"%m-%d-%Y %T"`)." >> /var/run/icinga2/cmd/icinga2.cmd
343 > Replace the deprecated command pipe EXEC statement with a curl call
344 > to the REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result).
346 Finally create the `Service` and assign it:
349 apply Service "Coldstart" {
350 import "generic-service-custom"
352 check_command = "dummy"
353 event_command = "coldstart-reset-event"
355 enable_notifications = 1
356 enable_active_checks = 0
357 enable_passive_checks = 1
363 vars.dummy_text = "Manual reset."
367 assign where (host.vars.os == "Linux" || host.vars.os == "Windows")
371 ### Complex SNMP Traps <a id="complex-traps"></a>
373 A more complex example might be passing dynamic data from a traps varbind list
374 for a backup scenario where the backup software dispatches status updates. By
375 utilizing active and passive checks, the older freshness concept can be leveraged.
377 By defining the active check as a hard failed state, a missed backup can be reported.
378 As long as the most recent passive update has occurred, the active check is bypassed.
380 Add a directive in `snmptt.conf`
383 EVENT enterpriseSpecific <YOUR OID> "Status Events" Normal
384 FORMAT Enterprise specific trap
385 EXEC echo "[$@] PROCESS_SERVICE_CHECK_RESULT;$A;$1;$2;$3" >> /var/run/icinga2/cmd/icinga2.cmd
387 An enterprise specific trap.
388 The varbinds in order denote the Icinga service name, state and text.
392 1. Define the `EVENT` as per your need using your actual oid.
393 2. The service name, state and text are extracted from the first three varbinds.
394 This has the advantage of accommodating an unlimited set of use cases.
398 > Replace the deprecated command pipe EXEC statement with a curl call
399 > to the REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result).
401 Create a `Service` for the specific use case associated to the host. If the host
402 matches and the first varbind value is `Backup`, SNMPTT will submit the corresponding
403 passive update with the state and text from the second and third varbind:
406 object Service "Backup" {
407 import "generic-service-custom"
409 host_name = "host.domain.com"
410 check_command = "dummy"
412 enable_notifications = 1
413 enable_active_checks = 1
414 enable_passive_checks = 1
417 max_check_attempts = 1
418 check_interval = 87000
423 vars.dummy_text = "No passive check result received."
428 ## Agents sending Check Results via REST API <a id="agent-based-checks-rest-api"></a>
430 Whenever the remote agent cannot run the Icinga agent, or a backup script
431 should just send its current state after finishing, you can use the [REST API](12-icinga2-api.md#icinga2-api)
432 as secure transport and send [passive external check results](08-advanced-topics.md#external-check-results).
434 Use the [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result) API action to send the external passive check result.
435 You can either use `curl` or implement the HTTP requests in your preferred programming
436 language. Examples for API clients are available in [this chapter](12-icinga2-api.md#icinga2-api-clients).
438 Feeding check results from remote hosts requires the host/service
439 objects configured on the master/satellite instance.
441 ## NSClient++ on Windows <a id="agent-based-checks-nsclient"></a>
443 [NSClient++](https://nsclient.org/) works on both Windows and Linux platforms and is well
444 known for its magnificent Windows support. There are alternatives like the WMI interface,
445 but using `NSClient++` will allow you to run local scripts similar to check plugins fetching
446 the required output and performance counters.
450 > Best practice is to use the Icinga agent as secure execution
451 > bridge (`check_nt` and `check_nrpe` are considered insecure)
452 > and query the NSClient++ service [locally](06-distributed-monitoring.md#distributed-monitoring-windows-nscp).
454 You can use the `check_nt` plugin from the Monitoring Plugins project to query NSClient++.
455 Icinga 2 provides the [nscp check command](10-icinga-template-library.md#plugin-check-command-nscp) for this:
460 object Service "disk" {
461 import "generic-service"
463 host_name = "remote-windows-host"
465 check_command = "nscp"
467 vars.nscp_variable = "USEDDISKSPACE"
468 vars.nscp_params = "c"
474 For details on the `NSClient++` configuration please refer to the [official documentation](https://docs.nsclient.org/).
476 ## WMI on Windows <a id="agent-based-checks-wmi"></a>
478 The most popular plugin is [check_wmi_plus](http://edcint.co.nz/checkwmiplus/).
480 > Check WMI Plus uses the Windows Management Interface (WMI) to check for common services (cpu, disk, sevices, eventlog…) on Windows machines. It requires the open source wmi client for Linux.
484 * [Icinga 2 check_wmi_plus example by 18pct](http://18pct.com/icinga2-check_wmi_plus-example/)
485 * [Agent-less monitoring with WMI](https://www.devlink.de/linux/icinga2-nagios-agentless-monitoring-von-windows/)