1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
5 Keep in mind these examples are made with a linux server in mind, if you are
6 using Windows you will need to change the services accordingly. See the [ITL reference](7-icinga-template-library.md#windows-plugins)
7 for further information.
9 ## <a id="hosts-services"></a> Hosts and Services
11 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
12 and services can be virtually anything which can be checked in some way:
14 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
18 * Other local or network-accessible services
20 Host objects provide a mechanism to group services that are running
21 on the same physical device.
23 Here is an example of a host object which defines two child services:
25 object Host "my-server1" {
27 check_command = "hostalive"
30 object Service "ping4" {
31 host_name = "my-server1"
32 check_command = "ping4"
35 object Service "http" {
36 host_name = "my-server1"
37 check_command = "http"
40 The example creates two services `ping4` and `http` which belong to the
43 It also specifies that the host should perform its own check using the `hostalive`
46 The `address` attribute is used by check commands to determine which network
47 address is associated with the host object.
49 Details on troubleshooting check problems can be found [here](17-troubleshooting.md#troubleshooting).
51 ### <a id="host-states"></a> Host States
53 Hosts can be in any of the following states:
56 ------------|--------------
57 UP | The host is available.
58 DOWN | The host is unavailable.
60 ### <a id="service-states"></a> Service States
62 Services can be in any of the following states:
65 ------------|--------------
66 OK | The service is working properly.
67 WARNING | The service is experiencing some problems but is still considered to be in working condition.
68 CRITICAL | The service is in a critical state.
69 UNKNOWN | The check could not determine the service's state.
71 ### <a id="hard-soft-states"></a> Hard and Soft States
73 When detecting a problem with a host/service Icinga re-checks the object a number of
74 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
75 notifications. This ensures that no unnecessary notifications are sent for
76 transient failures. During this time the object is in a `SOFT` state.
78 After all re-checks have been executed and the object is still in a non-OK
79 state the host/service switches to a `HARD` state and notifications are sent.
82 ------------|--------------
83 HARD | The host/service's state hasn't recently changed.
84 SOFT | The host/service has recently changed state and is being re-checked.
86 ### <a id="host-service-checks"></a> Host and Service Checks
88 Hosts and services determine their state by running checks in a regular interval.
90 object Host "router" {
91 check_command = "hostalive"
95 The `hostalive` command is one of several built-in check commands. It sends ICMP
96 echo requests to the IP address specified in the `address` attribute to determine
97 whether a host is online.
99 A number of other [built-in check commands](7-icinga-template-library.md#plugin-check-commands) are also
100 available. In addition to these commands the next few chapters will explain in
101 detail how to set up your own check commands.
104 ## <a id="object-inheritance-using-templates"></a> Templates
106 Templates may be used to apply a set of identical attributes to more than one
109 template Service "generic-service" {
110 max_check_attempts = 3
113 enable_perfdata = true
116 apply Service "ping4" {
117 import "generic-service"
119 check_command = "ping4"
121 assign where host.address
124 apply Service "ping6" {
125 import "generic-service"
127 check_command = "ping6"
129 assign where host.address6
133 In this example the `ping4` and `ping6` services inherit properties from the
134 template `generic-service`.
136 Objects as well as templates themselves can import an arbitrary number of
137 other templates. Attributes inherited from a template can be overridden in the
140 You can also import existing non-template objects. Note that templates
141 and objects share the same namespace, i.e. you can't define a template
142 that has the same name like an object.
145 ## <a id="custom-attributes"></a> Custom Attributes
147 In addition to built-in attributes you can define your own attributes:
149 object Host "localhost" {
153 Valid values for custom attributes include:
155 * [Strings](20-language-reference.md#string-literals), [numbers](20-language-reference.md#numeric-literals) and [booleans](20-language-reference.md#boolean-literals)
156 * [Arrays](20-language-reference.md#array) and [dictionaries](20-language-reference.md#dictionary)
157 * [Functions](3-monitoring-basics.md#custom-attributes-functions)
159 ### <a id="custom-attributes-functions"></a> Functions as Custom Attributes
161 Icinga 2 lets you specify [functions](20-language-reference.md#functions) for custom attributes.
162 The special case here is that whenever Icinga 2 needs the value for such a custom attribute it runs
163 the function and uses whatever value the function returns:
165 object CheckCommand "random-value" {
166 import "plugin-check-command"
168 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
170 vars.text = {{ Math.random() * 100 }}
173 This example uses the [abbreviated lambda syntax](20-language-reference.md#nullary-lambdas).
175 These functions have access to a number of variables:
177 Variable | Description
178 -------------|---------------
179 user | The User object (for notifications).
180 service | The Service object (for service checks/notifications/event handlers).
181 host | The Host object.
182 command | The command object (e.g. a CheckCommand object for checks).
186 vars.text = {{ host.check_interval }}
188 In addition to these variables the `macro` function can be used to retrieve the
189 value of arbitrary macro expressions:
192 if (macro("$address$") == "127.0.0.1") {
193 log("Running a check for localhost!")
199 The `resolve_arguments` can be used to resolve a command and its arguments much in
200 the same fashion Icinga does this for the `command` and `arguments` attributes for
201 commands. The `by_ssh` command uses this functionality to let users specify a
202 command and arguments that should be executed via SSH:
206 var command = macro("$by_ssh_command$")
207 var arguments = macro("$by_ssh_arguments$")
209 if (typeof(command) == String && !arguments) {
213 var escaped_args = []
214 for (arg in resolve_arguments(command, arguments)) {
215 escaped_args.add(escape_shell_arg(arg))
217 return escaped_args.join(" ")
222 Acessing object attributes at runtime inside these functions is described in the
223 [advanced topics](5-advanced-topics.md#access-object-attributes-at-runtime) chapter.
225 ## <a id="runtime-macros"></a> Runtime Macros
227 Macros can be used to access other objects' attributes at runtime. For example they
228 are used in command definitions to figure out which IP address a check should be
231 object CheckCommand "my-ping" {
232 import "plugin-check-command"
234 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
237 "-w" = "$ping_wrta$,$ping_wpl$%"
238 "-c" = "$ping_crta$,$ping_cpl$%"
239 "-p" = "$ping_packets$"
242 vars.ping_address = "$address$"
250 vars.ping_packets = 5
253 object Host "router" {
254 check_command = "my-ping"
258 In this example we are using the `$address$` macro to refer to the host's `address`
261 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
262 automatically tries to find the closest match for the attribute you specified. The
263 exact rules for this are explained in the next section.
266 ### <a id="macro-evaluation-order"></a> Evaluation Order
268 When executing commands Icinga 2 checks the following objects in this order to look
269 up macros and their respective values:
271 1. User object (only for notifications)
275 5. Global custom attributes in the `Vars` constant
277 This execution order allows you to define default values for custom attributes
278 in your command objects.
280 Here's how you can override the custom attribute `ping_packets` from the previous
283 object Service "ping" {
284 host_name = "localhost"
285 check_command = "my-ping"
287 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
290 If a custom attribute isn't defined anywhere an empty value is used and a warning is
291 written to the Icinga 2 log.
293 You can also directly refer to a specific attribute - thereby ignoring these evaluation
294 rules - by specifying the full attribute name:
296 $service.vars.ping_wrta$
298 This retrieves the value of the `ping_wrta` custom attribute for the service. This
299 returns an empty value if the service does not have such a custom attribute no matter
300 whether another object such as the host has this attribute.
303 ### <a id="host-runtime-macros"></a> Host Runtime Macros
305 The following host custom attributes are available in all commands that are executed for
309 -----------------------------|--------------
310 host.name | The name of the host object.
311 host.display_name | The value of the `display_name` attribute.
312 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
313 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
314 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
315 host.check_attempt | The current check attempt number.
316 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
317 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
318 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
319 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
320 host.last_state_change | The last state change's timestamp.
321 host.downtime_depth | The number of active downtimes.
322 host.duration_sec | The time since the last state change.
323 host.latency | The host's check latency.
324 host.execution_time | The host's check execution time.
325 host.output | The last check's output.
326 host.perfdata | The last check's performance data.
327 host.last_check | The timestamp when the last check was executed.
328 host.check_source | The monitoring instance that performed the last check.
329 host.num_services | Number of services associated with the host.
330 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
331 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
332 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
333 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
335 ### <a id="service-runtime-macros"></a> Service Runtime Macros
337 The following service macros are available in all commands that are executed for
341 ---------------------------|--------------
342 service.name | The short name of the service object.
343 service.display_name | The value of the `display_name` attribute.
344 service.check_command | The short name of the command along with any arguments to be used for the check.
345 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
346 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
347 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
348 service.check_attempt | The current check attempt number.
349 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
350 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
351 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
352 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
353 service.last_state_change | The last state change's timestamp.
354 service.downtime_depth | The number of active downtimes.
355 service.duration_sec | The time since the last state change.
356 service.latency | The service's check latency.
357 service.execution_time | The service's check execution time.
358 service.output | The last check's output.
359 service.perfdata | The last check's performance data.
360 service.last_check | The timestamp when the last check was executed.
361 service.check_source | The monitoring instance that performed the last check.
363 ### <a id="command-runtime-macros"></a> Command Runtime Macros
365 The following custom attributes are available in all commands:
368 -----------------------|--------------
369 command.name | The name of the command object.
371 ### <a id="user-runtime-macros"></a> User Runtime Macros
373 The following custom attributes are available in all commands that are executed for
377 -----------------------|--------------
378 user.name | The name of the user object.
379 user.display_name | The value of the display_name attribute.
381 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
384 -----------------------|--------------
385 notification.type | The type of the notification.
386 notification.author | The author of the notification comment, if existing.
387 notification.comment | The comment of the notification, if existing.
389 ### <a id="global-runtime-macros"></a> Global Runtime Macros
391 The following macros are available in all executed commands:
394 -----------------------|--------------
395 icinga.timet | Current UNIX timestamp.
396 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
397 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
398 icinga.date | Current date. Example: `2014-01-03`
399 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
400 icinga.uptime | Current uptime of the Icinga 2 process.
402 The following macros provide global statistics:
405 ----------------------------------|--------------
406 icinga.num_services_ok | Current number of services in state 'OK'.
407 icinga.num_services_warning | Current number of services in state 'Warning'.
408 icinga.num_services_critical | Current number of services in state 'Critical'.
409 icinga.num_services_unknown | Current number of services in state 'Unknown'.
410 icinga.num_services_pending | Current number of pending services.
411 icinga.num_services_unreachable | Current number of unreachable services.
412 icinga.num_services_flapping | Current number of flapping services.
413 icinga.num_services_in_downtime | Current number of services in downtime.
414 icinga.num_services_acknowledged | Current number of acknowledged service problems.
415 icinga.num_hosts_up | Current number of hosts in state 'Up'.
416 icinga.num_hosts_down | Current number of hosts in state 'Down'.
417 icinga.num_hosts_unreachable | Current number of unreachable hosts.
418 icinga.num_hosts_flapping | Current number of flapping hosts.
419 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
420 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
423 ## <a id="using-apply"></a> Apply Rules
425 Instead of assigning each object ([Service](6-object-types.md#objecttype-service),
426 [Notification](6-object-types.md#objecttype-notification), [Dependency](6-object-types.md#objecttype-dependency),
427 [ScheduledDowntime](6-object-types.md#objecttype-scheduleddowntime))
428 based on attribute identifiers for example `host_name` objects can be [applied](20-language-reference.md#apply).
430 Before you start using the apply rules keep the following in mind:
432 * Define the best match.
433 * A set of unique [custom attributes](3-monitoring-basics.md#custom-attributes) for these hosts/services?
434 * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
435 * A generic pattern [match](20-language-reference.md#function-calls) on the host/service name?
436 * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](20-language-reference.md#expression-operators)
437 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
441 > You can set/override object attributes in apply rules using the respectively available
442 > objects in that scope (host and/or service objects).
444 [Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
445 for not only matching for their existance or values in apply expressions, but also assign
446 ("inherit") their values into the generated objected from apply rules.
448 * [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
449 * [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
450 * [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-dependencies)
451 * [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
453 A more advanced example is using [apply with for loops on arrays or
454 dictionaries](3-monitoring-basics.md#using-apply-for) for example provided by
455 [custom atttributes](3-monitoring-basics.md#custom-attributes) or groups.
459 > Building configuration in that dynamic way requires detailed information
460 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
461 > after successful [configuration validation](8-cli-commands.md#config-validation).
464 ### <a id="using-apply-expressions"></a> Apply Rules Expressions
466 You can use simple or advanced combinations of apply rule expressions. Each
467 expression must evaluate into the boolean `true` value. An empty string
468 will be for instance interpreted as `false`. In a similar fashion undefined
469 attributes will return `false`.
473 assign where host.vars.attribute_does_not_exist
475 Multiple `assign where` condition rows are evaluated as `OR` condition.
477 You can combine multiple expressions for matching only a subset of objects. In some cases,
478 you want to be able to add more than one assign/ignore where expression which matches
479 a specific condition. To achieve this you can use the logical `and` and `or` operators.
482 Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
483 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
484 should be ignored, or any host name ending with `*internal` pattern.
486 object HostGroup "mysql-server" {
487 display_name = "MySQL Server"
489 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
490 ignore where host.vars.test_server == true
491 ignore where match("*internal", host.name)
494 Similar example for advanced notification apply rule filters: If the service
495 attribute `notes` contains the `has gold support 24x7` string `AND` one of the
496 two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
497 `OR` the host custom attribute `always_notify` is set to `true`.
499 The notification is ignored for services whose host name ends with `*internal`
500 `OR` the `priority` custom attribute is [less than](20-language-reference.md#expression-operators) `2`.
502 template Notification "cust-xy-notification" {
503 users = [ "noc-xy", "mgmt-xy" ]
504 command = "mail-service-notification"
507 apply Notification "notify-cust-xy-mysql" to Service {
508 import "cust-xy-notification"
510 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true)
511 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
516 ### <a id="using-apply-services"></a> Apply Services to Hosts
518 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
519 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
521 The example for `ssh` applies a service object to all hosts with the `address`
522 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
524 apply Service "ssh" {
525 import "generic-service"
527 check_command = "ssh"
529 assign where host.address && host.vars.os == "Linux"
533 Other detailed scenario examples are used in their respective chapters, for example
534 [apply services with custom command arguments](3-monitoring-basics.md#command-passing-parameters).
536 ### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
538 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
542 apply Notification "mail-noc" to Service {
543 import "mail-service-notification"
545 user_groups = [ "noc" ]
547 assign where host.vars.notification.mail
551 In this example the `mail-noc` notification will be created as object for all services having the
552 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
553 and all members of the user group `noc` will get notified.
555 It is also possible to generally apply a notification template and dynamically overwrite values from
556 the template by checking for custom attributes. This can be achieved by using [conditional statements](20-language-reference.md#conditional-statements):
558 apply Notification "host-mail-noc" to Host {
559 import "mail-host-notification"
561 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom attribute
562 if (host.vars.notification_interval) {
563 interval = host.vars.notification_interval
566 // same with notification period
567 if (host.vars.notification_period) {
568 interval = host.vars.notification_period
571 // Send SMS instead of email if the host's custom attribute `notification_type` is set to `sms`
572 if (host.vars.notification_type == "sms") {
573 command = "sms-host-notification"
575 command = "mail-host-notification"
578 user_groups = [ "noc" ]
580 assign where host.address
583 In the example above, the notification template `mail-host-notification`, which contains all relevant
584 notification settings, is applied on all host objects where the `host.address` is defined.
585 Each host object is then checked for custom attributes (`host.vars.notification_interval`,
586 `host.vars.notification_period` and `host.vars.notification_type`). Depending if the custom
587 attibute is set or which value it has, the value from the notification template is dynamically
590 The corresponding Host object could look like this:
592 object Host "host1" {
593 import "host-linux-prod"
594 display_name = "host1"
595 address = "192.168.1.50"
596 vars.notification_interval = 1h
597 vars.notification_period = "24x7"
598 vars.notification_type = "sms"
601 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
603 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
605 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
607 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
609 Detailed examples can be found in the [recurring downtimes](5-advanced-topics.md#recurring-downtimes) chapter.
612 ### <a id="using-apply-for"></a> Using Apply For Rules
614 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
615 there is the requirement of generating apply rules objects based on set (array or
618 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
619 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
621 Take the following example: A host provides the snmp oids for different service check
622 types. This could look like the following example:
625 user_groups = [ "noc" ]
627 assign where host.vars.notification.mail
630 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
632 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
634 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
636 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
638 Detailed examples can be found in the [recurring downtimes](5-advanced-topics.md#recurring-downtimes) chapter.
641 ### <a id="using-apply-for"></a> Using Apply For Rules
643 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
644 there is the requirement of generating apply rules objects based on set (array or
647 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
648 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
650 Take the following example: A host provides the snmp oids for different service check
651 types. This could look like the following example:
653 object Host "router-v6" {
654 check_command = "hostalive"
657 vars.oids["if01"] = "1.1.1.1.1"
658 vars.oids["temp"] = "1.1.1.1.2"
659 vars.oids["bgp"] = "1.1.1.1.5"
662 Now we want to create service checks for `if01` and `temp` but not `bgp`.
663 Furthermore we want to pass the snmp oid stored as dictionary value to the
664 custom attribute called `vars.snmp_oid` - this is the command argument required
665 by the [snmp](7-icinga-template-library.md#plugin-check-command-snmp) check command.
666 The service's `display_name` should be set to the identifier inside the dictionary.
668 apply Service for (identifier => oid in host.vars.oids) {
669 check_command = "snmp"
670 display_name = identifier
673 ignore where identifier == "bgp" //don't generate service for bgp checks
676 Icinga 2 evaluates the `apply for` rule for all objects with the custom attribute
677 `oids` set. It then iterates over all list items inside the `for` loop and evaluates the
678 `assign/ignore where` expressions. You can access the loop variable
679 in these expressions, e.g. for ignoring certain values.
680 In this example we'd ignore the `bgp` identifier and avoid generating an unwanted service.
681 We could extend the configuration by also matching the `oid` value on certain regex/wildcard
682 patterns for example.
686 > You don't need an `assign where` expression only checking for existance
687 > of the custom attribute.
689 That way you'll save duplicated apply rules by combining them into one
690 generic `apply for` rule generating the object name with or without a prefix.
693 #### <a id="using-apply-for-custom-attribute-override"></a> Apply For and Custom Attribute Override
695 Imagine a different more advanced example: You are monitoring your network device (host)
696 with many interfaces (services). The following requirements/problems apply:
698 * Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc)
699 * Each interface has its own vlan tag
700 * Some interfaces have QoS enabled
701 * Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be
702 dynamically generated
705 Tip: Define the snmp community as global constant in your [constants.conf](4-configuring-icinga-2.md#constants-conf) file.
707 const IftrafficSnmpCommunity = "public"
709 By defining the `interfaces` dictionary with three example interfaces on the `cisco-catalyst-6509-34`
710 host object, you'll make sure to pass the [custom attribute](3-monitoring-basics.md#custom-attributes)
711 storage required by the for loop in the service apply rule.
713 object Host "cisco-catalyst-6509-34" {
714 import "generic-host"
715 display_name = "Catalyst 6509 #34 VIE21"
716 address = "127.0.1.4"
718 /* "GigabitEthernet0/2" is the interface name,
719 * and key name in service apply for later on
721 vars.interfaces["GigabitEthernet0/2"] = {
722 /* define all custom attributes with the
723 * same name required for command parameters/arguments
724 * in service apply (look into your CheckCommand definition)
726 iftraffic_units = "g"
727 iftraffic_community = IftrafficSnmpCommunity
728 iftraffic_bandwidth = 1
732 vars.interfaces["GigabitEthernet0/4"] = {
733 iftraffic_units = "g"
734 //iftraffic_community = IftrafficSnmpCommunity
735 iftraffic_bandwidth = 1
739 vars.interfaces["MgmtInterface1"] = {
740 iftraffic_community = IftrafficSnmpCommunity
742 interface_address = "127.99.0.100" #special management ip
746 You can also omit the `"if-"` string, then all generated service names are directly
747 taken from the `if_name` variable value.
749 The config dictionary contains all key-value pairs for the specific interface in one
750 loop cycle, like `iftraffic_units`, `vlan`, and `qos` for the specified interface.
752 You can either map the custom attributes from the `interface_config` dictionary to
753 local custom attributes stashed into `vars`. If the names match the required command
754 argument parameters already (for example `iftraffic_units`), you could also add the
755 `interface_config` dictionary to the `vars` dictionary using the `+=` operator.
757 After `vars` is fully populated, all object attributes can be set calculated from
758 provided host attributes. For strings, you can use string concatention with the `+` operator.
760 You can also specifiy the display_name, check command, interval, notes, notes_url, action_url, etc.
761 attributes that way. Attribute strings can be [concatenated](20-language-reference.md#expression-operators),
762 for example for adding a more detailed service `display_name`.
764 This example also uses [if conditions](20-language-reference.md#conditional-statements)
765 if specific values are not set, adding a local default value.
766 The other way around you can override specific custom attributes inherited from a service template,
769 /* loop over the host.vars.interfaces dictionary
770 * for (key => value in dict) means `interface_name` as key
771 * and `interface_config` as value. Access config attributes
772 * with the indexer (`.`) character.
774 apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) {
775 import "generic-service"
776 check_command = "iftraffic"
777 display_name = "IF-" + interface_name
779 /* use the key as command argument (no duplication of values in host.vars.interfaces) */
780 vars.iftraffic_interface = interface_name
782 /* map the custom attributes as command arguments */
783 vars.iftraffic_units = interface_config.iftraffic_units
784 vars.iftraffic_community = interface_config.iftraffic_community
786 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces
787 * are the _exact_ same as required as command parameter by the check command
790 vars += interface_config
792 /* set a default value for units and bandwidth */
793 if (interface_config.iftraffic_units == "") {
794 vars.iftraffic_units = "m"
796 if (interface_config.iftraffic_bandwidth == "") {
797 vars.iftraffic_bandwidth = 1
799 if (interface_config.vlan == "") {
800 vars.vlan = "not set"
802 if (interface_config.qos == "") {
806 /* set the global constant if not explicitely
807 * not provided by the `interfaces` dictionary on the host
809 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) {
810 vars.iftraffic_community = IftrafficSnmpCommunity
813 /* Calculate some additional object attributes after populating the `vars` dictionary */
814 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'"
815 notes_url = "http://foreman.company.com/hosts/" + host.name
816 action_url = "http://snmp.checker.company.com/" + host.name + "/if-" + interface_name
821 This example makes use of the [check_iftraffic](https://exchange.icinga.org/exchange/iftraffic) plugin.
822 The `CheckCommand` definition can be found in the
823 [contributed plugin check commands](7-icinga-template-library.md#plugins-contrib-command-iftraffic)
824 - make sure to include them in your [icinga2 configuration file](4-configuring-icinga-2.md#icinga2-conf).
829 > Building configuration in that dynamic way requires detailed information
830 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
831 > after successful [configuration validation](8-cli-commands.md#config-validation).
833 Verify that the apply-for-rule successfully created the service objects with the
834 inherited custom attributes:
837 # icinga2 object list --type Service --name *catalyst*
839 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service':
842 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
843 * iftraffic_bandwidth = 1
844 * iftraffic_community = "public"
845 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
846 * iftraffic_interface = "GigabitEthernet0/2"
847 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
848 * iftraffic_units = "g"
849 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
854 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service':
857 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
858 * iftraffic_bandwidth = 1
859 * iftraffic_community = "public"
860 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
861 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53
862 * iftraffic_interface = "GigabitEthernet0/4"
863 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
864 * iftraffic_units = "g"
865 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
869 Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service':
872 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
873 * iftraffic_bandwidth = 1
874 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32
875 * iftraffic_community = "public"
876 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
877 * iftraffic_interface = "MgmtInterface1"
878 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
879 * iftraffic_units = "m"
880 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
881 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30
882 * interface_address = "127.99.0.100"
884 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24
888 ### <a id="using-apply-object-attributes"></a> Use Object Attributes in Apply Rules
890 Since apply rules are evaluated after the generic objects, you
891 can reference existing host and/or service object attributes as
892 values for any object attribute specified in that apply rule.
894 object Host "opennebula-host" {
895 import "generic-host"
898 vars.hosting["xyz"] = {
900 customer_name = "Customer xyz"
902 support_contract = "gold"
904 vars.hosting["abc"] = {
906 customer_name = "Customer xyz"
908 support_contract = "silver"
912 apply Service for (customer => config in host.vars.hosting) {
913 import "generic-service"
914 check_command = "ping4"
916 vars.qos = "disabled"
920 vars.http_uri = "/" + vars.customer + "/" + config.http_uri
922 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
924 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
926 notes_url = "http://foreman.company.com/hosts/" + host.name
927 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
930 ## <a id="groups"></a> Groups
932 A group is a collection of similar objects. Groups are primarily used as a
933 visualization aid in web interfaces.
935 Group membership is defined at the respective object itself. If
936 you have a hostgroup name `windows` for example, and want to assign
937 specific hosts to this group for later viewing the group on your
938 alert dashboard, first create a HostGroup object:
940 object HostGroup "windows" {
941 display_name = "Windows Servers"
944 Then add your hosts to this group:
946 template Host "windows-server" {
947 groups += [ "windows" ]
950 object Host "mssql-srv1" {
951 import "windows-server"
953 vars.mssql_port = 1433
956 object Host "mssql-srv2" {
957 import "windows-server"
959 vars.mssql_port = 1433
962 This can be done for service and user groups the same way:
964 object UserGroup "windows-mssql-admins" {
965 display_name = "Windows MSSQL Admins"
968 template User "generic-windows-mssql-users" {
969 groups += [ "windows-mssql-admins" ]
972 object User "win-mssql-noc" {
973 import "generic-windows-mssql-users"
975 email = "noc@example.com"
978 object User "win-mssql-ops" {
979 import "generic-windows-mssql-users"
981 email = "ops@example.com"
984 ### <a id="group-assign-intro"></a> Group Membership Assign
986 Instead of manually assigning each object to a group you can also assign objects
987 to a group based on their attributes:
989 object HostGroup "prod-mssql" {
990 display_name = "Production MSSQL Servers"
992 assign where host.vars.mssql_port && host.vars.prod_mysql_db
993 ignore where host.vars.test_server == true
994 ignore where match("*internal", host.name)
997 In this example all hosts with the `vars` attribute `mssql_port`
998 will be added as members to the host group `mssql`. However, all `*internal`
999 hosts or with the `test_server` attribute set to `true` are not added to this
1002 Details on the `assign where` syntax can be found in the
1003 [Language Reference](20-language-reference.md#apply)
1005 ## <a id="notifications"></a> Notifications
1007 Notifications for service and host problems are an integral part of your
1010 When a host or service is in a downtime, a problem has been acknowledged or
1011 the dependency logic determined that the host/service is unreachable, no
1012 notifications are sent. You can configure additional type and state filters
1013 refining the notifications being actually sent.
1015 There are many ways of sending notifications, e.g. by e-mail, XMPP,
1016 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
1017 Instead it relies on external mechanisms such as shell scripts to notify users.
1018 More notification methods are listed in the [addons and plugins](14-addons-plugins.md#notification-scripts-interfaces)
1021 A notification specification requires one or more users (and/or user groups)
1022 who will be notified in case of problems. These users must have all custom
1023 attributes defined which will be used in the `NotificationCommand` on execution.
1025 The user `icingaadmin` in the example below will get notified only on `WARNING` and
1026 `CRITICAL` states and `problem` and `recovery` notification types.
1028 object User "icingaadmin" {
1029 display_name = "Icinga 2 Admin"
1030 enable_notifications = true
1031 states = [ OK, Warning, Critical ]
1032 types = [ Problem, Recovery ]
1033 email = "icinga@localhost"
1036 If you don't set the `states` and `types` configuration attributes for the `User`
1037 object, notifications for all states and types will be sent.
1039 Details on troubleshooting notification problems can be found [here](17-troubleshooting.md#troubleshooting).
1043 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1044 > in order to execute notification commands.
1046 You should choose which information you (and your notified users) are interested in
1047 case of emergency, and also which information does not provide any value to you and
1050 An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
1052 You can add all shared attributes to a `Notification` template which is inherited
1053 to the defined notifications. That way you'll save duplicated attributes in each
1054 `Notification` object. Attributes can be overridden locally.
1056 template Notification "generic-notification" {
1059 command = "mail-service-notification"
1061 states = [ Warning, Critical, Unknown ]
1062 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1063 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1068 The time period `24x7` is included as example configuration with Icinga 2.
1070 Use the `apply` keyword to create `Notification` objects for your services:
1072 apply Notification "notify-cust-xy-mysql" to Service {
1073 import "generic-notification"
1075 users = [ "noc-xy", "mgmt-xy" ]
1077 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
1078 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
1082 Instead of assigning users to notifications, you can also add the `user_groups`
1083 attribute with a list of user groups to the `Notification` object. Icinga 2 will
1084 send notifications to all group members.
1088 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
1089 > states for services, `Down` for hosts) will receive `Recovery` notifications.
1091 ### <a id="notification-escalations"></a> Notification Escalations
1093 When a problem notification is sent and a problem still exists at the time of re-notification
1094 you may want to escalate the problem to the next support level. A different approach
1095 is to configure the default notification by email, and escalate the problem via SMS
1096 if not already solved.
1098 You can define notification start and end times as additional configuration
1099 attributes making the `Notification` object a so-called `notification escalation`.
1100 Using templates you can share the basic notification attributes such as users or the
1101 `interval` (and override them for the escalation then).
1103 Using the example from above, you can define additional users being escalated for SMS
1104 notifications between start and end time.
1106 object User "icinga-oncall-2nd-level" {
1107 display_name = "Icinga 2nd Level"
1109 vars.mobile = "+1 555 424642"
1112 object User "icinga-oncall-1st-level" {
1113 display_name = "Icinga 1st Level"
1115 vars.mobile = "+1 555 424642"
1118 Define an additional [NotificationCommand](3-monitoring-basics.md#notification-commands) for SMS notifications.
1122 > The example is not complete as there are many different SMS providers.
1123 > Please note that sending SMS notifications will require an SMS provider
1124 > or local hardware with a SIM card active.
1126 object NotificationCommand "sms-notification" {
1128 PluginDir + "/send_sms_notification",
1133 The two new notification escalations are added onto the local host
1134 and its service `ping4` using the `generic-notification` template.
1135 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
1136 command) after `30m` until `1h`.
1140 > The `interval` was set to 15m in the `generic-notification`
1141 > template example. Lower that value in your escalations by using a secondary
1142 > template or by overriding the attribute directly in the `notifications` array
1143 > position for `escalation-sms-2nd-level`.
1145 If the problem does not get resolved nor acknowledged preventing further notifications
1146 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
1147 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
1149 apply Notification "mail" to Service {
1150 import "generic-notification"
1152 command = "mail-notification"
1153 users = [ "icingaadmin" ]
1155 assign where service.name == "ping4"
1158 apply Notification "escalation-sms-2nd-level" to Service {
1159 import "generic-notification"
1161 command = "sms-notification"
1162 users = [ "icinga-oncall-2nd-level" ]
1169 assign where service.name == "ping4"
1172 apply Notification "escalation-sms-1st-level" to Service {
1173 import "generic-notification"
1175 command = "sms-notification"
1176 users = [ "icinga-oncall-1st-level" ]
1183 assign where service.name == "ping4"
1186 ### <a id="notification-delay"></a> Notification Delay
1188 Sometimes the problem in question should not be notified when the notification is due
1189 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
1190 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
1191 postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
1192 Icinga 2 will not check against any end time for this notification. Make sure to
1193 specify a relatively low notification `interval` to get notified soon enough again.
1195 apply Notification "mail" to Service {
1196 import "generic-notification"
1198 command = "mail-notification"
1199 users = [ "icingaadmin" ]
1203 times.begin = 15m // delay notification window
1205 assign where service.name == "ping4"
1208 ### <a id="disable-renotification"></a> Disable Re-notifications
1210 If you prefer to be notified only once, you can disable re-notifications by setting the
1211 `interval` attribute to `0`.
1213 apply Notification "notify-once" to Service {
1214 import "generic-notification"
1216 command = "mail-notification"
1217 users = [ "icingaadmin" ]
1219 interval = 0 // disable re-notification
1221 assign where service.name == "ping4"
1224 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
1226 If there are no notification state and type filter attributes defined at the `Notification`
1227 or `User` object Icinga 2 assumes that all states and types are being notified.
1229 Available state and type filters for notifications are:
1231 template Notification "generic-notification" {
1233 states = [ Warning, Critical, Unknown ]
1234 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1235 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1238 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
1239 into type and state to allow more fine granular filtering for example on downtimes and flapping.
1240 You can filter for acknowledgements and custom notifications too.
1243 ## <a id="commands"></a> Commands
1245 Icinga 2 uses three different command object types to specify how
1246 checks should be performed, notifications should be sent, and
1247 events should be handled.
1249 ### <a id="check-commands"></a> Check Commands
1251 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects define the command line how
1254 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects are referenced by
1255 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1256 using the `check_command` attribute.
1260 > Make sure that the [checker](8-cli-commands.md#features) feature is enabled in order to
1263 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
1265 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects require the [ITL template](7-icinga-template-library.md#itl-plugin-check-command)
1266 `plugin-check-command` to support native plugin based check methods.
1268 Unless you have done so already, download your check plugin and put it
1269 into the [PluginDir](4-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1270 `check_mysql` plugin contained in the Monitoring Plugins package.
1272 The plugin path and all command arguments are made a list of
1273 double-quoted string arguments for proper shell escaping.
1275 Call the `check_disk` plugin with the `--help` parameter to see
1276 all available options. Our example defines warning (`-w`) and
1277 critical (`-c`) thresholds for the disk usage. Without any
1278 partition defined (`-p`) it will check all local partitions.
1280 icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help
1283 This program tests connections to a MySQL server
1286 check_mysql [-d database] [-H host] [-P port] [-s socket]
1287 [-u user] [-p password] [-S] [-l] [-a cert] [-k key]
1288 [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group]
1290 Next step is to understand how [command parameters](3-monitoring-basics.md#command-passing-parameters)
1291 are being passed from a host or service object, and add a [CheckCommand](6-object-types.md#objecttype-checkcommand)
1292 definition based on these required parameters and/or default values.
1294 Please continue reading in the [plugins section](14-addons-plugins.md#plugins) for additional integration examples.
1296 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
1298 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1299 by the executed check command.
1301 The check command parameters for ITL provided plugin check command definitions are documented
1302 [here](7-icinga-template-library.md#plugin-check-commands), for example
1303 [disk](7-icinga-template-library.md#plugin-check-command-disk).
1305 In order to practice passing command parameters you should [integrate your own plugin](3-monitoring-basics.md#command-plugin-integration).
1307 The following example will use `check_mysql` provided by the [Monitoring Plugins installation](2-getting-started.md#setting-up-check-plugins).
1309 Define the default check command custom attributes, for example `mysql_user` and `mysql_password`
1310 (freely definable naming schema) and optional their default threshold values. You can
1311 then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
1312 on the command line.
1316 > Use a common command type as prefix for your command arguments to increase
1317 > readability. `mysql_user` helps understanding the context better than just
1318 > `user` as argument.
1320 The default custom attributes can be overridden by the custom attributes
1321 defined in the host or service using the check command `my-mysql`. The custom attributes
1322 can also be inherited from a parent template using additive inheritance (`+=`).
1324 # vim /etc/icinga2/conf.d/commands.conf
1326 object CheckCommand "my-mysql" {
1327 import "plugin-check-command"
1329 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir
1332 "-H" = "$mysql_host$"
1335 value = "$mysql_user$"
1337 "-p" = "$mysql_password$"
1338 "-P" = "$mysql_port$"
1339 "-s" = "$mysql_socket$"
1340 "-a" = "$mysql_cert$"
1341 "-d" = "$mysql_database$"
1342 "-k" = "$mysql_key$"
1343 "-C" = "$mysql_ca_cert$"
1344 "-D" = "$mysql_ca_dir$"
1345 "-L" = "$mysql_ciphers$"
1346 "-f" = "$mysql_optfile$"
1347 "-g" = "$mysql_group$"
1349 set_if = "$mysql_check_slave$"
1350 description = "Check if the slave thread is running properly."
1353 set_if = "$mysql_ssl$"
1354 description = "Use ssl encryption"
1358 vars.mysql_check_slave = false
1359 vars.mysql_ssl = false
1360 vars.mysql_host = "$address$"
1363 The check command definition also sets `mysql_host` to the `$address$` default value. You can override
1364 this command parameter if for example your MySQL host is not running on the same server's ip address.
1366 Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`.
1367 `MysqlUsername` and `MysqlPassword` are specified as [global constants](4-configuring-icinga-2.md#constants-conf)
1370 # vim /etc/icinga2/conf.d/services.conf
1372 apply Service "mysql-icinga-db-health" {
1373 import "generic-service"
1375 check_command = "my-mysql"
1377 vars.mysql_user = MysqlUsername
1378 vars.mysql_password = MysqlPassword
1380 vars.mysql_database = "icinga"
1381 vars.mysql_host = "192.168.33.11"
1383 assign where match("icinga2*", host.name)
1384 ignore where host.vars.no_health_check == true
1388 Take a different example: The example host configuration in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
1389 also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`.
1390 You can pass the command parameter as custom attribute `ssh_port` directly inside the service apply rule
1391 inside [services.conf](4-configuring-icinga-2.md#services-conf):
1393 apply Service "ssh" {
1394 import "generic-service"
1396 check_command = "ssh"
1397 vars.ssh_port = 2022 //custom command parameter
1399 assign where (host.address || host.address6) && host.vars.os == "Linux"
1402 If you prefer this being configured at the host instead of the service, modify the host configuration
1403 object instead. The runtime macro resolving order is described [here](3-monitoring-basics.md#macro-evaluation-order).
1405 object Host NodeName {
1407 vars.ssh_port = 2022
1410 #### <a id="command-passing-parameters-apply-for"></a> Passing Check Command Parameters Using Apply For
1412 The host `localhost` with the generated services from the `basic-partitions` dictionary (see
1413 [apply for](3-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions
1414 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
1417 The custom attribute `disk_partition` can either hold a single string or an array of
1418 string values for passing multiple partitions to the `check_disk` check plugin.
1420 object Host "my-server" {
1421 import "generic-host"
1422 address = "127.0.0.1"
1425 vars.local_disks["basic-partitions"] = {
1426 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
1430 apply Service for (disk => config in host.vars.local_disks) {
1431 import "generic-service"
1432 check_command = "my-disk"
1436 vars.disk_wfree = "10%"
1437 vars.disk_cfree = "5%"
1441 More details on using arrays in custom attributes can be found in
1442 [this chapter](3-monitoring-basics.md#custom-attributes).
1445 #### <a id="command-arguments"></a> Command Arguments
1447 By defining a check command line using the `command` attribute Icinga 2
1448 will resolve all macros in the static string or array. Sometimes it is
1449 required to extend the arguments list based on a met condition evaluated
1450 at command execution. Or making arguments optional - only set if the
1451 macro value can be resolved by Icinga 2.
1453 object CheckCommand "check_http" {
1454 import "plugin-check-command"
1456 command = [ PluginDir + "/check_http" ]
1459 "-H" = "$http_vhost$"
1460 "-I" = "$http_address$"
1462 "-p" = "$http_port$"
1464 set_if = "$http_ssl$"
1467 set_if = "$http_sni$"
1470 value = "$http_auth_pair$"
1471 description = "Username:password on sites with basic authentication"
1474 set_if = "$http_ignore_body$"
1476 "-r" = "$http_expect_body_regex$"
1477 "-w" = "$http_warn_time$"
1478 "-c" = "$http_critical_time$"
1479 "-e" = "$http_expect$"
1482 vars.http_address = "$address$"
1483 vars.http_ssl = false
1484 vars.http_sni = false
1487 The example shows the `check_http` check command defining the most common
1488 arguments. Each of them is optional by default and will be omitted if
1489 the value is not set. For example if the service calling the check command
1490 does not have `vars.http_port` set, it won't get added to the command
1493 If the `vars.http_ssl` custom attribute is set in the service, host or command
1494 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
1495 numeric value to the command line. String values are not supported.
1497 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
1498 to the final command argument array. Empty strings for macro values won't omit
1501 That way you can use the `check_http` command definition for both, with and
1502 without SSL enabled checks saving you duplicated command definitions.
1504 Details on all available options can be found in the
1505 [CheckCommand object definition](6-object-types.md#objecttype-checkcommand).
1508 #### <a id="command-environment-variables"></a> Environment Variables
1510 The `env` command object attribute specifies a list of environment variables with values calculated
1511 from either runtime macros or custom attributes which should be exported as environment variables
1512 prior to executing the command.
1514 This is useful for example for hiding sensitive information on the command line output
1515 when passing credentials to database checks:
1517 object CheckCommand "mysql-health" {
1518 import "plugin-check-command"
1521 PluginDir + "/check_mysql"
1525 "-H" = "$mysql_address$"
1526 "-d" = "$mysql_database$"
1529 vars.mysql_address = "$address$"
1530 vars.mysql_database = "icinga"
1531 vars.mysql_user = "icinga_check"
1532 vars.mysql_pass = "password"
1534 env.MYSQLUSER = "$mysql_user$"
1535 env.MYSQLPASS = "$mysql_pass$"
1540 ### <a id="notification-commands"></a> Notification Commands
1542 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
1543 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
1545 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects are referenced by
1546 [Notification](6-object-types.md#objecttype-notification) objects using the `command` attribute.
1548 `NotificationCommand` objects require the [ITL template](7-icinga-template-library.md#itl-plugin-notification-command)
1549 `plugin-notification-command` to support native plugin-based notifications.
1553 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1554 > in order to execute notification commands.
1556 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
1557 the current check output) sending an email to the user(s) associated with the
1558 notification itself (`$user.email$`).
1560 If you want to specify default values for some of the custom attribute definitions,
1561 you can add a `vars` dictionary as shown for the `CheckCommand` object.
1563 object NotificationCommand "mail-service-notification" {
1564 import "plugin-notification-command"
1566 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
1569 NOTIFICATIONTYPE = "$notification.type$"
1570 SERVICEDESC = "$service.name$"
1571 HOSTALIAS = "$host.display_name$"
1572 HOSTADDRESS = "$address$"
1573 SERVICESTATE = "$service.state$"
1574 LONGDATETIME = "$icinga.long_date_time$"
1575 SERVICEOUTPUT = "$service.output$"
1576 NOTIFICATIONAUTHORNAME = "$notification.author$"
1577 NOTIFICATIONCOMMENT = "$notification.comment$"
1578 HOSTDISPLAYNAME = "$host.display_name$"
1579 SERVICEDISPLAYNAME = "$service.display_name$"
1580 USEREMAIL = "$user.email$"
1584 The command attribute in the `mail-service-notification` command refers to the following
1585 shell script. The macros specified in the `env` array are exported
1586 as environment variables and can be used in the notification script:
1589 template=$(cat <<TEMPLATE
1592 Notification Type: $NOTIFICATIONTYPE
1594 Service: $SERVICEDESC
1596 Address: $HOSTADDRESS
1597 State: $SERVICESTATE
1599 Date/Time: $LONGDATETIME
1601 Additional Info: $SERVICEOUTPUT
1603 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
1607 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
1611 > This example is for `exim` only. Requires changes for `sendmail` and
1614 While it's possible to specify the entire notification command right
1615 in the NotificationCommand object it is generally advisable to create a
1616 shell script in the `/etc/icinga2/scripts` directory and have the
1617 NotificationCommand object refer to that.
1619 ### <a id="event-commands"></a> Event Commands
1621 Unlike notifications, event commands for hosts/services are called on every
1622 check execution if one of these conditions match:
1624 * The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
1625 * The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
1626 * The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
1628 [EventCommand](6-object-types.md#objecttype-eventcommand) objects are referenced by
1629 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1630 using the `event_command` attribute.
1632 Therefore the `EventCommand` object should define a command line
1633 evaluating the current service state and other service runtime attributes
1634 available through runtime vars. Runtime macros such as `$service.state_type$`
1635 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
1636 events being triggered.
1638 Common use case scenarios are a failing HTTP check requiring an immediate
1639 restart via event command, or if an application is locked and requires
1640 a restart upon detection.
1642 `EventCommand` objects require the ITL template `plugin-event-command`
1643 to support native plugin based checks.
1645 #### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
1647 The following example will triggert a restart of the `httpd` daemon
1648 via ssh when the `http` service check fails. If the service state is
1649 `OK`, it will not trigger any event action.
1654 * icinga user with public key authentication
1655 * icinga user with sudo permissions for restarting the httpd daemon.
1659 # ls /home/icinga/.ssh/
1663 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
1666 Define a generic [EventCommand](6-object-types.md#objecttype-eventcommand) object `event_by_ssh`
1667 which can be used for all event commands triggered using ssh:
1669 /* pass event commands through ssh */
1670 object EventCommand "event_by_ssh" {
1671 import "plugin-event-command"
1673 command = [ PluginDir + "/check_by_ssh" ]
1676 "-H" = "$event_by_ssh_address$"
1677 "-p" = "$event_by_ssh_port$"
1678 "-C" = "$event_by_ssh_command$"
1679 "-l" = "$event_by_ssh_logname$"
1680 "-i" = "$event_by_ssh_identity$"
1682 set_if = "$event_by_ssh_quiet$"
1684 "-w" = "$event_by_ssh_warn$"
1685 "-c" = "$event_by_ssh_crit$"
1686 "-t" = "$event_by_ssh_timeout$"
1689 vars.event_by_ssh_address = "$address$"
1690 vars.event_by_ssh_quiet = false
1693 The actual event command only passes the `event_by_ssh_command` attribute.
1694 The `event_by_ssh_service` custom attribute takes care of passing the correct
1695 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
1696 is only restarted when the service is not in an `OK` state.
1699 object EventCommand "event_by_ssh_restart_service" {
1700 import "event_by_ssh"
1702 //only restart the daemon if state > 0 (not-ok)
1703 //requires sudo permissions for the icinga user
1704 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
1708 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
1709 which service should be restarted using the `event_by_ssh_service` attribute.
1711 object Service "http" {
1712 import "generic-service"
1713 host_name = "remote-http-host"
1714 check_command = "http"
1716 event_command = "event_by_ssh_restart_service"
1717 vars.event_by_ssh_service = "$host.vars.httpd_name$"
1719 //vars.event_by_ssh_logname = "icinga"
1720 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
1724 Each host with this service then must define the `httpd_name` custom attribute
1725 (for example generated from your cmdb):
1727 object Host "remote-http-host" {
1728 import "generic-host"
1729 address = "192.168.1.100"
1731 vars.httpd_name = "apache2"
1734 You can testdrive this example by manually stopping the `httpd` daemon
1735 on your `remote-http-host`. Enable the `debuglog` feature and tail the
1736 `/var/log/icinga2/debug.log` file.
1738 Remote Host Terminal:
1740 # date; service apache2 status
1741 Mon Sep 15 18:57:39 CEST 2014
1742 Apache2 is running (pid 23651).
1743 # date; service apache2 stop
1744 Mon Sep 15 18:57:47 CEST 2014
1745 [ ok ] Stopping web server: apache2 ... waiting .
1747 Icinga 2 Host Terminal:
1749 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
1750 [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
1751 [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
1752 [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
1753 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
1754 [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
1756 Remote Host Terminal:
1758 # date; service apache2 status
1759 Mon Sep 15 18:58:44 CEST 2014
1760 Apache2 is running (pid 24908).
1763 ## <a id="dependencies"></a> Dependencies
1765 Icinga 2 uses host and service [Dependency](6-object-types.md#objecttype-dependency) objects
1766 for determing their network reachability.
1768 A service can depend on a host, and vice versa. A service has an implicit
1769 dependency (parent) to its host. A host to host dependency acts implicitly
1770 as host parent relation.
1771 When dependencies are calculated, not only the immediate parent is taken into
1772 account but all parents are inherited.
1774 The `parent_host_name` and `parent_service_name` attributes are mandatory for
1775 service dependencies, `parent_host_name` is required for host dependencies.
1776 [Apply rules](3-monitoring-basics.md#using-apply) will allow you to
1777 [determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
1778 dynamic fashion if required.
1780 parent_host_name = "core-router"
1781 parent_service_name = "uplink-port"
1783 Notifications are suppressed by default if a host or service becomes unreachable.
1784 You can control that option by defining the `disable_notifications` attribute.
1786 disable_notifications = false
1788 If the dependency should be triggered in the parent object's soft state, you
1789 need to set `ignore_soft_states` to `false`.
1791 The dependency state filter must be defined based on the parent object being
1792 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
1794 The following example will make the dependency fail and trigger it if the parent
1795 object is **not** in one of these states:
1797 states = [ OK, Critical, Unknown ]
1799 Rephrased: If the parent service object changes into the `Warning` state, this
1800 dependency will fail and render all child objects (hosts or services) unreachable.
1802 You can determine the child's reachability by querying the `is_reachable` attribute
1803 in for example [DB IDO](23-appendix.md#schema-db-ido-extensions).
1805 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
1807 Icinga 2 automatically adds an implicit dependency for services on their host. That way
1808 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
1809 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
1810 `states = [ Up ]` for all service objects.
1812 Service checks are still executed. If you want to prevent them from happening, you can
1813 apply the following dependency to all services setting their host as `parent_host_name`
1814 and disabling the checks. `assign where true` matches on all `Service` objects.
1816 apply Dependency "disable-host-service-checks" to Service {
1817 disable_checks = true
1821 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
1823 A common scenario is the Icinga 2 server behind a router. Checking internet
1824 access by pinging the Google DNS server `google-dns` is a common method, but
1825 will fail in case the `dsl-router` host is down. Therefore the example below
1826 defines a host dependency which acts implicitly as parent relation too.
1828 Furthermore the host may be reachable but ping probes are dropped by the
1829 router's firewall. In case the `dsl-router`'s `ping4` service check fails, all
1830 further checks for the `ping4` service on host `google-dns` service should
1831 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1833 object Host "dsl-router" {
1834 import "generic-host"
1835 address = "192.168.1.1"
1838 object Host "google-dns" {
1839 import "generic-host"
1843 apply Service "ping4" {
1844 import "generic-service"
1846 check_command = "ping4"
1848 assign where host.address
1851 apply Dependency "internet" to Host {
1852 parent_host_name = "dsl-router"
1853 disable_checks = true
1854 disable_notifications = true
1856 assign where host.name != "dsl-router"
1859 apply Dependency "internet" to Service {
1860 parent_host_name = "dsl-router"
1861 parent_service_name = "ping4"
1862 disable_checks = true
1864 assign where host.name != "dsl-router"
1867 ### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
1869 You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
1870 child attributes e.g. `parent_host_name` to other object's
1873 A common example are virtual machines hosted on a master. The object
1874 name of that master is auto-generated from your CMDB or VMWare inventory
1875 into the host's custom attributes (or a generic template for your
1878 Define your master host object:
1881 object Host "master.example.com" {
1882 import "generic-host"
1885 Add a generic template defining all common host attributes:
1887 /* generic template for your virtual machines */
1888 template Host "generic-vm" {
1889 import "generic-host"
1892 Add a template for all hosts on your example.com cloud setting
1893 custom attribute `vm_parent` to `master.example.com`:
1895 template Host "generic-vm-example.com" {
1897 vars.vm_parent = "master.example.com"
1900 Define your guest hosts:
1902 object Host "www.example1.com" {
1903 import "generic-vm-master.example.com"
1906 object Host "www.example2.com" {
1907 import "generic-vm-master.example.com"
1910 Apply the host dependency to all child hosts importing the
1911 `generic-vm` template and set the `parent_host_name`
1912 to the previously defined custom attribute `host.vars.vm_parent`.
1914 apply Dependency "vm-host-to-parent-master" to Host {
1915 parent_host_name = host.vars.vm_parent
1916 assign where "generic-vm" in host.templates
1919 You can extend this example, and make your services depend on the
1920 `master.example.com` host too. Their local scope allows you to use
1921 `host.vars.vm_parent` similar to the example above.
1923 apply Dependency "vm-service-to-parent-master" to Service {
1924 parent_host_name = host.vars.vm_parent
1925 assign where "generic-vm" in host.templates
1928 That way you don't need to wait for your guest hosts becoming
1929 unreachable when the master host goes down. Instead the services
1930 will detect their reachability immediately when executing checks.
1934 > This method with setting locally scoped variables only works in
1935 > apply rules, but not in object definitions.
1938 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1940 Another classic example are agent based checks. You would define a health check
1941 for the agent daemon responding to your requests, and make all other services
1942 querying that daemon depend on that health check.
1944 The following configuration defines two nrpe based service checks `nrpe-load`
1945 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1946 `nrpe-health` service.
1948 apply Service "nrpe-health" {
1949 import "generic-service"
1950 check_command = "nrpe"
1951 assign where match("nrpe-*", host.name)
1954 apply Service "nrpe-load" {
1955 import "generic-service"
1956 check_command = "nrpe"
1957 vars.nrpe_command = "check_load"
1958 assign where match("nrpe-*", host.name)
1961 apply Service "nrpe-disk" {
1962 import "generic-service"
1963 check_command = "nrpe"
1964 vars.nrpe_command = "check_disk"
1965 assign where match("nrpe-*", host.name)
1968 object Host "nrpe-server" {
1969 import "generic-host"
1970 address = "192.168.1.5"
1973 apply Dependency "disable-nrpe-checks" to Service {
1974 parent_service_name = "nrpe-health"
1977 disable_checks = true
1978 disable_notifications = true
1979 assign where service.check_command == "nrpe"
1980 ignore where service.name == "nrpe-health"
1983 The `disable-nrpe-checks` dependency is applied to all services
1984 on the `nrpe-service` host using the `nrpe` check_command attribute
1985 but not the `nrpe-health` service itself.