1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
5 Keep in mind these examples are made with a linux server in mind, if you are
6 using Windows you will need to change the services accordingly. See the [ITL reference](7-icinga-template-library.md#windows-plugins)
7 for further information.
9 ## <a id="hosts-services"></a> Hosts and Services
11 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
12 and services can be virtually anything which can be checked in some way:
14 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
18 * Other local or network-accessible services
20 Host objects provide a mechanism to group services that are running
21 on the same physical device.
23 Here is an example of a host object which defines two child services:
25 object Host "my-server1" {
27 check_command = "hostalive"
30 object Service "ping4" {
31 host_name = "my-server1"
32 check_command = "ping4"
35 object Service "http" {
36 host_name = "my-server1"
37 check_command = "http"
40 The example creates two services `ping4` and `http` which belong to the
43 It also specifies that the host should perform its own check using the `hostalive`
46 The `address` attribute is used by check commands to determine which network
47 address is associated with the host object.
49 Details on troubleshooting check problems can be found [here](17-troubleshooting.md#troubleshooting).
51 ### <a id="host-states"></a> Host States
53 Hosts can be in any of the following states:
56 ------------|--------------
57 UP | The host is available.
58 DOWN | The host is unavailable.
60 ### <a id="service-states"></a> Service States
62 Services can be in any of the following states:
65 ------------|--------------
66 OK | The service is working properly.
67 WARNING | The service is experiencing some problems but is still considered to be in working condition.
68 CRITICAL | The service is in a critical state.
69 UNKNOWN | The check could not determine the service's state.
71 ### <a id="hard-soft-states"></a> Hard and Soft States
73 When detecting a problem with a host/service Icinga re-checks the object a number of
74 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
75 notifications. This ensures that no unnecessary notifications are sent for
76 transient failures. During this time the object is in a `SOFT` state.
78 After all re-checks have been executed and the object is still in a non-OK
79 state the host/service switches to a `HARD` state and notifications are sent.
82 ------------|--------------
83 HARD | The host/service's state hasn't recently changed.
84 SOFT | The host/service has recently changed state and is being re-checked.
86 ### <a id="host-service-checks"></a> Host and Service Checks
88 Hosts and services determine their state by running checks in a regular interval.
90 object Host "router" {
91 check_command = "hostalive"
95 The `hostalive` command is one of several built-in check commands. It sends ICMP
96 echo requests to the IP address specified in the `address` attribute to determine
97 whether a host is online.
99 A number of other [built-in check commands](7-icinga-template-library.md#plugin-check-commands) are also
100 available. In addition to these commands the next few chapters will explain in
101 detail how to set up your own check commands.
104 ## <a id="object-inheritance-using-templates"></a> Templates
106 Templates may be used to apply a set of identical attributes to more than one
109 template Service "generic-service" {
110 max_check_attempts = 3
113 enable_perfdata = true
116 apply Service "ping4" {
117 import "generic-service"
119 check_command = "ping4"
121 assign where host.address
124 apply Service "ping6" {
125 import "generic-service"
127 check_command = "ping6"
129 assign where host.address6
133 In this example the `ping4` and `ping6` services inherit properties from the
134 template `generic-service`.
136 Objects as well as templates themselves can import an arbitrary number of
137 other templates. Attributes inherited from a template can be overridden in the
140 You can also import existing non-template objects. Note that templates
141 and objects share the same namespace, i.e. you can't define a template
142 that has the same name like an object.
145 ## <a id="custom-attributes"></a> Custom Attributes
147 In addition to built-in attributes you can define your own attributes:
149 object Host "localhost" {
153 Valid values for custom attributes include:
155 * [Strings](20-language-reference.md#string-literals), [numbers](20-language-reference.md#numeric-literals) and [booleans](20-language-reference.md#boolean-literals)
156 * [Arrays](20-language-reference.md#array) and [dictionaries](20-language-reference.md#dictionary)
157 * [Functions](3-monitoring-basics.md#custom-attributes-functions)
159 ### <a id="custom-attributes-functions"></a> Functions as Custom Attributes
161 Icinga 2 lets you specify [functions](20-language-reference.md#functions) for custom attributes.
162 The special case here is that whenever Icinga 2 needs the value for such a custom attribute it runs
163 the function and uses whatever value the function returns:
165 object CheckCommand "random-value" {
166 import "plugin-check-command"
168 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
170 vars.text = {{ Math.random() * 100 }}
173 This example uses the [abbreviated lambda syntax](20-language-reference.md#nullary-lambdas).
175 These functions have access to a number of variables:
177 Variable | Description
178 -------------|---------------
179 user | The User object (for notifications).
180 service | The Service object (for service checks/notifications/event handlers).
181 host | The Host object.
182 command | The command object (e.g. a CheckCommand object for checks).
186 vars.text = {{ host.check_interval }}
188 In addition to these variables the `macro` function can be used to retrieve the
189 value of arbitrary macro expressions:
192 if (macro("$address$") == "127.0.0.1") {
193 log("Running a check for localhost!")
199 The `resolve_arguments` can be used to resolve a command and its arguments much in
200 the same fashion Icinga does this for the `command` and `arguments` attributes for
201 commands. The `by_ssh` command uses this functionality to let users specify a
202 command and arguments that should be executed via SSH:
206 var command = macro("$by_ssh_command$")
207 var arguments = macro("$by_ssh_arguments$")
209 if (typeof(command) == String && !arguments) {
213 var escaped_args = []
214 for (arg in resolve_arguments(command, arguments)) {
215 escaped_args.add(escape_shell_arg(arg))
217 return escaped_args.join(" ")
222 Acessing object attributes at runtime inside these functions is described in the
223 [advanced topics](5-advanced-topics.md#access-object-attributes-at-runtime) chapter.
225 ## <a id="runtime-macros"></a> Runtime Macros
227 Macros can be used to access other objects' attributes at runtime. For example they
228 are used in command definitions to figure out which IP address a check should be
231 object CheckCommand "my-ping" {
232 import "plugin-check-command"
234 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
237 "-w" = "$ping_wrta$,$ping_wpl$%"
238 "-c" = "$ping_crta$,$ping_cpl$%"
239 "-p" = "$ping_packets$"
242 vars.ping_address = "$address$"
250 vars.ping_packets = 5
253 object Host "router" {
254 check_command = "my-ping"
258 In this example we are using the `$address$` macro to refer to the host's `address`
261 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
262 automatically tries to find the closest match for the attribute you specified. The
263 exact rules for this are explained in the next section.
266 ### <a id="macro-evaluation-order"></a> Evaluation Order
268 When executing commands Icinga 2 checks the following objects in this order to look
269 up macros and their respective values:
271 1. User object (only for notifications)
275 5. Global custom attributes in the `Vars` constant
277 This execution order allows you to define default values for custom attributes
278 in your command objects.
280 Here's how you can override the custom attribute `ping_packets` from the previous
283 object Service "ping" {
284 host_name = "localhost"
285 check_command = "my-ping"
287 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
290 If a custom attribute isn't defined anywhere an empty value is used and a warning is
291 written to the Icinga 2 log.
293 You can also directly refer to a specific attribute - thereby ignoring these evaluation
294 rules - by specifying the full attribute name:
296 $service.vars.ping_wrta$
298 This retrieves the value of the `ping_wrta` custom attribute for the service. This
299 returns an empty value if the service does not have such a custom attribute no matter
300 whether another object such as the host has this attribute.
303 ### <a id="host-runtime-macros"></a> Host Runtime Macros
305 The following host custom attributes are available in all commands that are executed for
309 -----------------------------|--------------
310 host.name | The name of the host object.
311 host.display_name | The value of the `display_name` attribute.
312 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
313 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
314 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
315 host.check_attempt | The current check attempt number.
316 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
317 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
318 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
319 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
320 host.last_state_change | The last state change's timestamp.
321 host.downtime_depth | The number of active downtimes.
322 host.duration_sec | The time since the last state change.
323 host.latency | The host's check latency.
324 host.execution_time | The host's check execution time.
325 host.output | The last check's output.
326 host.perfdata | The last check's performance data.
327 host.last_check | The timestamp when the last check was executed.
328 host.check_source | The monitoring instance that performed the last check.
329 host.num_services | Number of services associated with the host.
330 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
331 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
332 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
333 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
335 ### <a id="service-runtime-macros"></a> Service Runtime Macros
337 The following service macros are available in all commands that are executed for
341 ---------------------------|--------------
342 service.name | The short name of the service object.
343 service.display_name | The value of the `display_name` attribute.
344 service.check_command | The short name of the command along with any arguments to be used for the check.
345 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
346 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
347 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
348 service.check_attempt | The current check attempt number.
349 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
350 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
351 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
352 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
353 service.last_state_change | The last state change's timestamp.
354 service.downtime_depth | The number of active downtimes.
355 service.duration_sec | The time since the last state change.
356 service.latency | The service's check latency.
357 service.execution_time | The service's check execution time.
358 service.output | The last check's output.
359 service.perfdata | The last check's performance data.
360 service.last_check | The timestamp when the last check was executed.
361 service.check_source | The monitoring instance that performed the last check.
363 ### <a id="command-runtime-macros"></a> Command Runtime Macros
365 The following custom attributes are available in all commands:
368 -----------------------|--------------
369 command.name | The name of the command object.
371 ### <a id="user-runtime-macros"></a> User Runtime Macros
373 The following custom attributes are available in all commands that are executed for
377 -----------------------|--------------
378 user.name | The name of the user object.
379 user.display_name | The value of the display_name attribute.
381 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
384 -----------------------|--------------
385 notification.type | The type of the notification.
386 notification.author | The author of the notification comment, if existing.
387 notification.comment | The comment of the notification, if existing.
389 ### <a id="global-runtime-macros"></a> Global Runtime Macros
391 The following macros are available in all executed commands:
394 -----------------------|--------------
395 icinga.timet | Current UNIX timestamp.
396 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
397 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
398 icinga.date | Current date. Example: `2014-01-03`
399 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
400 icinga.uptime | Current uptime of the Icinga 2 process.
402 The following macros provide global statistics:
405 ----------------------------------|--------------
406 icinga.num_services_ok | Current number of services in state 'OK'.
407 icinga.num_services_warning | Current number of services in state 'Warning'.
408 icinga.num_services_critical | Current number of services in state 'Critical'.
409 icinga.num_services_unknown | Current number of services in state 'Unknown'.
410 icinga.num_services_pending | Current number of pending services.
411 icinga.num_services_unreachable | Current number of unreachable services.
412 icinga.num_services_flapping | Current number of flapping services.
413 icinga.num_services_in_downtime | Current number of services in downtime.
414 icinga.num_services_acknowledged | Current number of acknowledged service problems.
415 icinga.num_hosts_up | Current number of hosts in state 'Up'.
416 icinga.num_hosts_down | Current number of hosts in state 'Down'.
417 icinga.num_hosts_unreachable | Current number of unreachable hosts.
418 icinga.num_hosts_flapping | Current number of flapping hosts.
419 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
420 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
423 ## <a id="using-apply"></a> Apply Rules
425 Instead of assigning each object ([Service](6-object-types.md#objecttype-service),
426 [Notification](6-object-types.md#objecttype-notification), [Dependency](6-object-types.md#objecttype-dependency),
427 [ScheduledDowntime](6-object-types.md#objecttype-scheduleddowntime))
428 based on attribute identifiers for example `host_name` objects can be [applied](20-language-reference.md#apply).
430 Before you start using the apply rules keep the following in mind:
432 * Define the best match.
433 * A set of unique [custom attributes](3-monitoring-basics.md#custom-attributes) for these hosts/services?
434 * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
435 * A generic pattern [match](20-language-reference.md#function-calls) on the host/service name?
436 * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](20-language-reference.md#expression-operators)
437 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
441 > You can set/override object attributes in apply rules using the respectively available
442 > objects in that scope (host and/or service objects).
444 [Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
445 for not only matching for their existance or values in apply expressions, but also assign
446 ("inherit") their values into the generated objected from apply rules.
448 * [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
449 * [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
450 * [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-dependencies)
451 * [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
453 A more advanced example is using [apply with for loops on arrays or
454 dictionaries](3-monitoring-basics.md#using-apply-for) for example provided by
455 [custom atttributes](3-monitoring-basics.md#custom-attributes) or groups.
459 > Building configuration in that dynamic way requires detailed information
460 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
461 > after successful [configuration validation](8-cli-commands.md#config-validation).
464 ### <a id="using-apply-expressions"></a> Apply Rules Expressions
466 You can use simple or advanced combinations of apply rule expressions. Each
467 expression must evaluate into the boolean `true` value. An empty string
468 will be for instance interpreted as `false`. In a similar fashion undefined
469 attributes will return `false`.
473 assign where host.vars.attribute_does_not_exist
475 Multiple `assign where` condition rows are evaluated as `OR` condition.
477 You can combine multiple expressions for matching only a subset of objects. In some cases,
478 you want to be able to add more than one assign/ignore where expression which matches
479 a specific condition. To achieve this you can use the logical `and` and `or` operators.
482 Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
483 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
484 should be ignored, or any host name ending with `*internal` pattern.
486 object HostGroup "mysql-server" {
487 display_name = "MySQL Server"
489 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
490 ignore where host.vars.test_server == true
491 ignore where match("*internal", host.name)
494 Similar example for advanced notification apply rule filters: If the service
495 attribute `notes` contains the `has gold support 24x7` string `AND` one of the
496 two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
497 `OR` the host custom attribute `always_notify` is set to `true`.
499 The notification is ignored for services whose host name ends with `*internal`
500 `OR` the `priority` custom attribute is [less than](20-language-reference.md#expression-operators) `2`.
502 template Notification "cust-xy-notification" {
503 users = [ "noc-xy", "mgmt-xy" ]
504 command = "mail-service-notification"
507 apply Notification "notify-cust-xy-mysql" to Service {
508 import "cust-xy-notification"
510 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true)
511 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
514 ### <a id="using-apply-services"></a> Apply Services to Hosts
516 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
517 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
519 The example for `ssh` applies a service object to all hosts with the `address`
520 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
522 apply Service "ssh" {
523 import "generic-service"
525 check_command = "ssh"
527 assign where host.address && host.vars.os == "Linux"
531 Other detailed scenario examples are used in their respective chapters, for example
532 [apply services with custom command arguments](3-monitoring-basics.md#command-passing-parameters).
534 ### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
536 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
540 apply Notification "mail-noc" to Service {
541 import "mail-service-notification"
543 user_groups = [ "noc" ]
545 assign where host.vars.notification.mail
549 In this example the `mail-noc` notification will be created as object for all services having the
550 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
551 and all members of the user group `noc` will get notified.
553 It is also possible to generally apply a notification template and dynamically overwrite values from
554 the template by checking for custom attributes. This can be achieved by using [conditional statements](20-language-reference.md#conditional-statements):
556 apply Notification "host-mail-noc" to Host {
557 import "mail-host-notification"
559 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom attribute
560 if (host.vars.notification_interval) {
561 interval = host.vars.notification_interval
564 // same with notification period
565 if (host.vars.notification_period) {
566 interval = host.vars.notification_period
569 // Send SMS instead of email if the host's custom attribute `notification_type` is set to `sms`
570 if (host.vars.notification_type == "sms") {
571 command = "sms-host-notification"
573 command = "mail-host-notification"
576 user_groups = [ "noc" ]
578 assign where host.address
581 In the example above, the notification template `mail-host-notification`, which contains all relevant
582 notification settings, is applied on all host objects where the `host.address` is defined.
583 Each host object is then checked for custom attributes (`host.vars.notification_interval`,
584 `host.vars.notification_period` and `host.vars.notification_type`). Depending if the custom
585 attibute is set or which value it has, the value from the notification template is dynamically
588 The corresponding Host object could look like this:
590 object Host "host1" {
591 import "host-linux-prod"
592 display_name = "host1"
593 address = "192.168.1.50"
594 vars.notification_interval = 1h
595 vars.notification_period = "24x7"
596 vars.notification_type = "sms"
599 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
601 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
603 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
605 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
607 Detailed examples can be found in the [recurring downtimes](5-advanced-topics.md#recurring-downtimes) chapter.
610 ### <a id="using-apply-for"></a> Using Apply For Rules
612 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
613 there is the requirement of generating apply rules objects based on set (array or
616 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
617 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
619 Take the following example: A host provides the snmp oids for different service check
620 types. This could look like the following example:
622 object Host "router-v6" {
623 check_command = "hostalive"
626 vars.oids["if01"] = "1.1.1.1.1"
627 vars.oids["temp"] = "1.1.1.1.2"
628 vars.oids["bgp"] = "1.1.1.1.5"
631 Now we want to create service checks for `if01` and `temp` but not `bgp`.
632 Furthermore we want to pass the snmp oid stored as dictionary value to the
633 custom attribute called `vars.snmp_oid` - this is the command argument required
634 by the [snmp](7-icinga-template-library.md#plugin-check-command-snmp) check command.
635 The service's `display_name` should be set to the identifier inside the dictionary.
637 apply Service for (identifier => oid in host.vars.oids) {
638 check_command = "snmp"
639 display_name = identifier
642 ignore where identifier == "bgp" //don't generate service for bgp checks
645 Icinga 2 evaluates the `apply for` rule for all objects with the custom attribute
646 `oids` set. It then iterates over all list items inside the `for` loop and evaluates the
647 `assign/ignore where` expressions. You can access the loop variable
648 in these expressions, e.g. for ignoring certain values.
649 In this example we'd ignore the `bgp` identifier and avoid generating an unwanted service.
650 We could extend the configuration by also matching the `oid` value on certain regex/wildcard
651 patterns for example.
655 > You don't need an `assign where` expression only checking for existance
656 > of the custom attribute.
658 That way you'll save duplicated apply rules by combining them into one
659 generic `apply for` rule generating the object name with or without a prefix.
662 #### <a id="using-apply-for-custom-attribute-override"></a> Apply For and Custom Attribute Override
664 Imagine a different more advanced example: You are monitoring your network device (host)
665 with many interfaces (services). The following requirements/problems apply:
667 * Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc)
668 * Each interface has its own vlan tag
669 * Some interfaces have QoS enabled
670 * Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be
671 dynamically generated
674 Tip: Define the snmp community as global constant in your [constants.conf](4-configuring-icinga-2.md#constants-conf) file.
676 const IftrafficSnmpCommunity = "public"
678 By defining the `interfaces` dictionary with three example interfaces on the `cisco-catalyst-6509-34`
679 host object, you'll make sure to pass the [custom attribute](3-monitoring-basics.md#custom-attributes)
680 storage required by the for loop in the service apply rule.
682 object Host "cisco-catalyst-6509-34" {
683 import "generic-host"
684 display_name = "Catalyst 6509 #34 VIE21"
685 address = "127.0.1.4"
687 /* "GigabitEthernet0/2" is the interface name,
688 * and key name in service apply for later on
690 vars.interfaces["GigabitEthernet0/2"] = {
691 /* define all custom attributes with the
692 * same name required for command parameters/arguments
693 * in service apply (look into your CheckCommand definition)
695 iftraffic_units = "g"
696 iftraffic_community = IftrafficSnmpCommunity
697 iftraffic_bandwidth = 1
701 vars.interfaces["GigabitEthernet0/4"] = {
702 iftraffic_units = "g"
703 //iftraffic_community = IftrafficSnmpCommunity
704 iftraffic_bandwidth = 1
708 vars.interfaces["MgmtInterface1"] = {
709 iftraffic_community = IftrafficSnmpCommunity
711 interface_address = "127.99.0.100" #special management ip
715 You can also omit the `"if-"` string, then all generated service names are directly
716 taken from the `if_name` variable value.
718 The config dictionary contains all key-value pairs for the specific interface in one
719 loop cycle, like `iftraffic_units`, `vlan`, and `qos` for the specified interface.
721 You can either map the custom attributes from the `interface_config` dictionary to
722 local custom attributes stashed into `vars`. If the names match the required command
723 argument parameters already (for example `iftraffic_units`), you could also add the
724 `interface_config` dictionary to the `vars` dictionary using the `+=` operator.
726 After `vars` is fully populated, all object attributes can be set calculated from
727 provided host attributes. For strings, you can use string concatention with the `+` operator.
729 You can also specifiy the display_name, check command, interval, notes, notes_url, action_url, etc.
730 attributes that way. Attribute strings can be [concatenated](20-language-reference.md#expression-operators),
731 for example for adding a more detailed service `display_name`.
733 This example also uses [if conditions](20-language-reference.md#conditional-statements)
734 if specific values are not set, adding a local default value.
735 The other way around you can override specific custom attributes inherited from a service template,
738 /* loop over the host.vars.interfaces dictionary
739 * for (key => value in dict) means `interface_name` as key
740 * and `interface_config` as value. Access config attributes
741 * with the indexer (`.`) character.
743 apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) {
744 import "generic-service"
745 check_command = "iftraffic"
746 display_name = "IF-" + interface_name
748 /* use the key as command argument (no duplication of values in host.vars.interfaces) */
749 vars.iftraffic_interface = interface_name
751 /* map the custom attributes as command arguments */
752 vars.iftraffic_units = interface_config.iftraffic_units
753 vars.iftraffic_community = interface_config.iftraffic_community
755 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces
756 * are the _exact_ same as required as command parameter by the check command
759 vars += interface_config
761 /* set a default value for units and bandwidth */
762 if (interface_config.iftraffic_units == "") {
763 vars.iftraffic_units = "m"
765 if (interface_config.iftraffic_bandwidth == "") {
766 vars.iftraffic_bandwidth = 1
768 if (interface_config.vlan == "") {
769 vars.vlan = "not set"
771 if (interface_config.qos == "") {
775 /* set the global constant if not explicitely
776 * not provided by the `interfaces` dictionary on the host
778 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) {
779 vars.iftraffic_community = IftrafficSnmpCommunity
782 /* Calculate some additional object attributes after populating the `vars` dictionary */
783 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'"
784 notes_url = "http://foreman.company.com/hosts/" + host.name
785 action_url = "http://snmp.checker.company.com/" + host.name + "/if-" + interface_name
790 This example makes use of the [check_iftraffic](https://exchange.icinga.org/exchange/iftraffic) plugin.
791 The `CheckCommand` definition can be found in the
792 [contributed plugin check commands](7-icinga-template-library.md#plugins-contrib-command-iftraffic)
793 - make sure to include them in your [icinga2 configuration file](4-configuring-icinga-2.md#icinga2-conf).
798 > Building configuration in that dynamic way requires detailed information
799 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
800 > after successful [configuration validation](8-cli-commands.md#config-validation).
802 Verify that the apply-for-rule successfully created the service objects with the
803 inherited custom attributes:
806 # icinga2 object list --type Service --name *catalyst*
808 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service':
811 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
812 * iftraffic_bandwidth = 1
813 * iftraffic_community = "public"
814 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
815 * iftraffic_interface = "GigabitEthernet0/2"
816 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
817 * iftraffic_units = "g"
818 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
823 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service':
826 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
827 * iftraffic_bandwidth = 1
828 * iftraffic_community = "public"
829 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
830 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53
831 * iftraffic_interface = "GigabitEthernet0/4"
832 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
833 * iftraffic_units = "g"
834 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
838 Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service':
841 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
842 * iftraffic_bandwidth = 1
843 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32
844 * iftraffic_community = "public"
845 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
846 * iftraffic_interface = "MgmtInterface1"
847 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
848 * iftraffic_units = "m"
849 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
850 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30
851 * interface_address = "127.99.0.100"
853 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24
857 ### <a id="using-apply-object-attributes"></a> Use Object Attributes in Apply Rules
859 Since apply rules are evaluated after the generic objects, you
860 can reference existing host and/or service object attributes as
861 values for any object attribute specified in that apply rule.
863 object Host "opennebula-host" {
864 import "generic-host"
867 vars.hosting["xyz"] = {
869 customer_name = "Customer xyz"
871 support_contract = "gold"
873 vars.hosting["abc"] = {
875 customer_name = "Customer xyz"
877 support_contract = "silver"
881 apply Service for (customer => config in host.vars.hosting) {
882 import "generic-service"
883 check_command = "ping4"
885 vars.qos = "disabled"
889 vars.http_uri = "/" + vars.customer + "/" + config.http_uri
891 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
893 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
895 notes_url = "http://foreman.company.com/hosts/" + host.name
896 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
899 ## <a id="groups"></a> Groups
901 A group is a collection of similar objects. Groups are primarily used as a
902 visualization aid in web interfaces.
904 Group membership is defined at the respective object itself. If
905 you have a hostgroup name `windows` for example, and want to assign
906 specific hosts to this group for later viewing the group on your
907 alert dashboard, first create a HostGroup object:
909 object HostGroup "windows" {
910 display_name = "Windows Servers"
913 Then add your hosts to this group:
915 template Host "windows-server" {
916 groups += [ "windows" ]
919 object Host "mssql-srv1" {
920 import "windows-server"
922 vars.mssql_port = 1433
925 object Host "mssql-srv2" {
926 import "windows-server"
928 vars.mssql_port = 1433
931 This can be done for service and user groups the same way:
933 object UserGroup "windows-mssql-admins" {
934 display_name = "Windows MSSQL Admins"
937 template User "generic-windows-mssql-users" {
938 groups += [ "windows-mssql-admins" ]
941 object User "win-mssql-noc" {
942 import "generic-windows-mssql-users"
944 email = "noc@example.com"
947 object User "win-mssql-ops" {
948 import "generic-windows-mssql-users"
950 email = "ops@example.com"
953 ### <a id="group-assign-intro"></a> Group Membership Assign
955 Instead of manually assigning each object to a group you can also assign objects
956 to a group based on their attributes:
958 object HostGroup "prod-mssql" {
959 display_name = "Production MSSQL Servers"
961 assign where host.vars.mssql_port && host.vars.prod_mysql_db
962 ignore where host.vars.test_server == true
963 ignore where match("*internal", host.name)
966 In this example all hosts with the `vars` attribute `mssql_port`
967 will be added as members to the host group `mssql`. However, all `*internal`
968 hosts or with the `test_server` attribute set to `true` are not added to this
971 Details on the `assign where` syntax can be found in the
972 [Language Reference](20-language-reference.md#apply)
974 ## <a id="notifications"></a> Notifications
976 Notifications for service and host problems are an integral part of your
979 When a host or service is in a downtime, a problem has been acknowledged or
980 the dependency logic determined that the host/service is unreachable, no
981 notifications are sent. You can configure additional type and state filters
982 refining the notifications being actually sent.
984 There are many ways of sending notifications, e.g. by e-mail, XMPP,
985 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
986 Instead it relies on external mechanisms such as shell scripts to notify users.
987 More notification methods are listed in the [addons and plugins](14-addons-plugins.md#notification-scripts-interfaces)
990 A notification specification requires one or more users (and/or user groups)
991 who will be notified in case of problems. These users must have all custom
992 attributes defined which will be used in the `NotificationCommand` on execution.
994 The user `icingaadmin` in the example below will get notified only on `WARNING` and
995 `CRITICAL` states and `problem` and `recovery` notification types.
997 object User "icingaadmin" {
998 display_name = "Icinga 2 Admin"
999 enable_notifications = true
1000 states = [ OK, Warning, Critical ]
1001 types = [ Problem, Recovery ]
1002 email = "icinga@localhost"
1005 If you don't set the `states` and `types` configuration attributes for the `User`
1006 object, notifications for all states and types will be sent.
1008 Details on troubleshooting notification problems can be found [here](17-troubleshooting.md#troubleshooting).
1012 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1013 > in order to execute notification commands.
1015 You should choose which information you (and your notified users) are interested in
1016 case of emergency, and also which information does not provide any value to you and
1019 An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
1021 You can add all shared attributes to a `Notification` template which is inherited
1022 to the defined notifications. That way you'll save duplicated attributes in each
1023 `Notification` object. Attributes can be overridden locally.
1025 template Notification "generic-notification" {
1028 command = "mail-service-notification"
1030 states = [ Warning, Critical, Unknown ]
1031 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1032 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1037 The time period `24x7` is included as example configuration with Icinga 2.
1039 Use the `apply` keyword to create `Notification` objects for your services:
1041 apply Notification "notify-cust-xy-mysql" to Service {
1042 import "generic-notification"
1044 users = [ "noc-xy", "mgmt-xy" ]
1046 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
1047 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
1051 Instead of assigning users to notifications, you can also add the `user_groups`
1052 attribute with a list of user groups to the `Notification` object. Icinga 2 will
1053 send notifications to all group members.
1057 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
1058 > states for services, `Down` for hosts) will receive `Recovery` notifications.
1060 ### <a id="notification-escalations"></a> Notification Escalations
1062 When a problem notification is sent and a problem still exists at the time of re-notification
1063 you may want to escalate the problem to the next support level. A different approach
1064 is to configure the default notification by email, and escalate the problem via SMS
1065 if not already solved.
1067 You can define notification start and end times as additional configuration
1068 attributes making the `Notification` object a so-called `notification escalation`.
1069 Using templates you can share the basic notification attributes such as users or the
1070 `interval` (and override them for the escalation then).
1072 Using the example from above, you can define additional users being escalated for SMS
1073 notifications between start and end time.
1075 object User "icinga-oncall-2nd-level" {
1076 display_name = "Icinga 2nd Level"
1078 vars.mobile = "+1 555 424642"
1081 object User "icinga-oncall-1st-level" {
1082 display_name = "Icinga 1st Level"
1084 vars.mobile = "+1 555 424642"
1087 Define an additional [NotificationCommand](3-monitoring-basics.md#notification-commands) for SMS notifications.
1091 > The example is not complete as there are many different SMS providers.
1092 > Please note that sending SMS notifications will require an SMS provider
1093 > or local hardware with a SIM card active.
1095 object NotificationCommand "sms-notification" {
1097 PluginDir + "/send_sms_notification",
1102 The two new notification escalations are added onto the local host
1103 and its service `ping4` using the `generic-notification` template.
1104 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
1105 command) after `30m` until `1h`.
1109 > The `interval` was set to 15m in the `generic-notification`
1110 > template example. Lower that value in your escalations by using a secondary
1111 > template or by overriding the attribute directly in the `notifications` array
1112 > position for `escalation-sms-2nd-level`.
1114 If the problem does not get resolved nor acknowledged preventing further notifications
1115 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
1116 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
1118 apply Notification "mail" to Service {
1119 import "generic-notification"
1121 command = "mail-notification"
1122 users = [ "icingaadmin" ]
1124 assign where service.name == "ping4"
1127 apply Notification "escalation-sms-2nd-level" to Service {
1128 import "generic-notification"
1130 command = "sms-notification"
1131 users = [ "icinga-oncall-2nd-level" ]
1138 assign where service.name == "ping4"
1141 apply Notification "escalation-sms-1st-level" to Service {
1142 import "generic-notification"
1144 command = "sms-notification"
1145 users = [ "icinga-oncall-1st-level" ]
1152 assign where service.name == "ping4"
1155 ### <a id="notification-delay"></a> Notification Delay
1157 Sometimes the problem in question should not be notified when the notification is due
1158 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
1159 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
1160 postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
1161 Icinga 2 will not check against any end time for this notification. Make sure to
1162 specify a relatively low notification `interval` to get notified soon enough again.
1164 apply Notification "mail" to Service {
1165 import "generic-notification"
1167 command = "mail-notification"
1168 users = [ "icingaadmin" ]
1172 times.begin = 15m // delay notification window
1174 assign where service.name == "ping4"
1177 ### <a id="disable-renotification"></a> Disable Re-notifications
1179 If you prefer to be notified only once, you can disable re-notifications by setting the
1180 `interval` attribute to `0`.
1182 apply Notification "notify-once" to Service {
1183 import "generic-notification"
1185 command = "mail-notification"
1186 users = [ "icingaadmin" ]
1188 interval = 0 // disable re-notification
1190 assign where service.name == "ping4"
1193 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
1195 If there are no notification state and type filter attributes defined at the `Notification`
1196 or `User` object Icinga 2 assumes that all states and types are being notified.
1198 Available state and type filters for notifications are:
1200 template Notification "generic-notification" {
1202 states = [ Warning, Critical, Unknown ]
1203 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1204 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1207 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
1208 into type and state to allow more fine granular filtering for example on downtimes and flapping.
1209 You can filter for acknowledgements and custom notifications too.
1212 ## <a id="commands"></a> Commands
1214 Icinga 2 uses three different command object types to specify how
1215 checks should be performed, notifications should be sent, and
1216 events should be handled.
1218 ### <a id="check-commands"></a> Check Commands
1220 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects define the command line how
1223 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects are referenced by
1224 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1225 using the `check_command` attribute.
1229 > Make sure that the [checker](8-cli-commands.md#features) feature is enabled in order to
1232 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
1234 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects require the [ITL template](7-icinga-template-library.md#itl-plugin-check-command)
1235 `plugin-check-command` to support native plugin based check methods.
1237 Unless you have done so already, download your check plugin and put it
1238 into the [PluginDir](4-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1239 `check_mysql` plugin contained in the Monitoring Plugins package.
1241 The plugin path and all command arguments are made a list of
1242 double-quoted string arguments for proper shell escaping.
1244 Call the `check_disk` plugin with the `--help` parameter to see
1245 all available options. Our example defines warning (`-w`) and
1246 critical (`-c`) thresholds for the disk usage. Without any
1247 partition defined (`-p`) it will check all local partitions.
1249 icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help
1252 This program tests connections to a MySQL server
1255 check_mysql [-d database] [-H host] [-P port] [-s socket]
1256 [-u user] [-p password] [-S] [-l] [-a cert] [-k key]
1257 [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group]
1259 Next step is to understand how [command parameters](3-monitoring-basics.md#command-passing-parameters)
1260 are being passed from a host or service object, and add a [CheckCommand](6-object-types.md#objecttype-checkcommand)
1261 definition based on these required parameters and/or default values.
1263 Please continue reading in the [plugins section](14-addons-plugins.md#plugins) for additional integration examples.
1265 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
1267 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1268 by the executed check command.
1270 The check command parameters for ITL provided plugin check command definitions are documented
1271 [here](7-icinga-template-library.md#plugin-check-commands), for example
1272 [disk](7-icinga-template-library.md#plugin-check-command-disk).
1274 In order to practice passing command parameters you should [integrate your own plugin](3-monitoring-basics.md#command-plugin-integration).
1276 The following example will use `check_mysql` provided by the [Monitoring Plugins installation](2-getting-started.md#setting-up-check-plugins).
1278 Define the default check command custom attributes, for example `mysql_user` and `mysql_password`
1279 (freely definable naming schema) and optional their default threshold values. You can
1280 then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
1281 on the command line.
1285 > Use a common command type as prefix for your command arguments to increase
1286 > readability. `mysql_user` helps understanding the context better than just
1287 > `user` as argument.
1289 The default custom attributes can be overridden by the custom attributes
1290 defined in the host or service using the check command `my-mysql`. The custom attributes
1291 can also be inherited from a parent template using additive inheritance (`+=`).
1293 # vim /etc/icinga2/conf.d/commands.conf
1295 object CheckCommand "my-mysql" {
1296 import "plugin-check-command"
1298 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir
1301 "-H" = "$mysql_host$"
1304 value = "$mysql_user$"
1306 "-p" = "$mysql_password$"
1307 "-P" = "$mysql_port$"
1308 "-s" = "$mysql_socket$"
1309 "-a" = "$mysql_cert$"
1310 "-d" = "$mysql_database$"
1311 "-k" = "$mysql_key$"
1312 "-C" = "$mysql_ca_cert$"
1313 "-D" = "$mysql_ca_dir$"
1314 "-L" = "$mysql_ciphers$"
1315 "-f" = "$mysql_optfile$"
1316 "-g" = "$mysql_group$"
1318 set_if = "$mysql_check_slave$"
1319 description = "Check if the slave thread is running properly."
1322 set_if = "$mysql_ssl$"
1323 description = "Use ssl encryption"
1327 vars.mysql_check_slave = false
1328 vars.mysql_ssl = false
1329 vars.mysql_host = "$address$"
1332 The check command definition also sets `mysql_host` to the `$address$` default value. You can override
1333 this command parameter if for example your MySQL host is not running on the same server's ip address.
1335 Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`.
1336 `MysqlUsername` and `MysqlPassword` are specified as [global constants](4-configuring-icinga-2.md#constants-conf)
1339 # vim /etc/icinga2/conf.d/services.conf
1341 apply Service "mysql-icinga-db-health" {
1342 import "generic-service"
1344 check_command = "my-mysql"
1346 vars.mysql_user = MysqlUsername
1347 vars.mysql_password = MysqlPassword
1349 vars.mysql_database = "icinga"
1350 vars.mysql_host = "192.168.33.11"
1352 assign where match("icinga2*", host.name)
1353 ignore where host.vars.no_health_check == true
1357 Take a different example: The example host configuration in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
1358 also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`.
1359 You can pass the command parameter as custom attribute `ssh_port` directly inside the service apply rule
1360 inside [services.conf](4-configuring-icinga-2.md#services-conf):
1362 apply Service "ssh" {
1363 import "generic-service"
1365 check_command = "ssh"
1366 vars.ssh_port = 2022 //custom command parameter
1368 assign where (host.address || host.address6) && host.vars.os == "Linux"
1371 If you prefer this being configured at the host instead of the service, modify the host configuration
1372 object instead. The runtime macro resolving order is described [here](3-monitoring-basics.md#macro-evaluation-order).
1374 object Host NodeName {
1376 vars.ssh_port = 2022
1379 #### <a id="command-passing-parameters-apply-for"></a> Passing Check Command Parameters Using Apply For
1381 The host `localhost` with the generated services from the `basic-partitions` dictionary (see
1382 [apply for](3-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions
1383 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
1386 The custom attribute `disk_partition` can either hold a single string or an array of
1387 string values for passing multiple partitions to the `check_disk` check plugin.
1389 object Host "my-server" {
1390 import "generic-host"
1391 address = "127.0.0.1"
1394 vars.local_disks["basic-partitions"] = {
1395 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
1399 apply Service for (disk => config in host.vars.local_disks) {
1400 import "generic-service"
1401 check_command = "my-disk"
1405 vars.disk_wfree = "10%"
1406 vars.disk_cfree = "5%"
1410 More details on using arrays in custom attributes can be found in
1411 [this chapter](3-monitoring-basics.md#custom-attributes).
1414 #### <a id="command-arguments"></a> Command Arguments
1416 By defining a check command line using the `command` attribute Icinga 2
1417 will resolve all macros in the static string or array. Sometimes it is
1418 required to extend the arguments list based on a met condition evaluated
1419 at command execution. Or making arguments optional - only set if the
1420 macro value can be resolved by Icinga 2.
1422 object CheckCommand "check_http" {
1423 import "plugin-check-command"
1425 command = [ PluginDir + "/check_http" ]
1428 "-H" = "$http_vhost$"
1429 "-I" = "$http_address$"
1431 "-p" = "$http_port$"
1433 set_if = "$http_ssl$"
1436 set_if = "$http_sni$"
1439 value = "$http_auth_pair$"
1440 description = "Username:password on sites with basic authentication"
1443 set_if = "$http_ignore_body$"
1445 "-r" = "$http_expect_body_regex$"
1446 "-w" = "$http_warn_time$"
1447 "-c" = "$http_critical_time$"
1448 "-e" = "$http_expect$"
1451 vars.http_address = "$address$"
1452 vars.http_ssl = false
1453 vars.http_sni = false
1456 The example shows the `check_http` check command defining the most common
1457 arguments. Each of them is optional by default and will be omitted if
1458 the value is not set. For example if the service calling the check command
1459 does not have `vars.http_port` set, it won't get added to the command
1462 If the `vars.http_ssl` custom attribute is set in the service, host or command
1463 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
1464 numeric value to the command line. String values are not supported.
1466 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
1467 to the final command argument array. Empty strings for macro values won't omit
1470 That way you can use the `check_http` command definition for both, with and
1471 without SSL enabled checks saving you duplicated command definitions.
1473 Details on all available options can be found in the
1474 [CheckCommand object definition](6-object-types.md#objecttype-checkcommand).
1477 #### <a id="command-environment-variables"></a> Environment Variables
1479 The `env` command object attribute specifies a list of environment variables with values calculated
1480 from either runtime macros or custom attributes which should be exported as environment variables
1481 prior to executing the command.
1483 This is useful for example for hiding sensitive information on the command line output
1484 when passing credentials to database checks:
1486 object CheckCommand "mysql-health" {
1487 import "plugin-check-command"
1490 PluginDir + "/check_mysql"
1494 "-H" = "$mysql_address$"
1495 "-d" = "$mysql_database$"
1498 vars.mysql_address = "$address$"
1499 vars.mysql_database = "icinga"
1500 vars.mysql_user = "icinga_check"
1501 vars.mysql_pass = "password"
1503 env.MYSQLUSER = "$mysql_user$"
1504 env.MYSQLPASS = "$mysql_pass$"
1509 ### <a id="notification-commands"></a> Notification Commands
1511 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
1512 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
1514 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects are referenced by
1515 [Notification](6-object-types.md#objecttype-notification) objects using the `command` attribute.
1517 `NotificationCommand` objects require the [ITL template](7-icinga-template-library.md#itl-plugin-notification-command)
1518 `plugin-notification-command` to support native plugin-based notifications.
1522 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1523 > in order to execute notification commands.
1525 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
1526 the current check output) sending an email to the user(s) associated with the
1527 notification itself (`$user.email$`).
1529 If you want to specify default values for some of the custom attribute definitions,
1530 you can add a `vars` dictionary as shown for the `CheckCommand` object.
1532 object NotificationCommand "mail-service-notification" {
1533 import "plugin-notification-command"
1535 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
1538 NOTIFICATIONTYPE = "$notification.type$"
1539 SERVICEDESC = "$service.name$"
1540 HOSTALIAS = "$host.display_name$"
1541 HOSTADDRESS = "$address$"
1542 SERVICESTATE = "$service.state$"
1543 LONGDATETIME = "$icinga.long_date_time$"
1544 SERVICEOUTPUT = "$service.output$"
1545 NOTIFICATIONAUTHORNAME = "$notification.author$"
1546 NOTIFICATIONCOMMENT = "$notification.comment$"
1547 HOSTDISPLAYNAME = "$host.display_name$"
1548 SERVICEDISPLAYNAME = "$service.display_name$"
1549 USEREMAIL = "$user.email$"
1553 The command attribute in the `mail-service-notification` command refers to the following
1554 shell script. The macros specified in the `env` array are exported
1555 as environment variables and can be used in the notification script:
1558 template=$(cat <<TEMPLATE
1561 Notification Type: $NOTIFICATIONTYPE
1563 Service: $SERVICEDESC
1565 Address: $HOSTADDRESS
1566 State: $SERVICESTATE
1568 Date/Time: $LONGDATETIME
1570 Additional Info: $SERVICEOUTPUT
1572 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
1576 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
1580 > This example is for `exim` only. Requires changes for `sendmail` and
1583 While it's possible to specify the entire notification command right
1584 in the NotificationCommand object it is generally advisable to create a
1585 shell script in the `/etc/icinga2/scripts` directory and have the
1586 NotificationCommand object refer to that.
1588 ### <a id="event-commands"></a> Event Commands
1590 Unlike notifications, event commands for hosts/services are called on every
1591 check execution if one of these conditions match:
1593 * The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
1594 * The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
1595 * The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
1597 [EventCommand](6-object-types.md#objecttype-eventcommand) objects are referenced by
1598 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1599 using the `event_command` attribute.
1601 Therefore the `EventCommand` object should define a command line
1602 evaluating the current service state and other service runtime attributes
1603 available through runtime vars. Runtime macros such as `$service.state_type$`
1604 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
1605 events being triggered.
1607 Common use case scenarios are a failing HTTP check requiring an immediate
1608 restart via event command, or if an application is locked and requires
1609 a restart upon detection.
1611 `EventCommand` objects require the ITL template `plugin-event-command`
1612 to support native plugin based checks.
1614 #### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
1616 The following example will triggert a restart of the `httpd` daemon
1617 via ssh when the `http` service check fails. If the service state is
1618 `OK`, it will not trigger any event action.
1623 * icinga user with public key authentication
1624 * icinga user with sudo permissions for restarting the httpd daemon.
1628 # ls /home/icinga/.ssh/
1632 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
1635 Define a generic [EventCommand](6-object-types.md#objecttype-eventcommand) object `event_by_ssh`
1636 which can be used for all event commands triggered using ssh:
1638 /* pass event commands through ssh */
1639 object EventCommand "event_by_ssh" {
1640 import "plugin-event-command"
1642 command = [ PluginDir + "/check_by_ssh" ]
1645 "-H" = "$event_by_ssh_address$"
1646 "-p" = "$event_by_ssh_port$"
1647 "-C" = "$event_by_ssh_command$"
1648 "-l" = "$event_by_ssh_logname$"
1649 "-i" = "$event_by_ssh_identity$"
1651 set_if = "$event_by_ssh_quiet$"
1653 "-w" = "$event_by_ssh_warn$"
1654 "-c" = "$event_by_ssh_crit$"
1655 "-t" = "$event_by_ssh_timeout$"
1658 vars.event_by_ssh_address = "$address$"
1659 vars.event_by_ssh_quiet = false
1662 The actual event command only passes the `event_by_ssh_command` attribute.
1663 The `event_by_ssh_service` custom attribute takes care of passing the correct
1664 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
1665 is only restarted when the service is not in an `OK` state.
1668 object EventCommand "event_by_ssh_restart_service" {
1669 import "event_by_ssh"
1671 //only restart the daemon if state > 0 (not-ok)
1672 //requires sudo permissions for the icinga user
1673 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
1677 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
1678 which service should be restarted using the `event_by_ssh_service` attribute.
1680 object Service "http" {
1681 import "generic-service"
1682 host_name = "remote-http-host"
1683 check_command = "http"
1685 event_command = "event_by_ssh_restart_service"
1686 vars.event_by_ssh_service = "$host.vars.httpd_name$"
1688 //vars.event_by_ssh_logname = "icinga"
1689 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
1693 Each host with this service then must define the `httpd_name` custom attribute
1694 (for example generated from your cmdb):
1696 object Host "remote-http-host" {
1697 import "generic-host"
1698 address = "192.168.1.100"
1700 vars.httpd_name = "apache2"
1703 You can testdrive this example by manually stopping the `httpd` daemon
1704 on your `remote-http-host`. Enable the `debuglog` feature and tail the
1705 `/var/log/icinga2/debug.log` file.
1707 Remote Host Terminal:
1709 # date; service apache2 status
1710 Mon Sep 15 18:57:39 CEST 2014
1711 Apache2 is running (pid 23651).
1712 # date; service apache2 stop
1713 Mon Sep 15 18:57:47 CEST 2014
1714 [ ok ] Stopping web server: apache2 ... waiting .
1716 Icinga 2 Host Terminal:
1718 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
1719 [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
1720 [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
1721 [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
1722 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
1723 [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
1725 Remote Host Terminal:
1727 # date; service apache2 status
1728 Mon Sep 15 18:58:44 CEST 2014
1729 Apache2 is running (pid 24908).
1732 ## <a id="dependencies"></a> Dependencies
1734 Icinga 2 uses host and service [Dependency](6-object-types.md#objecttype-dependency) objects
1735 for determing their network reachability.
1737 A service can depend on a host, and vice versa. A service has an implicit
1738 dependency (parent) to its host. A host to host dependency acts implicitly
1739 as host parent relation.
1740 When dependencies are calculated, not only the immediate parent is taken into
1741 account but all parents are inherited.
1743 The `parent_host_name` and `parent_service_name` attributes are mandatory for
1744 service dependencies, `parent_host_name` is required for host dependencies.
1745 [Apply rules](3-monitoring-basics.md#using-apply) will allow you to
1746 [determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
1747 dynamic fashion if required.
1749 parent_host_name = "core-router"
1750 parent_service_name = "uplink-port"
1752 Notifications are suppressed by default if a host or service becomes unreachable.
1753 You can control that option by defining the `disable_notifications` attribute.
1755 disable_notifications = false
1757 If the dependency should be triggered in the parent object's soft state, you
1758 need to set `ignore_soft_states` to `false`.
1760 The dependency state filter must be defined based on the parent object being
1761 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
1763 The following example will make the dependency fail and trigger it if the parent
1764 object is **not** in one of these states:
1766 states = [ OK, Critical, Unknown ]
1768 Rephrased: If the parent service object changes into the `Warning` state, this
1769 dependency will fail and render all child objects (hosts or services) unreachable.
1771 You can determine the child's reachability by querying the `is_reachable` attribute
1772 in for example [DB IDO](23-appendix.md#schema-db-ido-extensions).
1774 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
1776 Icinga 2 automatically adds an implicit dependency for services on their host. That way
1777 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
1778 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
1779 `states = [ Up ]` for all service objects.
1781 Service checks are still executed. If you want to prevent them from happening, you can
1782 apply the following dependency to all services setting their host as `parent_host_name`
1783 and disabling the checks. `assign where true` matches on all `Service` objects.
1785 apply Dependency "disable-host-service-checks" to Service {
1786 disable_checks = true
1790 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
1792 A common scenario is the Icinga 2 server behind a router. Checking internet
1793 access by pinging the Google DNS server `google-dns` is a common method, but
1794 will fail in case the `dsl-router` host is down. Therefore the example below
1795 defines a host dependency which acts implicitly as parent relation too.
1797 Furthermore the host may be reachable but ping probes are dropped by the
1798 router's firewall. In case the `dsl-router`'s `ping4` service check fails, all
1799 further checks for the `ping4` service on host `google-dns` service should
1800 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1802 object Host "dsl-router" {
1803 import "generic-host"
1804 address = "192.168.1.1"
1807 object Host "google-dns" {
1808 import "generic-host"
1812 apply Service "ping4" {
1813 import "generic-service"
1815 check_command = "ping4"
1817 assign where host.address
1820 apply Dependency "internet" to Host {
1821 parent_host_name = "dsl-router"
1822 disable_checks = true
1823 disable_notifications = true
1825 assign where host.name != "dsl-router"
1828 apply Dependency "internet" to Service {
1829 parent_host_name = "dsl-router"
1830 parent_service_name = "ping4"
1831 disable_checks = true
1833 assign where host.name != "dsl-router"
1836 ### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
1838 You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
1839 child attributes e.g. `parent_host_name` to other object's
1842 A common example are virtual machines hosted on a master. The object
1843 name of that master is auto-generated from your CMDB or VMWare inventory
1844 into the host's custom attributes (or a generic template for your
1847 Define your master host object:
1850 object Host "master.example.com" {
1851 import "generic-host"
1854 Add a generic template defining all common host attributes:
1856 /* generic template for your virtual machines */
1857 template Host "generic-vm" {
1858 import "generic-host"
1861 Add a template for all hosts on your example.com cloud setting
1862 custom attribute `vm_parent` to `master.example.com`:
1864 template Host "generic-vm-example.com" {
1866 vars.vm_parent = "master.example.com"
1869 Define your guest hosts:
1871 object Host "www.example1.com" {
1872 import "generic-vm-master.example.com"
1875 object Host "www.example2.com" {
1876 import "generic-vm-master.example.com"
1879 Apply the host dependency to all child hosts importing the
1880 `generic-vm` template and set the `parent_host_name`
1881 to the previously defined custom attribute `host.vars.vm_parent`.
1883 apply Dependency "vm-host-to-parent-master" to Host {
1884 parent_host_name = host.vars.vm_parent
1885 assign where "generic-vm" in host.templates
1888 You can extend this example, and make your services depend on the
1889 `master.example.com` host too. Their local scope allows you to use
1890 `host.vars.vm_parent` similar to the example above.
1892 apply Dependency "vm-service-to-parent-master" to Service {
1893 parent_host_name = host.vars.vm_parent
1894 assign where "generic-vm" in host.templates
1897 That way you don't need to wait for your guest hosts becoming
1898 unreachable when the master host goes down. Instead the services
1899 will detect their reachability immediately when executing checks.
1903 > This method with setting locally scoped variables only works in
1904 > apply rules, but not in object definitions.
1907 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1909 Another classic example are agent based checks. You would define a health check
1910 for the agent daemon responding to your requests, and make all other services
1911 querying that daemon depend on that health check.
1913 The following configuration defines two nrpe based service checks `nrpe-load`
1914 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1915 `nrpe-health` service.
1917 apply Service "nrpe-health" {
1918 import "generic-service"
1919 check_command = "nrpe"
1920 assign where match("nrpe-*", host.name)
1923 apply Service "nrpe-load" {
1924 import "generic-service"
1925 check_command = "nrpe"
1926 vars.nrpe_command = "check_load"
1927 assign where match("nrpe-*", host.name)
1930 apply Service "nrpe-disk" {
1931 import "generic-service"
1932 check_command = "nrpe"
1933 vars.nrpe_command = "check_disk"
1934 assign where match("nrpe-*", host.name)
1937 object Host "nrpe-server" {
1938 import "generic-host"
1939 address = "192.168.1.5"
1942 apply Dependency "disable-nrpe-checks" to Service {
1943 parent_service_name = "nrpe-health"
1946 disable_checks = true
1947 disable_notifications = true
1948 assign where service.check_command == "nrpe"
1949 ignore where service.name == "nrpe-health"
1952 The `disable-nrpe-checks` dependency is applied to all services
1953 on the `nrpe-service` host using the `nrpe` check_command attribute
1954 but not the `nrpe-health` service itself.