1 # Monitoring Basics <a id="monitoring-basics"></a>
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
5 Keep in mind these examples are made with a Linux server. If you are
6 using Windows, you will need to change the services accordingly. See the [ITL reference](10-icinga-template-library.md#windows-plugins)
7 for further information.
9 ## Attribute Value Types <a id="attribute-value-types"></a>
11 The Icinga 2 configuration uses different value types for attributes.
14 -------------------------------------------------------|---------------------------------------------------------
15 [Number](17-language-reference.md#numeric-literals) | `5`
16 [Duration](17-language-reference.md#duration-literals) | `1m`
17 [String](17-language-reference.md#string-literals) | `"These are notes"`
18 [Boolean](17-language-reference.md#boolean-literals) | `true`
19 [Array](17-language-reference.md#array) | `[ "value1", "value2" ]`
20 [Dictionary](17-language-reference.md#dictionary) | `{ "key1" = "value1", "key2" = false }`
22 It is important to use the correct value type for object attributes
23 as otherwise the [configuration validation](11-cli-commands.md#config-validation) will fail.
25 ## Hosts and Services <a id="hosts-services"></a>
27 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
28 and services can be virtually anything which can be checked in some way:
30 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
34 * Other local or network-accessible services
36 Host objects provide a mechanism to group services that are running
37 on the same physical device.
39 Here is an example of a host object which defines two child services:
42 object Host "my-server1" {
44 check_command = "hostalive"
47 object Service "ping4" {
48 host_name = "my-server1"
49 check_command = "ping4"
52 object Service "http" {
53 host_name = "my-server1"
54 check_command = "http"
58 The example creates two services `ping4` and `http` which belong to the
61 It also specifies that the host should perform its own check using the `hostalive`
64 The `address` attribute is used by check commands to determine which network
65 address is associated with the host object.
67 Details on troubleshooting check problems can be found [here](15-troubleshooting.md#troubleshooting).
69 ### Host States <a id="host-states"></a>
71 Hosts can be in any one of the following states:
74 ------------|--------------
75 UP | The host is available.
76 DOWN | The host is unavailable.
78 ### Service States <a id="service-states"></a>
80 Services can be in any one of the following states:
83 ------------|--------------
84 OK | The service is working properly.
85 WARNING | The service is experiencing some problems but is still considered to be in working condition.
86 CRITICAL | The service is in a critical state.
87 UNKNOWN | The check could not determine the service's state.
89 ### Check Result State Mapping <a id="check-result-state-mapping"></a>
91 [Check plugins](05-service-monitoring.md#service-monitoring-plugins) return
92 with an exit code which is converted into a state number.
93 Services map the states directly while hosts will treat `0` or `1` as `UP`
96 Value | Host State | Service State
97 ------|------------|--------------
103 ### Hard and Soft States <a id="hard-soft-states"></a>
105 When detecting a problem with a host/service, Icinga re-checks the object a number of
106 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
107 notifications. This ensures that no unnecessary notifications are sent for
108 transient failures. During this time the object is in a `SOFT` state.
110 After all re-checks have been executed and the object is still in a non-OK
111 state, the host/service switches to a `HARD` state and notifications are sent.
114 ------------|--------------
115 HARD | The host/service's state hasn't recently changed. `check_interval` applies here.
116 SOFT | The host/service has recently changed state and is being re-checked with `retry_interval`.
118 ### Host and Service Checks <a id="host-service-checks"></a>
120 Hosts and services determine their state by running checks in a regular interval.
123 object Host "router" {
124 check_command = "hostalive"
129 The `hostalive` command is one of several built-in check commands. It sends ICMP
130 echo requests to the IP address specified in the `address` attribute to determine
131 whether a host is online.
135 > `hostalive` is the same as `ping` but with different default thresholds.
136 > Both use the `ping` CLI command to execute sequential checks.
138 > If you need faster ICMP checks, look into the [icmp](10-icinga-template-library.md#plugin-check-command-icmp) CheckCommand.
140 A number of other [built-in check commands](10-icinga-template-library.md#icinga-template-library) are also
141 available. In addition to these commands the next few chapters will explain in
142 detail how to set up your own check commands.
144 #### Host Check Alternatives <a id="host-check-alternatives"></a>
146 If the host is not reachable with ICMP, HTTP, etc. you can
147 also use the [dummy](10-icinga-template-library.md#itl-dummy) CheckCommand to set a default state.
150 object Host "dummy-host" {
151 check_command = "dummy"
152 vars.dummy_state = 0 //Up
153 vars.dummy_text = "Everything OK."
157 This method is also used when you send in [external check results](08-advanced-topics.md#external-check-results).
159 A more advanced technique is to calculate an overall state
160 based on all services. This is described [here](08-advanced-topics.md#access-object-attributes-at-runtime-cluster-check).
163 ## Templates <a id="object-inheritance-using-templates"></a>
165 Templates may be used to apply a set of identical attributes to more than one
169 template Service "generic-service" {
170 max_check_attempts = 3
173 enable_perfdata = true
176 apply Service "ping4" {
177 import "generic-service"
179 check_command = "ping4"
181 assign where host.address
184 apply Service "ping6" {
185 import "generic-service"
187 check_command = "ping6"
189 assign where host.address6
194 In this example the `ping4` and `ping6` services inherit properties from the
195 template `generic-service`.
197 Objects as well as templates themselves can import an arbitrary number of
198 other templates. Attributes inherited from a template can be overridden in the
201 You can also import existing non-template objects.
205 > Templates and objects share the same namespace, i.e. you can't define a template
206 > that has the same name like an object.
209 ### Multiple Templates <a id="object-inheritance-using-multiple-templates"></a>
211 The following example uses [custom attributes](03-monitoring-basics.md#custom-attributes) which
212 are provided in each template. The `web-server` template is used as the
213 base template for any host providing web services. In addition to that it
214 specifies the custom attribute `webserver_type`, e.g. `apache`. Since this
215 template is also the base template, we import the `generic-host` template here.
216 This provides the `check_command` attribute by default and we don't need
217 to set it anywhere later on.
220 template Host "web-server" {
221 import "generic-host"
223 webserver_type = "apache"
228 The `wp-server` host template specifies a Wordpress instance and sets
229 the `application_type` custom attribute. Please note the `+=` [operator](17-language-reference.md#dictionary-operators)
230 which adds [dictionary](17-language-reference.md#dictionary) items,
231 but does not override any previous `vars` attribute.
234 template Host "wp-server" {
236 application_type = "wordpress"
241 The final host object imports both templates. The order is important here:
242 First the base template `web-server` is added to the object, then additional
243 attributes are imported from the `wp-server` object.
246 object Host "wp.example.com" {
250 address = "192.168.56.200"
254 If you want to override specific attributes inherited from templates, you can
255 specify them on the host object.
258 object Host "wp1.example.com" {
262 vars.webserver_type = "nginx" //overrides attribute from base template
264 address = "192.168.56.201"
268 ## Custom Attributes <a id="custom-attributes"></a>
270 In addition to built-in attributes you can define your own attributes
271 inside the `vars` attribute:
274 object Host "localhost" {
275 check_command = "ssh"
280 `vars` is a [dictionary](17-language-reference.md#dictionary) where you
281 can set specific keys to values. The example above uses the shorter
282 [indexer](17-language-reference.md#indexer) syntax.
284 An alternative representation can be written like this:
295 vars["ssh_port"] = 2222
298 ### Custom Attribute Values <a id="custom-attributes-values"></a>
300 Valid values for custom attributes include:
302 * [Strings](17-language-reference.md#string-literals), [numbers](17-language-reference.md#numeric-literals) and [booleans](17-language-reference.md#boolean-literals)
303 * [Arrays](17-language-reference.md#array) and [dictionaries](17-language-reference.md#dictionary)
304 * [Functions](03-monitoring-basics.md#custom-attributes-functions)
306 You can also define nested values such as dictionaries in dictionaries.
308 This example defines the custom attribute `disks` as dictionary.
309 The first key is set to `disk /` is itself set to a dictionary
310 with one key-value pair.
313 vars.disks["disk /"] = {
314 disk_partitions = "/"
318 This can be written as resolved structure like this:
324 disk_partitions = "/"
330 Keep this in mind when trying to access specific sub-keys
331 in apply rules or functions.
333 Another example which is shown in the example configuration:
336 vars.notification["mail"] = {
337 groups = [ "icingaadmins" ]
341 This defines the `notification` custom attribute as dictionary
342 with the key `mail`. Its value is a dictionary with the key `groups`
343 which itself has an array as value. Note: This array is the exact
344 same as the `user_groups` attribute for [notification apply rules](#03-monitoring-basics.md#using-apply-notifications)
348 vars.notification = {
358 ### Functions as Custom Attributes <a id="custom-attributes-functions"></a>
360 Icinga 2 lets you specify [functions](17-language-reference.md#functions) for custom attributes.
361 The special case here is that whenever Icinga 2 needs the value for such a custom attribute it runs
362 the function and uses whatever value the function returns:
365 object CheckCommand "random-value" {
366 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
368 vars.text = {{ Math.random() * 100 }}
372 This example uses the [abbreviated lambda syntax](17-language-reference.md#nullary-lambdas).
374 These functions have access to a number of variables:
376 Variable | Description
377 -------------|---------------
378 user | The User object (for notifications).
379 service | The Service object (for service checks/notifications/event handlers).
380 host | The Host object.
381 command | The command object (e.g. a CheckCommand object for checks).
386 vars.text = {{ host.check_interval }}
389 In addition to these variables the [macro](18-library-reference.md#scoped-functions-macro) function can be used to retrieve the
390 value of arbitrary macro expressions:
394 if (macro("$address$") == "127.0.0.1") {
395 log("Running a check for localhost!")
402 The `resolve_arguments` function can be used to resolve a command and its arguments much in
403 the same fashion Icinga does this for the `command` and `arguments` attributes for
404 commands. The `by_ssh` command uses this functionality to let users specify a
405 command and arguments that should be executed via SSH:
410 var command = macro("$by_ssh_command$")
411 var arguments = macro("$by_ssh_arguments$")
413 if (typeof(command) == String && !arguments) {
417 var escaped_args = []
418 for (arg in resolve_arguments(command, arguments)) {
419 escaped_args.add(escape_shell_arg(arg))
421 return escaped_args.join(" ")
427 Accessing object attributes at runtime inside these functions is described in the
428 [advanced topics](08-advanced-topics.md#access-object-attributes-at-runtime) chapter.
431 ## Runtime Macros <a id="runtime-macros"></a>
433 Macros can be used to access other objects' attributes at runtime. For example they
434 are used in command definitions to figure out which IP address a check should be
438 object CheckCommand "my-ping" {
439 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
442 "-w" = "$ping_wrta$,$ping_wpl$%"
443 "-c" = "$ping_crta$,$ping_cpl$%"
444 "-p" = "$ping_packets$"
447 vars.ping_address = "$address$"
455 vars.ping_packets = 5
458 object Host "router" {
459 check_command = "my-ping"
464 In this example we are using the `$address$` macro to refer to the host's `address`
467 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
468 automatically tries to find the closest match for the attribute you specified. The
469 exact rules for this are explained in the next section.
473 > When using the `$` sign as single character you must escape it with an
474 > additional dollar character (`$$`).
477 ### Evaluation Order <a id="macro-evaluation-order"></a>
479 When executing commands Icinga 2 checks the following objects in this order to look
480 up macros and their respective values:
482 1. User object (only for notifications)
486 5. Global custom attributes in the `Vars` constant
488 This execution order allows you to define default values for custom attributes
489 in your command objects.
491 Here's how you can override the custom attribute `ping_packets` from the previous
495 object Service "ping" {
496 host_name = "localhost"
497 check_command = "my-ping"
499 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
503 If a custom attribute isn't defined anywhere, an empty value is used and a warning is
504 written to the Icinga 2 log.
506 You can also directly refer to a specific attribute -- thereby ignoring these evaluation
507 rules -- by specifying the full attribute name:
510 $service.vars.ping_wrta$
513 This retrieves the value of the `ping_wrta` custom attribute for the service. This
514 returns an empty value if the service does not have such a custom attribute no matter
515 whether another object such as the host has this attribute.
518 ### Host Runtime Macros <a id="host-runtime-macros"></a>
520 The following host custom attributes are available in all commands that are executed for
524 -----------------------------|--------------
525 host.name | The name of the host object.
526 host.display\_name | The value of the `display_name` attribute.
527 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
528 host.state\_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
529 host.state\_type | The host's current state type. Can be one of `SOFT` and `HARD`.
530 host.check\_attempt | The current check attempt number.
531 host.max\_check\_attempts | The maximum number of checks which are executed before changing to a hard state.
532 host.last\_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
533 host.last\_state\_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
534 host.last\_state\_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
535 host.last\_state\_change | The last state change's timestamp.
536 host.downtime\_depth | The number of active downtimes.
537 host.duration\_sec | The time since the last state change.
538 host.latency | The host's check latency.
539 host.execution\_time | The host's check execution time.
540 host.output | The last check's output.
541 host.perfdata | The last check's performance data.
542 host.last\_check | The timestamp when the last check was executed.
543 host.check\_source | The monitoring instance that performed the last check.
544 host.num\_services | Number of services associated with the host.
545 host.num\_services\_ok | Number of services associated with the host which are in an `OK` state.
546 host.num\_services\_warning | Number of services associated with the host which are in a `WARNING` state.
547 host.num\_services\_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
548 host.num\_services\_critical | Number of services associated with the host which are in a `CRITICAL` state.
550 In addition to these specific runtime macros [host object](09-object-types.md#objecttype-host)
551 attributes can be accessed too.
553 ### Service Runtime Macros <a id="service-runtime-macros"></a>
555 The following service macros are available in all commands that are executed for
559 -----------------------------|--------------
560 service.name | The short name of the service object.
561 service.display\_name | The value of the `display_name` attribute.
562 service.check\_command | The short name of the command along with any arguments to be used for the check.
563 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
564 service.state\_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
565 service.state\_type | The service's current state type. Can be one of `SOFT` and `HARD`.
566 service.check\_attempt | The current check attempt number.
567 service.max\_check\_attempts | The maximum number of checks which are executed before changing to a hard state.
568 service.last\_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
569 service.last\_state\_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
570 service.last\_state\_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
571 service.last\_state\_change | The last state change's timestamp.
572 service.downtime\_depth | The number of active downtimes.
573 service.duration\_sec | The time since the last state change.
574 service.latency | The service's check latency.
575 service.execution\_time | The service's check execution time.
576 service.output | The last check's output.
577 service.perfdata | The last check's performance data.
578 service.last\_check | The timestamp when the last check was executed.
579 service.check\_source | The monitoring instance that performed the last check.
581 In addition to these specific runtime macros [service object](09-object-types.md#objecttype-service)
582 attributes can be accessed too.
584 ### Command Runtime Macros <a id="command-runtime-macros"></a>
586 The following custom attributes are available in all commands:
589 -----------------------|--------------
590 command.name | The name of the command object.
592 ### User Runtime Macros <a id="user-runtime-macros"></a>
594 The following custom attributes are available in all commands that are executed for
598 -----------------------|--------------
599 user.name | The name of the user object.
600 user.display\_name | The value of the `display_name` attribute.
602 In addition to these specific runtime macros [user object](09-object-types.md#objecttype-user)
603 attributes can be accessed too.
605 ### Notification Runtime Macros <a id="notification-runtime-macros"></a>
608 -----------------------|--------------
609 notification.type | The type of the notification.
610 notification.author | The author of the notification comment if existing.
611 notification.comment | The comment of the notification if existing.
613 In addition to these specific runtime macros [notification object](09-object-types.md#objecttype-notification)
614 attributes can be accessed too.
616 ### Global Runtime Macros <a id="global-runtime-macros"></a>
618 The following macros are available in all executed commands:
621 -------------------------|--------------
622 icinga.timet | Current UNIX timestamp.
623 icinga.long\_date\_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
624 icinga.short\_date\_time | Current date and time. Example: `2014-01-03 11:23:08`
625 icinga.date | Current date. Example: `2014-01-03`
626 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
627 icinga.uptime | Current uptime of the Icinga 2 process.
629 The following macros provide global statistics:
632 ------------------------------------|------------------------------------
633 icinga.num\_services\_ok | Current number of services in state 'OK'.
634 icinga.num\_services\_warning | Current number of services in state 'Warning'.
635 icinga.num\_services\_critical | Current number of services in state 'Critical'.
636 icinga.num\_services\_unknown | Current number of services in state 'Unknown'.
637 icinga.num\_services\_pending | Current number of pending services.
638 icinga.num\_services\_unreachable | Current number of unreachable services.
639 icinga.num\_services\_flapping | Current number of flapping services.
640 icinga.num\_services\_in\_downtime | Current number of services in downtime.
641 icinga.num\_services\_acknowledged | Current number of acknowledged service problems.
642 icinga.num\_hosts\_up | Current number of hosts in state 'Up'.
643 icinga.num\_hosts\_down | Current number of hosts in state 'Down'.
644 icinga.num\_hosts\_unreachable | Current number of unreachable hosts.
645 icinga.num\_hosts\_pending | Current number of pending hosts.
646 icinga.num\_hosts\_flapping | Current number of flapping hosts.
647 icinga.num\_hosts\_in\_downtime | Current number of hosts in downtime.
648 icinga.num\_hosts\_acknowledged | Current number of acknowledged host problems.
651 ## Apply Rules <a id="using-apply"></a>
653 Several object types require an object relation, e.g. [Service](09-object-types.md#objecttype-service),
654 [Notification](09-object-types.md#objecttype-notification), [Dependency](09-object-types.md#objecttype-dependency),
655 [ScheduledDowntime](09-object-types.md#objecttype-scheduleddowntime) objects. The
656 object relations are documented in the linked chapters.
658 If you for example create a service object you have to specify the [host_name](09-object-types.md#objecttype-service)
659 attribute and reference an existing host attribute.
662 object Service "ping4" {
663 check_command = "ping4"
664 host_name = "icinga2-client1.localdomain"
668 This isn't comfortable when managing a huge set of configuration objects which could
669 [match](03-monitoring-basics.md#using-apply-expressions) on a common pattern.
671 Instead you want to use **[apply](17-language-reference.md#apply) rules**.
673 If you want basic monitoring for all your hosts, add a `ping4` service apply rule
674 for all hosts which have the `address` attribute specified. Just one rule for 1000 hosts
675 instead of 1000 service objects. Apply rules will automatically generate them for you.
678 apply Service "ping4" {
679 check_command = "ping4"
680 assign where host.address
684 More explanations on assign where expressions can be found [here](03-monitoring-basics.md#using-apply-expressions).
686 ### Apply Rules: Prerequisites <a id="using-apply-prerquisites"></a>
688 Before you start with apply rules keep the following in mind:
690 * Define the best match.
691 * A set of unique [custom attributes](03-monitoring-basics.md#custom-attributes) for these hosts/services?
692 * Or [group](03-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup which should have a service set?
693 * A generic pattern [match](18-library-reference.md#global-functions-match) on the host/service name?
694 * [Multiple expressions combined](03-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](17-language-reference.md#expression-operators)
695 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
697 More specific object type requirements are described in these chapters:
699 * [Apply services to hosts](03-monitoring-basics.md#using-apply-services)
700 * [Apply notifications to hosts and services](03-monitoring-basics.md#using-apply-notifications)
701 * [Apply dependencies to hosts and services](03-monitoring-basics.md#using-apply-dependencies)
702 * [Apply scheduled downtimes to hosts and services](03-monitoring-basics.md#using-apply-scheduledowntimes)
704 ### Apply Rules: Usage Examples <a id="using-apply-usage-examples"></a>
706 You can set/override object attributes in apply rules using the respectively available
707 objects in that scope (host and/or service objects).
710 vars.application_type = host.vars.application_type
713 [Custom attributes](03-monitoring-basics.md#custom-attributes) can also store
714 nested dictionaries and arrays. That way you can use them for not only matching
715 for their existence or values in apply expressions, but also assign
716 ("inherit") their values into the generated objected from apply rules.
718 Remember the examples shown for [custom attribute values](03-monitoring-basics.md#custom-attributes-values):
721 vars.notification["mail"] = {
722 groups = [ "icingaadmins" ]
726 You can do two things here:
728 * Check for the existence of the `notification` custom attribute and its nested dictionary key `mail`.
729 If this is boolean true, the notification object will be generated.
730 * Assign the value of the `groups` key to the `user_groups` attribute.
733 apply Notification "mail-icingaadmin" to Host {
736 user_groups = host.vars.notification.mail.groups
738 assign where host.vars.notification.mail
743 A more advanced example is to use [apply rules with for loops on arrays or
744 dictionaries](03-monitoring-basics.md#using-apply-for) provided by
745 [custom atttributes](03-monitoring-basics.md#custom-attributes) or groups.
747 Remember the examples shown for [custom attribute values](03-monitoring-basics.md#custom-attributes-values):
750 vars.disks["disk /"] = {
751 disk_partitions = "/"
755 You can iterate over all dictionary keys defined in `disks`.
756 You can optionally use the value to specify additional object attributes.
759 apply Service for (disk => config in host.vars.disks) {
762 vars.disk_partitions = config.disk_partitions
766 Please read the [apply for chapter](03-monitoring-basics.md#using-apply-for)
767 for more specific insights.
772 > Building configuration in that dynamic way requires detailed information
773 > of the generated objects. Use the `object list` [CLI command](11-cli-commands.md#cli-command-object)
774 > after successful [configuration validation](11-cli-commands.md#config-validation).
777 ### Apply Rules Expressions <a id="using-apply-expressions"></a>
779 You can use simple or advanced combinations of apply rule expressions. Each
780 expression must evaluate into the boolean `true` value. An empty string
781 will be for instance interpreted as `false`. In a similar fashion undefined
782 attributes will return `false`.
787 assign where host.vars.attribute_does_not_exist
790 Multiple `assign where` condition rows are evaluated as `OR` condition.
792 You can combine multiple expressions for matching only a subset of objects. In some cases,
793 you want to be able to add more than one assign/ignore where expression which matches
794 a specific condition. To achieve this you can use the logical `and` and `or` operators.
796 #### Apply Rules Expressions Examples <a id="using-apply-expressions-examples"></a>
798 Assign a service to a specific host in a host group [array](18-library-reference.md#array-type) using the [in operator](17-language-reference.md#expression-operators):
801 assign where "hostgroup-dev" in host.groups
804 Assign an object when a custom attribute is [equal](17-language-reference.md#expression-operators) to a value:
807 assign where host.vars.application_type == "database"
809 assign where service.vars.sms_notify == true
812 Assign an object if a dictionary [contains](18-library-reference.md#dictionary-contains) a given key:
815 assign where host.vars.app_dict.contains("app")
818 Match the host name by either using a [case insensitive match](18-library-reference.md#global-functions-match):
821 assign where match("webserver*", host.name)
824 Match the host name by using a [regular expression](18-library-reference.md#global-functions-regex). Please note the [escaped](17-language-reference.md#string-literals-escape-sequences) backslash character:
827 assign where regex("^webserver-[\\d+]", host.name)
830 [Match](18-library-reference.md#global-functions-match) all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
831 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
832 should be ignored, or any host name ending with `*internal` pattern.
835 object HostGroup "mysql-server" {
836 display_name = "MySQL Server"
838 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
839 ignore where host.vars.test_server == true
840 ignore where match("*internal", host.name)
844 Similar example for advanced notification apply rule filters: If the service
845 attribute `notes` [matches](18-library-reference.md#global-functions-match) the `has gold support 24x7` string `AND` one of the
846 two condition passes, either the `customer` host custom attribute is set to `customer-xy`
847 `OR` the host custom attribute `always_notify` is set to `true`.
849 The notification is ignored for services whose host name ends with `*internal`
850 `OR` the `priority` custom attribute is [less than](17-language-reference.md#expression-operators) `2`.
853 template Notification "cust-xy-notification" {
854 users = [ "noc-xy", "mgmt-xy" ]
855 command = "mail-service-notification"
858 apply Notification "notify-cust-xy-mysql" to Service {
859 import "cust-xy-notification"
861 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true)
862 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
866 More advanced examples are covered [here](08-advanced-topics.md#use-functions-assign-where).
868 ### Apply Services to Hosts <a id="using-apply-services"></a>
870 The sample configuration already includes a detailed example in [hosts.conf](04-configuring-icinga-2.md#hosts-conf)
871 and [services.conf](04-configuring-icinga-2.md#services-conf) for this use case.
873 The example for `ssh` applies a service object to all hosts with the `address`
874 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
877 apply Service "ssh" {
878 import "generic-service"
880 check_command = "ssh"
882 assign where host.address && host.vars.os == "Linux"
886 Other detailed examples are used in their respective chapters, for example
887 [apply services with custom command arguments](03-monitoring-basics.md#command-passing-parameters).
889 ### Apply Notifications to Hosts and Services <a id="using-apply-notifications"></a>
891 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
895 apply Notification "mail-noc" to Service {
896 import "mail-service-notification"
898 user_groups = [ "noc" ]
900 assign where host.vars.notification.mail
904 In this example the `mail-noc` notification will be created as object for all services having the
905 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
906 and all members of the user group `noc` will get notified.
908 It is also possible to generally apply a notification template and dynamically overwrite values from
909 the template by checking for custom attributes. This can be achieved by using [conditional statements](17-language-reference.md#conditional-statements):
912 apply Notification "host-mail-noc" to Host {
913 import "mail-host-notification"
915 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom attribute
916 if (host.vars.notification_interval) {
917 interval = host.vars.notification_interval
920 // same with notification period
921 if (host.vars.notification_period) {
922 period = host.vars.notification_period
925 // Send SMS instead of email if the host's custom attribute `notification_type` is set to `sms`
926 if (host.vars.notification_type == "sms") {
927 command = "sms-host-notification"
929 command = "mail-host-notification"
932 user_groups = [ "noc" ]
934 assign where host.address
938 In the example above the notification template `mail-host-notification`
939 contains all relevant notification settings.
940 The apply rule is applied on all host objects where the `host.address` is defined.
942 If the host object has a specific custom attribute set, its value is inherited
943 into the local notification object scope, e.g. `host.vars.notification_interval`,
944 `host.vars.notification_period` and `host.vars.notification_type`.
945 This overwrites attributes already specified in the imported `mail-host-notification`
948 The corresponding host object could look like this:
951 object Host "host1" {
952 import "host-linux-prod"
953 display_name = "host1"
954 address = "192.168.1.50"
955 vars.notification_interval = 1h
956 vars.notification_period = "24x7"
957 vars.notification_type = "sms"
961 ### Apply Dependencies to Hosts and Services <a id="using-apply-dependencies"></a>
963 Detailed examples can be found in the [dependencies](03-monitoring-basics.md#dependencies) chapter.
965 ### Apply Recurring Downtimes to Hosts and Services <a id="using-apply-scheduledowntimes"></a>
967 The sample configuration includes an example in [downtimes.conf](04-configuring-icinga-2.md#downtimes-conf).
969 Detailed examples can be found in the [recurring downtimes](08-advanced-topics.md#recurring-downtimes) chapter.
972 ### Using Apply For Rules <a id="using-apply-for"></a>
974 Next to the standard way of using [apply rules](03-monitoring-basics.md#using-apply)
975 there is the requirement of applying objects based on a set (array or
976 dictionary) using [apply for](17-language-reference.md#apply-for) expressions.
978 The sample configuration already includes a detailed example in [hosts.conf](04-configuring-icinga-2.md#hosts-conf)
979 and [services.conf](04-configuring-icinga-2.md#services-conf) for this use case.
981 Take the following example: A host provides the snmp oids for different service check
982 types. This could look like the following example:
985 object Host "router-v6" {
986 check_command = "hostalive"
987 address6 = "2001:db8:1234::42"
989 vars.oids["if01"] = "1.1.1.1.1"
990 vars.oids["temp"] = "1.1.1.1.2"
991 vars.oids["bgp"] = "1.1.1.1.5"
995 The idea is to create service objects for `if01` and `temp` but not `bgp`.
996 The oid value should also be used as service custom attribute `snmp_oid`.
997 This is the command argument required by the [snmp](10-icinga-template-library.md#plugin-check-command-snmp)
999 The service's `display_name` should be set to the identifier inside the dictionary,
1003 apply Service for (identifier => oid in host.vars.oids) {
1004 check_command = "snmp"
1005 display_name = identifier
1008 ignore where identifier == "bgp" //don't generate service for bgp checks
1012 Icinga 2 evaluates the `apply for` rule for all objects with the custom attribute
1014 It iterates over all dictionary items inside the `for` loop and evaluates the
1015 `assign/ignore where` expressions. You can access the loop variable
1016 in these expressions, e.g. to ignore specific values.
1018 In this example the `bgp` identifier is ignored. This avoids to generate
1019 unwanted services. A different approach would be to match the `oid` value with a
1020 [regex](18-library-reference.md#global-functions-regex)/[wildcard match](18-library-reference.md#global-functions-match) pattern for example.
1023 ignore where regex("^\d.\d.\d.\d.5$", oid)
1028 > You don't need an `assign where` expression which checks for the existence of the
1029 > `oids` custom attribute.
1031 This method saves you from creating multiple apply rules. It also moves
1032 the attribute specification logic from the service to the host.
1035 #### Apply For and Custom Attribute Override <a id="using-apply-for-custom-attribute-override"></a>
1037 Imagine a different more advanced example: You are monitoring your network device (host)
1038 with many interfaces (services). The following requirements/problems apply:
1040 * Each interface service should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc.)
1041 * Each interface has its own VLAN tag
1042 * Some interfaces have QoS enabled
1043 * Additional attributes such as `display_name` or `notes`, `notes_url` and `action_url` must be
1044 dynamically generated.
1049 > Define the SNMP community as global constant in your [constants.conf](04-configuring-icinga-2.md#constants-conf) file.
1052 const IftrafficSnmpCommunity = "public"
1055 Define the `interfaces` [custom attribute](03-monitoring-basics.md#custom-attributes)
1056 on the `cisco-catalyst-6509-34` host object and add three example interfaces as dictionary keys.
1058 Specify additional attributes inside the nested dictionary
1059 as learned with [custom attribute values](03-monitoring-basics.md#custom-attributes-values):
1062 object Host "cisco-catalyst-6509-34" {
1063 import "generic-host"
1064 display_name = "Catalyst 6509 #34 VIE21"
1065 address = "127.0.1.4"
1067 /* "GigabitEthernet0/2" is the interface name,
1068 * and key name in service apply for later on
1070 vars.interfaces["GigabitEthernet0/2"] = {
1071 /* define all custom attributes with the
1072 * same name required for command parameters/arguments
1073 * in service apply (look into your CheckCommand definition)
1075 iftraffic_units = "g"
1076 iftraffic_community = IftrafficSnmpCommunity
1077 iftraffic_bandwidth = 1
1081 vars.interfaces["GigabitEthernet0/4"] = {
1082 iftraffic_units = "g"
1083 //iftraffic_community = IftrafficSnmpCommunity
1084 iftraffic_bandwidth = 1
1088 vars.interfaces["MgmtInterface1"] = {
1089 iftraffic_community = IftrafficSnmpCommunity
1091 interface_address = "127.99.0.100" #special management ip
1096 Start with the apply for definition and iterate over `host.vars.interfaces`.
1097 This is a dictionary and should use the variables `interface_name` as key
1098 and `interface_config` as value for each generated object scope.
1100 `"if-"` specifies the object name prefix for each service which results
1101 in `if-<interface_name>` for each iteration.
1104 /* loop over the host.vars.interfaces dictionary
1105 * for (key => value in dict) means `interface_name` as key
1106 * and `interface_config` as value. Access config attributes
1107 * with the indexer (`.`) character.
1109 apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) {
1112 Import the `generic-service` template, assign the [iftraffic](10-icinga-template-library.md#plugin-contrib-command-iftraffic)
1113 `check_command`. Use the dictionary key `interface_name` to set a proper `display_name`
1114 string for external interfaces.
1117 import "generic-service"
1118 check_command = "iftraffic"
1119 display_name = "IF-" + interface_name
1122 The `interface_name` key's value is the same string used as command parameter for
1126 /* use the key as command argument (no duplication of values in host.vars.interfaces) */
1127 vars.iftraffic_interface = interface_name
1130 Remember that `interface_config` is a nested dictionary. In the first iteration it looks
1134 interface_config = {
1135 iftraffic_units = "g"
1136 iftraffic_community = IftrafficSnmpCommunity
1137 iftraffic_bandwidth = 1
1143 Access the dictionary keys with the [indexer](17-language-reference.md#indexer) syntax
1144 and assign them to custom attributes used as command parameters for the `iftraffic`
1148 /* map the custom attributes as command arguments */
1149 vars.iftraffic_units = interface_config.iftraffic_units
1150 vars.iftraffic_community = interface_config.iftraffic_community
1153 If you just want to inherit all attributes specified inside the `interface_config`
1154 dictionary, add it to the generated service custom attributes like this:
1157 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces
1158 * are the _exact_ same as required as command parameter by the check command
1161 vars += interface_config
1164 If the user did not specify default values for required service custom attributes,
1165 add them here. This also helps to avoid unwanted configuration validation errors or
1166 runtime failures. Please read more about conditional statements [here](17-language-reference.md#conditional-statements).
1169 /* set a default value for units and bandwidth */
1170 if (interface_config.iftraffic_units == "") {
1171 vars.iftraffic_units = "m"
1173 if (interface_config.iftraffic_bandwidth == "") {
1174 vars.iftraffic_bandwidth = 1
1176 if (interface_config.vlan == "") {
1177 vars.vlan = "not set"
1179 if (interface_config.qos == "") {
1180 vars.qos = "not set"
1184 If the host object did not specify a custom SNMP community,
1185 set a default value specified by the [global constant](17-language-reference.md#constants) `IftrafficSnmpCommunity`.
1188 /* set the global constant if not explicitely
1189 * not provided by the `interfaces` dictionary on the host
1191 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) {
1192 vars.iftraffic_community = IftrafficSnmpCommunity
1196 Use the provided values to [calculate](17-language-reference.md#expression-operators)
1197 more object attributes which can be e.g. seen in external interfaces.
1200 /* Calculate some additional object attributes after populating the `vars` dictionary */
1201 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'"
1202 notes_url = "https://foreman.company.com/hosts/" + host.name
1203 action_url = "http://snmp.checker.company.com/" + host.name + "/if-" + interface_name
1209 > Building configuration in that dynamic way requires detailed information
1210 > of the generated objects. Use the `object list` [CLI command](11-cli-commands.md#cli-command-object)
1211 > after successful [configuration validation](11-cli-commands.md#config-validation).
1213 Verify that the apply-for-rule successfully created the service objects with the
1214 inherited custom attributes:
1218 # icinga2 object list --type Service --name *catalyst*
1220 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service':
1223 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
1224 * iftraffic_bandwidth = 1
1225 * iftraffic_community = "public"
1226 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
1227 * iftraffic_interface = "GigabitEthernet0/2"
1228 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
1229 * iftraffic_units = "g"
1230 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
1235 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service':
1238 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
1239 * iftraffic_bandwidth = 1
1240 * iftraffic_community = "public"
1241 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
1242 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53
1243 * iftraffic_interface = "GigabitEthernet0/4"
1244 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
1245 * iftraffic_units = "g"
1246 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
1250 Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service':
1253 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
1254 * iftraffic_bandwidth = 1
1255 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32
1256 * iftraffic_community = "public"
1257 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
1258 * iftraffic_interface = "MgmtInterface1"
1259 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
1260 * iftraffic_units = "m"
1261 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
1262 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30
1263 * interface_address = "127.99.0.100"
1265 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24
1269 ### Use Object Attributes in Apply Rules <a id="using-apply-object-attributes"></a>
1271 Since apply rules are evaluated after the generic objects, you
1272 can reference existing host and/or service object attributes as
1273 values for any object attribute specified in that apply rule.
1276 object Host "opennebula-host" {
1277 import "generic-host"
1278 address = "10.1.1.2"
1280 vars.hosting["cust1"] = {
1282 customer_name = "Customer 1"
1283 customer_id = "7568"
1284 support_contract = "gold"
1286 vars.hosting["cust2"] = {
1288 customer_name = "Customer 2"
1289 customer_id = "7569"
1290 support_contract = "silver"
1295 `hosting` is a custom attribute with the Dictionary value type.
1296 This is mandatory to iterate with the `key => value` notation
1297 in the below apply for rule.
1300 apply Service for (customer => config in host.vars.hosting) {
1301 import "generic-service"
1302 check_command = "ping4"
1304 vars.qos = "disabled"
1308 vars.http_uri = "/" + customer + "/" + config.http_uri
1310 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
1312 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
1314 notes_url = "https://foreman.company.com/hosts/" + host.name
1315 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
1319 Each loop iteration has different values for `customer` and config`
1328 customer_name = "Customer 1"
1329 customer_id = "7568"
1330 support_contract = "gold"
1340 customer_name = "Customer 2"
1341 customer_id = "7569"
1342 support_contract = "silver"
1346 You can now add the `config` dictionary into `vars`.
1352 Now it looks like the following in the first iteration:
1358 customer_name = "Customer 1"
1359 customer_id = "7568"
1360 support_contract = "gold"
1364 Remember, you know this structure already. Custom
1365 attributes can also be accessed by using the [indexer](17-language-reference.md#indexer)
1369 vars.http_uri = ... + config.http_uri
1372 can also be written as
1376 vars.http_uri = ... + vars.http_uri
1380 ## Groups <a id="groups"></a>
1382 A group is a collection of similar objects. Groups are primarily used as a
1383 visualization aid in web interfaces.
1385 Group membership is defined at the respective object itself. If
1386 you have a hostgroup name `windows` for example, and want to assign
1387 specific hosts to this group for later viewing the group on your
1388 alert dashboard, first create a HostGroup object:
1391 object HostGroup "windows" {
1392 display_name = "Windows Servers"
1396 Then add your hosts to this group:
1399 template Host "windows-server" {
1400 groups += [ "windows" ]
1403 object Host "mssql-srv1" {
1404 import "windows-server"
1406 vars.mssql_port = 1433
1409 object Host "mssql-srv2" {
1410 import "windows-server"
1412 vars.mssql_port = 1433
1416 This can be done for service and user groups the same way:
1419 object UserGroup "windows-mssql-admins" {
1420 display_name = "Windows MSSQL Admins"
1423 template User "generic-windows-mssql-users" {
1424 groups += [ "windows-mssql-admins" ]
1427 object User "win-mssql-noc" {
1428 import "generic-windows-mssql-users"
1430 email = "noc@example.com"
1433 object User "win-mssql-ops" {
1434 import "generic-windows-mssql-users"
1436 email = "ops@example.com"
1440 ### Group Membership Assign <a id="group-assign-intro"></a>
1442 Instead of manually assigning each object to a group you can also assign objects
1443 to a group based on their attributes:
1446 object HostGroup "prod-mssql" {
1447 display_name = "Production MSSQL Servers"
1449 assign where host.vars.mssql_port && host.vars.prod_mysql_db
1450 ignore where host.vars.test_server == true
1451 ignore where match("*internal", host.name)
1455 In this example all hosts with the `vars` attribute `mssql_port`
1456 will be added as members to the host group `mssql`. However, all
1457 hosts [matching](18-library-reference.md#global-functions-match) the string `\*internal`
1458 or with the `test_server` attribute set to `true` are **not** added to this group.
1460 Details on the `assign where` syntax can be found in the
1461 [Language Reference](17-language-reference.md#apply).
1463 ## Notifications <a id="alert-notifications"></a>
1465 Notifications for service and host problems are an integral part of your
1468 When a host or service is in a downtime, a problem has been acknowledged or
1469 the dependency logic determined that the host/service is unreachable, no
1470 notifications are sent. You can configure additional type and state filters
1471 refining the notifications being actually sent.
1473 There are many ways of sending notifications, e.g. by email, XMPP,
1474 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
1475 Instead it relies on external mechanisms such as shell scripts to notify users.
1476 More notification methods are listed in the [addons and plugins](13-addons.md#notification-scripts-interfaces)
1479 A notification specification requires one or more users (and/or user groups)
1480 who will be notified in case of problems. These users must have all custom
1481 attributes defined which will be used in the `NotificationCommand` on execution.
1483 The user `icingaadmin` in the example below will get notified only on `Warning` and
1484 `Critical` problems. In addition to that `Recovery` notifications are sent (they require
1488 object User "icingaadmin" {
1489 display_name = "Icinga 2 Admin"
1490 enable_notifications = true
1491 states = [ OK, Warning, Critical ]
1492 types = [ Problem, Recovery ]
1493 email = "icinga@localhost"
1497 If you don't set the `states` and `types` configuration attributes for the `User`
1498 object, notifications for all states and types will be sent.
1500 Details on troubleshooting notification problems can be found [here](15-troubleshooting.md#troubleshooting).
1504 > Make sure that the [notification](11-cli-commands.md#enable-features) feature is enabled
1505 > in order to execute notification commands.
1507 You should choose which information you (and your notified users) are interested in
1508 case of emergency, and also which information does not provide any value to you and
1511 An example notification command is explained [here](03-monitoring-basics.md#notification-commands).
1513 You can add all shared attributes to a `Notification` template which is inherited
1514 to the defined notifications. That way you'll save duplicated attributes in each
1515 `Notification` object. Attributes can be overridden locally.
1518 template Notification "generic-notification" {
1521 command = "mail-service-notification"
1523 states = [ Warning, Critical, Unknown ]
1524 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1525 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1531 The time period `24x7` is included as example configuration with Icinga 2.
1533 Use the `apply` keyword to create `Notification` objects for your services:
1536 apply Notification "notify-cust-xy-mysql" to Service {
1537 import "generic-notification"
1539 users = [ "noc-xy", "mgmt-xy" ]
1541 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
1542 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
1547 Instead of assigning users to notifications, you can also add the `user_groups`
1548 attribute with a list of user groups to the `Notification` object. Icinga 2 will
1549 send notifications to all group members.
1553 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
1554 states for services, `Down` for hosts) will receive `Recovery` notifications.
1556 Icinga 2 v2.10 allows you to configure `Acknowledgement` and/or `Recovery`
1557 without a `Problem` notification. These notifications will be sent without
1558 any problem notifications beforehand, and can be used for e.g. ticket systems.
1561 types = [ Acknowledgement, Recovery ]
1564 ### Notifications: Users from Host/Service <a id="alert-notifications-users-host-service"></a>
1566 A common pattern is to store the users and user groups
1567 on the host or service objects instead of the notification
1570 The sample configuration provided in [hosts.conf](04-configuring-icinga-2.md#hosts-conf) and [notifications.conf](notifications-conf)
1571 already provides an example for this question.
1575 > Please make sure to read the [apply](03-monitoring-basics.md#using-apply) and
1576 > [custom attribute values](03-monitoring-basics.md#custom-attributes-values) chapter to
1577 > fully understand these examples.
1580 Specify the user and groups as nested custom attribute on the host object:
1583 object Host "icinga2-client1.localdomain" {
1586 vars.notification["mail"] = {
1587 groups = [ "icingaadmins" ]
1588 users = [ "icingaadmin" ]
1590 vars.notification["sms"] = {
1591 users = [ "icingaadmin" ]
1596 As you can see, there is the option to use two different notification
1597 apply rules here: One for `mail` and one for `sms`.
1599 This example assigns the `users` and `groups` nested keys from the `notification`
1600 custom attribute to the actual notification object attributes.
1602 Since errors are hard to debug if host objects don't specify the required
1603 configuration attributes, you can add a safety condition which logs which
1604 host object is affected.
1607 critical/config: Host 'icinga2-client3.localdomain' does not specify required user/user_groups configuration attributes for notification 'mail-icingaadmin'.
1610 You can also use the [script debugger](20-script-debugger.md#script-debugger) for more advanced insights.
1613 apply Notification "mail-host-notification" to Host {
1616 /* Log which host does not specify required user/user_groups attributes. This will fail immediately during config validation and help a lot. */
1617 if (len(host.vars.notification.mail.users) == 0 && len(host.vars.notification.mail.user_groups) == 0) {
1618 log(LogCritical, "config", "Host '" + host.name + "' does not specify required user/user_groups configuration attributes for notification '" + name + "'.")
1621 users = host.vars.notification.mail.users
1622 user_groups = host.vars.notification.mail.groups
1624 assign where host.vars.notification.mail && typeof(host.vars.notification.mail) == Dictionary
1627 apply Notification "sms-host-notification" to Host {
1630 /* Log which host does not specify required user/user_groups attributes. This will fail immediately during config validation and help a lot. */
1631 if (len(host.vars.notification.sms.users) == 0 && len(host.vars.notification.sms.user_groups) == 0) {
1632 log(LogCritical, "config", "Host '" + host.name + "' does not specify required user/user_groups configuration attributes for notification '" + name + "'.")
1635 users = host.vars.notification.sms.users
1636 user_groups = host.vars.notification.sms.groups
1638 assign where host.vars.notification.sms && typeof(host.vars.notification.sms) == Dictionary
1642 The example above uses [typeof](18-library-reference.md#global-functions-typeof) as safety function to ensure that
1643 the `mail` key really provides a dictionary as value. Otherwise
1644 the configuration validation could fail if an admin adds something
1645 like this on another host:
1648 vars.notification.mail = "yes"
1652 You can also do a more fine granular assignment on the service object:
1655 apply Service "http" {
1658 vars.notification["mail"] = {
1659 groups = [ "icingaadmins" ]
1660 users = [ "icingaadmin" ]
1667 This notification apply rule is different to the one above. The service
1668 notification users and groups are inherited from the service and if not set,
1669 from the host object. A default user is set too.
1672 apply Notification "mail-host-notification" to Service {
1675 if (service.vars.notification.mail.users) {
1676 users = service.vars.notification.mail.users
1677 } else if (host.vars.notification.mail.users) {
1678 users = host.vars.notification.mail.users
1680 /* Default user who receives everything. */
1681 users = [ "icingaadmin" ]
1684 if (service.vars.notification.mail.groups) {
1685 user_groups = service.vars.notification.mail.groups
1686 } else (host.vars.notification.mail.groups) {
1687 user_groups = host.vars.notification.mail.groups
1690 assign where host.vars.notification.mail && typeof(host.vars.notification.mail) == Dictionary
1694 ### Notification Escalations <a id="notification-escalations"></a>
1696 When a problem notification is sent and a problem still exists at the time of re-notification
1697 you may want to escalate the problem to the next support level. A different approach
1698 is to configure the default notification by email, and escalate the problem via SMS
1699 if not already solved.
1701 You can define notification start and end times as additional configuration
1702 attributes making the `Notification` object a so-called `notification escalation`.
1703 Using templates you can share the basic notification attributes such as users or the
1704 `interval` (and override them for the escalation then).
1706 Using the example from above, you can define additional users being escalated for SMS
1707 notifications between start and end time.
1710 object User "icinga-oncall-2nd-level" {
1711 display_name = "Icinga 2nd Level"
1713 vars.mobile = "+1 555 424642"
1716 object User "icinga-oncall-1st-level" {
1717 display_name = "Icinga 1st Level"
1719 vars.mobile = "+1 555 424642"
1723 Define an additional [NotificationCommand](03-monitoring-basics.md#notification-commands) for SMS notifications.
1727 > The example is not complete as there are many different SMS providers.
1728 > Please note that sending SMS notifications will require an SMS provider
1729 > or local hardware with an active SIM card.
1732 object NotificationCommand "sms-notification" {
1734 PluginDir + "/send_sms_notification",
1740 The two new notification escalations are added onto the local host
1741 and its service `ping4` using the `generic-notification` template.
1742 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
1743 command) after `30m` until `1h`.
1747 > The `interval` was set to 15m in the `generic-notification`
1748 > template example. Lower that value in your escalations by using a secondary
1749 > template or by overriding the attribute directly in the `notifications` array
1750 > position for `escalation-sms-2nd-level`.
1752 If the problem does not get resolved nor acknowledged preventing further notifications,
1753 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
1754 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
1757 apply Notification "mail" to Service {
1758 import "generic-notification"
1760 command = "mail-notification"
1761 users = [ "icingaadmin" ]
1763 assign where service.name == "ping4"
1766 apply Notification "escalation-sms-2nd-level" to Service {
1767 import "generic-notification"
1769 command = "sms-notification"
1770 users = [ "icinga-oncall-2nd-level" ]
1777 assign where service.name == "ping4"
1780 apply Notification "escalation-sms-1st-level" to Service {
1781 import "generic-notification"
1783 command = "sms-notification"
1784 users = [ "icinga-oncall-1st-level" ]
1791 assign where service.name == "ping4"
1795 ### Notification Delay <a id="notification-delay"></a>
1797 Sometimes the problem in question should not be announced when the notification is due
1798 (the object reaching the `HARD` state), but after a certain period. In Icinga 2
1799 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
1800 postpone the notification window for 15 minutes. Leave out the `end` key -- if not set,
1801 Icinga 2 will not check against any end time for this notification. Make sure to
1802 specify a relatively low notification `interval` to get notified soon enough again.
1805 apply Notification "mail" to Service {
1806 import "generic-notification"
1808 command = "mail-notification"
1809 users = [ "icingaadmin" ]
1813 times.begin = 15m // delay notification window
1815 assign where service.name == "ping4"
1819 ### Disable Re-notifications <a id="disable-renotification"></a>
1821 If you prefer to be notified only once, you can disable re-notifications by setting the
1822 `interval` attribute to `0`.
1825 apply Notification "notify-once" to Service {
1826 import "generic-notification"
1828 command = "mail-notification"
1829 users = [ "icingaadmin" ]
1831 interval = 0 // disable re-notification
1833 assign where service.name == "ping4"
1837 ### Notification Filters by State and Type <a id="notification-filters-state-type"></a>
1839 If there are no notification state and type filter attributes defined at the `Notification`
1840 or `User` object, Icinga 2 assumes that all states and types are being notified.
1842 Available state and type filters for notifications are:
1845 template Notification "generic-notification" {
1847 states = [ OK, Warning, Critical, Unknown ]
1848 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1849 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1854 ## Commands <a id="commands"></a>
1856 Icinga 2 uses three different command object types to specify how
1857 checks should be performed, notifications should be sent, and
1858 events should be handled.
1860 ### Check Commands <a id="check-commands"></a>
1862 [CheckCommand](09-object-types.md#objecttype-checkcommand) objects define the command line how
1865 [CheckCommand](09-object-types.md#objecttype-checkcommand) objects are referenced by
1866 [Host](09-object-types.md#objecttype-host) and [Service](09-object-types.md#objecttype-service) objects
1867 using the `check_command` attribute.
1871 > Make sure that the [checker](11-cli-commands.md#enable-features) feature is enabled in order to
1874 #### Integrate the Plugin with a CheckCommand Definition <a id="command-plugin-integration"></a>
1876 Unless you have done so already, download your check plugin and put it
1877 into the [PluginDir](04-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1878 `check_mysql` plugin contained in the Monitoring Plugins package.
1880 The plugin path and all command arguments are made a list of
1881 double-quoted string arguments for proper shell escaping.
1883 Call the `check_disk` plugin with the `--help` parameter to see
1884 all available options. Our example defines warning (`-w`) and
1885 critical (`-c`) thresholds for the disk usage. Without any
1886 partition defined (`-p`) it will check all local partitions.
1889 icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help
1891 This program tests connections to a MySQL server
1894 check_mysql [-d database] [-H host] [-P port] [-s socket]
1895 [-u user] [-p password] [-S] [-l] [-a cert] [-k key]
1896 [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group]
1899 Next step is to understand how [command parameters](03-monitoring-basics.md#command-passing-parameters)
1900 are being passed from a host or service object, and add a [CheckCommand](09-object-types.md#objecttype-checkcommand)
1901 definition based on these required parameters and/or default values.
1903 Please continue reading in the [plugins section](05-service-monitoring.md#service-monitoring-plugins) for additional integration examples.
1905 #### Passing Check Command Parameters from Host or Service <a id="command-passing-parameters"></a>
1907 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1908 by the executed check command.
1910 The check command parameters for ITL provided plugin check command definitions are documented
1911 [here](10-icinga-template-library.md#icinga-template-library), for example
1912 [disk](10-icinga-template-library.md#plugin-check-command-disk).
1914 In order to practice passing command parameters you should [integrate your own plugin](03-monitoring-basics.md#command-plugin-integration).
1916 The following example will use `check_mysql` provided by the [Monitoring Plugins installation](02-getting-started.md#setting-up-check-plugins).
1918 Define the default check command custom attributes, for example `mysql_user` and `mysql_password`
1919 (freely definable naming schema) and optional their default threshold values. You can
1920 then use these custom attributes as runtime macros for [command arguments](03-monitoring-basics.md#command-arguments)
1921 on the command line.
1925 > Use a common command type as prefix for your command arguments to increase
1926 > readability. `mysql_user` helps understanding the context better than just
1927 > `user` as argument.
1929 The default custom attributes can be overridden by the custom attributes
1930 defined in the host or service using the check command `my-mysql`. The custom attributes
1931 can also be inherited from a parent template using additive inheritance (`+=`).
1934 # vim /etc/icinga2/conf.d/commands.conf
1936 object CheckCommand "my-mysql" {
1937 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir
1940 "-H" = "$mysql_host$"
1943 value = "$mysql_user$"
1945 "-p" = "$mysql_password$"
1946 "-P" = "$mysql_port$"
1947 "-s" = "$mysql_socket$"
1948 "-a" = "$mysql_cert$"
1949 "-d" = "$mysql_database$"
1950 "-k" = "$mysql_key$"
1951 "-C" = "$mysql_ca_cert$"
1952 "-D" = "$mysql_ca_dir$"
1953 "-L" = "$mysql_ciphers$"
1954 "-f" = "$mysql_optfile$"
1955 "-g" = "$mysql_group$"
1957 set_if = "$mysql_check_slave$"
1958 description = "Check if the slave thread is running properly."
1961 set_if = "$mysql_ssl$"
1962 description = "Use ssl encryption"
1966 vars.mysql_check_slave = false
1967 vars.mysql_ssl = false
1968 vars.mysql_host = "$address$"
1972 The check command definition also sets `mysql_host` to the `$address$` default value. You can override
1973 this command parameter if for example your MySQL host is not running on the same server's ip address.
1975 Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`.
1976 `MysqlUsername` and `MysqlPassword` are specified as [global constants](04-configuring-icinga-2.md#constants-conf)
1980 # vim /etc/icinga2/conf.d/services.conf
1982 apply Service "mysql-icinga-db-health" {
1983 import "generic-service"
1985 check_command = "my-mysql"
1987 vars.mysql_user = MysqlUsername
1988 vars.mysql_password = MysqlPassword
1990 vars.mysql_database = "icinga"
1991 vars.mysql_host = "192.168.33.11"
1993 assign where match("icinga2*", host.name)
1994 ignore where host.vars.no_health_check == true
1999 Take a different example: The example host configuration in [hosts.conf](04-configuring-icinga-2.md#hosts-conf)
2000 also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`.
2001 You can pass the command parameter as custom attribute `ssh_port` directly inside the service apply rule
2002 inside [services.conf](04-configuring-icinga-2.md#services-conf):
2005 apply Service "ssh" {
2006 import "generic-service"
2008 check_command = "ssh"
2009 vars.ssh_port = 2022 //custom command parameter
2011 assign where (host.address || host.address6) && host.vars.os == "Linux"
2015 If you prefer this being configured at the host instead of the service, modify the host configuration
2016 object instead. The runtime macro resolving order is described [here](03-monitoring-basics.md#macro-evaluation-order).
2019 object Host "icinga2-client1.localdomain {
2021 vars.ssh_port = 2022
2025 #### Passing Check Command Parameters Using Apply For <a id="command-passing-parameters-apply-for"></a>
2027 The host `localhost` with the generated services from the `basic-partitions` dictionary (see
2028 [apply for](03-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions
2029 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
2032 The custom attribute `disk_partition` can either hold a single string or an array of
2033 string values for passing multiple partitions to the `check_disk` check plugin.
2036 object Host "my-server" {
2037 import "generic-host"
2038 address = "127.0.0.1"
2041 vars.local_disks["basic-partitions"] = {
2042 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
2046 apply Service for (disk => config in host.vars.local_disks) {
2047 import "generic-service"
2048 check_command = "my-disk"
2052 vars.disk_wfree = "10%"
2053 vars.disk_cfree = "5%"
2058 More details on using arrays in custom attributes can be found in
2059 [this chapter](03-monitoring-basics.md#custom-attributes).
2062 #### Command Arguments <a id="command-arguments"></a>
2064 By defining a check command line using the `command` attribute Icinga 2
2065 will resolve all macros in the static string or array. Sometimes it is
2066 required to extend the arguments list based on a met condition evaluated
2067 at command execution. Or making arguments optional -- only set if the
2068 macro value can be resolved by Icinga 2.
2071 object CheckCommand "http" {
2072 command = [ PluginDir + "/check_http" ]
2075 "-H" = "$http_vhost$"
2076 "-I" = "$http_address$"
2078 "-p" = "$http_port$"
2080 set_if = "$http_ssl$"
2083 set_if = "$http_sni$"
2086 value = "$http_auth_pair$"
2087 description = "Username:password on sites with basic authentication"
2090 set_if = "$http_ignore_body$"
2092 "-r" = "$http_expect_body_regex$"
2093 "-w" = "$http_warn_time$"
2094 "-c" = "$http_critical_time$"
2095 "-e" = "$http_expect$"
2098 vars.http_address = "$address$"
2099 vars.http_ssl = false
2100 vars.http_sni = false
2104 The example shows the `check_http` check command defining the most common
2105 arguments. Each of them is optional by default and is omitted if
2106 the value is not set. For example, if the service calling the check command
2107 does not have `vars.http_port` set, it won't get added to the command
2110 If the `vars.http_ssl` custom attribute is set in the service, host or command
2111 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
2112 numeric value to the command line. String values are not supported.
2114 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
2115 to the final command argument array. Empty strings for macro values won't omit
2118 That way you can use the `check_http` command definition for both, with and
2119 without SSL enabled checks saving you duplicated command definitions.
2121 Details on all available options can be found in the
2122 [CheckCommand object definition](09-object-types.md#objecttype-checkcommand).
2124 ##### Command Arguments: set_if <a id="command-arguments-set-if"></a>
2126 The `set_if` attribute in command arguments can be used to only add
2127 this parameter if the runtime macro value is boolean `true`.
2129 Best practice is to define and pass only [boolean](17-language-reference.md#boolean-literals) values here.
2130 [Numeric](17-language-reference.md#numeric-literals) values are allowed too.
2148 If you accidentally used a [String](17-language-reference.md#string-literals) value, this could lead into
2149 an undefined behaviour.
2151 If you still want to work with String values and other variants, you can also
2152 use runtime evaluated functions for `set_if`.
2155 vars.test_s = "1.1.2.1"
2159 var str = macro("$test_s$")
2161 return regex("^\d.\d.\d.\d$", str)
2166 References: [abbreviated lambda syntax](17-language-reference.md#nullary-lambdas), [macro](18-library-reference.md#scoped-functions-macro), [regex](18-library-reference.md#global-functions-regex).
2169 #### Environment Variables <a id="command-environment-variables"></a>
2171 The `env` command object attribute specifies a list of environment variables with values calculated
2172 from custom attributes which should be exported as environment variables prior to executing the command.
2174 This is useful for example for hiding sensitive information on the command line output
2175 when passing credentials to database checks:
2178 object CheckCommand "mysql" {
2179 command = [ PluginDir + "/check_mysql" ]
2182 "-H" = "$mysql_address$"
2183 "-d" = "$mysql_database$"
2186 vars.mysql_address = "$address$"
2187 vars.mysql_database = "icinga"
2188 vars.mysql_user = "icinga_check"
2189 vars.mysql_pass = "password"
2191 env.MYSQLUSER = "$mysql_user$"
2192 env.MYSQLPASS = "$mysql_pass$"
2196 The executed command line visible with `ps` or `top` looks like this and hides
2197 the database credentials in the user's environment.
2200 /usr/lib/nagios/plugins/check_mysql -H 192.168.56.101 -d icinga
2205 > If the CheckCommand also supports setting the parameter in the command line,
2206 > ensure to use a different name for the custom attribute. Otherwise Icinga 2
2207 > adds the command line parameter.
2209 If a specific CheckCommand object provided with the [Icinga Template Library](10-icinga-template-library.md#icinga-template-library)
2210 needs additional environment variables, you can import it into a new custom
2211 CheckCommand object and add additional `env` keys. Example for the [mysql_health](10-icinga-template-library.md#plugin-contrib-command-mysql_health)
2215 object CheckCommand "mysql_health_env" {
2216 import "mysql_health"
2218 // https://labs.consol.de/nagios/check_mysql_health/
2219 env.NAGIOS__SERVICEMYSQL_USER = "$mysql_health_env_username$"
2220 env.NAGIOS__SERVICEMYSQL_PASS = "$mysql_health_env_password$"
2224 Specify the custom attributes `mysql_health_env_username` and `mysql_health_env_password`
2225 in the service object then.
2229 > Keep in mind that the values are still visible with the [debug console](11-cli-commands.md#cli-command-console)
2230 > and the inspect mode in the [Icinga Director](https://icinga.com/docs/director/latest/).
2232 You can also set global environment variables in the application's
2233 sysconfig configuration file, e.g. `HOME` or specific library paths
2234 for Oracle. Beware that these environment variables can be used
2235 by any CheckCommand object and executed plugin and can leak sensitive
2238 ### Notification Commands <a id="notification-commands"></a>
2240 [NotificationCommand](09-object-types.md#objecttype-notificationcommand)
2241 objects define how notifications are delivered to external interfaces
2242 (email, XMPP, IRC, Twitter, etc.).
2243 [NotificationCommand](09-object-types.md#objecttype-notificationcommand)
2244 objects are referenced by [Notification](09-object-types.md#objecttype-notification)
2245 objects using the `command` attribute.
2249 > Make sure that the [notification](11-cli-commands.md#enable-features) feature is enabled
2250 > in order to execute notification commands.
2252 While it's possible to specify an entire notification command right
2253 in the NotificationCommand object it is generally advisable to create a
2254 shell script in the `/etc/icinga2/scripts` directory and have the
2255 NotificationCommand object refer to that.
2257 A fresh Icinga 2 install comes with with two example scripts for host
2258 and service notifications by email. Based on the Icinga 2 runtime macros
2259 (such as `$service.output$` for the current check output) it's possible
2260 to send email to the user(s) associated with the notification itself
2261 (`$user.email$`). Feel free to take these scripts as a starting point
2262 for your own individual notification solution - and keep in mind that
2263 nearly everything is technically possible.
2265 Information needed to generate notifications is passed to the scripts as
2266 arguments. The NotificationCommand objects `mail-host-notification` and
2267 `mail-service-notification` correspond to the shell scripts
2268 `mail-host-notification.sh` and `mail-service-notification.sh` in
2269 `/etc/icinga2/scripts` and define default values for arguments. These
2270 defaults can always be overwritten locally.
2274 > This example requires the `mail` binary installed on the Icinga 2
2277 #### Notification Commands in 2.7 <a id="notification-command-2-7"></a>
2279 Icinga 2 v2.7.0 introduced new notification scripts which support both
2280 environment variables and command line parameters.
2282 Therefore the `NotificationCommand` objects inside the [commands.conf](04-configuring-icinga-2.md#commands-conf)
2283 and `Notification` apply rules inside the [notifications.conf](04-configuring-icinga-2.md#notifications-conf)
2284 configuration files have been updated. Your configuration needs to be
2285 updated next to the notification scripts themselves.
2289 > Several parameters have been changed. Please review the notification
2290 > script parameters and configuration objects before updating your production
2293 The safest way is to incorporate the configuration updates from
2294 v2.7.0 inside the [commands.conf](04-configuring-icinga-2.md#commands-conf) and [notifications.conf](04-configuring-icinga-2.md#notifications-conf)
2295 configuration files.
2297 A quick-fix is shown below:
2300 @@ -5,7 +5,8 @@ object NotificationCommand "mail-host-notification" {
2303 NOTIFICATIONTYPE = "$notification.type$"
2304 - HOSTALIAS = "$host.display_name$"
2305 + HOSTNAME = "$host.name$"
2306 + HOSTDISPLAYNAME = "$host.display_name$"
2307 HOSTADDRESS = "$address$"
2308 HOSTSTATE = "$host.state$"
2309 LONGDATETIME = "$icinga.long_date_time$"
2310 @@ -22,8 +23,9 @@ object NotificationCommand "mail-service-notification" {
2313 NOTIFICATIONTYPE = "$notification.type$"
2314 - SERVICEDESC = "$service.name$"
2315 - HOSTALIAS = "$host.display_name$"
2316 + SERVICENAME = "$service.name$"
2317 + HOSTNAME = "$host.name$"
2318 + HOSTDISPLAYNAME = "$host.display_name$"
2319 HOSTADDRESS = "$address$"
2320 SERVICESTATE = "$service.state$"
2321 LONGDATETIME = "$icinga.long_date_time$"
2325 #### mail-host-notification <a id="mail-host-notification"></a>
2327 The `mail-host-notification` NotificationCommand object uses the
2328 example notification script located in `/etc/icinga2/scripts/mail-host-notification.sh`.
2330 Here is a quick overview of the arguments that can be used. See also [host runtime
2331 macros](03-monitoring-basics.md#-host-runtime-macros) for further
2335 -------------------------------|---------------------------------------
2336 `notification_date` | **Required.** Date and time. Defaults to `$icinga.long_date_time$`.
2337 `notification_hostname` | **Required.** The host's `FQDN`. Defaults to `$host.name$`.
2338 `notification_hostdisplayname` | **Required.** The host's display name. Defaults to `$host.display_name$`.
2339 `notification_hostoutput` | **Required.** Output from host check. Defaults to `$host.output$`.
2340 `notification_useremail` | **Required.** The notification's recipient(s). Defaults to `$user.email$`.
2341 `notification_hoststate` | **Required.** Current state of host. Defaults to `$host.state$`.
2342 `notification_type` | **Required.** Type of notification. Defaults to `$notification.type$`.
2343 `notification_address` | **Optional.** The host's IPv4 address. Defaults to `$address$`.
2344 `notification_address6` | **Optional.** The host's IPv6 address. Defaults to `$address6$`.
2345 `notification_author` | **Optional.** Comment author. Defaults to `$notification.author$`.
2346 `notification_comment` | **Optional.** Comment text. Defaults to `$notification.comment$`.
2347 `notification_from` | **Optional.** Define a valid From: string (e.g. `"Icinga 2 Host Monitoring <icinga@example.com>"`). Requires `GNU mailutils` (Debian/Ubuntu) or `mailx` (RHEL/SUSE).
2348 `notification_icingaweb2url` | **Optional.** Define URL to your Icinga Web 2 (e.g. `"https://www.example.com/icingaweb2"`)
2349 `notification_logtosyslog` | **Optional.** Set `true` to log notification events to syslog; useful for debugging. Defaults to `false`.
2351 #### mail-service-notification <a id="mail-service-notification"></a>
2353 The `mail-service-notification` NotificationCommand object uses the
2354 example notification script located in `/etc/icinga2/scripts/mail-service-notification.sh`.
2356 Here is a quick overview of the arguments that can be used. See also [service runtime
2357 macros](03-monitoring-basics.md#-service-runtime-macros) for further
2361 ----------------------------------|---------------------------------------
2362 `notification_date` | **Required.** Date and time. Defaults to `$icinga.long_date_time$`.
2363 `notification_hostname` | **Required.** The host's `FQDN`. Defaults to `$host.name$`.
2364 `notification_servicename` | **Required.** The service name. Defaults to `$service.name$`.
2365 `notification_hostdisplayname` | **Required.** Host display name. Defaults to `$host.display_name$`.
2366 `notification_servicedisplayname` | **Required.** Service display name. Defaults to `$service.display_name$`.
2367 `notification_serviceoutput` | **Required.** Output from service check. Defaults to `$service.output$`.
2368 `notification_useremail` | **Required.** The notification's recipient(s). Defaults to `$user.email$`.
2369 `notification_servicestate` | **Required.** Current state of host. Defaults to `$service.state$`.
2370 `notification_type` | **Required.** Type of notification. Defaults to `$notification.type$`.
2371 `notification_address` | **Optional.** The host's IPv4 address. Defaults to `$address$`.
2372 `notification_address6` | **Optional.** The host's IPv6 address. Defaults to `$address6$`.
2373 `notification_author` | **Optional.** Comment author. Defaults to `$notification.author$`.
2374 `notification_comment` | **Optional.** Comment text. Defaults to `$notification.comment$`.
2375 `notification_from` | **Optional.** Define a valid From: string (e.g. `"Icinga 2 Host Monitoring <icinga@example.com>"`). Requires `GNU mailutils` (Debian/Ubuntu) or `mailx` (RHEL/SUSE).
2376 `notification_icingaweb2url` | **Optional.** Define URL to your Icinga Web 2 (e.g. `"https://www.example.com/icingaweb2"`)
2377 `notification_logtosyslog` | **Optional.** Set `true` to log notification events to syslog; useful for debugging. Defaults to `false`.
2380 ## Dependencies <a id="dependencies"></a>
2382 Icinga 2 uses host and service [Dependency](09-object-types.md#objecttype-dependency) objects
2383 for determining their network reachability.
2385 A service can depend on a host, and vice versa. A service has an implicit
2386 dependency (parent) to its host. A host to host dependency acts implicitly
2387 as host parent relation.
2388 When dependencies are calculated, not only the immediate parent is taken into
2389 account but all parents are inherited.
2391 The `parent_host_name` and `parent_service_name` attributes are mandatory for
2392 service dependencies, `parent_host_name` is required for host dependencies.
2393 [Apply rules](03-monitoring-basics.md#using-apply) will allow you to
2394 [determine these attributes](03-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
2395 dynamic fashion if required.
2398 parent_host_name = "core-router"
2399 parent_service_name = "uplink-port"
2402 Notifications are suppressed by default if a host or service becomes unreachable.
2403 You can control that option by defining the `disable_notifications` attribute.
2406 disable_notifications = false
2409 If the dependency should be triggered in the parent object's soft state, you
2410 need to set `ignore_soft_states` to `false`.
2412 The dependency state filter must be defined based on the parent object being
2413 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
2415 The following example will make the dependency fail and trigger it if the parent
2416 object is **not** in one of these states:
2419 states = [ OK, Critical, Unknown ]
2422 > **In other words**
2424 > If the parent service object changes into the `Warning` state, this
2425 > dependency will fail and render all child objects (hosts or services) unreachable.
2427 You can determine the child's reachability by querying the `is_reachable` attribute
2428 in for example [DB IDO](24-appendix.md#schema-db-ido-extensions).
2430 ### Implicit Dependencies for Services on Host <a id="dependencies-implicit-host-service"></a>
2432 Icinga 2 automatically adds an implicit dependency for services on their host. That way
2433 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
2434 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
2435 `states = [ Up ]` for all service objects.
2437 Service checks are still executed. If you want to prevent them from happening, you can
2438 apply the following dependency to all services setting their host as `parent_host_name`
2439 and disabling the checks. `assign where true` matches on all `Service` objects.
2442 apply Dependency "disable-host-service-checks" to Service {
2443 disable_checks = true
2448 ### Dependencies for Network Reachability <a id="dependencies-network-reachability"></a>
2450 A common scenario is the Icinga 2 server behind a router. Checking internet
2451 access by pinging the Google DNS server `google-dns` is a common method, but
2452 will fail in case the `dsl-router` host is down. Therefore the example below
2453 defines a host dependency which acts implicitly as parent relation too.
2455 Furthermore the host may be reachable but ping probes are dropped by the
2456 router's firewall. In case the `dsl-router`'s `ping4` service check fails, all
2457 further checks for the `ping4` service on host `google-dns` service should
2458 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
2461 object Host "dsl-router" {
2462 import "generic-host"
2463 address = "192.168.1.1"
2466 object Host "google-dns" {
2467 import "generic-host"
2471 apply Service "ping4" {
2472 import "generic-service"
2474 check_command = "ping4"
2476 assign where host.address
2479 apply Dependency "internet" to Host {
2480 parent_host_name = "dsl-router"
2481 disable_checks = true
2482 disable_notifications = true
2484 assign where host.name != "dsl-router"
2487 apply Dependency "internet" to Service {
2488 parent_host_name = "dsl-router"
2489 parent_service_name = "ping4"
2490 disable_checks = true
2492 assign where host.name != "dsl-router"
2496 ### Apply Dependencies based on Custom Attributes <a id="dependencies-apply-custom-attributes"></a>
2498 You can use [apply rules](03-monitoring-basics.md#using-apply) to set parent or
2499 child attributes, e.g. `parent_host_name` to other objects'
2502 A common example are virtual machines hosted on a master. The object
2503 name of that master is auto-generated from your CMDB or VMWare inventory
2504 into the host's custom attributes (or a generic template for your
2507 Define your master host object:
2511 object Host "master.example.com" {
2512 import "generic-host"
2516 Add a generic template defining all common host attributes:
2519 /* generic template for your virtual machines */
2520 template Host "generic-vm" {
2521 import "generic-host"
2525 Add a template for all hosts on your example.com cloud setting
2526 custom attribute `vm_parent` to `master.example.com`:
2529 template Host "generic-vm-example.com" {
2531 vars.vm_parent = "master.example.com"
2535 Define your guest hosts:
2538 object Host "www.example1.com" {
2539 import "generic-vm-master.example.com"
2542 object Host "www.example2.com" {
2543 import "generic-vm-master.example.com"
2547 Apply the host dependency to all child hosts importing the
2548 `generic-vm` template and set the `parent_host_name`
2549 to the previously defined custom attribute `host.vars.vm_parent`.
2552 apply Dependency "vm-host-to-parent-master" to Host {
2553 parent_host_name = host.vars.vm_parent
2554 assign where "generic-vm" in host.templates
2558 You can extend this example, and make your services depend on the
2559 `master.example.com` host too. Their local scope allows you to use
2560 `host.vars.vm_parent` similar to the example above.
2563 apply Dependency "vm-service-to-parent-master" to Service {
2564 parent_host_name = host.vars.vm_parent
2565 assign where "generic-vm" in host.templates
2569 That way you don't need to wait for your guest hosts becoming
2570 unreachable when the master host goes down. Instead the services
2571 will detect their reachability immediately when executing checks.
2575 > This method with setting locally scoped variables only works in
2576 > apply rules, but not in object definitions.
2579 ### Dependencies for Agent Checks <a id="dependencies-agent-checks"></a>
2581 Another classic example are agent based checks. You would define a health check
2582 for the agent daemon responding to your requests, and make all other services
2583 querying that daemon depend on that health check.
2585 The following configuration defines two nrpe based service checks `nrpe-load`
2586 and `nrpe-disk` applied to the host `nrpe-server` [matched](18-library-reference.md#global-functions-match)
2587 by its name. The health check is defined as `nrpe-health` service.
2590 apply Service "nrpe-health" {
2591 import "generic-service"
2592 check_command = "nrpe"
2593 assign where match("nrpe-*", host.name)
2596 apply Service "nrpe-load" {
2597 import "generic-service"
2598 check_command = "nrpe"
2599 vars.nrpe_command = "check_load"
2600 assign where match("nrpe-*", host.name)
2603 apply Service "nrpe-disk" {
2604 import "generic-service"
2605 check_command = "nrpe"
2606 vars.nrpe_command = "check_disk"
2607 assign where match("nrpe-*", host.name)
2610 object Host "nrpe-server" {
2611 import "generic-host"
2612 address = "192.168.1.5"
2615 apply Dependency "disable-nrpe-checks" to Service {
2616 parent_service_name = "nrpe-health"
2619 disable_checks = true
2620 disable_notifications = true
2621 assign where service.check_command == "nrpe"
2622 ignore where service.name == "nrpe-health"
2626 The `disable-nrpe-checks` dependency is applied to all services
2627 on the `nrpe-service` host using the `nrpe` check_command attribute
2628 but not the `nrpe-health` service itself.
2631 ### Event Commands <a id="event-commands"></a>
2633 Unlike notifications, event commands for hosts/services are called on every
2634 check execution if one of these conditions matches:
2636 * The host/service is in a [soft state](03-monitoring-basics.md#hard-soft-states)
2637 * The host/service state changes into a [hard state](03-monitoring-basics.md#hard-soft-states)
2638 * The host/service state recovers from a [soft or hard state](03-monitoring-basics.md#hard-soft-states) to [OK](03-monitoring-basics.md#service-states)/[Up](03-monitoring-basics.md#host-states)
2640 [EventCommand](09-object-types.md#objecttype-eventcommand) objects are referenced by
2641 [Host](09-object-types.md#objecttype-host) and [Service](09-object-types.md#objecttype-service) objects
2642 with the `event_command` attribute.
2644 Therefore the `EventCommand` object should define a command line
2645 evaluating the current service state and other service runtime attributes
2646 available through runtime variables. Runtime macros such as `$service.state_type$`
2647 and `$service.state$` will be processed by Icinga 2 and help with fine-granular
2650 If the host/service is located on a client as [command endpoint](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint)
2651 the event command will be executed on the client itself (similar to the check
2654 Common use case scenarios are a failing HTTP check which requires an immediate
2655 restart via event command. Another example would be an application that is not
2656 responding and therefore requires a restart. You can also use event handlers
2657 to forward more details on state changes and events than the typical notification
2660 #### Use Event Commands to Send Information from the Master <a id="event-command-send-information-from-master"></a>
2662 This example sends a web request from the master node to an external tool
2663 for every event triggered on a `businessprocess` service.
2665 Define an [EventCommand](09-object-types.md#objecttype-eventcommand)
2666 object `send_to_businesstool` which sends state changes to the external tool.
2669 object EventCommand "send_to_businesstool" {
2678 value ="$businesstool_url$"
2681 "-d" = "$businesstool_message$"
2684 vars.businesstool_url = "http://localhost:8080/businesstool"
2685 vars.businesstool_message = "$host.name$ $service.name$ $service.state$ $service.state_type$ $service.check_attempt$"
2689 Set the `event_command` attribute to `send_to_businesstool` on the Service.
2692 object Service "businessprocess" {
2693 host_name = "businessprocess"
2695 check_command = "icingacli-businessprocess"
2696 vars.icingacli_businessprocess_process = "icinga"
2697 vars.icingacli_businessprocess_config = "training"
2699 event_command = "send_to_businesstool"
2703 In order to test this scenario you can run:
2709 This allows to catch the web request. You can also enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output)
2710 and search for the event command execution log message.
2713 tail -f /var/log/icinga2/debug.log | grep EventCommand
2716 Feed in a check result via REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result)
2717 or via Icinga Web 2.
2723 PUT /businesstool HTTP/1.1
2724 User-Agent: curl/7.29.0
2725 Host: localhost:8080
2728 Content-Type: application/x-www-form-urlencoded
2730 businessprocess businessprocess CRITICAL SOFT 1
2733 #### Use Event Commands to Restart Service Daemon via Command Endpoint on Linux <a id="event-command-restart-service-daemon-command-endpoint-linux"></a>
2735 This example triggers a restart of the `httpd` service on the local system
2736 when the `procs` service check executed via Command Endpoint fails. It only
2737 triggers if the service state is `Critical` and attempts to restart the
2738 service before a notification is sent.
2742 * Icinga 2 as client on the remote node
2743 * icinga user with sudo permissions to the httpd daemon
2745 Example on CentOS 7:
2749 icinga ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart httpd
2752 Note: Distributions might use a different name. On Debian/Ubuntu the service is called `apache2`.
2754 Define an [EventCommand](09-object-types.md#objecttype-eventcommand) object `restart_service`
2755 which allows to trigger local service restarts. Put it into a [global zone](06-distributed-monitoring.md#distributed-monitoring-global-zone-config-sync)
2756 to sync its configuration to all clients.
2759 [root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/global-templates/eventcommands.conf
2761 object EventCommand "restart_service" {
2762 command = [ PluginDir + "/restart_service" ]
2765 "-s" = "$service.state$"
2766 "-t" = "$service.state_type$"
2767 "-a" = "$service.check_attempt$"
2768 "-S" = "$restart_service$"
2771 vars.restart_service = "$procs_command$"
2775 This event command triggers the following script which restarts the service.
2776 The script only is executed if the service state is `CRITICAL`. Warning and Unknown states
2777 are ignored as they indicate not an immediate failure.
2780 [root@icinga2-client1.localdomain /]# vim /usr/lib64/nagios/plugins/restart_service
2784 while getopts "s:t:a:S:" opt; do
2787 servicestate=$OPTARG
2790 servicestatetype=$OPTARG
2793 serviceattempt=$OPTARG
2801 if ( [ -z $servicestate ] || [ -z $servicestatetype ] || [ -z $serviceattempt ] || [ -z $service ] ); then
2802 echo "USAGE: $0 -s servicestate -z servicestatetype -a serviceattempt -S service"
2805 # Only restart on the third attempt of a critical event
2806 if ( [ $servicestate == "CRITICAL" ] && [ $servicestatetype == "SOFT" ] && [ $serviceattempt -eq 3 ] ); then
2807 sudo /usr/bin/systemctl restart $service
2811 [root@icinga2-client1.localdomain /]# chmod +x /usr/lib64/nagios/plugins/restart_service
2814 Add a service on the master node which is executed via command endpoint on the client.
2815 Set the `event_command` attribute to `restart_service`, the name of the previously defined
2816 EventCommand object.
2819 [root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/icinga2-client1.localdomain.conf
2821 object Service "Process httpd" {
2822 check_command = "procs"
2823 event_command = "restart_service"
2824 max_check_attempts = 4
2826 host_name = "icinga2-client1.localdomain"
2827 command_endpoint = "icinga2-client1.localdomain"
2829 vars.procs_command = "httpd"
2830 vars.procs_warning = "1:10"
2831 vars.procs_critical = "1:"
2835 In order to test this configuration just stop the `httpd` on the remote host `icinga2-client1.localdomain`.
2838 [root@icinga2-client1.localdomain /]# systemctl stop httpd
2841 You can enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) and search for the
2842 executed command line.
2845 [root@icinga2-client1.localdomain /]# tail -f /var/log/icinga2/debug.log | grep restart_service
2848 #### Use Event Commands to Restart Service Daemon via Command Endpoint on Windows <a id="event-command-restart-service-daemon-command-endpoint-windows"></a>
2850 This example triggers a restart of the `httpd` service on the remote system
2851 when the `service-windows` service check executed via Command Endpoint fails.
2852 It only triggers if the service state is `Critical` and attempts to restart the
2853 service before a notification is sent.
2857 * Icinga 2 as client on the remote node
2858 * Icinga 2 service with permissions to execute Powershell scripts (which is the default)
2860 Define an [EventCommand](09-object-types.md#objecttype-eventcommand) object `restart_service-windows`
2861 which allows to trigger local service restarts. Put it into a [global zone](06-distributed-monitoring.md#distributed-monitoring-global-zone-config-sync)
2862 to sync its configuration to all clients.
2865 [root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/global-templates/eventcommands.conf
2867 object EventCommand "restart_service-windows" {
2869 "C:\\Windows\\SysWOW64\\WindowsPowerShell\\v1.0\\powershell.exe",
2870 PluginDir + "/restart_service.ps1"
2874 "-ServiceState" = "$service.state$"
2875 "-ServiceStateType" = "$service.state_type$"
2876 "-ServiceAttempt" = "$service.check_attempt$"
2877 "-Service" = "$restart_service$"
2880 value = "$$LASTEXITCODE"
2884 vars.restart_service = "$service_win_service$"
2888 This event command triggers the following script which restarts the service.
2889 The script only is executed if the service state is `CRITICAL`. Warning and Unknown states
2890 are ignored as they indicate not an immediate failure.
2892 Add the `restart_service.ps1` Powershell script into `C:\Program Files\Icinga2\sbin`:
2896 [string]$Service = '',
2897 [string]$ServiceState = '',
2898 [string]$ServiceStateType = '',
2899 [int]$ServiceAttempt = ''
2902 if (!$Service -Or !$ServiceState -Or !$ServiceStateType -Or !$ServiceAttempt) {
2903 $scriptName = GCI $MyInvocation.PSCommandPath | Select -Expand Name;
2904 Write-Host "USAGE: $scriptName -ServiceState servicestate -ServiceStateType servicestatetype -ServiceAttempt serviceattempt -Service service" -ForegroundColor red;
2908 # Only restart on the third attempt of a critical event
2909 if ($ServiceState -eq "CRITICAL" -And $ServiceStateType -eq "SOFT" -And $ServiceAttempt -eq 3) {
2910 Restart-Service $Service;
2916 Add a service on the master node which is executed via command endpoint on the client.
2917 Set the `event_command` attribute to `restart_service-windows`, the name of the previously defined
2918 EventCommand object.
2921 [root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/icinga2-client2.localdomain.conf
2923 object Service "Service httpd" {
2924 check_command = "service-windows"
2925 event_command = "restart_service-windows"
2926 max_check_attempts = 4
2928 host_name = "icinga2-client2.localdomain"
2929 command_endpoint = "icinga2-client2.localdomain"
2931 vars.service_win_service = "httpd"
2935 In order to test this configuration just stop the `httpd` on the remote host `icinga2-client1.localdomain`.
2941 You can enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) and search for the
2942 executed command line in `C:\ProgramData\icinga2\var\log\icinga2\debug.log`.
2945 #### Use Event Commands to Restart Service Daemon via SSH <a id="event-command-restart-service-daemon-ssh"></a>
2947 This example triggers a restart of the `httpd` daemon
2948 via SSH when the `http` service check fails.
2952 * SSH connection allowed (firewall, packet filters)
2953 * icinga user with public key authentication
2954 * icinga user with sudo permissions to restart the httpd daemon.
2959 # ls /home/icinga/.ssh/
2963 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
2966 Define a generic [EventCommand](09-object-types.md#objecttype-eventcommand) object `event_by_ssh`
2967 which can be used for all event commands triggered using SSH:
2970 [root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/local_eventcommands.conf
2972 /* pass event commands through ssh */
2973 object EventCommand "event_by_ssh" {
2974 command = [ PluginDir + "/check_by_ssh" ]
2977 "-H" = "$event_by_ssh_address$"
2978 "-p" = "$event_by_ssh_port$"
2979 "-C" = "$event_by_ssh_command$"
2980 "-l" = "$event_by_ssh_logname$"
2981 "-i" = "$event_by_ssh_identity$"
2983 set_if = "$event_by_ssh_quiet$"
2985 "-w" = "$event_by_ssh_warn$"
2986 "-c" = "$event_by_ssh_crit$"
2987 "-t" = "$event_by_ssh_timeout$"
2990 vars.event_by_ssh_address = "$address$"
2991 vars.event_by_ssh_quiet = false
2995 The actual event command only passes the `event_by_ssh_command` attribute.
2996 The `event_by_ssh_service` custom attribute takes care of passing the correct
2997 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
2998 is only restarted when the service is not in an `OK` state.
3001 object EventCommand "event_by_ssh_restart_service" {
3002 import "event_by_ssh"
3004 //only restart the daemon if state > 0 (not-ok)
3005 //requires sudo permissions for the icinga user
3006 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo systemctl restart $event_by_ssh_service$"
3011 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
3012 which service should be restarted using the `event_by_ssh_service` attribute.
3015 apply Service "http" {
3016 import "generic-service"
3017 check_command = "http"
3019 event_command = "event_by_ssh_restart_service"
3020 vars.event_by_ssh_service = "$host.vars.httpd_name$"
3022 //vars.event_by_ssh_logname = "icinga"
3023 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
3025 assign where host.vars.httpd_name
3029 Specify the `httpd_name` custom attribute on the host to assign the
3030 service and set the event handler service.
3033 object Host "remote-http-host" {
3034 import "generic-host"
3035 address = "192.168.1.100"
3037 vars.httpd_name = "apache2"
3041 In order to test this configuration just stop the `httpd` on the remote host `icinga2-client1.localdomain`.
3044 [root@icinga2-client1.localdomain /]# systemctl stop httpd
3047 You can enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) and search for the
3048 executed command line.
3051 [root@icinga2-client1.localdomain /]# tail -f /var/log/icinga2/debug.log | grep by_ssh