1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
6 ## <a id="hosts-services"></a> Hosts and Services
8 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
9 and services can be virtually anything which can be checked in some way:
11 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
15 * Other local or network-accessible services
17 Host objects provide a mechanism to group services that are running
18 on the same physical device.
20 Here is an example of a host object which defines two child services:
22 object Host "my-server1" {
24 check_command = "hostalive"
27 object Service "ping4" {
28 host_name = "my-server1"
29 check_command = "ping4"
32 object Service "http" {
33 host_name = "my-server1"
34 check_command = "http"
37 The example creates two services `ping4` and `http` which belong to the
40 It also specifies that the host should perform its own check using the `hostalive`
43 The `address` attribute is used by check commands to determine which network
44 address is associated with the host object.
46 Details on troubleshooting check problems can be found [here](13-troubleshooting.md#troubleshooting).
48 ### <a id="host-states"></a> Host States
50 Hosts can be in any of the following states:
53 ------------|--------------
54 UP | The host is available.
55 DOWN | The host is unavailable.
57 ### <a id="service-states"></a> Service States
59 Services can be in any of the following states:
62 ------------|--------------
63 OK | The service is working properly.
64 WARNING | The service is experiencing some problems but is still considered to be in working condition.
65 CRITICAL | The service is in a critical state.
66 UNKNOWN | The check could not determine the service's state.
68 ### <a id="hard-soft-states"></a> Hard and Soft States
70 When detecting a problem with a host/service Icinga re-checks the object a number of
71 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
72 notifications. This ensures that no unnecessary notifications are sent for
73 transient failures. During this time the object is in a `SOFT` state.
75 After all re-checks have been executed and the object is still in a non-OK
76 state the host/service switches to a `HARD` state and notifications are sent.
79 ------------|--------------
80 HARD | The host/service's state hasn't recently changed.
81 SOFT | The host/service has recently changed state and is being re-checked.
83 ### <a id="host-service-checks"></a> Host and Service Checks
85 Hosts and services determine their state by running checks in a regular interval.
87 object Host "router" {
88 check_command = "hostalive"
92 The `hostalive` command is one of several built-in check commands. It sends ICMP
93 echo requests to the IP address specified in the `address` attribute to determine
94 whether a host is online.
96 A number of other [built-in check commands](#plugin-check-comamnds) are also
97 available. In addition to these commands the next few chapters will explain in
98 detail how to set up your own check commands.
101 ## <a id="object-inheritance-using-templates"></a> Templates
103 Templates may be used to apply a set of identical attributes to more than one
106 template Service "generic-service" {
107 max_check_attempts = 3
110 enable_perfdata = true
113 apply Service "ping4" {
114 import "generic-service"
116 check_command = "ping4"
118 assign where host.address
121 apply Service "ping6" {
122 import "generic-service"
124 check_command = "ping6"
126 assign where host.address6
130 In this example the `ping4` and `ping6` services inherit properties from the
131 template `generic-service`.
133 Objects as well as templates themselves can import an arbitrary number of
134 other templates. Attributes inherited from a template can be overridden in the
137 You can also import existing non-template objects. Note that templates
138 and objects share the same namespace, i.e. you can't define a template
139 that has the same name like an object.
142 ## <a id="custom-attributes"></a> Custom Attributes
144 In addition to built-in attributes you can define your own attributes:
146 object Host "localhost" {
150 Valid values for custom attributes include:
152 * Strings and numbers
153 * Arrays and dictionaries
156 ### <a id="custom-attributes-functions"></a> Functions as Custom Attributes
158 Icinga lets you specify functions for custom attributes. The special case here
159 is that whenever Icinga needs the value for such a custom attribute it runs
160 the function and uses whatever value the function returns:
162 object CheckCommand "random-value" {
163 import "plugin-check-command"
165 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
167 vars.text = {{ Math.random() * 100 }}
170 This example uses the [abbreviated lambda syntax](16-language-reference.md#nullary-lambdas).
172 These functions have access to a number of variables:
174 Variable | Description
175 -------------|---------------
176 user | The User object (for notifications).
177 service | The Service object (for service checks/notifications/event handlers).
178 host | The Host object.
179 command | The command object (e.g. a CheckCommand object for checks).
183 vars.text = {{ host.check_interval }}
185 In addition to these variables the `macro` function can be used to retrieve the
186 value of arbitrary macro expressions:
189 if (macro("$address$") == "127.0.0.1") {
190 log("Running a check for localhost!")
196 The [Object Accessor Functions](17-library-reference.md#object-accessor-functions) can be used to retrieve references
197 to other objects by name.
199 ## <a id="runtime-macros"></a> Runtime Macros
201 Macros can be used to access other objects' attributes at runtime. For example they
202 are used in command definitions to figure out which IP address a check should be
205 object CheckCommand "my-ping" {
206 import "plugin-check-command"
208 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
211 "-w" = "$ping_wrta$,$ping_wpl$%"
212 "-c" = "$ping_crta$,$ping_cpl$%"
213 "-p" = "$ping_packets$"
222 vars.ping_packets = 5
225 object Host "router" {
226 check_command = "my-ping"
230 In this example we are using the `$address$` macro to refer to the host's `address`
233 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
234 automatically tries to find the closest match for the attribute you specified. The
235 exact rules for this are explained in the next section.
238 ### <a id="macro-evaluation-order"></a> Evaluation Order
240 When executing commands Icinga 2 checks the following objects in this order to look
241 up macros and their respective values:
243 1. User object (only for notifications)
247 5. Global custom attributes in the `Vars` constant
249 This execution order allows you to define default values for custom attributes
250 in your command objects.
252 Here's how you can override the custom attribute `ping_packets` from the previous
255 object Service "ping" {
256 host_name = "localhost"
257 check_command = "my-ping"
259 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
262 If a custom attribute isn't defined anywhere an empty value is used and a warning is
263 written to the Icinga 2 log.
265 You can also directly refer to a specific attribute - thereby ignoring these evaluation
266 rules - by specifying the full attribute name:
268 $service.vars.ping_wrta$
270 This retrieves the value of the `ping_wrta` custom attribute for the service. This
271 returns an empty value if the server does not have such a custom attribute no matter
272 whether another object such as the host has this attribute.
275 ### <a id="host-runtime-macros"></a> Host Runtime Macros
277 The following host custom attributes are available in all commands that are executed for
281 -----------------------------|--------------
282 host.name | The name of the host object.
283 host.display_name | The value of the `display_name` attribute.
284 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
285 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
286 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
287 host.check_attempt | The current check attempt number.
288 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
289 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
290 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
291 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
292 host.last_state_change | The last state change's timestamp.
293 host.downtime_depth | The number of active downtimes.
294 host.duration_sec | The time since the last state change.
295 host.latency | The host's check latency.
296 host.execution_time | The host's check execution time.
297 host.output | The last check's output.
298 host.perfdata | The last check's performance data.
299 host.last_check | The timestamp when the last check was executed.
300 host.check_source | The monitoring instance that performed the last check.
301 host.num_services | Number of services associated with the host.
302 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
303 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
304 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
305 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
307 ### <a id="service-runtime-macros"></a> Service Runtime Macros
309 The following service macros are available in all commands that are executed for
313 ---------------------------|--------------
314 service.name | The short name of the service object.
315 service.display_name | The value of the `display_name` attribute.
316 service.check_command | The short name of the command along with any arguments to be used for the check.
317 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
318 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
319 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
320 service.check_attempt | The current check attempt number.
321 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
322 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
323 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
324 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
325 service.last_state_change | The last state change's timestamp.
326 service.downtime_depth | The number of active downtimes.
327 service.duration_sec | The time since the last state change.
328 service.latency | The service's check latency.
329 service.execution_time | The service's check execution time.
330 service.output | The last check's output.
331 service.perfdata | The last check's performance data.
332 service.last_check | The timestamp when the last check was executed.
333 service.check_source | The monitoring instance that performed the last check.
335 ### <a id="command-runtime-macros"></a> Command Runtime Macros
337 The following custom attributes are available in all commands:
340 -----------------------|--------------
341 command.name | The name of the command object.
343 ### <a id="user-runtime-macros"></a> User Runtime Macros
345 The following custom attributes are available in all commands that are executed for
349 -----------------------|--------------
350 user.name | The name of the user object.
351 user.display_name | The value of the display_name attribute.
353 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
356 -----------------------|--------------
357 notification.type | The type of the notification.
358 notification.author | The author of the notification comment, if existing.
359 notification.comment | The comment of the notification, if existing.
361 ### <a id="global-runtime-macros"></a> Global Runtime Macros
363 The following macros are available in all executed commands:
366 -----------------------|--------------
367 icinga.timet | Current UNIX timestamp.
368 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
369 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
370 icinga.date | Current date. Example: `2014-01-03`
371 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
372 icinga.uptime | Current uptime of the Icinga 2 process.
374 The following macros provide global statistics:
377 ----------------------------------|--------------
378 icinga.num_services_ok | Current number of services in state 'OK'.
379 icinga.num_services_warning | Current number of services in state 'Warning'.
380 icinga.num_services_critical | Current number of services in state 'Critical'.
381 icinga.num_services_unknown | Current number of services in state 'Unknown'.
382 icinga.num_services_pending | Current number of pending services.
383 icinga.num_services_unreachable | Current number of unreachable services.
384 icinga.num_services_flapping | Current number of flapping services.
385 icinga.num_services_in_downtime | Current number of services in downtime.
386 icinga.num_services_acknowledged | Current number of acknowledged service problems.
387 icinga.num_hosts_up | Current number of hosts in state 'Up'.
388 icinga.num_hosts_down | Current number of hosts in state 'Down'.
389 icinga.num_hosts_unreachable | Current number of unreachable hosts.
390 icinga.num_hosts_flapping | Current number of flapping hosts.
391 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
392 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
397 ## <a id="using-apply"></a> Apply Rules
399 Instead of assigning each object ([Service](6-object-types.md#objecttype-service),
400 [Notification](6-object-types.md#objecttype-notification), [Dependency](6-object-types.md#objecttype-dependency),
401 [ScheduledDowntime](6-object-types.md#objecttype-scheduleddowntime))
402 based on attribute identifiers for example `host_name` objects can be [applied](16-language-reference.md#apply).
404 Before you start using the apply rules keep the following in mind:
406 * Define the best match.
407 * A set of unique [custom attributes](#custom-attributes-apply) for these hosts/services?
408 * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
409 * A generic pattern [match](16-language-reference.md#function-calls) on the host/service name?
410 * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](16-language-reference.md#expression-operators)
411 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
415 > You can set/override object attributes in apply rules using the respectively available
416 > objects in that scope (host and/or service objects).
418 [Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
419 for not only matching for their existance or values in apply expressions, but also assign
420 ("inherit") their values into the generated objected from apply rules.
422 * [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
423 * [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
424 * [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
425 * [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
427 A more advanced example is using [apply with for loops on arrays or
428 dictionaries](#using-apply-for) for example provided by
429 [custom atttributes](#custom-attributes-apply) or groups.
433 > Building configuration in that dynamic way requires detailed information
434 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
435 > after successful [configuration validation](8-cli-commands.md#config-validation).
438 ### <a id="using-apply-expressions"></a> Apply Rules Expressions
440 You can use simple or advanced combinations of apply rule expressions. Each
441 expression must evaluate into the boolean `true` value. An empty string
442 will be for instance interpreted as `false`. In a similar fashion undefined
443 attributes will return `false`.
447 assign where host.vars.attribute_does_not_exist
449 Multiple `assign where` condition rows are evaluated as `OR` condition.
451 You can combine multiple expressions for matching only a subset of objects. In some cases,
452 you want to be able to add more than one assign/ignore where expression which matches
453 a specific condition. To achieve this you can use the logical `and` and `or` operators.
456 Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
457 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
458 should be ignored, or any host name ending with `*internal` pattern.
460 object HostGroup "mysql-server" {
461 display_name = "MySQL Server"
463 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
464 ignore where host.vars.test_server == true
465 ignore where match("*internal", host.name)
468 Similar example for advanced notification apply rule filters: If the service
469 attribute `notes` contains the `has gold support 24x7` string `AND` one of the
470 two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
471 `OR` the host custom attribute `always_notify` is set to `true`.
473 The notification is ignored for services whose host name ends with `*internal`
474 `OR` the `priority` custom attribute is [less than](16-language-reference.md#expression-operators) `2`.
476 template Notification "cust-xy-notification" {
477 users = [ "noc-xy", "mgmt-xy" ]
478 command = "mail-service-notification"
481 apply Notification "notify-cust-xy-mysql" to Service {
482 import "cust-xy-notification"
484 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
485 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
491 ### <a id="using-apply-services"></a> Apply Services to Hosts
493 The sample configuration already includes a detailed example in [hosts.conf](5-configuring-icinga-2.md#hosts-conf)
494 and [services.conf](5-configuring-icinga-2.md#services-conf) for this use case.
496 The example for `ssh` applies a service object to all hosts with the `address`
497 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
499 apply Service "ssh" {
500 import "generic-service"
502 check_command = "ssh"
504 assign where host.address && host.vars.os == "Linux"
508 Other detailed scenario examples are used in their respective chapters, for example
509 [apply services with custom command arguments](#using-apply-services-command-arguments).
511 ### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
513 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
517 apply Notification "mail-noc" to Service {
518 import "mail-service-notification"
520 user_groups = [ "noc" ]
522 assign where host.vars.notification.mail
526 In this example the `mail-noc` notification will be created as object for all services having the
527 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
528 and all members of the user group `noc` will get notified.
530 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
532 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
534 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
536 The sample confituration includes an example in [downtimes.conf](5-configuring-icinga-2.md#downtimes-conf).
538 Detailed examples can be found in the [recurring downtimes](4-advanced-topics.md#recurring-downtimes) chapter.
541 ### <a id="using-apply-for"></a> Using Apply For Rules
543 Next to the standard way of using apply rules there is the requirement of generating
544 apply rules objects based on set (array or dictionary). That way you'll save quite
545 of a lot of duplicated apply rules by combining them into one generic generating
546 the object name with or without a prefix.
548 The sample configuration already includes a detailed example in [hosts.conf](5-configuring-icinga-2.md#hosts-conf)
549 and [services.conf](5-configuring-icinga-2.md#services-conf) for this use case.
551 Imagine a different example: You are monitoring your switch (hosts) with many
552 interfaces (services). The following requirements/problems apply:
554 * Each interface service check should be named with a prefix and a running number
555 * Each interface has its own vlan tag
556 * Some interfaces have QoS enabled
557 * Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be
558 dynamically generated
560 By defining the `interfaces` dictionary with three example interfaces on the `core-switch`
561 host object, you'll make sure to pass the storage required by the for loop in the service apply
565 object Host "core-switch" {
566 import "generic-host"
567 address = "127.0.0.1"
569 vars.interfaces["0"] = {
572 address = "127.0.0.2"
575 vars.interfaces["1"] = {
578 address = "127.0.1.2"
580 vars.interfaces["2"] = {
583 address = "127.0.2.2"
587 You can also omit the `"if-"` string, then all generated service names are directly
588 taken from the `if_name` variable value.
590 The config dictionary contains all key-value pairs for the specific interface in one
591 loop cycle, like `port`, `vlan`, `address` and `qos` for the `0` interface.
593 By defining a default value for the custom attribute `qos` in the `vars` dictionary
594 before adding the `config` dictionary we''ll ensure that this attribute is always defined.
596 After `vars` is fully populated, all object attributes can be set. For strings, you can use
597 string concatention with the `+` operator.
599 You can also specifiy the check command that way.
601 apply Service "if-" for (if_name => config in host.vars.interfaces) {
602 import "generic-service"
603 check_command = "ping4"
605 vars.qos = "disabled"
608 display_name = "if-" + if_name + "-" + vars.vlan
610 notes = "Interface check for Port " + string(vars.port) + " in VLAN " + vars.vlan + " on Address " + vars.address + " QoS " + vars.qos
611 notes_url = "http://foreman.company.com/hosts/" + host.name
612 action_url = "http://snmp.checker.company.com/" + host.name + "if-" + if_name
615 Note that numbers must be explicitely casted to string when adding to strings.
616 This can be achieved by wrapping them into the [string()](16-language-reference.md#function-calls) function.
620 > Building configuration in that dynamic way requires detailed information
621 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
622 > after successful [configuration validation](8-cli-commands.md#config-validation).
625 ### <a id="using-apply-object attributes"></a> Use Object Attributes in Apply Rules
627 Since apply rules are evaluated after the generic objects, you
628 can reference existing host and/or service object attributes as
629 values for any object attribute specified in that apply rule.
631 object Host "opennebula-host" {
632 import "generic-host"
635 vars.hosting["xyz"] = {
637 customer_name = "Customer xyz"
639 support_contract = "gold"
641 vars.hosting["abc"] = {
643 customer_name = "Customer xyz"
645 support_contract = "silver"
649 apply Service for (customer => config in host.vars.hosting) {
650 import "generic-service"
651 check_command = "ping4"
653 vars.qos = "disabled"
657 vars.http_uri = "/" + vars.customer + "/" + config.http_uri
659 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
661 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
663 notes_url = "http://foreman.company.com/hosts/" + host.name
664 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
667 ## <a id="groups"></a> Groups
669 A group is a collection of similar objects. Groups are primarily used as a
670 visualization aid in web interfaces.
672 Group membership is defined at the respective object itself. If
673 you have a hostgroup name `windows` for example, and want to assign
674 specific hosts to this group for later viewing the group on your
675 alert dashboard, first create a HostGroup object:
677 object HostGroup "windows" {
678 display_name = "Windows Servers"
681 Then add your hosts to this group:
683 template Host "windows-server" {
684 groups += [ "windows" ]
687 object Host "mssql-srv1" {
688 import "windows-server"
690 vars.mssql_port = 1433
693 object Host "mssql-srv2" {
694 import "windows-server"
696 vars.mssql_port = 1433
699 This can be done for service and user groups the same way:
701 object UserGroup "windows-mssql-admins" {
702 display_name = "Windows MSSQL Admins"
705 template User "generic-windows-mssql-users" {
706 groups += [ "windows-mssql-admins" ]
709 object User "win-mssql-noc" {
710 import "generic-windows-mssql-users"
712 email = "noc@example.com"
715 object User "win-mssql-ops" {
716 import "generic-windows-mssql-users"
718 email = "ops@example.com"
721 ### <a id="group-assign-intro"></a> Group Membership Assign
723 Instead of manually assigning each object to a group you can also assign objects
724 to a group based on their attributes:
726 object HostGroup "prod-mssql" {
727 display_name = "Production MSSQL Servers"
729 assign where host.vars.mssql_port && host.vars.prod_mysql_db
730 ignore where host.vars.test_server == true
731 ignore where match("*internal", host.name)
734 In this example all hosts with the `vars` attribute `mssql_port`
735 will be added as members to the host group `mssql`. However, all `*internal`
736 hosts or with the `test_server` attribute set to `true` are not added to this
739 Details on the `assign where` syntax can be found in the
740 [Language Reference](16-language-reference.md#apply)
742 ## <a id="notifications"></a> Notifications
744 Notifications for service and host problems are an integral part of your
747 When a host or service is in a downtime, a problem has been acknowledged or
748 the dependency logic determined that the host/service is unreachable, no
749 notifications are sent. You can configure additional type and state filters
750 refining the notifications being actually sent.
752 There are many ways of sending notifications, e.g. by e-mail, XMPP,
753 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
754 Instead it relies on external mechanisms such as shell scripts to notify users.
756 A notification specification requires one or more users (and/or user groups)
757 who will be notified in case of problems. These users must have all custom
758 attributes defined which will be used in the `NotificationCommand` on execution.
760 The user `icingaadmin` in the example below will get notified only on `WARNING` and
761 `CRITICAL` states and `problem` and `recovery` notification types.
763 object User "icingaadmin" {
764 display_name = "Icinga 2 Admin"
765 enable_notifications = true
766 states = [ OK, Warning, Critical ]
767 types = [ Problem, Recovery ]
768 email = "icinga@localhost"
771 If you don't set the `states` and `types` configuration attributes for the `User`
772 object, notifications for all states and types will be sent.
774 Details on troubleshooting notification problems can be found [here](13-troubleshooting.md#troubleshooting).
778 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
779 > in order to execute notification commands.
781 You should choose which information you (and your notified users) are interested in
782 case of emergency, and also which information does not provide any value to you and
785 An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
787 You can add all shared attributes to a `Notification` template which is inherited
788 to the defined notifications. That way you'll save duplicated attributes in each
789 `Notification` object. Attributes can be overridden locally.
791 template Notification "generic-notification" {
794 command = "mail-service-notification"
796 states = [ Warning, Critical, Unknown ]
797 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
798 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
803 The time period `24x7` is included as example configuration with Icinga 2.
805 Use the `apply` keyword to create `Notification` objects for your services:
807 apply Notification "notify-cust-xy-mysql" to Service {
808 import "generic-notification"
810 users = [ "noc-xy", "mgmt-xy" ]
812 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
813 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
817 Instead of assigning users to notifications, you can also add the `user_groups`
818 attribute with a list of user groups to the `Notification` object. Icinga 2 will
819 send notifications to all group members.
823 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
824 > states for services, `Down` for hosts) will receive `Recovery` notifications.
826 ### <a id="notification-escalations"></a> Notification Escalations
828 When a problem notification is sent and a problem still exists at the time of re-notification
829 you may want to escalate the problem to the next support level. A different approach
830 is to configure the default notification by email, and escalate the problem via SMS
831 if not already solved.
833 You can define notification start and end times as additional configuration
834 attributes making the `Notification` object a so-called `notification escalation`.
835 Using templates you can share the basic notification attributes such as users or the
836 `interval` (and override them for the escalation then).
838 Using the example from above, you can define additional users being escalated for SMS
839 notifications between start and end time.
841 object User "icinga-oncall-2nd-level" {
842 display_name = "Icinga 2nd Level"
844 vars.mobile = "+1 555 424642"
847 object User "icinga-oncall-1st-level" {
848 display_name = "Icinga 1st Level"
850 vars.mobile = "+1 555 424642"
853 Define an additional [NotificationCommand](#notification) for SMS notifications.
857 > The example is not complete as there are many different SMS providers.
858 > Please note that sending SMS notifications will require an SMS provider
859 > or local hardware with a SIM card active.
861 object NotificationCommand "sms-notification" {
863 PluginDir + "/send_sms_notification",
868 The two new notification escalations are added onto the local host
869 and its service `ping4` using the `generic-notification` template.
870 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
871 command) after `30m` until `1h`.
875 > The `interval` was set to 15m in the `generic-notification`
876 > template example. Lower that value in your escalations by using a secondary
877 > template or by overriding the attribute directly in the `notifications` array
878 > position for `escalation-sms-2nd-level`.
880 If the problem does not get resolved nor acknowledged preventing further notifications
881 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
882 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
884 apply Notification "mail" to Service {
885 import "generic-notification"
887 command = "mail-notification"
888 users = [ "icingaadmin" ]
890 assign where service.name == "ping4"
893 apply Notification "escalation-sms-2nd-level" to Service {
894 import "generic-notification"
896 command = "sms-notification"
897 users = [ "icinga-oncall-2nd-level" ]
904 assign where service.name == "ping4"
907 apply Notification "escalation-sms-1st-level" to Service {
908 import "generic-notification"
910 command = "sms-notification"
911 users = [ "icinga-oncall-1st-level" ]
918 assign where service.name == "ping4"
921 ### <a id="notification-delay"></a> Notification Delay
923 Sometimes the problem in question should not be notified when the notification is due
924 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
925 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
926 postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
927 Icinga 2 will not check against any end time for this notification. Make sure to
928 specify a relatively low notification `interval` to get notified soon enough again.
930 apply Notification "mail" to Service {
931 import "generic-notification"
933 command = "mail-notification"
934 users = [ "icingaadmin" ]
938 times.begin = 15m // delay notification window
940 assign where service.name == "ping4"
943 ### <a id="disable-renotification"></a> Disable Re-notifications
945 If you prefer to be notified only once, you can disable re-notifications by setting the
946 `interval` attribute to `0`.
948 apply Notification "notify-once" to Service {
949 import "generic-notification"
951 command = "mail-notification"
952 users = [ "icingaadmin" ]
954 interval = 0 // disable re-notification
956 assign where service.name == "ping4"
959 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
961 If there are no notification state and type filter attributes defined at the `Notification`
962 or `User` object Icinga 2 assumes that all states and types are being notified.
964 Available state and type filters for notifications are:
966 template Notification "generic-notification" {
968 states = [ Warning, Critical, Unknown ]
969 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
970 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
973 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
974 into type and state to allow more fine granular filtering for example on downtimes and flapping.
975 You can filter for acknowledgements and custom notifications too.s and custom notifications too.
978 ## <a id="commands"></a> Commands
980 Icinga 2 uses three different command object types to specify how
981 checks should be performed, notifications should be sent, and
982 events should be handled.
984 ### <a id="check-commands"></a> Check Commands
986 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects define the command line how
989 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects are referenced by
990 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
991 using the `check_command` attribute.
995 > Make sure that the [checker](8-cli-commands.md#features) feature is enabled in order to
998 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
1000 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects require the [ITL template](7-icinga-template-library.md#itl-plugin-check-command)
1001 `plugin-check-command` to support native plugin based check methods.
1003 Unless you have done so already, download your check plugin and put it
1004 into the [PluginDir](5-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1005 `check_disk` plugin contained in the Monitoring Plugins package.
1007 The plugin path and all command arguments are made a list of
1008 double-quoted string arguments for proper shell escaping.
1010 Call the `check_disk` plugin with the `--help` parameter to see
1011 all available options. Our example defines warning (`-w`) and
1012 critical (`-c`) thresholds for the disk usage. Without any
1013 partition defined (`-p`) it will check all local partitions.
1015 icinga@icinga2 $ /usr/lib/nagios/plugins/check_disk --help
1017 This plugin checks the amount of used disk space on a mounted file system
1018 and generates an alert if free space is less than one of the threshold values
1022 check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
1023 [-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
1024 [-t timeout] [-u unit] [-v] [-X type] [-N type]
1029 > Don't execute plugins as `root` and always use the absolute path to the plugin! Trust us.
1031 Next step is to understand how command parameters are being passed from
1032 a host or service object, and add a [CheckCommand](6-object-types.md#objecttype-checkcommand)
1033 definition based on these required parameters and/or default values.
1035 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
1037 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1038 by the executed check command.
1040 Define the default check command custom attribute `disk_wfree` and `disk_cfree`
1041 (freely definable naming schema) and their default threshold values. You can
1042 then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
1043 on the command line.
1047 > Use a common command type as prefix for your command arguments to increase
1048 > readability. `disk_wfree` helps understanding the context better than just
1049 > `wfree` as argument.
1051 The default custom attributes can be overridden by the custom attributes
1052 defined in the service using the check command `my-disk`. The custom attributes
1053 can also be inherited from a parent template using additive inheritance (`+=`).
1055 object CheckCommand "my-disk" {
1056 import "plugin-check-command"
1058 command = [ PluginDir + "/check_disk" ]
1061 "-w" = "$disk_wfree$%"
1062 "-c" = "$disk_cfree$%"
1063 "-W" = "$disk_inode_wfree$%"
1064 "-K" = "$disk_inode_cfree$%"
1065 "-p" = "$disk_partitions$"
1066 "-x" = "$disk_partitions_excluded$"
1069 vars.disk_wfree = 20
1070 vars.disk_cfree = 10
1075 > A proper example for the `check_disk` plugin is already shipped with Icinga 2
1076 > ready to use with the [plugin check commands](7-icinga-template-library.md#plugin-check-command-disk).
1078 The host `localhost` with the applied service `basic-partitions` checks a basic set of disk partitions
1079 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
1082 The custom attribute `disk_partition` can either hold a single string or an array of
1083 string values for passing multiple partitions to the `check_disk` check plugin.
1085 object Host "my-server" {
1086 import "generic-host"
1087 address = "127.0.0.1"
1090 vars.local_disks["basic-partitions"] = {
1091 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
1095 apply Service for (disk => config in host.vars.local_disks) {
1096 import "generic-service"
1097 check_command = "my-disk"
1101 vars.disk_wfree = 10
1106 More details on using arrays in custom attributes can be found in
1107 [this chapter](#runtime-custom-attributes).
1110 #### <a id="command-arguments"></a> Command Arguments
1112 By defining a check command line using the `command` attribute Icinga 2
1113 will resolve all macros in the static string or array. Sometimes it is
1114 required to extend the arguments list based on a met condition evaluated
1115 at command execution. Or making arguments optional - only set if the
1116 macro value can be resolved by Icinga 2.
1118 object CheckCommand "check_http" {
1119 import "plugin-check-command"
1121 command = [ PluginDir + "/check_http" ]
1124 "-H" = "$http_vhost$"
1125 "-I" = "$http_address$"
1127 "-p" = "$http_port$"
1129 set_if = "$http_ssl$"
1132 set_if = "$http_sni$"
1135 value = "$http_auth_pair$"
1136 description = "Username:password on sites with basic authentication"
1139 set_if = "$http_ignore_body$"
1141 "-r" = "$http_expect_body_regex$"
1142 "-w" = "$http_warn_time$"
1143 "-c" = "$http_critical_time$"
1144 "-e" = "$http_expect$"
1147 vars.http_address = "$address$"
1148 vars.http_ssl = false
1149 vars.http_sni = false
1152 The example shows the `check_http` check command defining the most common
1153 arguments. Each of them is optional by default and will be omitted if
1154 the value is not set. For example if the service calling the check command
1155 does not have `vars.http_port` set, it won't get added to the command
1158 If the `vars.http_ssl` custom attribute is set in the service, host or command
1159 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
1160 numeric value to the command line. String values are not supported.
1162 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
1163 to the final command argument array. Empty strings for macro values won't omit
1166 That way you can use the `check_http` command definition for both, with and
1167 without SSL enabled checks saving you duplicated command definitions.
1169 Details on all available options can be found in the
1170 [CheckCommand object definition](6-object-types.md#objecttype-checkcommand).
1173 ### <a id="notification-commands"></a> Notification Commands
1175 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
1176 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
1178 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects are referenced by
1179 [Notification](6-object-types.md#objecttype-notification) objects using the `command` attribute.
1181 `NotificationCommand` objects require the [ITL template](7-icinga-template-library.md#itl-plugin-notification-command)
1182 `plugin-notification-command` to support native plugin-based notifications.
1186 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1187 > in order to execute notification commands.
1189 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
1190 the current check output) sending an email to the user(s) associated with the
1191 notification itself (`$user.email$`).
1193 If you want to specify default values for some of the custom attribute definitions,
1194 you can add a `vars` dictionary as shown for the `CheckCommand` object.
1196 object NotificationCommand "mail-service-notification" {
1197 import "plugin-notification-command"
1199 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
1202 NOTIFICATIONTYPE = "$notification.type$"
1203 SERVICEDESC = "$service.name$"
1204 HOSTALIAS = "$host.display_name$"
1205 HOSTADDRESS = "$address$"
1206 SERVICESTATE = "$service.state$"
1207 LONGDATETIME = "$icinga.long_date_time$"
1208 SERVICEOUTPUT = "$service.output$"
1209 NOTIFICATIONAUTHORNAME = "$notification.author$"
1210 NOTIFICATIONCOMMENT = "$notification.comment$"
1211 HOSTDISPLAYNAME = "$host.display_name$"
1212 SERVICEDISPLAYNAME = "$service.display_name$"
1213 USEREMAIL = "$user.email$"
1217 The command attribute in the `mail-service-notification` command refers to the following
1218 shell script. The macros specified in the `env` array are exported
1219 as environment variables and can be used in the notification script:
1222 template=$(cat <<TEMPLATE
1225 Notification Type: $NOTIFICATIONTYPE
1227 Service: $SERVICEDESC
1229 Address: $HOSTADDRESS
1230 State: $SERVICESTATE
1232 Date/Time: $LONGDATETIME
1234 Additional Info: $SERVICEOUTPUT
1236 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
1240 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
1244 > This example is for `exim` only. Requires changes for `sendmail` and
1247 While it's possible to specify the entire notification command right
1248 in the NotificationCommand object it is generally advisable to create a
1249 shell script in the `/etc/icinga2/scripts` directory and have the
1250 NotificationCommand object refer to that.
1252 ### <a id="event-commands"></a> Event Commands
1254 Unlike notifications event commands for hosts/services are called on every
1255 check execution if one of these conditions match:
1257 * The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
1258 * The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
1259 * The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
1261 [EventCommand](6-object-types.md#objecttype-eventcommand) objects are referenced by
1262 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1263 using the `event_command` attribute.
1265 Therefore the `EventCommand` object should define a command line
1266 evaluating the current service state and other service runtime attributes
1267 available through runtime vars. Runtime macros such as `$service.state_type$`
1268 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
1269 events being triggered.
1271 Common use case scenarios are a failing HTTP check requiring an immediate
1272 restart via event command, or if an application is locked and requires
1273 a restart upon detection.
1275 `EventCommand` objects require the ITL template `plugin-event-command`
1276 to support native plugin based checks.
1278 #### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
1280 The following example will triggert a restart of the `httpd` daemon
1281 via ssh when the `http` service check fails. If the service state is
1282 `OK`, it will not trigger any event action.
1287 * icinga user with public key authentication
1288 * icinga user with sudo permissions for restarting the httpd daemon.
1292 # ls /home/icinga/.ssh/
1296 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
1299 Define a generic [EventCommand](6-object-types.md#objecttype-eventcommand) object `event_by_ssh`
1300 which can be used for all event commands triggered using ssh:
1302 /* pass event commands through ssh */
1303 object EventCommand "event_by_ssh" {
1304 import "plugin-event-command"
1306 command = [ PluginDir + "/check_by_ssh" ]
1309 "-H" = "$event_by_ssh_address$"
1310 "-p" = "$event_by_ssh_port$"
1311 "-C" = "$event_by_ssh_command$"
1312 "-l" = "$event_by_ssh_logname$"
1313 "-i" = "$event_by_ssh_identity$"
1315 set_if = "$event_by_ssh_quiet$"
1317 "-w" = "$event_by_ssh_warn$"
1318 "-c" = "$event_by_ssh_crit$"
1319 "-t" = "$event_by_ssh_timeout$"
1322 vars.event_by_ssh_address = "$address$"
1323 vars.event_by_ssh_quiet = false
1326 The actual event command only passes the `event_by_ssh_command` attribute.
1327 The `event_by_ssh_service` custom attribute takes care of passing the correct
1328 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
1329 is only restarted when the service is an a not `OK` state.
1332 object EventCommand "event_by_ssh_restart_service" {
1333 import "event_by_ssh"
1335 //only restart the daemon if state > 0 (not-ok)
1336 //requires sudo permissions for the icinga user
1337 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
1341 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
1342 which service should be restarted using the `event_by_ssh_service` attribute.
1344 object Service "http" {
1345 import "generic-service"
1346 host_name = "remote-http-host"
1347 check_command = "http"
1349 event_command = "event_by_ssh_restart_service"
1350 vars.event_by_ssh_service = "$host.vars.httpd_name$"
1352 //vars.event_by_ssh_logname = "icinga"
1353 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
1357 Each host with this service then must define the `httpd_name` custom attribute
1358 (for example generated from your cmdb):
1360 object Host "remote-http-host" {
1361 import "generic-host"
1362 address = "192.168.1.100"
1364 vars.httpd_name = "apache2"
1367 You can testdrive this example by manually stopping the `httpd` daemon
1368 on your `remote-http-host`. Enable the `debuglog` feature and tail the
1369 `/var/log/icinga2/debug.log` file.
1371 Remote Host Terminal:
1373 # date; service apache2 status
1374 Mon Sep 15 18:57:39 CEST 2014
1375 Apache2 is running (pid 23651).
1376 # date; service apache2 stop
1377 Mon Sep 15 18:57:47 CEST 2014
1378 [ ok ] Stopping web server: apache2 ... waiting .
1380 Icinga 2 Host Terminal:
1382 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
1383 [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
1384 [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
1385 [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
1386 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
1387 [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
1389 Remote Host Terminal:
1391 # date; service apache2 status
1392 Mon Sep 15 18:58:44 CEST 2014
1393 Apache2 is running (pid 24908).
1396 ## <a id="dependencies"></a> Dependencies
1398 Icinga 2 uses host and service [Dependency](6-object-types.md#objecttype-dependency) objects
1399 for determing their network reachability.
1401 A service can depend on a host, and vice versa. A service has an implicit
1402 dependency (parent) to its host. A host to host dependency acts implicitly
1403 as host parent relation.
1404 When dependencies are calculated, not only the immediate parent is taken into
1405 account but all parents are inherited.
1407 The `parent_host_name` and `parent_service_name` attributes are mandatory for
1408 service dependencies, `parent_host_name` is required for host dependencies.
1409 [Apply rules](3-monitoring-basics.md#using-apply) will allow you to
1410 [determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
1411 dynamic fashion if required.
1413 parent_host_name = "core-router"
1414 parent_service_name = "uplink-port"
1416 Notifications are suppressed by default if a host or service becomes unreachable.
1417 You can control that option by defining the `disable_notifications` attribute.
1419 disable_notifications = false
1421 The dependency state filter must be defined based on the parent object being
1422 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
1424 The following example will make the dependency fail and trigger it if the parent
1425 object is **not** in one of these states:
1427 states = [ OK, Critical, Unknown ]
1429 Rephrased: If the parent service object changes into the `Warning` state, this
1430 dependency will fail and render all child objects (hosts or services) unreachable.
1432 You can determine the child's reachability by querying the `is_reachable` attribute
1433 in for example [DB IDO](19-apendix.md#schema-db-ido-extensions).
1435 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
1437 Icinga 2 automatically adds an implicit dependency for services on their host. That way
1438 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
1439 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
1440 `states = [ Up ]` for all service objects.
1442 Service checks are still executed. If you want to prevent them from happening, you can
1443 apply the following dependency to all services setting their host as `parent_host_name`
1444 and disabling the checks. `assign where true` matches on all `Service` objects.
1446 apply Dependency "disable-host-service-checks" to Service {
1447 disable_checks = true
1451 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
1453 A common scenario is the Icinga 2 server behind a router. Checking internet
1454 access by pinging the Google DNS server `google-dns` is a common method, but
1455 will fail in case the `dsl-router` host is down. Therefore the example below
1456 defines a host dependency which acts implicitly as parent relation too.
1458 Furthermore the host may be reachable but ping probes are dropped by the
1459 router's firewall. In case the `dsl-router``ping4` service check fails, all
1460 further checks for the `ping4` service on host `google-dns` service should
1461 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1463 object Host "dsl-router" {
1464 import "generic-host"
1465 address = "192.168.1.1"
1468 object Host "google-dns" {
1469 import "generic-host"
1473 apply Service "ping4" {
1474 import "generic-service"
1476 check_command = "ping4"
1478 assign where host.address
1481 apply Dependency "internet" to Host {
1482 parent_host_name = "dsl-router"
1483 disable_checks = true
1484 disable_notifications = true
1486 assign where host.name != "dsl-router"
1489 apply Dependency "internet" to Service {
1490 parent_host_name = "dsl-router"
1491 parent_service_name = "ping4"
1492 disable_checks = true
1494 assign where host.name != "dsl-router"
1497 ### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
1499 You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
1500 child attributes e.g. `parent_host_name`to other object's
1503 A common example are virtual machines hosted on a master. The object
1504 name of that master is auto-generated from your CMDB or VMWare inventory
1505 into the host's custom attributes (or a generic template for your
1508 Define your master host object:
1511 object Host "master.example.com" {
1512 import "generic-host"
1515 Add a generic template defining all common host attributes:
1517 /* generic template for your virtual machines */
1518 template Host "generic-vm" {
1519 import "generic-host"
1522 Add a template for all hosts on your example.com cloud setting
1523 custom attribute `vm_parent` to `master.example.com`:
1525 template Host "generic-vm-example.com" {
1527 vars.vm_parent = "master.example.com"
1530 Define your guest hosts:
1532 object Host "www.example1.com" {
1533 import "generic-vm-master.example.com"
1536 object Host "www.example2.com" {
1537 import "generic-vm-master.example.com"
1540 Apply the host dependency to all child hosts importing the
1541 `generic-vm` template and set the `parent_host_name`
1542 to the previously defined custom attribute `host.vars.vm_parent`.
1544 apply Dependency "vm-host-to-parent-master" to Host {
1545 parent_host_name = host.vars.vm_parent
1546 assign where "generic-vm" in host.templates
1549 You can extend this example, and make your services depend on the
1550 `master.example.com` host too. Their local scope allows you to use
1551 `host.vars.vm_parent` similar to the example above.
1553 apply Dependency "vm-service-to-parent-master" to Service {
1554 parent_host_name = host.vars.vm_parent
1555 assign where "generic-vm" in host.templates
1558 That way you don't need to wait for your guest hosts becoming
1559 unreachable when the master host goes down. Instead the services
1560 will detect their reachability immediately when executing checks.
1564 > This method with setting locally scoped variables only works in
1565 > apply rules, but not in object definitions.
1568 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1570 Another classic example are agent based checks. You would define a health check
1571 for the agent daemon responding to your requests, and make all other services
1572 querying that daemon depend on that health check.
1574 The following configuration defines two nrpe based service checks `nrpe-load`
1575 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1576 `nrpe-health` service.
1578 apply Service "nrpe-health" {
1579 import "generic-service"
1580 check_command = "nrpe"
1581 assign where match("nrpe-*", host.name)
1584 apply Service "nrpe-load" {
1585 import "generic-service"
1586 check_command = "nrpe"
1587 vars.nrpe_command = "check_load"
1588 assign where match("nrpe-*", host.name)
1591 apply Service "nrpe-disk" {
1592 import "generic-service"
1593 check_command = "nrpe"
1594 vars.nrpe_command = "check_disk"
1595 assign where match("nrpe-*", host.name)
1598 object Host "nrpe-server" {
1599 import "generic-host"
1600 address = "192.168.1.5"
1603 apply Dependency "disable-nrpe-checks" to Service {
1604 parent_service_name = "nrpe-health"
1607 disable_checks = true
1608 disable_notifications = true
1609 assign where service.check_command == "nrpe"
1610 ignore where service.name == "nrpe-health"
1613 The `disable-nrpe-checks` dependency is applied to all services
1614 on the `nrpe-service` host using the `nrpe` check_command attribute
1615 but not the `nrpe-health` service itself.