1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
5 Keep in mind these examples are made with a linux server in mind, if you are
6 using Windows you will need to change the services accordingly. See the [ITL reference](7-icinga-template-library.md#windows-plugins)
7 for further information.
9 ## <a id="hosts-services"></a> Hosts and Services
11 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
12 and services can be virtually anything which can be checked in some way:
14 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
18 * Other local or network-accessible services
20 Host objects provide a mechanism to group services that are running
21 on the same physical device.
23 Here is an example of a host object which defines two child services:
25 object Host "my-server1" {
27 check_command = "hostalive"
30 object Service "ping4" {
31 host_name = "my-server1"
32 check_command = "ping4"
35 object Service "http" {
36 host_name = "my-server1"
37 check_command = "http"
40 The example creates two services `ping4` and `http` which belong to the
43 It also specifies that the host should perform its own check using the `hostalive`
46 The `address` attribute is used by check commands to determine which network
47 address is associated with the host object.
49 Details on troubleshooting check problems can be found [here](16-troubleshooting.md#troubleshooting).
51 ### <a id="host-states"></a> Host States
53 Hosts can be in any of the following states:
56 ------------|--------------
57 UP | The host is available.
58 DOWN | The host is unavailable.
60 ### <a id="service-states"></a> Service States
62 Services can be in any of the following states:
65 ------------|--------------
66 OK | The service is working properly.
67 WARNING | The service is experiencing some problems but is still considered to be in working condition.
68 CRITICAL | The service is in a critical state.
69 UNKNOWN | The check could not determine the service's state.
71 ### <a id="hard-soft-states"></a> Hard and Soft States
73 When detecting a problem with a host/service Icinga re-checks the object a number of
74 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
75 notifications. This ensures that no unnecessary notifications are sent for
76 transient failures. During this time the object is in a `SOFT` state.
78 After all re-checks have been executed and the object is still in a non-OK
79 state the host/service switches to a `HARD` state and notifications are sent.
82 ------------|--------------
83 HARD | The host/service's state hasn't recently changed.
84 SOFT | The host/service has recently changed state and is being re-checked.
86 ### <a id="host-service-checks"></a> Host and Service Checks
88 Hosts and services determine their state by running checks in a regular interval.
90 object Host "router" {
91 check_command = "hostalive"
95 The `hostalive` command is one of several built-in check commands. It sends ICMP
96 echo requests to the IP address specified in the `address` attribute to determine
97 whether a host is online.
99 A number of other [built-in check commands](7-icinga-template-library.md#plugin-check-commands) are also
100 available. In addition to these commands the next few chapters will explain in
101 detail how to set up your own check commands.
104 ## <a id="object-inheritance-using-templates"></a> Templates
106 Templates may be used to apply a set of identical attributes to more than one
109 template Service "generic-service" {
110 max_check_attempts = 3
113 enable_perfdata = true
116 apply Service "ping4" {
117 import "generic-service"
119 check_command = "ping4"
121 assign where host.address
124 apply Service "ping6" {
125 import "generic-service"
127 check_command = "ping6"
129 assign where host.address6
133 In this example the `ping4` and `ping6` services inherit properties from the
134 template `generic-service`.
136 Objects as well as templates themselves can import an arbitrary number of
137 other templates. Attributes inherited from a template can be overridden in the
140 You can also import existing non-template objects. Note that templates
141 and objects share the same namespace, i.e. you can't define a template
142 that has the same name like an object.
145 ## <a id="custom-attributes"></a> Custom Attributes
147 In addition to built-in attributes you can define your own attributes:
149 object Host "localhost" {
153 Valid values for custom attributes include:
155 * [Strings](18-language-reference.md#string-literals), [numbers](18-language-reference.md#numeric-literals) and [booleans](18-language-reference.md#boolean-literals)
156 * [Arrays](18-language-reference.md#array) and [dictionaries](18-language-reference.md#dictionary)
157 * [Functions](3-monitoring-basics.md#custom-attributes-functions)
159 ### <a id="custom-attributes-functions"></a> Functions as Custom Attributes
161 Icinga 2 lets you specify [functions](18-language-reference.md#functions) for custom attributes.
162 The special case here is that whenever Icinga 2 needs the value for such a custom attribute it runs
163 the function and uses whatever value the function returns:
165 object CheckCommand "random-value" {
166 import "plugin-check-command"
168 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
170 vars.text = {{ Math.random() * 100 }}
173 This example uses the [abbreviated lambda syntax](18-language-reference.md#nullary-lambdas).
175 These functions have access to a number of variables:
177 Variable | Description
178 -------------|---------------
179 user | The User object (for notifications).
180 service | The Service object (for service checks/notifications/event handlers).
181 host | The Host object.
182 command | The command object (e.g. a CheckCommand object for checks).
186 vars.text = {{ host.check_interval }}
188 In addition to these variables the `macro` function can be used to retrieve the
189 value of arbitrary macro expressions:
192 if (macro("$address$") == "127.0.0.1") {
193 log("Running a check for localhost!")
199 The `resolve_arguments` can be used to resolve a command and its arguments much in
200 the same fashion Icinga does this for the `command` and `arguments` attributes for
201 commands. The `by_ssh` command uses this functionality to let users specify a
202 command and arguments that should be executed via SSH:
206 var command = macro("$by_ssh_command$")
207 var arguments = macro("$by_ssh_arguments$")
209 if (typeof(command) == String && !arguments) {
213 var escaped_args = []
214 for (arg in resolve_arguments(command, arguments)) {
215 escaped_args.add(escape_shell_arg(arg))
217 return escaped_args.join(" ")
222 Acessing object attributes at runtime inside these functions is described in the
223 [advanced topics](5-advanced-topics.md#access-object-attributes-at-runtime) chapter.
225 ## <a id="runtime-macros"></a> Runtime Macros
227 Macros can be used to access other objects' attributes at runtime. For example they
228 are used in command definitions to figure out which IP address a check should be
231 object CheckCommand "my-ping" {
232 import "plugin-check-command"
234 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
237 "-w" = "$ping_wrta$,$ping_wpl$%"
238 "-c" = "$ping_crta$,$ping_cpl$%"
239 "-p" = "$ping_packets$"
242 vars.ping_address = "$address$"
250 vars.ping_packets = 5
253 object Host "router" {
254 check_command = "my-ping"
258 In this example we are using the `$address$` macro to refer to the host's `address`
261 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
262 automatically tries to find the closest match for the attribute you specified. The
263 exact rules for this are explained in the next section.
267 > When using the `$` sign as single character you must escape it with an
268 > additional dollar character (`$$`).
271 ### <a id="macro-evaluation-order"></a> Evaluation Order
273 When executing commands Icinga 2 checks the following objects in this order to look
274 up macros and their respective values:
276 1. User object (only for notifications)
280 5. Global custom attributes in the `Vars` constant
282 This execution order allows you to define default values for custom attributes
283 in your command objects.
285 Here's how you can override the custom attribute `ping_packets` from the previous
288 object Service "ping" {
289 host_name = "localhost"
290 check_command = "my-ping"
292 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
295 If a custom attribute isn't defined anywhere an empty value is used and a warning is
296 written to the Icinga 2 log.
298 You can also directly refer to a specific attribute - thereby ignoring these evaluation
299 rules - by specifying the full attribute name:
301 $service.vars.ping_wrta$
303 This retrieves the value of the `ping_wrta` custom attribute for the service. This
304 returns an empty value if the service does not have such a custom attribute no matter
305 whether another object such as the host has this attribute.
308 ### <a id="host-runtime-macros"></a> Host Runtime Macros
310 The following host custom attributes are available in all commands that are executed for
314 -----------------------------|--------------
315 host.name | The name of the host object.
316 host.display_name | The value of the `display_name` attribute.
317 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
318 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
319 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
320 host.check_attempt | The current check attempt number.
321 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
322 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
323 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
324 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
325 host.last_state_change | The last state change's timestamp.
326 host.downtime_depth | The number of active downtimes.
327 host.duration_sec | The time since the last state change.
328 host.latency | The host's check latency.
329 host.execution_time | The host's check execution time.
330 host.output | The last check's output.
331 host.perfdata | The last check's performance data.
332 host.last_check | The timestamp when the last check was executed.
333 host.check_source | The monitoring instance that performed the last check.
334 host.num_services | Number of services associated with the host.
335 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
336 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
337 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
338 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
340 ### <a id="service-runtime-macros"></a> Service Runtime Macros
342 The following service macros are available in all commands that are executed for
346 ---------------------------|--------------
347 service.name | The short name of the service object.
348 service.display_name | The value of the `display_name` attribute.
349 service.check_command | The short name of the command along with any arguments to be used for the check.
350 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
351 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
352 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
353 service.check_attempt | The current check attempt number.
354 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
355 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
356 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
357 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
358 service.last_state_change | The last state change's timestamp.
359 service.downtime_depth | The number of active downtimes.
360 service.duration_sec | The time since the last state change.
361 service.latency | The service's check latency.
362 service.execution_time | The service's check execution time.
363 service.output | The last check's output.
364 service.perfdata | The last check's performance data.
365 service.last_check | The timestamp when the last check was executed.
366 service.check_source | The monitoring instance that performed the last check.
368 ### <a id="command-runtime-macros"></a> Command Runtime Macros
370 The following custom attributes are available in all commands:
373 -----------------------|--------------
374 command.name | The name of the command object.
376 ### <a id="user-runtime-macros"></a> User Runtime Macros
378 The following custom attributes are available in all commands that are executed for
382 -----------------------|--------------
383 user.name | The name of the user object.
384 user.display_name | The value of the display_name attribute.
386 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
389 -----------------------|--------------
390 notification.type | The type of the notification.
391 notification.author | The author of the notification comment, if existing.
392 notification.comment | The comment of the notification, if existing.
394 ### <a id="global-runtime-macros"></a> Global Runtime Macros
396 The following macros are available in all executed commands:
399 -----------------------|--------------
400 icinga.timet | Current UNIX timestamp.
401 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
402 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
403 icinga.date | Current date. Example: `2014-01-03`
404 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
405 icinga.uptime | Current uptime of the Icinga 2 process.
407 The following macros provide global statistics:
410 ----------------------------------|--------------
411 icinga.num_services_ok | Current number of services in state 'OK'.
412 icinga.num_services_warning | Current number of services in state 'Warning'.
413 icinga.num_services_critical | Current number of services in state 'Critical'.
414 icinga.num_services_unknown | Current number of services in state 'Unknown'.
415 icinga.num_services_pending | Current number of pending services.
416 icinga.num_services_unreachable | Current number of unreachable services.
417 icinga.num_services_flapping | Current number of flapping services.
418 icinga.num_services_in_downtime | Current number of services in downtime.
419 icinga.num_services_acknowledged | Current number of acknowledged service problems.
420 icinga.num_hosts_up | Current number of hosts in state 'Up'.
421 icinga.num_hosts_down | Current number of hosts in state 'Down'.
422 icinga.num_hosts_unreachable | Current number of unreachable hosts.
423 icinga.num_hosts_flapping | Current number of flapping hosts.
424 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
425 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
428 ## <a id="using-apply"></a> Apply Rules
430 Instead of assigning each object ([Service](6-object-types.md#objecttype-service),
431 [Notification](6-object-types.md#objecttype-notification), [Dependency](6-object-types.md#objecttype-dependency),
432 [ScheduledDowntime](6-object-types.md#objecttype-scheduleddowntime))
433 based on attribute identifiers for example `host_name` objects can be [applied](18-language-reference.md#apply).
435 Before you start using the apply rules keep the following in mind:
437 * Define the best match.
438 * A set of unique [custom attributes](3-monitoring-basics.md#custom-attributes) for these hosts/services?
439 * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
440 * A generic pattern [match](18-language-reference.md#function-calls) on the host/service name?
441 * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](18-language-reference.md#expression-operators)
442 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
446 > You can set/override object attributes in apply rules using the respectively available
447 > objects in that scope (host and/or service objects).
449 [Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
450 for not only matching for their existance or values in apply expressions, but also assign
451 ("inherit") their values into the generated objected from apply rules.
453 * [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
454 * [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
455 * [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-dependencies)
456 * [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
458 A more advanced example is using [apply with for loops on arrays or
459 dictionaries](3-monitoring-basics.md#using-apply-for) for example provided by
460 [custom atttributes](3-monitoring-basics.md#custom-attributes) or groups.
464 > Building configuration in that dynamic way requires detailed information
465 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
466 > after successful [configuration validation](8-cli-commands.md#config-validation).
469 ### <a id="using-apply-expressions"></a> Apply Rules Expressions
471 You can use simple or advanced combinations of apply rule expressions. Each
472 expression must evaluate into the boolean `true` value. An empty string
473 will be for instance interpreted as `false`. In a similar fashion undefined
474 attributes will return `false`.
478 assign where host.vars.attribute_does_not_exist
480 Multiple `assign where` condition rows are evaluated as `OR` condition.
482 You can combine multiple expressions for matching only a subset of objects. In some cases,
483 you want to be able to add more than one assign/ignore where expression which matches
484 a specific condition. To achieve this you can use the logical `and` and `or` operators.
487 Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
488 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
489 should be ignored, or any host name ending with `*internal` pattern.
491 object HostGroup "mysql-server" {
492 display_name = "MySQL Server"
494 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
495 ignore where host.vars.test_server == true
496 ignore where match("*internal", host.name)
499 Similar example for advanced notification apply rule filters: If the service
500 attribute `notes` contains the `has gold support 24x7` string `AND` one of the
501 two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
502 `OR` the host custom attribute `always_notify` is set to `true`.
504 The notification is ignored for services whose host name ends with `*internal`
505 `OR` the `priority` custom attribute is [less than](18-language-reference.md#expression-operators) `2`.
507 template Notification "cust-xy-notification" {
508 users = [ "noc-xy", "mgmt-xy" ]
509 command = "mail-service-notification"
512 apply Notification "notify-cust-xy-mysql" to Service {
513 import "cust-xy-notification"
515 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true)
516 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
519 ### <a id="using-apply-services"></a> Apply Services to Hosts
521 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
522 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
524 The example for `ssh` applies a service object to all hosts with the `address`
525 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
527 apply Service "ssh" {
528 import "generic-service"
530 check_command = "ssh"
532 assign where host.address && host.vars.os == "Linux"
536 Other detailed scenario examples are used in their respective chapters, for example
537 [apply services with custom command arguments](3-monitoring-basics.md#command-passing-parameters).
539 ### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
541 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
545 apply Notification "mail-noc" to Service {
546 import "mail-service-notification"
548 user_groups = [ "noc" ]
550 assign where host.vars.notification.mail
554 In this example the `mail-noc` notification will be created as object for all services having the
555 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
556 and all members of the user group `noc` will get notified.
558 It is also possible to generally apply a notification template and dynamically overwrite values from
559 the template by checking for custom attributes. This can be achieved by using [conditional statements](18-language-reference.md#conditional-statements):
561 apply Notification "host-mail-noc" to Host {
562 import "mail-host-notification"
564 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom attribute
565 if (host.vars.notification_interval) {
566 interval = host.vars.notification_interval
569 // same with notification period
570 if (host.vars.notification_period) {
571 period = host.vars.notification_period
574 // Send SMS instead of email if the host's custom attribute `notification_type` is set to `sms`
575 if (host.vars.notification_type == "sms") {
576 command = "sms-host-notification"
578 command = "mail-host-notification"
581 user_groups = [ "noc" ]
583 assign where host.address
586 In the example above, the notification template `mail-host-notification`, which contains all relevant
587 notification settings, is applied on all host objects where the `host.address` is defined.
588 Each host object is then checked for custom attributes (`host.vars.notification_interval`,
589 `host.vars.notification_period` and `host.vars.notification_type`). Depending if the custom
590 attibute is set or which value it has, the value from the notification template is dynamically
593 The corresponding Host object could look like this:
595 object Host "host1" {
596 import "host-linux-prod"
597 display_name = "host1"
598 address = "192.168.1.50"
599 vars.notification_interval = 1h
600 vars.notification_period = "24x7"
601 vars.notification_type = "sms"
604 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
606 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
608 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
610 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
612 Detailed examples can be found in the [recurring downtimes](5-advanced-topics.md#recurring-downtimes) chapter.
615 ### <a id="using-apply-for"></a> Using Apply For Rules
617 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
618 there is the requirement of applying objects based on a set (array or
619 dictionary) using [apply for](18-language-reference.md#apply-for) expressions.
621 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
622 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
624 Take the following example: A host provides the snmp oids for different service check
625 types. This could look like the following example:
627 object Host "router-v6" {
628 check_command = "hostalive"
631 vars.oids["if01"] = "1.1.1.1.1"
632 vars.oids["temp"] = "1.1.1.1.2"
633 vars.oids["bgp"] = "1.1.1.1.5"
636 Now we want to create service checks for `if01` and `temp` but not `bgp`.
637 Furthermore we want to pass the snmp oid stored as dictionary value to the
638 custom attribute called `vars.snmp_oid` - this is the command argument required
639 by the [snmp](7-icinga-template-library.md#plugin-check-command-snmp) check command.
640 The service's `display_name` should be set to the identifier inside the dictionary.
642 apply Service for (identifier => oid in host.vars.oids) {
643 check_command = "snmp"
644 display_name = identifier
647 ignore where identifier == "bgp" //don't generate service for bgp checks
650 Icinga 2 evaluates the `apply for` rule for all objects with the custom attribute
651 `oids` set. It then iterates over all list items inside the `for` loop and evaluates the
652 `assign/ignore where` expressions. You can access the loop variable
653 in these expressions, e.g. for ignoring certain values.
654 In this example we'd ignore the `bgp` identifier and avoid generating an unwanted service.
655 We could extend the configuration by also matching the `oid` value on certain regex/wildcard
656 patterns for example.
660 > You don't need an `assign where` expression only checking for existance
661 > of the custom attribute.
663 That way you'll save duplicated apply rules by combining them into one
664 generic `apply for` rule generating the object name with or without a prefix.
667 #### <a id="using-apply-for-custom-attribute-override"></a> Apply For and Custom Attribute Override
669 Imagine a different more advanced example: You are monitoring your network device (host)
670 with many interfaces (services). The following requirements/problems apply:
672 * Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc)
673 * Each interface has its own vlan tag
674 * Some interfaces have QoS enabled
675 * Additional attributes such as `display_name` or `notes`, `notes_url` and `action_url` must be
676 dynamically generated
679 Tip: Define the snmp community as global constant in your [constants.conf](4-configuring-icinga-2.md#constants-conf) file.
681 const IftrafficSnmpCommunity = "public"
683 By defining the `interfaces` dictionary with three example interfaces on the `cisco-catalyst-6509-34`
684 host object, you'll make sure to pass the [custom attribute](3-monitoring-basics.md#custom-attributes)
685 storage required by the for loop in the service apply rule.
687 object Host "cisco-catalyst-6509-34" {
688 import "generic-host"
689 display_name = "Catalyst 6509 #34 VIE21"
690 address = "127.0.1.4"
692 /* "GigabitEthernet0/2" is the interface name,
693 * and key name in service apply for later on
695 vars.interfaces["GigabitEthernet0/2"] = {
696 /* define all custom attributes with the
697 * same name required for command parameters/arguments
698 * in service apply (look into your CheckCommand definition)
700 iftraffic_units = "g"
701 iftraffic_community = IftrafficSnmpCommunity
702 iftraffic_bandwidth = 1
706 vars.interfaces["GigabitEthernet0/4"] = {
707 iftraffic_units = "g"
708 //iftraffic_community = IftrafficSnmpCommunity
709 iftraffic_bandwidth = 1
713 vars.interfaces["MgmtInterface1"] = {
714 iftraffic_community = IftrafficSnmpCommunity
716 interface_address = "127.99.0.100" #special management ip
720 You can also omit the `"if-"` string, then all generated service names are directly
721 taken from the `if_name` variable value.
723 The config dictionary contains all key-value pairs for the specific interface in one
724 loop cycle, like `iftraffic_units`, `vlan`, and `qos` for the specified interface.
726 You can either map the custom attributes from the `interface_config` dictionary to
727 local custom attributes stashed into `vars`. If the names match the required command
728 argument parameters already (for example `iftraffic_units`), you could also add the
729 `interface_config` dictionary to the `vars` dictionary using the `+=` operator.
731 After `vars` is fully populated, all object attributes can be set calculated from
732 provided host attributes. For strings, you can use string concatention with the `+` operator.
734 You can also specifiy the display_name, check command, interval, notes, notes_url, action_url, etc.
735 attributes that way. Attribute strings can be [concatenated](18-language-reference.md#expression-operators),
736 for example for adding a more detailed service `display_name`.
738 This example also uses [if conditions](18-language-reference.md#conditional-statements)
739 if specific values are not set, adding a local default value.
740 The other way around you can override specific custom attributes inherited from a service template,
743 /* loop over the host.vars.interfaces dictionary
744 * for (key => value in dict) means `interface_name` as key
745 * and `interface_config` as value. Access config attributes
746 * with the indexer (`.`) character.
748 apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) {
749 import "generic-service"
750 check_command = "iftraffic"
751 display_name = "IF-" + interface_name
753 /* use the key as command argument (no duplication of values in host.vars.interfaces) */
754 vars.iftraffic_interface = interface_name
756 /* map the custom attributes as command arguments */
757 vars.iftraffic_units = interface_config.iftraffic_units
758 vars.iftraffic_community = interface_config.iftraffic_community
760 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces
761 * are the _exact_ same as required as command parameter by the check command
764 vars += interface_config
766 /* set a default value for units and bandwidth */
767 if (interface_config.iftraffic_units == "") {
768 vars.iftraffic_units = "m"
770 if (interface_config.iftraffic_bandwidth == "") {
771 vars.iftraffic_bandwidth = 1
773 if (interface_config.vlan == "") {
774 vars.vlan = "not set"
776 if (interface_config.qos == "") {
780 /* set the global constant if not explicitely
781 * not provided by the `interfaces` dictionary on the host
783 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) {
784 vars.iftraffic_community = IftrafficSnmpCommunity
787 /* Calculate some additional object attributes after populating the `vars` dictionary */
788 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'"
789 notes_url = "http://foreman.company.com/hosts/" + host.name
790 action_url = "http://snmp.checker.company.com/" + host.name + "/if-" + interface_name
795 This example makes use of the [check_iftraffic](https://exchange.icinga.org/exchange/iftraffic) plugin.
796 The `CheckCommand` definition can be found in the
797 [contributed plugin check commands](7-icinga-template-library.md#plugins-contrib-command-iftraffic)
798 - make sure to include them in your [icinga2 configuration file](4-configuring-icinga-2.md#icinga2-conf).
803 > Building configuration in that dynamic way requires detailed information
804 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
805 > after successful [configuration validation](8-cli-commands.md#config-validation).
807 Verify that the apply-for-rule successfully created the service objects with the
808 inherited custom attributes:
811 # icinga2 object list --type Service --name *catalyst*
813 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service':
816 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
817 * iftraffic_bandwidth = 1
818 * iftraffic_community = "public"
819 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
820 * iftraffic_interface = "GigabitEthernet0/2"
821 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
822 * iftraffic_units = "g"
823 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
828 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service':
831 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
832 * iftraffic_bandwidth = 1
833 * iftraffic_community = "public"
834 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
835 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53
836 * iftraffic_interface = "GigabitEthernet0/4"
837 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
838 * iftraffic_units = "g"
839 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
843 Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service':
846 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
847 * iftraffic_bandwidth = 1
848 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32
849 * iftraffic_community = "public"
850 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
851 * iftraffic_interface = "MgmtInterface1"
852 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
853 * iftraffic_units = "m"
854 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
855 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30
856 * interface_address = "127.99.0.100"
858 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24
862 ### <a id="using-apply-object-attributes"></a> Use Object Attributes in Apply Rules
864 Since apply rules are evaluated after the generic objects, you
865 can reference existing host and/or service object attributes as
866 values for any object attribute specified in that apply rule.
868 object Host "opennebula-host" {
869 import "generic-host"
872 vars.hosting["xyz"] = {
874 customer_name = "Customer xyz"
876 support_contract = "gold"
878 vars.hosting["abc"] = {
880 customer_name = "Customer xyz"
882 support_contract = "silver"
886 apply Service for (customer => config in host.vars.hosting) {
887 import "generic-service"
888 check_command = "ping4"
890 vars.qos = "disabled"
894 vars.http_uri = "/" + vars.customer + "/" + config.http_uri
896 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
898 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
900 notes_url = "http://foreman.company.com/hosts/" + host.name
901 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
904 ## <a id="groups"></a> Groups
906 A group is a collection of similar objects. Groups are primarily used as a
907 visualization aid in web interfaces.
909 Group membership is defined at the respective object itself. If
910 you have a hostgroup name `windows` for example, and want to assign
911 specific hosts to this group for later viewing the group on your
912 alert dashboard, first create a HostGroup object:
914 object HostGroup "windows" {
915 display_name = "Windows Servers"
918 Then add your hosts to this group:
920 template Host "windows-server" {
921 groups += [ "windows" ]
924 object Host "mssql-srv1" {
925 import "windows-server"
927 vars.mssql_port = 1433
930 object Host "mssql-srv2" {
931 import "windows-server"
933 vars.mssql_port = 1433
936 This can be done for service and user groups the same way:
938 object UserGroup "windows-mssql-admins" {
939 display_name = "Windows MSSQL Admins"
942 template User "generic-windows-mssql-users" {
943 groups += [ "windows-mssql-admins" ]
946 object User "win-mssql-noc" {
947 import "generic-windows-mssql-users"
949 email = "noc@example.com"
952 object User "win-mssql-ops" {
953 import "generic-windows-mssql-users"
955 email = "ops@example.com"
958 ### <a id="group-assign-intro"></a> Group Membership Assign
960 Instead of manually assigning each object to a group you can also assign objects
961 to a group based on their attributes:
963 object HostGroup "prod-mssql" {
964 display_name = "Production MSSQL Servers"
966 assign where host.vars.mssql_port && host.vars.prod_mysql_db
967 ignore where host.vars.test_server == true
968 ignore where match("*internal", host.name)
971 In this example all hosts with the `vars` attribute `mssql_port`
972 will be added as members to the host group `mssql`. However, all `\*internal`
973 hosts or with the `test_server` attribute set to `true` are not added to this
976 Details on the `assign where` syntax can be found in the
977 [Language Reference](18-language-reference.md#apply)
979 ## <a id="notifications"></a> Notifications
981 Notifications for service and host problems are an integral part of your
984 When a host or service is in a downtime, a problem has been acknowledged or
985 the dependency logic determined that the host/service is unreachable, no
986 notifications are sent. You can configure additional type and state filters
987 refining the notifications being actually sent.
989 There are many ways of sending notifications, e.g. by e-mail, XMPP,
990 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
991 Instead it relies on external mechanisms such as shell scripts to notify users.
992 More notification methods are listed in the [addons and plugins](14-addons-plugins.md#notification-scripts-interfaces)
995 A notification specification requires one or more users (and/or user groups)
996 who will be notified in case of problems. These users must have all custom
997 attributes defined which will be used in the `NotificationCommand` on execution.
999 The user `icingaadmin` in the example below will get notified only on `WARNING` and
1000 `CRITICAL` states and `problem` and `recovery` notification types.
1002 object User "icingaadmin" {
1003 display_name = "Icinga 2 Admin"
1004 enable_notifications = true
1005 states = [ OK, Warning, Critical ]
1006 types = [ Problem, Recovery ]
1007 email = "icinga@localhost"
1010 If you don't set the `states` and `types` configuration attributes for the `User`
1011 object, notifications for all states and types will be sent.
1013 Details on troubleshooting notification problems can be found [here](16-troubleshooting.md#troubleshooting).
1017 > Make sure that the [notification](8-cli-commands.md#enable-features) feature is enabled
1018 > in order to execute notification commands.
1020 You should choose which information you (and your notified users) are interested in
1021 case of emergency, and also which information does not provide any value to you and
1024 An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
1026 You can add all shared attributes to a `Notification` template which is inherited
1027 to the defined notifications. That way you'll save duplicated attributes in each
1028 `Notification` object. Attributes can be overridden locally.
1030 template Notification "generic-notification" {
1033 command = "mail-service-notification"
1035 states = [ Warning, Critical, Unknown ]
1036 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1037 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1042 The time period `24x7` is included as example configuration with Icinga 2.
1044 Use the `apply` keyword to create `Notification` objects for your services:
1046 apply Notification "notify-cust-xy-mysql" to Service {
1047 import "generic-notification"
1049 users = [ "noc-xy", "mgmt-xy" ]
1051 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
1052 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
1056 Instead of assigning users to notifications, you can also add the `user_groups`
1057 attribute with a list of user groups to the `Notification` object. Icinga 2 will
1058 send notifications to all group members.
1062 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
1063 > states for services, `Down` for hosts) will receive `Recovery` notifications.
1065 ### <a id="notification-escalations"></a> Notification Escalations
1067 When a problem notification is sent and a problem still exists at the time of re-notification
1068 you may want to escalate the problem to the next support level. A different approach
1069 is to configure the default notification by email, and escalate the problem via SMS
1070 if not already solved.
1072 You can define notification start and end times as additional configuration
1073 attributes making the `Notification` object a so-called `notification escalation`.
1074 Using templates you can share the basic notification attributes such as users or the
1075 `interval` (and override them for the escalation then).
1077 Using the example from above, you can define additional users being escalated for SMS
1078 notifications between start and end time.
1080 object User "icinga-oncall-2nd-level" {
1081 display_name = "Icinga 2nd Level"
1083 vars.mobile = "+1 555 424642"
1086 object User "icinga-oncall-1st-level" {
1087 display_name = "Icinga 1st Level"
1089 vars.mobile = "+1 555 424642"
1092 Define an additional [NotificationCommand](3-monitoring-basics.md#notification-commands) for SMS notifications.
1096 > The example is not complete as there are many different SMS providers.
1097 > Please note that sending SMS notifications will require an SMS provider
1098 > or local hardware with a SIM card active.
1100 object NotificationCommand "sms-notification" {
1102 PluginDir + "/send_sms_notification",
1107 The two new notification escalations are added onto the local host
1108 and its service `ping4` using the `generic-notification` template.
1109 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
1110 command) after `30m` until `1h`.
1114 > The `interval` was set to 15m in the `generic-notification`
1115 > template example. Lower that value in your escalations by using a secondary
1116 > template or by overriding the attribute directly in the `notifications` array
1117 > position for `escalation-sms-2nd-level`.
1119 If the problem does not get resolved nor acknowledged preventing further notifications
1120 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
1121 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
1123 apply Notification "mail" to Service {
1124 import "generic-notification"
1126 command = "mail-notification"
1127 users = [ "icingaadmin" ]
1129 assign where service.name == "ping4"
1132 apply Notification "escalation-sms-2nd-level" to Service {
1133 import "generic-notification"
1135 command = "sms-notification"
1136 users = [ "icinga-oncall-2nd-level" ]
1143 assign where service.name == "ping4"
1146 apply Notification "escalation-sms-1st-level" to Service {
1147 import "generic-notification"
1149 command = "sms-notification"
1150 users = [ "icinga-oncall-1st-level" ]
1157 assign where service.name == "ping4"
1160 ### <a id="notification-delay"></a> Notification Delay
1162 Sometimes the problem in question should not be notified when the notification is due
1163 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
1164 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
1165 postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
1166 Icinga 2 will not check against any end time for this notification. Make sure to
1167 specify a relatively low notification `interval` to get notified soon enough again.
1169 apply Notification "mail" to Service {
1170 import "generic-notification"
1172 command = "mail-notification"
1173 users = [ "icingaadmin" ]
1177 times.begin = 15m // delay notification window
1179 assign where service.name == "ping4"
1182 ### <a id="disable-renotification"></a> Disable Re-notifications
1184 If you prefer to be notified only once, you can disable re-notifications by setting the
1185 `interval` attribute to `0`.
1187 apply Notification "notify-once" to Service {
1188 import "generic-notification"
1190 command = "mail-notification"
1191 users = [ "icingaadmin" ]
1193 interval = 0 // disable re-notification
1195 assign where service.name == "ping4"
1198 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
1200 If there are no notification state and type filter attributes defined at the `Notification`
1201 or `User` object Icinga 2 assumes that all states and types are being notified.
1203 Available state and type filters for notifications are:
1205 template Notification "generic-notification" {
1207 states = [ Warning, Critical, Unknown ]
1208 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1209 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1212 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
1213 into type and state to allow more fine granular filtering for example on downtimes and flapping.
1214 You can filter for acknowledgements and custom notifications too.
1217 ## <a id="commands"></a> Commands
1219 Icinga 2 uses three different command object types to specify how
1220 checks should be performed, notifications should be sent, and
1221 events should be handled.
1223 ### <a id="check-commands"></a> Check Commands
1225 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects define the command line how
1228 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects are referenced by
1229 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1230 using the `check_command` attribute.
1234 > Make sure that the [checker](8-cli-commands.md#enable-features) feature is enabled in order to
1237 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
1239 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects require the [ITL template](7-icinga-template-library.md#itl-plugin-check-command)
1240 `plugin-check-command` to support native plugin based check methods.
1242 Unless you have done so already, download your check plugin and put it
1243 into the [PluginDir](4-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1244 `check_mysql` plugin contained in the Monitoring Plugins package.
1246 The plugin path and all command arguments are made a list of
1247 double-quoted string arguments for proper shell escaping.
1249 Call the `check_disk` plugin with the `--help` parameter to see
1250 all available options. Our example defines warning (`-w`) and
1251 critical (`-c`) thresholds for the disk usage. Without any
1252 partition defined (`-p`) it will check all local partitions.
1254 icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help
1257 This program tests connections to a MySQL server
1260 check_mysql [-d database] [-H host] [-P port] [-s socket]
1261 [-u user] [-p password] [-S] [-l] [-a cert] [-k key]
1262 [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group]
1264 Next step is to understand how [command parameters](3-monitoring-basics.md#command-passing-parameters)
1265 are being passed from a host or service object, and add a [CheckCommand](6-object-types.md#objecttype-checkcommand)
1266 definition based on these required parameters and/or default values.
1268 Please continue reading in the [plugins section](14-addons-plugins.md#plugins) for additional integration examples.
1270 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
1272 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1273 by the executed check command.
1275 The check command parameters for ITL provided plugin check command definitions are documented
1276 [here](7-icinga-template-library.md#plugin-check-commands), for example
1277 [disk](7-icinga-template-library.md#plugin-check-command-disk).
1279 In order to practice passing command parameters you should [integrate your own plugin](3-monitoring-basics.md#command-plugin-integration).
1281 The following example will use `check_mysql` provided by the [Monitoring Plugins installation](2-getting-started.md#setting-up-check-plugins).
1283 Define the default check command custom attributes, for example `mysql_user` and `mysql_password`
1284 (freely definable naming schema) and optional their default threshold values. You can
1285 then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
1286 on the command line.
1290 > Use a common command type as prefix for your command arguments to increase
1291 > readability. `mysql_user` helps understanding the context better than just
1292 > `user` as argument.
1294 The default custom attributes can be overridden by the custom attributes
1295 defined in the host or service using the check command `my-mysql`. The custom attributes
1296 can also be inherited from a parent template using additive inheritance (`+=`).
1298 # vim /etc/icinga2/conf.d/commands.conf
1300 object CheckCommand "my-mysql" {
1301 import "plugin-check-command"
1303 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir
1306 "-H" = "$mysql_host$"
1309 value = "$mysql_user$"
1311 "-p" = "$mysql_password$"
1312 "-P" = "$mysql_port$"
1313 "-s" = "$mysql_socket$"
1314 "-a" = "$mysql_cert$"
1315 "-d" = "$mysql_database$"
1316 "-k" = "$mysql_key$"
1317 "-C" = "$mysql_ca_cert$"
1318 "-D" = "$mysql_ca_dir$"
1319 "-L" = "$mysql_ciphers$"
1320 "-f" = "$mysql_optfile$"
1321 "-g" = "$mysql_group$"
1323 set_if = "$mysql_check_slave$"
1324 description = "Check if the slave thread is running properly."
1327 set_if = "$mysql_ssl$"
1328 description = "Use ssl encryption"
1332 vars.mysql_check_slave = false
1333 vars.mysql_ssl = false
1334 vars.mysql_host = "$address$"
1337 The check command definition also sets `mysql_host` to the `$address$` default value. You can override
1338 this command parameter if for example your MySQL host is not running on the same server's ip address.
1340 Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`.
1341 `MysqlUsername` and `MysqlPassword` are specified as [global constants](4-configuring-icinga-2.md#constants-conf)
1344 # vim /etc/icinga2/conf.d/services.conf
1346 apply Service "mysql-icinga-db-health" {
1347 import "generic-service"
1349 check_command = "my-mysql"
1351 vars.mysql_user = MysqlUsername
1352 vars.mysql_password = MysqlPassword
1354 vars.mysql_database = "icinga"
1355 vars.mysql_host = "192.168.33.11"
1357 assign where match("icinga2*", host.name)
1358 ignore where host.vars.no_health_check == true
1362 Take a different example: The example host configuration in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
1363 also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`.
1364 You can pass the command parameter as custom attribute `ssh_port` directly inside the service apply rule
1365 inside [services.conf](4-configuring-icinga-2.md#services-conf):
1367 apply Service "ssh" {
1368 import "generic-service"
1370 check_command = "ssh"
1371 vars.ssh_port = 2022 //custom command parameter
1373 assign where (host.address || host.address6) && host.vars.os == "Linux"
1376 If you prefer this being configured at the host instead of the service, modify the host configuration
1377 object instead. The runtime macro resolving order is described [here](3-monitoring-basics.md#macro-evaluation-order).
1379 object Host NodeName {
1381 vars.ssh_port = 2022
1384 #### <a id="command-passing-parameters-apply-for"></a> Passing Check Command Parameters Using Apply For
1386 The host `localhost` with the generated services from the `basic-partitions` dictionary (see
1387 [apply for](3-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions
1388 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
1391 The custom attribute `disk_partition` can either hold a single string or an array of
1392 string values for passing multiple partitions to the `check_disk` check plugin.
1394 object Host "my-server" {
1395 import "generic-host"
1396 address = "127.0.0.1"
1399 vars.local_disks["basic-partitions"] = {
1400 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
1404 apply Service for (disk => config in host.vars.local_disks) {
1405 import "generic-service"
1406 check_command = "my-disk"
1410 vars.disk_wfree = "10%"
1411 vars.disk_cfree = "5%"
1415 More details on using arrays in custom attributes can be found in
1416 [this chapter](3-monitoring-basics.md#custom-attributes).
1419 #### <a id="command-arguments"></a> Command Arguments
1421 By defining a check command line using the `command` attribute Icinga 2
1422 will resolve all macros in the static string or array. Sometimes it is
1423 required to extend the arguments list based on a met condition evaluated
1424 at command execution. Or making arguments optional - only set if the
1425 macro value can be resolved by Icinga 2.
1427 object CheckCommand "check_http" {
1428 import "plugin-check-command"
1430 command = [ PluginDir + "/check_http" ]
1433 "-H" = "$http_vhost$"
1434 "-I" = "$http_address$"
1436 "-p" = "$http_port$"
1438 set_if = "$http_ssl$"
1441 set_if = "$http_sni$"
1444 value = "$http_auth_pair$"
1445 description = "Username:password on sites with basic authentication"
1448 set_if = "$http_ignore_body$"
1450 "-r" = "$http_expect_body_regex$"
1451 "-w" = "$http_warn_time$"
1452 "-c" = "$http_critical_time$"
1453 "-e" = "$http_expect$"
1456 vars.http_address = "$address$"
1457 vars.http_ssl = false
1458 vars.http_sni = false
1461 The example shows the `check_http` check command defining the most common
1462 arguments. Each of them is optional by default and will be omitted if
1463 the value is not set. For example if the service calling the check command
1464 does not have `vars.http_port` set, it won't get added to the command
1467 If the `vars.http_ssl` custom attribute is set in the service, host or command
1468 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
1469 numeric value to the command line. String values are not supported.
1471 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
1472 to the final command argument array. Empty strings for macro values won't omit
1475 That way you can use the `check_http` command definition for both, with and
1476 without SSL enabled checks saving you duplicated command definitions.
1478 Details on all available options can be found in the
1479 [CheckCommand object definition](6-object-types.md#objecttype-checkcommand).
1482 #### <a id="command-environment-variables"></a> Environment Variables
1484 The `env` command object attribute specifies a list of environment variables with values calculated
1485 from either runtime macros or custom attributes which should be exported as environment variables
1486 prior to executing the command.
1488 This is useful for example for hiding sensitive information on the command line output
1489 when passing credentials to database checks:
1491 object CheckCommand "mysql-health" {
1492 import "plugin-check-command"
1495 PluginDir + "/check_mysql"
1499 "-H" = "$mysql_address$"
1500 "-d" = "$mysql_database$"
1503 vars.mysql_address = "$address$"
1504 vars.mysql_database = "icinga"
1505 vars.mysql_user = "icinga_check"
1506 vars.mysql_pass = "password"
1508 env.MYSQLUSER = "$mysql_user$"
1509 env.MYSQLPASS = "$mysql_pass$"
1514 ### <a id="notification-commands"></a> Notification Commands
1516 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
1517 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
1519 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects are referenced by
1520 [Notification](6-object-types.md#objecttype-notification) objects using the `command` attribute.
1522 `NotificationCommand` objects require the [ITL template](7-icinga-template-library.md#itl-plugin-notification-command)
1523 `plugin-notification-command` to support native plugin-based notifications.
1527 > Make sure that the [notification](8-cli-commands.md#enable-features) feature is enabled
1528 > in order to execute notification commands.
1530 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
1531 the current check output) sending an email to the user(s) associated with the
1532 notification itself (`$user.email$`).
1534 If you want to specify default values for some of the custom attribute definitions,
1535 you can add a `vars` dictionary as shown for the `CheckCommand` object.
1537 object NotificationCommand "mail-service-notification" {
1538 import "plugin-notification-command"
1540 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
1543 NOTIFICATIONTYPE = "$notification.type$"
1544 SERVICEDESC = "$service.name$"
1545 HOSTALIAS = "$host.display_name$"
1546 HOSTADDRESS = "$address$"
1547 SERVICESTATE = "$service.state$"
1548 LONGDATETIME = "$icinga.long_date_time$"
1549 SERVICEOUTPUT = "$service.output$"
1550 NOTIFICATIONAUTHORNAME = "$notification.author$"
1551 NOTIFICATIONCOMMENT = "$notification.comment$"
1552 HOSTDISPLAYNAME = "$host.display_name$"
1553 SERVICEDISPLAYNAME = "$service.display_name$"
1554 USEREMAIL = "$user.email$"
1558 The command attribute in the `mail-service-notification` command refers to the following
1559 shell script. The macros specified in the `env` array are exported
1560 as environment variables and can be used in the notification script:
1563 template=$(cat <<TEMPLATE
1566 Notification Type: $NOTIFICATIONTYPE
1568 Service: $SERVICEDESC
1570 Address: $HOSTADDRESS
1571 State: $SERVICESTATE
1573 Date/Time: $LONGDATETIME
1575 Additional Info: $SERVICEOUTPUT
1577 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
1581 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
1585 > This example is for `exim` only. Requires changes for `sendmail` and
1588 While it's possible to specify the entire notification command right
1589 in the NotificationCommand object it is generally advisable to create a
1590 shell script in the `/etc/icinga2/scripts` directory and have the
1591 NotificationCommand object refer to that.
1593 ### <a id="event-commands"></a> Event Commands
1595 Unlike notifications, event commands for hosts/services are called on every
1596 check execution if one of these conditions match:
1598 * The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
1599 * The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
1600 * The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
1602 [EventCommand](6-object-types.md#objecttype-eventcommand) objects are referenced by
1603 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1604 using the `event_command` attribute.
1606 Therefore the `EventCommand` object should define a command line
1607 evaluating the current service state and other service runtime attributes
1608 available through runtime vars. Runtime macros such as `$service.state_type$`
1609 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
1610 events being triggered.
1612 Common use case scenarios are a failing HTTP check requiring an immediate
1613 restart via event command, or if an application is locked and requires
1614 a restart upon detection.
1616 `EventCommand` objects require the ITL template `plugin-event-command`
1617 to support native plugin based checks.
1619 #### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
1621 The following example will triggert a restart of the `httpd` daemon
1622 via ssh when the `http` service check fails. If the service state is
1623 `OK`, it will not trigger any event action.
1628 * icinga user with public key authentication
1629 * icinga user with sudo permissions for restarting the httpd daemon.
1633 # ls /home/icinga/.ssh/
1637 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
1640 Define a generic [EventCommand](6-object-types.md#objecttype-eventcommand) object `event_by_ssh`
1641 which can be used for all event commands triggered using ssh:
1643 /* pass event commands through ssh */
1644 object EventCommand "event_by_ssh" {
1645 import "plugin-event-command"
1647 command = [ PluginDir + "/check_by_ssh" ]
1650 "-H" = "$event_by_ssh_address$"
1651 "-p" = "$event_by_ssh_port$"
1652 "-C" = "$event_by_ssh_command$"
1653 "-l" = "$event_by_ssh_logname$"
1654 "-i" = "$event_by_ssh_identity$"
1656 set_if = "$event_by_ssh_quiet$"
1658 "-w" = "$event_by_ssh_warn$"
1659 "-c" = "$event_by_ssh_crit$"
1660 "-t" = "$event_by_ssh_timeout$"
1663 vars.event_by_ssh_address = "$address$"
1664 vars.event_by_ssh_quiet = false
1667 The actual event command only passes the `event_by_ssh_command` attribute.
1668 The `event_by_ssh_service` custom attribute takes care of passing the correct
1669 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
1670 is only restarted when the service is not in an `OK` state.
1673 object EventCommand "event_by_ssh_restart_service" {
1674 import "event_by_ssh"
1676 //only restart the daemon if state > 0 (not-ok)
1677 //requires sudo permissions for the icinga user
1678 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
1682 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
1683 which service should be restarted using the `event_by_ssh_service` attribute.
1685 object Service "http" {
1686 import "generic-service"
1687 host_name = "remote-http-host"
1688 check_command = "http"
1690 event_command = "event_by_ssh_restart_service"
1691 vars.event_by_ssh_service = "$host.vars.httpd_name$"
1693 //vars.event_by_ssh_logname = "icinga"
1694 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
1698 Each host with this service then must define the `httpd_name` custom attribute
1699 (for example generated from your cmdb):
1701 object Host "remote-http-host" {
1702 import "generic-host"
1703 address = "192.168.1.100"
1705 vars.httpd_name = "apache2"
1708 You can testdrive this example by manually stopping the `httpd` daemon
1709 on your `remote-http-host`. Enable the `debuglog` feature and tail the
1710 `/var/log/icinga2/debug.log` file.
1712 Remote Host Terminal:
1714 # date; service apache2 status
1715 Mon Sep 15 18:57:39 CEST 2014
1716 Apache2 is running (pid 23651).
1717 # date; service apache2 stop
1718 Mon Sep 15 18:57:47 CEST 2014
1719 [ ok ] Stopping web server: apache2 ... waiting .
1721 Icinga 2 Host Terminal:
1723 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
1724 [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
1725 [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
1726 [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
1727 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
1728 [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
1730 Remote Host Terminal:
1732 # date; service apache2 status
1733 Mon Sep 15 18:58:44 CEST 2014
1734 Apache2 is running (pid 24908).
1737 ## <a id="dependencies"></a> Dependencies
1739 Icinga 2 uses host and service [Dependency](6-object-types.md#objecttype-dependency) objects
1740 for determing their network reachability.
1742 A service can depend on a host, and vice versa. A service has an implicit
1743 dependency (parent) to its host. A host to host dependency acts implicitly
1744 as host parent relation.
1745 When dependencies are calculated, not only the immediate parent is taken into
1746 account but all parents are inherited.
1748 The `parent_host_name` and `parent_service_name` attributes are mandatory for
1749 service dependencies, `parent_host_name` is required for host dependencies.
1750 [Apply rules](3-monitoring-basics.md#using-apply) will allow you to
1751 [determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
1752 dynamic fashion if required.
1754 parent_host_name = "core-router"
1755 parent_service_name = "uplink-port"
1757 Notifications are suppressed by default if a host or service becomes unreachable.
1758 You can control that option by defining the `disable_notifications` attribute.
1760 disable_notifications = false
1762 If the dependency should be triggered in the parent object's soft state, you
1763 need to set `ignore_soft_states` to `false`.
1765 The dependency state filter must be defined based on the parent object being
1766 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
1768 The following example will make the dependency fail and trigger it if the parent
1769 object is **not** in one of these states:
1771 states = [ OK, Critical, Unknown ]
1773 Rephrased: If the parent service object changes into the `Warning` state, this
1774 dependency will fail and render all child objects (hosts or services) unreachable.
1776 You can determine the child's reachability by querying the `is_reachable` attribute
1777 in for example [DB IDO](23-appendix.md#schema-db-ido-extensions).
1779 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
1781 Icinga 2 automatically adds an implicit dependency for services on their host. That way
1782 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
1783 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
1784 `states = [ Up ]` for all service objects.
1786 Service checks are still executed. If you want to prevent them from happening, you can
1787 apply the following dependency to all services setting their host as `parent_host_name`
1788 and disabling the checks. `assign where true` matches on all `Service` objects.
1790 apply Dependency "disable-host-service-checks" to Service {
1791 disable_checks = true
1795 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
1797 A common scenario is the Icinga 2 server behind a router. Checking internet
1798 access by pinging the Google DNS server `google-dns` is a common method, but
1799 will fail in case the `dsl-router` host is down. Therefore the example below
1800 defines a host dependency which acts implicitly as parent relation too.
1802 Furthermore the host may be reachable but ping probes are dropped by the
1803 router's firewall. In case the `dsl-router`'s `ping4` service check fails, all
1804 further checks for the `ping4` service on host `google-dns` service should
1805 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1807 object Host "dsl-router" {
1808 import "generic-host"
1809 address = "192.168.1.1"
1812 object Host "google-dns" {
1813 import "generic-host"
1817 apply Service "ping4" {
1818 import "generic-service"
1820 check_command = "ping4"
1822 assign where host.address
1825 apply Dependency "internet" to Host {
1826 parent_host_name = "dsl-router"
1827 disable_checks = true
1828 disable_notifications = true
1830 assign where host.name != "dsl-router"
1833 apply Dependency "internet" to Service {
1834 parent_host_name = "dsl-router"
1835 parent_service_name = "ping4"
1836 disable_checks = true
1838 assign where host.name != "dsl-router"
1841 ### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
1843 You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
1844 child attributes e.g. `parent_host_name` to other object's
1847 A common example are virtual machines hosted on a master. The object
1848 name of that master is auto-generated from your CMDB or VMWare inventory
1849 into the host's custom attributes (or a generic template for your
1852 Define your master host object:
1855 object Host "master.example.com" {
1856 import "generic-host"
1859 Add a generic template defining all common host attributes:
1861 /* generic template for your virtual machines */
1862 template Host "generic-vm" {
1863 import "generic-host"
1866 Add a template for all hosts on your example.com cloud setting
1867 custom attribute `vm_parent` to `master.example.com`:
1869 template Host "generic-vm-example.com" {
1871 vars.vm_parent = "master.example.com"
1874 Define your guest hosts:
1876 object Host "www.example1.com" {
1877 import "generic-vm-master.example.com"
1880 object Host "www.example2.com" {
1881 import "generic-vm-master.example.com"
1884 Apply the host dependency to all child hosts importing the
1885 `generic-vm` template and set the `parent_host_name`
1886 to the previously defined custom attribute `host.vars.vm_parent`.
1888 apply Dependency "vm-host-to-parent-master" to Host {
1889 parent_host_name = host.vars.vm_parent
1890 assign where "generic-vm" in host.templates
1893 You can extend this example, and make your services depend on the
1894 `master.example.com` host too. Their local scope allows you to use
1895 `host.vars.vm_parent` similar to the example above.
1897 apply Dependency "vm-service-to-parent-master" to Service {
1898 parent_host_name = host.vars.vm_parent
1899 assign where "generic-vm" in host.templates
1902 That way you don't need to wait for your guest hosts becoming
1903 unreachable when the master host goes down. Instead the services
1904 will detect their reachability immediately when executing checks.
1908 > This method with setting locally scoped variables only works in
1909 > apply rules, but not in object definitions.
1912 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1914 Another classic example are agent based checks. You would define a health check
1915 for the agent daemon responding to your requests, and make all other services
1916 querying that daemon depend on that health check.
1918 The following configuration defines two nrpe based service checks `nrpe-load`
1919 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1920 `nrpe-health` service.
1922 apply Service "nrpe-health" {
1923 import "generic-service"
1924 check_command = "nrpe"
1925 assign where match("nrpe-*", host.name)
1928 apply Service "nrpe-load" {
1929 import "generic-service"
1930 check_command = "nrpe"
1931 vars.nrpe_command = "check_load"
1932 assign where match("nrpe-*", host.name)
1935 apply Service "nrpe-disk" {
1936 import "generic-service"
1937 check_command = "nrpe"
1938 vars.nrpe_command = "check_disk"
1939 assign where match("nrpe-*", host.name)
1942 object Host "nrpe-server" {
1943 import "generic-host"
1944 address = "192.168.1.5"
1947 apply Dependency "disable-nrpe-checks" to Service {
1948 parent_service_name = "nrpe-health"
1951 disable_checks = true
1952 disable_notifications = true
1953 assign where service.check_command == "nrpe"
1954 ignore where service.name == "nrpe-health"
1957 The `disable-nrpe-checks` dependency is applied to all services
1958 on the `nrpe-service` host using the `nrpe` check_command attribute
1959 but not the `nrpe-health` service itself.