1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
5 Keep in mind these examples are made with a Linux server in mind. If you are
6 using Windows, you will need to change the services accordingly. See the [ITL reference](10-icinga-template-library.md#windows-plugins)
7 for further information.
9 ## <a id="hosts-services"></a> Hosts and Services
11 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
12 and services can be virtually anything which can be checked in some way:
14 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
18 * Other local or network-accessible services
20 Host objects provide a mechanism to group services that are running
21 on the same physical device.
23 Here is an example of a host object which defines two child services:
25 object Host "my-server1" {
27 check_command = "hostalive"
30 object Service "ping4" {
31 host_name = "my-server1"
32 check_command = "ping4"
35 object Service "http" {
36 host_name = "my-server1"
37 check_command = "http"
40 The example creates two services `ping4` and `http` which belong to the
43 It also specifies that the host should perform its own check using the `hostalive`
46 The `address` attribute is used by check commands to determine which network
47 address is associated with the host object.
49 Details on troubleshooting check problems can be found [here](15-troubleshooting.md#troubleshooting).
51 ### <a id="host-states"></a> Host States
53 Hosts can be in any of the following states:
56 ------------|--------------
57 UP | The host is available.
58 DOWN | The host is unavailable.
60 ### <a id="service-states"></a> Service States
62 Services can be in any of the following states:
65 ------------|--------------
66 OK | The service is working properly.
67 WARNING | The service is experiencing some problems but is still considered to be in working condition.
68 CRITICAL | The service is in a critical state.
69 UNKNOWN | The check could not determine the service's state.
71 ### <a id="hard-soft-states"></a> Hard and Soft States
73 When detecting a problem with a host/service Icinga re-checks the object a number of
74 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
75 notifications. This ensures that no unnecessary notifications are sent for
76 transient failures. During this time the object is in a `SOFT` state.
78 After all re-checks have been executed and the object is still in a non-OK
79 state the host/service switches to a `HARD` state and notifications are sent.
82 ------------|--------------
83 HARD | The host/service's state hasn't recently changed.
84 SOFT | The host/service has recently changed state and is being re-checked.
86 ### <a id="host-service-checks"></a> Host and Service Checks
88 Hosts and services determine their state by running checks in a regular interval.
90 object Host "router" {
91 check_command = "hostalive"
95 The `hostalive` command is one of several built-in check commands. It sends ICMP
96 echo requests to the IP address specified in the `address` attribute to determine
97 whether a host is online.
99 A number of other [built-in check commands](10-icinga-template-library.md#plugin-check-commands) are also
100 available. In addition to these commands the next few chapters will explain in
101 detail how to set up your own check commands.
104 ## <a id="object-inheritance-using-templates"></a> Templates
106 Templates may be used to apply a set of identical attributes to more than one
109 template Service "generic-service" {
110 max_check_attempts = 3
113 enable_perfdata = true
116 apply Service "ping4" {
117 import "generic-service"
119 check_command = "ping4"
121 assign where host.address
124 apply Service "ping6" {
125 import "generic-service"
127 check_command = "ping6"
129 assign where host.address6
133 In this example the `ping4` and `ping6` services inherit properties from the
134 template `generic-service`.
136 Objects as well as templates themselves can import an arbitrary number of
137 other templates. Attributes inherited from a template can be overridden in the
140 You can also import existing non-template objects. Note that templates
141 and objects share the same namespace, i.e. you can't define a template
142 that has the same name like an object.
145 ## <a id="custom-attributes"></a> Custom Attributes
147 In addition to built-in attributes you can define your own attributes:
149 object Host "localhost" {
153 Valid values for custom attributes include:
155 * [Strings](17-language-reference.md#string-literals), [numbers](17-language-reference.md#numeric-literals) and [booleans](17-language-reference.md#boolean-literals)
156 * [Arrays](17-language-reference.md#array) and [dictionaries](17-language-reference.md#dictionary)
157 * [Functions](3-monitoring-basics.md#custom-attributes-functions)
159 ### <a id="custom-attributes-functions"></a> Functions as Custom Attributes
161 Icinga 2 lets you specify [functions](17-language-reference.md#functions) for custom attributes.
162 The special case here is that whenever Icinga 2 needs the value for such a custom attribute it runs
163 the function and uses whatever value the function returns:
165 object CheckCommand "random-value" {
166 import "plugin-check-command"
168 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
170 vars.text = {{ Math.random() * 100 }}
173 This example uses the [abbreviated lambda syntax](17-language-reference.md#nullary-lambdas).
175 These functions have access to a number of variables:
177 Variable | Description
178 -------------|---------------
179 user | The User object (for notifications).
180 service | The Service object (for service checks/notifications/event handlers).
181 host | The Host object.
182 command | The command object (e.g. a CheckCommand object for checks).
186 vars.text = {{ host.check_interval }}
188 In addition to these variables the `macro` function can be used to retrieve the
189 value of arbitrary macro expressions:
192 if (macro("$address$") == "127.0.0.1") {
193 log("Running a check for localhost!")
199 The `resolve_arguments` can be used to resolve a command and its arguments much in
200 the same fashion Icinga does this for the `command` and `arguments` attributes for
201 commands. The `by_ssh` command uses this functionality to let users specify a
202 command and arguments that should be executed via SSH:
206 var command = macro("$by_ssh_command$")
207 var arguments = macro("$by_ssh_arguments$")
209 if (typeof(command) == String && !arguments) {
213 var escaped_args = []
214 for (arg in resolve_arguments(command, arguments)) {
215 escaped_args.add(escape_shell_arg(arg))
217 return escaped_args.join(" ")
222 Acessing object attributes at runtime inside these functions is described in the
223 [advanced topics](8-advanced-topics.md#access-object-attributes-at-runtime) chapter.
225 ## <a id="runtime-macros"></a> Runtime Macros
227 Macros can be used to access other objects' attributes at runtime. For example they
228 are used in command definitions to figure out which IP address a check should be
231 object CheckCommand "my-ping" {
232 import "plugin-check-command"
234 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
237 "-w" = "$ping_wrta$,$ping_wpl$%"
238 "-c" = "$ping_crta$,$ping_cpl$%"
239 "-p" = "$ping_packets$"
242 vars.ping_address = "$address$"
250 vars.ping_packets = 5
253 object Host "router" {
254 check_command = "my-ping"
258 In this example we are using the `$address$` macro to refer to the host's `address`
261 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
262 automatically tries to find the closest match for the attribute you specified. The
263 exact rules for this are explained in the next section.
267 > When using the `$` sign as single character you must escape it with an
268 > additional dollar character (`$$`).
271 ### <a id="macro-evaluation-order"></a> Evaluation Order
273 When executing commands Icinga 2 checks the following objects in this order to look
274 up macros and their respective values:
276 1. User object (only for notifications)
280 5. Global custom attributes in the `Vars` constant
282 This execution order allows you to define default values for custom attributes
283 in your command objects.
285 Here's how you can override the custom attribute `ping_packets` from the previous
288 object Service "ping" {
289 host_name = "localhost"
290 check_command = "my-ping"
292 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
295 If a custom attribute isn't defined anywhere, an empty value is used and a warning is
296 written to the Icinga 2 log.
298 You can also directly refer to a specific attribute -- thereby ignoring these evaluation
299 rules -- by specifying the full attribute name:
301 $service.vars.ping_wrta$
303 This retrieves the value of the `ping_wrta` custom attribute for the service. This
304 returns an empty value if the service does not have such a custom attribute no matter
305 whether another object such as the host has this attribute.
308 ### <a id="host-runtime-macros"></a> Host Runtime Macros
310 The following host custom attributes are available in all commands that are executed for
314 -----------------------------|--------------
315 host.name | The name of the host object.
316 host.display_name | The value of the `display_name` attribute.
317 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
318 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
319 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
320 host.check_attempt | The current check attempt number.
321 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
322 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
323 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
324 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
325 host.last_state_change | The last state change's timestamp.
326 host.downtime_depth | The number of active downtimes.
327 host.duration_sec | The time since the last state change.
328 host.latency | The host's check latency.
329 host.execution_time | The host's check execution time.
330 host.output | The last check's output.
331 host.perfdata | The last check's performance data.
332 host.last_check | The timestamp when the last check was executed.
333 host.check_source | The monitoring instance that performed the last check.
334 host.num_services | Number of services associated with the host.
335 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
336 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
337 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
338 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
340 ### <a id="service-runtime-macros"></a> Service Runtime Macros
342 The following service macros are available in all commands that are executed for
346 ---------------------------|--------------
347 service.name | The short name of the service object.
348 service.display_name | The value of the `display_name` attribute.
349 service.check_command | The short name of the command along with any arguments to be used for the check.
350 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
351 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
352 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
353 service.check_attempt | The current check attempt number.
354 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
355 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
356 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
357 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
358 service.last_state_change | The last state change's timestamp.
359 service.downtime_depth | The number of active downtimes.
360 service.duration_sec | The time since the last state change.
361 service.latency | The service's check latency.
362 service.execution_time | The service's check execution time.
363 service.output | The last check's output.
364 service.perfdata | The last check's performance data.
365 service.last_check | The timestamp when the last check was executed.
366 service.check_source | The monitoring instance that performed the last check.
368 ### <a id="command-runtime-macros"></a> Command Runtime Macros
370 The following custom attributes are available in all commands:
373 -----------------------|--------------
374 command.name | The name of the command object.
376 ### <a id="user-runtime-macros"></a> User Runtime Macros
378 The following custom attributes are available in all commands that are executed for
382 -----------------------|--------------
383 user.name | The name of the user object.
384 user.display_name | The value of the display_name attribute.
386 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
389 -----------------------|--------------
390 notification.type | The type of the notification.
391 notification.author | The author of the notification comment if existing.
392 notification.comment | The comment of the notification if existing.
394 ### <a id="global-runtime-macros"></a> Global Runtime Macros
396 The following macros are available in all executed commands:
399 -----------------------|--------------
400 icinga.timet | Current UNIX timestamp.
401 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
402 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
403 icinga.date | Current date. Example: `2014-01-03`
404 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
405 icinga.uptime | Current uptime of the Icinga 2 process.
407 The following macros provide global statistics:
410 ----------------------------------|--------------
411 icinga.num_services_ok | Current number of services in state 'OK'.
412 icinga.num_services_warning | Current number of services in state 'Warning'.
413 icinga.num_services_critical | Current number of services in state 'Critical'.
414 icinga.num_services_unknown | Current number of services in state 'Unknown'.
415 icinga.num_services_pending | Current number of pending services.
416 icinga.num_services_unreachable | Current number of unreachable services.
417 icinga.num_services_flapping | Current number of flapping services.
418 icinga.num_services_in_downtime | Current number of services in downtime.
419 icinga.num_services_acknowledged | Current number of acknowledged service problems.
420 icinga.num_hosts_up | Current number of hosts in state 'Up'.
421 icinga.num_hosts_down | Current number of hosts in state 'Down'.
422 icinga.num_hosts_unreachable | Current number of unreachable hosts.
423 icinga.num_hosts_pending | Current number of pending hosts.
424 icinga.num_hosts_flapping | Current number of flapping hosts.
425 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
426 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
429 ## <a id="using-apply"></a> Apply Rules
431 Instead of assigning each object ([Service](9-object-types.md#objecttype-service),
432 [Notification](9-object-types.md#objecttype-notification), [Dependency](9-object-types.md#objecttype-dependency),
433 [ScheduledDowntime](9-object-types.md#objecttype-scheduleddowntime))
434 based on attribute identifiers for example `host_name` objects can be [applied](17-language-reference.md#apply).
436 Before you start using the apply rules keep the following in mind:
438 * Define the best match.
439 * A set of unique [custom attributes](3-monitoring-basics.md#custom-attributes) for these hosts/services?
440 * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
441 * A generic pattern [match](17-language-reference.md#function-calls) on the host/service name?
442 * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](17-language-reference.md#expression-operators)
443 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
447 > You can set/override object attributes in apply rules using the respectively available
448 > objects in that scope (host and/or service objects).
450 [Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
451 for not only matching for their existance or values in apply expressions, but also assign
452 ("inherit") their values into the generated objected from apply rules.
454 * [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
455 * [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
456 * [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-dependencies)
457 * [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
459 A more advanced example is using [apply with for loops on arrays or
460 dictionaries](3-monitoring-basics.md#using-apply-for) for example provided by
461 [custom atttributes](3-monitoring-basics.md#custom-attributes) or groups.
465 > Building configuration in that dynamic way requires detailed information
466 > of the generated objects. Use the `object list` [CLI command](11-cli-commands.md#cli-command-object)
467 > after successful [configuration validation](11-cli-commands.md#config-validation).
470 ### <a id="using-apply-expressions"></a> Apply Rules Expressions
472 You can use simple or advanced combinations of apply rule expressions. Each
473 expression must evaluate into the boolean `true` value. An empty string
474 will be for instance interpreted as `false`. In a similar fashion undefined
475 attributes will return `false`.
479 assign where host.vars.attribute_does_not_exist
481 Multiple `assign where` condition rows are evaluated as `OR` condition.
483 You can combine multiple expressions for matching only a subset of objects. In some cases,
484 you want to be able to add more than one assign/ignore where expression which matches
485 a specific condition. To achieve this you can use the logical `and` and `or` operators.
488 Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
489 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
490 should be ignored, or any host name ending with `*internal` pattern.
492 object HostGroup "mysql-server" {
493 display_name = "MySQL Server"
495 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
496 ignore where host.vars.test_server == true
497 ignore where match("*internal", host.name)
500 Similar example for advanced notification apply rule filters: If the service
501 attribute `notes` contains the `has gold support 24x7` string `AND` one of the
502 two condition passes, either the `customer` host custom attribute is set to `customer-xy`
503 `OR` the host custom attribute `always_notify` is set to `true`.
505 The notification is ignored for services whose host name ends with `*internal`
506 `OR` the `priority` custom attribute is [less than](17-language-reference.md#expression-operators) `2`.
508 template Notification "cust-xy-notification" {
509 users = [ "noc-xy", "mgmt-xy" ]
510 command = "mail-service-notification"
513 apply Notification "notify-cust-xy-mysql" to Service {
514 import "cust-xy-notification"
516 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true)
517 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
520 More advanced examples are covered [here](8-advanced-topics.md#use-functions-assign-where).
522 ### <a id="using-apply-services"></a> Apply Services to Hosts
524 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
525 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
527 The example for `ssh` applies a service object to all hosts with the `address`
528 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
530 apply Service "ssh" {
531 import "generic-service"
533 check_command = "ssh"
535 assign where host.address && host.vars.os == "Linux"
539 Other detailed examples are used in their respective chapters, for example
540 [apply services with custom command arguments](3-monitoring-basics.md#command-passing-parameters).
542 ### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
544 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
548 apply Notification "mail-noc" to Service {
549 import "mail-service-notification"
551 user_groups = [ "noc" ]
553 assign where host.vars.notification.mail
557 In this example the `mail-noc` notification will be created as object for all services having the
558 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
559 and all members of the user group `noc` will get notified.
561 It is also possible to generally apply a notification template and dynamically overwrite values from
562 the template by checking for custom attributes. This can be achieved by using [conditional statements](17-language-reference.md#conditional-statements):
564 apply Notification "host-mail-noc" to Host {
565 import "mail-host-notification"
567 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom attribute
568 if (host.vars.notification_interval) {
569 interval = host.vars.notification_interval
572 // same with notification period
573 if (host.vars.notification_period) {
574 period = host.vars.notification_period
577 // Send SMS instead of email if the host's custom attribute `notification_type` is set to `sms`
578 if (host.vars.notification_type == "sms") {
579 command = "sms-host-notification"
581 command = "mail-host-notification"
584 user_groups = [ "noc" ]
586 assign where host.address
589 In the example above, the notification template `mail-host-notification`, which contains all relevant
590 notification settings, is applied on all host objects where the `host.address` is defined.
591 Each host object is then checked for custom attributes (`host.vars.notification_interval`,
592 `host.vars.notification_period` and `host.vars.notification_type`). Depending if the custom
593 attibute is set or which value it has, the value from the notification template is dynamically
596 The corresponding host object could look like this:
598 object Host "host1" {
599 import "host-linux-prod"
600 display_name = "host1"
601 address = "192.168.1.50"
602 vars.notification_interval = 1h
603 vars.notification_period = "24x7"
604 vars.notification_type = "sms"
607 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
609 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
611 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
613 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
615 Detailed examples can be found in the [recurring downtimes](8-advanced-topics.md#recurring-downtimes) chapter.
618 ### <a id="using-apply-for"></a> Using Apply For Rules
620 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
621 there is the requirement of applying objects based on a set (array or
622 dictionary) using [apply for](17-language-reference.md#apply-for) expressions.
624 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
625 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
627 Take the following example: A host provides the snmp oids for different service check
628 types. This could look like the following example:
630 object Host "router-v6" {
631 check_command = "hostalive"
634 vars.oids["if01"] = "1.1.1.1.1"
635 vars.oids["temp"] = "1.1.1.1.2"
636 vars.oids["bgp"] = "1.1.1.1.5"
639 Now we want to create service checks for `if01` and `temp`, but not `bgp`.
640 Furthermore we want to pass the snmp oid stored as dictionary value to the
641 custom attribute called `vars.snmp_oid` -- this is the command argument required
642 by the [snmp](10-icinga-template-library.md#plugin-check-command-snmp) check command.
643 The service's `display_name` should be set to the identifier inside the dictionary.
645 apply Service for (identifier => oid in host.vars.oids) {
646 check_command = "snmp"
647 display_name = identifier
650 ignore where identifier == "bgp" //don't generate service for bgp checks
653 Icinga 2 evaluates the `apply for` rule for all objects with the custom attribute
654 `oids` set. It then iterates over all list items inside the `for` loop and evaluates the
655 `assign/ignore where` expressions. You can access the loop variable
656 in these expressions, e.g. for ignoring certain values.
657 In this example we'd ignore the `bgp` identifier and avoid generating an unwanted service.
658 We could extend the configuration by also matching the `oid` value on certain regex/wildcard
659 patterns for example.
663 > You don't need an `assign where` expression only checking for existance
664 > of the custom attribute.
666 That way you'll save duplicated apply rules by combining them into one
667 generic `apply for` rule generating the object name with or without a prefix.
670 #### <a id="using-apply-for-custom-attribute-override"></a> Apply For and Custom Attribute Override
672 Imagine a different more advanced example: You are monitoring your network device (host)
673 with many interfaces (services). The following requirements/problems apply:
675 * Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc.)
676 * Each interface has its own vlan tag
677 * Some interfaces have QoS enabled
678 * Additional attributes such as `display_name` or `notes`, `notes_url` and `action_url` must be
679 dynamically generated
682 Tip: Define the snmp community as global constant in your [constants.conf](4-configuring-icinga-2.md#constants-conf) file.
684 const IftrafficSnmpCommunity = "public"
686 By defining the `interfaces` dictionary with three example interfaces on the `cisco-catalyst-6509-34`
687 host object, you'll make sure to pass the [custom attribute](3-monitoring-basics.md#custom-attributes)
688 storage required by the for loop in the service apply rule.
690 object Host "cisco-catalyst-6509-34" {
691 import "generic-host"
692 display_name = "Catalyst 6509 #34 VIE21"
693 address = "127.0.1.4"
695 /* "GigabitEthernet0/2" is the interface name,
696 * and key name in service apply for later on
698 vars.interfaces["GigabitEthernet0/2"] = {
699 /* define all custom attributes with the
700 * same name required for command parameters/arguments
701 * in service apply (look into your CheckCommand definition)
703 iftraffic_units = "g"
704 iftraffic_community = IftrafficSnmpCommunity
705 iftraffic_bandwidth = 1
709 vars.interfaces["GigabitEthernet0/4"] = {
710 iftraffic_units = "g"
711 //iftraffic_community = IftrafficSnmpCommunity
712 iftraffic_bandwidth = 1
716 vars.interfaces["MgmtInterface1"] = {
717 iftraffic_community = IftrafficSnmpCommunity
719 interface_address = "127.99.0.100" #special management ip
723 You can also omit the `"if-"` string, then all generated service names are directly
724 taken from the `if_name` variable value.
726 The config dictionary contains all key-value pairs for the specific interface in one
727 loop cycle, like `iftraffic_units`, `vlan`, and `qos` for the specified interface.
729 You can either map the custom attributes from the `interface_config` dictionary to
730 local custom attributes stashed into `vars`. If the names match the required command
731 argument parameters already (for example `iftraffic_units`), you could also add the
732 `interface_config` dictionary to the `vars` dictionary using the `+=` operator.
734 After `vars` is fully populated, all object attributes can be set calculated from
735 provided host attributes. For strings, you can use string concatention with the `+` operator.
737 You can also specifiy the display_name, check command, interval, notes, notes_url, action_url, etc.
738 attributes that way. Attribute strings can be [concatenated](17-language-reference.md#expression-operators),
739 for example for adding a more detailed service `display_name`.
741 This example also uses [if conditions](17-language-reference.md#conditional-statements)
742 if specific values are not set, adding a local default value.
743 The other way around you can override specific custom attributes inherited from a service template if set.
745 /* loop over the host.vars.interfaces dictionary
746 * for (key => value in dict) means `interface_name` as key
747 * and `interface_config` as value. Access config attributes
748 * with the indexer (`.`) character.
750 apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) {
751 import "generic-service"
752 check_command = "iftraffic"
753 display_name = "IF-" + interface_name
755 /* use the key as command argument (no duplication of values in host.vars.interfaces) */
756 vars.iftraffic_interface = interface_name
758 /* map the custom attributes as command arguments */
759 vars.iftraffic_units = interface_config.iftraffic_units
760 vars.iftraffic_community = interface_config.iftraffic_community
762 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces
763 * are the _exact_ same as required as command parameter by the check command
766 vars += interface_config
768 /* set a default value for units and bandwidth */
769 if (interface_config.iftraffic_units == "") {
770 vars.iftraffic_units = "m"
772 if (interface_config.iftraffic_bandwidth == "") {
773 vars.iftraffic_bandwidth = 1
775 if (interface_config.vlan == "") {
776 vars.vlan = "not set"
778 if (interface_config.qos == "") {
782 /* set the global constant if not explicitely
783 * not provided by the `interfaces` dictionary on the host
785 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) {
786 vars.iftraffic_community = IftrafficSnmpCommunity
789 /* Calculate some additional object attributes after populating the `vars` dictionary */
790 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'"
791 notes_url = "http://foreman.company.com/hosts/" + host.name
792 action_url = "http://snmp.checker.company.com/" + host.name + "/if-" + interface_name
797 This example makes use of the [check_iftraffic](https://exchange.icinga.org/exchange/iftraffic) plugin.
798 The `CheckCommand` definition can be found in the
799 [contributed plugin check commands](10-icinga-template-library.md#plugin-contrib-command-iftraffic)
800 -- make sure to include them in your [icinga2 configuration file](4-configuring-icinga-2.md#icinga2-conf).
805 > Building configuration in that dynamic way requires detailed information
806 > of the generated objects. Use the `object list` [CLI command](11-cli-commands.md#cli-command-object)
807 > after successful [configuration validation](11-cli-commands.md#config-validation).
809 Verify that the apply-for-rule successfully created the service objects with the
810 inherited custom attributes:
813 # icinga2 object list --type Service --name *catalyst*
815 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service':
818 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
819 * iftraffic_bandwidth = 1
820 * iftraffic_community = "public"
821 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
822 * iftraffic_interface = "GigabitEthernet0/2"
823 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
824 * iftraffic_units = "g"
825 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
830 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service':
833 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
834 * iftraffic_bandwidth = 1
835 * iftraffic_community = "public"
836 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
837 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53
838 * iftraffic_interface = "GigabitEthernet0/4"
839 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
840 * iftraffic_units = "g"
841 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
845 Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service':
848 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
849 * iftraffic_bandwidth = 1
850 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32
851 * iftraffic_community = "public"
852 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
853 * iftraffic_interface = "MgmtInterface1"
854 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
855 * iftraffic_units = "m"
856 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
857 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30
858 * interface_address = "127.99.0.100"
860 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24
864 ### <a id="using-apply-object-attributes"></a> Use Object Attributes in Apply Rules
866 Since apply rules are evaluated after the generic objects, you
867 can reference existing host and/or service object attributes as
868 values for any object attribute specified in that apply rule.
870 object Host "opennebula-host" {
871 import "generic-host"
874 vars.hosting["xyz"] = {
876 customer_name = "Customer xyz"
878 support_contract = "gold"
880 vars.hosting["abc"] = {
882 customer_name = "Customer xyz"
884 support_contract = "silver"
888 apply Service for (customer => config in host.vars.hosting) {
889 import "generic-service"
890 check_command = "ping4"
892 vars.qos = "disabled"
896 vars.http_uri = "/" + vars.customer + "/" + config.http_uri
898 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
900 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
902 notes_url = "http://foreman.company.com/hosts/" + host.name
903 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
906 ## <a id="groups"></a> Groups
908 A group is a collection of similar objects. Groups are primarily used as a
909 visualization aid in web interfaces.
911 Group membership is defined at the respective object itself. If
912 you have a hostgroup name `windows` for example, and want to assign
913 specific hosts to this group for later viewing the group on your
914 alert dashboard, first create a HostGroup object:
916 object HostGroup "windows" {
917 display_name = "Windows Servers"
920 Then add your hosts to this group:
922 template Host "windows-server" {
923 groups += [ "windows" ]
926 object Host "mssql-srv1" {
927 import "windows-server"
929 vars.mssql_port = 1433
932 object Host "mssql-srv2" {
933 import "windows-server"
935 vars.mssql_port = 1433
938 This can be done for service and user groups the same way:
940 object UserGroup "windows-mssql-admins" {
941 display_name = "Windows MSSQL Admins"
944 template User "generic-windows-mssql-users" {
945 groups += [ "windows-mssql-admins" ]
948 object User "win-mssql-noc" {
949 import "generic-windows-mssql-users"
951 email = "noc@example.com"
954 object User "win-mssql-ops" {
955 import "generic-windows-mssql-users"
957 email = "ops@example.com"
960 ### <a id="group-assign-intro"></a> Group Membership Assign
962 Instead of manually assigning each object to a group you can also assign objects
963 to a group based on their attributes:
965 object HostGroup "prod-mssql" {
966 display_name = "Production MSSQL Servers"
968 assign where host.vars.mssql_port && host.vars.prod_mysql_db
969 ignore where host.vars.test_server == true
970 ignore where match("*internal", host.name)
973 In this example all hosts with the `vars` attribute `mssql_port`
974 will be added as members to the host group `mssql`. However, all `\*internal`
975 hosts or with the `test_server` attribute set to `true` are not added to this
978 Details on the `assign where` syntax can be found in the
979 [Language Reference](17-language-reference.md#apply).
981 ## <a id="notifications"></a> Notifications
983 Notifications for service and host problems are an integral part of your
986 When a host or service is in a downtime, a problem has been acknowledged or
987 the dependency logic determined that the host/service is unreachable, no
988 notifications are sent. You can configure additional type and state filters
989 refining the notifications being actually sent.
991 There are many ways of sending notifications, e.g. by email, XMPP,
992 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
993 Instead it relies on external mechanisms such as shell scripts to notify users.
994 More notification methods are listed in the [addons and plugins](13-addons.md#notification-scripts-interfaces)
997 A notification specification requires one or more users (and/or user groups)
998 who will be notified in case of problems. These users must have all custom
999 attributes defined which will be used in the `NotificationCommand` on execution.
1001 The user `icingaadmin` in the example below will get notified only on `WARNING` and
1002 `CRITICAL` states and `problem` and `recovery` notification types.
1004 object User "icingaadmin" {
1005 display_name = "Icinga 2 Admin"
1006 enable_notifications = true
1007 states = [ OK, Warning, Critical ]
1008 types = [ Problem, Recovery ]
1009 email = "icinga@localhost"
1012 If you don't set the `states` and `types` configuration attributes for the `User`
1013 object, notifications for all states and types will be sent.
1015 Details on troubleshooting notification problems can be found [here](15-troubleshooting.md#troubleshooting).
1019 > Make sure that the [notification](11-cli-commands.md#enable-features) feature is enabled
1020 > in order to execute notification commands.
1022 You should choose which information you (and your notified users) are interested in
1023 case of emergency, and also which information does not provide any value to you and
1026 An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
1028 You can add all shared attributes to a `Notification` template which is inherited
1029 to the defined notifications. That way you'll save duplicated attributes in each
1030 `Notification` object. Attributes can be overridden locally.
1032 template Notification "generic-notification" {
1035 command = "mail-service-notification"
1037 states = [ Warning, Critical, Unknown ]
1038 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1039 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1044 The time period `24x7` is included as example configuration with Icinga 2.
1046 Use the `apply` keyword to create `Notification` objects for your services:
1048 apply Notification "notify-cust-xy-mysql" to Service {
1049 import "generic-notification"
1051 users = [ "noc-xy", "mgmt-xy" ]
1053 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
1054 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
1058 Instead of assigning users to notifications, you can also add the `user_groups`
1059 attribute with a list of user groups to the `Notification` object. Icinga 2 will
1060 send notifications to all group members.
1064 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
1065 > states for services, `Down` for hosts) will receive `Recovery` notifications.
1067 ### <a id="notification-escalations"></a> Notification Escalations
1069 When a problem notification is sent and a problem still exists at the time of re-notification
1070 you may want to escalate the problem to the next support level. A different approach
1071 is to configure the default notification by email, and escalate the problem via SMS
1072 if not already solved.
1074 You can define notification start and end times as additional configuration
1075 attributes making the `Notification` object a so-called `notification escalation`.
1076 Using templates you can share the basic notification attributes such as users or the
1077 `interval` (and override them for the escalation then).
1079 Using the example from above, you can define additional users being escalated for SMS
1080 notifications between start and end time.
1082 object User "icinga-oncall-2nd-level" {
1083 display_name = "Icinga 2nd Level"
1085 vars.mobile = "+1 555 424642"
1088 object User "icinga-oncall-1st-level" {
1089 display_name = "Icinga 1st Level"
1091 vars.mobile = "+1 555 424642"
1094 Define an additional [NotificationCommand](3-monitoring-basics.md#notification-commands) for SMS notifications.
1098 > The example is not complete as there are many different SMS providers.
1099 > Please note that sending SMS notifications will require an SMS provider
1100 > or local hardware with an active SIM card.
1102 object NotificationCommand "sms-notification" {
1104 PluginDir + "/send_sms_notification",
1109 The two new notification escalations are added onto the local host
1110 and its service `ping4` using the `generic-notification` template.
1111 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
1112 command) after `30m` until `1h`.
1116 > The `interval` was set to 15m in the `generic-notification`
1117 > template example. Lower that value in your escalations by using a secondary
1118 > template or by overriding the attribute directly in the `notifications` array
1119 > position for `escalation-sms-2nd-level`.
1121 If the problem does not get resolved nor acknowledged preventing further notifications,
1122 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
1123 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
1125 apply Notification "mail" to Service {
1126 import "generic-notification"
1128 command = "mail-notification"
1129 users = [ "icingaadmin" ]
1131 assign where service.name == "ping4"
1134 apply Notification "escalation-sms-2nd-level" to Service {
1135 import "generic-notification"
1137 command = "sms-notification"
1138 users = [ "icinga-oncall-2nd-level" ]
1145 assign where service.name == "ping4"
1148 apply Notification "escalation-sms-1st-level" to Service {
1149 import "generic-notification"
1151 command = "sms-notification"
1152 users = [ "icinga-oncall-1st-level" ]
1159 assign where service.name == "ping4"
1162 ### <a id="notification-delay"></a> Notification Delay
1164 Sometimes the problem in question should not be announced when the notification is due
1165 (the object reaching the `HARD` state), but after a certain period. In Icinga 2
1166 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
1167 postpone the notification window for 15 minutes. Leave out the `end` key -- if not set,
1168 Icinga 2 will not check against any end time for this notification. Make sure to
1169 specify a relatively low notification `interval` to get notified soon enough again.
1171 apply Notification "mail" to Service {
1172 import "generic-notification"
1174 command = "mail-notification"
1175 users = [ "icingaadmin" ]
1179 times.begin = 15m // delay notification window
1181 assign where service.name == "ping4"
1184 ### <a id="disable-renotification"></a> Disable Re-notifications
1186 If you prefer to be notified only once, you can disable re-notifications by setting the
1187 `interval` attribute to `0`.
1189 apply Notification "notify-once" to Service {
1190 import "generic-notification"
1192 command = "mail-notification"
1193 users = [ "icingaadmin" ]
1195 interval = 0 // disable re-notification
1197 assign where service.name == "ping4"
1200 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
1202 If there are no notification state and type filter attributes defined at the `Notification`
1203 or `User` object, Icinga 2 assumes that all states and types are being notified.
1205 Available state and type filters for notifications are:
1207 template Notification "generic-notification" {
1209 states = [ Warning, Critical, Unknown ]
1210 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1211 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1214 If you are familiar with Icinga 1.x `notification_options`, please note that they have been split
1215 into type and state to allow more fine granular filtering for example on downtimes and flapping.
1216 You can filter for acknowledgements and custom notifications too.
1219 ## <a id="commands"></a> Commands
1221 Icinga 2 uses three different command object types to specify how
1222 checks should be performed, notifications should be sent, and
1223 events should be handled.
1225 ### <a id="check-commands"></a> Check Commands
1227 [CheckCommand](9-object-types.md#objecttype-checkcommand) objects define the command line how
1230 [CheckCommand](9-object-types.md#objecttype-checkcommand) objects are referenced by
1231 [Host](9-object-types.md#objecttype-host) and [Service](9-object-types.md#objecttype-service) objects
1232 using the `check_command` attribute.
1236 > Make sure that the [checker](11-cli-commands.md#enable-features) feature is enabled in order to
1239 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
1241 [CheckCommand](9-object-types.md#objecttype-checkcommand) objects require the [ITL template](10-icinga-template-library.md#itl-plugin-check-command)
1242 `plugin-check-command` to support native plugin based check methods.
1244 Unless you have done so already, download your check plugin and put it
1245 into the [PluginDir](4-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1246 `check_mysql` plugin contained in the Monitoring Plugins package.
1248 The plugin path and all command arguments are made a list of
1249 double-quoted string arguments for proper shell escaping.
1251 Call the `check_disk` plugin with the `--help` parameter to see
1252 all available options. Our example defines warning (`-w`) and
1253 critical (`-c`) thresholds for the disk usage. Without any
1254 partition defined (`-p`) it will check all local partitions.
1256 icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help
1258 This program tests connections to a MySQL server
1261 check_mysql [-d database] [-H host] [-P port] [-s socket]
1262 [-u user] [-p password] [-S] [-l] [-a cert] [-k key]
1263 [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group]
1265 Next step is to understand how [command parameters](3-monitoring-basics.md#command-passing-parameters)
1266 are being passed from a host or service object, and add a [CheckCommand](9-object-types.md#objecttype-checkcommand)
1267 definition based on these required parameters and/or default values.
1269 Please continue reading in the [plugins section](5-service-monitoring.md#service-monitoring-plugins) for additional integration examples.
1271 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
1273 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1274 by the executed check command.
1276 The check command parameters for ITL provided plugin check command definitions are documented
1277 [here](10-icinga-template-library.md#plugin-check-commands), for example
1278 [disk](10-icinga-template-library.md#plugin-check-command-disk).
1280 In order to practice passing command parameters you should [integrate your own plugin](3-monitoring-basics.md#command-plugin-integration).
1282 The following example will use `check_mysql` provided by the [Monitoring Plugins installation](2-getting-started.md#setting-up-check-plugins).
1284 Define the default check command custom attributes, for example `mysql_user` and `mysql_password`
1285 (freely definable naming schema) and optional their default threshold values. You can
1286 then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
1287 on the command line.
1291 > Use a common command type as prefix for your command arguments to increase
1292 > readability. `mysql_user` helps understanding the context better than just
1293 > `user` as argument.
1295 The default custom attributes can be overridden by the custom attributes
1296 defined in the host or service using the check command `my-mysql`. The custom attributes
1297 can also be inherited from a parent template using additive inheritance (`+=`).
1299 # vim /etc/icinga2/conf.d/commands.conf
1301 object CheckCommand "my-mysql" {
1302 import "plugin-check-command"
1304 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir
1307 "-H" = "$mysql_host$"
1310 value = "$mysql_user$"
1312 "-p" = "$mysql_password$"
1313 "-P" = "$mysql_port$"
1314 "-s" = "$mysql_socket$"
1315 "-a" = "$mysql_cert$"
1316 "-d" = "$mysql_database$"
1317 "-k" = "$mysql_key$"
1318 "-C" = "$mysql_ca_cert$"
1319 "-D" = "$mysql_ca_dir$"
1320 "-L" = "$mysql_ciphers$"
1321 "-f" = "$mysql_optfile$"
1322 "-g" = "$mysql_group$"
1324 set_if = "$mysql_check_slave$"
1325 description = "Check if the slave thread is running properly."
1328 set_if = "$mysql_ssl$"
1329 description = "Use ssl encryption"
1333 vars.mysql_check_slave = false
1334 vars.mysql_ssl = false
1335 vars.mysql_host = "$address$"
1338 The check command definition also sets `mysql_host` to the `$address$` default value. You can override
1339 this command parameter if for example your MySQL host is not running on the same server's ip address.
1341 Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`.
1342 `MysqlUsername` and `MysqlPassword` are specified as [global constants](4-configuring-icinga-2.md#constants-conf)
1345 # vim /etc/icinga2/conf.d/services.conf
1347 apply Service "mysql-icinga-db-health" {
1348 import "generic-service"
1350 check_command = "my-mysql"
1352 vars.mysql_user = MysqlUsername
1353 vars.mysql_password = MysqlPassword
1355 vars.mysql_database = "icinga"
1356 vars.mysql_host = "192.168.33.11"
1358 assign where match("icinga2*", host.name)
1359 ignore where host.vars.no_health_check == true
1363 Take a different example: The example host configuration in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
1364 also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`.
1365 You can pass the command parameter as custom attribute `ssh_port` directly inside the service apply rule
1366 inside [services.conf](4-configuring-icinga-2.md#services-conf):
1368 apply Service "ssh" {
1369 import "generic-service"
1371 check_command = "ssh"
1372 vars.ssh_port = 2022 //custom command parameter
1374 assign where (host.address || host.address6) && host.vars.os == "Linux"
1377 If you prefer this being configured at the host instead of the service, modify the host configuration
1378 object instead. The runtime macro resolving order is described [here](3-monitoring-basics.md#macro-evaluation-order).
1380 object Host NodeName {
1382 vars.ssh_port = 2022
1385 #### <a id="command-passing-parameters-apply-for"></a> Passing Check Command Parameters Using Apply For
1387 The host `localhost` with the generated services from the `basic-partitions` dictionary (see
1388 [apply for](3-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions
1389 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
1392 The custom attribute `disk_partition` can either hold a single string or an array of
1393 string values for passing multiple partitions to the `check_disk` check plugin.
1395 object Host "my-server" {
1396 import "generic-host"
1397 address = "127.0.0.1"
1400 vars.local_disks["basic-partitions"] = {
1401 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
1405 apply Service for (disk => config in host.vars.local_disks) {
1406 import "generic-service"
1407 check_command = "my-disk"
1411 vars.disk_wfree = "10%"
1412 vars.disk_cfree = "5%"
1416 More details on using arrays in custom attributes can be found in
1417 [this chapter](3-monitoring-basics.md#custom-attributes).
1420 #### <a id="command-arguments"></a> Command Arguments
1422 By defining a check command line using the `command` attribute Icinga 2
1423 will resolve all macros in the static string or array. Sometimes it is
1424 required to extend the arguments list based on a met condition evaluated
1425 at command execution. Or making arguments optional -- only set if the
1426 macro value can be resolved by Icinga 2.
1428 object CheckCommand "check_http" {
1429 import "plugin-check-command"
1431 command = [ PluginDir + "/check_http" ]
1434 "-H" = "$http_vhost$"
1435 "-I" = "$http_address$"
1437 "-p" = "$http_port$"
1439 set_if = "$http_ssl$"
1442 set_if = "$http_sni$"
1445 value = "$http_auth_pair$"
1446 description = "Username:password on sites with basic authentication"
1449 set_if = "$http_ignore_body$"
1451 "-r" = "$http_expect_body_regex$"
1452 "-w" = "$http_warn_time$"
1453 "-c" = "$http_critical_time$"
1454 "-e" = "$http_expect$"
1457 vars.http_address = "$address$"
1458 vars.http_ssl = false
1459 vars.http_sni = false
1462 The example shows the `check_http` check command defining the most common
1463 arguments. Each of them is optional by default and will be omitted if
1464 the value is not set. For example, if the service calling the check command
1465 does not have `vars.http_port` set, it won't get added to the command
1468 If the `vars.http_ssl` custom attribute is set in the service, host or command
1469 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
1470 numeric value to the command line. String values are not supported.
1472 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
1473 to the final command argument array. Empty strings for macro values won't omit
1476 That way you can use the `check_http` command definition for both, with and
1477 without SSL enabled checks saving you duplicated command definitions.
1479 Details on all available options can be found in the
1480 [CheckCommand object definition](9-object-types.md#objecttype-checkcommand).
1483 #### <a id="command-environment-variables"></a> Environment Variables
1485 The `env` command object attribute specifies a list of environment variables with values calculated
1486 from either runtime macros or custom attributes which should be exported as environment variables
1487 prior to executing the command.
1489 This is useful for example for hiding sensitive information on the command line output
1490 when passing credentials to database checks:
1492 object CheckCommand "mysql-health" {
1493 import "plugin-check-command"
1496 PluginDir + "/check_mysql"
1500 "-H" = "$mysql_address$"
1501 "-d" = "$mysql_database$"
1504 vars.mysql_address = "$address$"
1505 vars.mysql_database = "icinga"
1506 vars.mysql_user = "icinga_check"
1507 vars.mysql_pass = "password"
1509 env.MYSQLUSER = "$mysql_user$"
1510 env.MYSQLPASS = "$mysql_pass$"
1515 ### <a id="notification-commands"></a> Notification Commands
1517 [NotificationCommand](9-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
1518 interfaces (email, XMPP, IRC, Twitter, etc.).
1520 [NotificationCommand](9-object-types.md#objecttype-notificationcommand) objects are referenced by
1521 [Notification](9-object-types.md#objecttype-notification) objects using the `command` attribute.
1523 `NotificationCommand` objects require the [ITL template](10-icinga-template-library.md#itl-plugin-notification-command)
1524 `plugin-notification-command` to support native plugin-based notifications.
1528 > Make sure that the [notification](11-cli-commands.md#enable-features) feature is enabled
1529 > in order to execute notification commands.
1531 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
1532 the current check output) sending an email to the user(s) associated with the
1533 notification itself (`$user.email$`).
1535 If you want to specify default values for some of the custom attribute definitions,
1536 you can add a `vars` dictionary as shown for the `CheckCommand` object.
1538 object NotificationCommand "mail-service-notification" {
1539 import "plugin-notification-command"
1541 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
1544 NOTIFICATIONTYPE = "$notification.type$"
1545 SERVICEDESC = "$service.name$"
1546 HOSTALIAS = "$host.display_name$"
1547 HOSTADDRESS = "$address$"
1548 SERVICESTATE = "$service.state$"
1549 LONGDATETIME = "$icinga.long_date_time$"
1550 SERVICEOUTPUT = "$service.output$"
1551 NOTIFICATIONAUTHORNAME = "$notification.author$"
1552 NOTIFICATIONCOMMENT = "$notification.comment$"
1553 HOSTDISPLAYNAME = "$host.display_name$"
1554 SERVICEDISPLAYNAME = "$service.display_name$"
1555 USEREMAIL = "$user.email$"
1559 The command attribute in the `mail-service-notification` command refers to the following
1560 shell script. The macros specified in the `env` array are exported
1561 as environment variables and can be used in the notification script:
1564 template=$(cat <<TEMPLATE
1567 Notification Type: $NOTIFICATIONTYPE
1569 Service: $SERVICEDESC
1571 Address: $HOSTADDRESS
1572 State: $SERVICESTATE
1574 Date/Time: $LONGDATETIME
1576 Additional Info: $SERVICEOUTPUT
1578 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
1582 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
1586 > This example is for `exim` only. Requires changes for `sendmail` and
1589 While it's possible to specify the entire notification command right
1590 in the NotificationCommand object it is generally advisable to create a
1591 shell script in the `/etc/icinga2/scripts` directory and have the
1592 NotificationCommand object refer to that.
1594 ### <a id="event-commands"></a> Event Commands
1596 Unlike notifications, event commands for hosts/services are called on every
1597 check execution if one of these conditions matches:
1599 * The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
1600 * The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
1601 * The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
1603 [EventCommand](9-object-types.md#objecttype-eventcommand) objects are referenced by
1604 [Host](9-object-types.md#objecttype-host) and [Service](9-object-types.md#objecttype-service) objects
1605 using the `event_command` attribute.
1607 Therefore the `EventCommand` object should define a command line
1608 evaluating the current service state and other service runtime attributes
1609 available through runtime vars. Runtime macros such as `$service.state_type$`
1610 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
1611 events being triggered.
1613 If you are using a client as [command endpoint](6-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint)
1614 the event command will be executed on the client itself (similar to the check
1617 Common use case scenarios are a failing HTTP check requiring an immediate
1618 restart via event command, or if an application is locked and requires
1619 a restart upon detection.
1621 `EventCommand` objects require the ITL template `plugin-event-command`
1622 to support native plugin based checks.
1624 #### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
1626 The following example will trigger a restart of the `httpd` daemon
1627 via ssh when the `http` service check fails. If the service state is
1628 `OK`, it will not trigger any event action.
1633 * icinga user with public key authentication
1634 * icinga user with sudo permissions for restarting the httpd daemon.
1638 # ls /home/icinga/.ssh/
1642 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
1645 Define a generic [EventCommand](9-object-types.md#objecttype-eventcommand) object `event_by_ssh`
1646 which can be used for all event commands triggered using ssh:
1648 /* pass event commands through ssh */
1649 object EventCommand "event_by_ssh" {
1650 import "plugin-event-command"
1652 command = [ PluginDir + "/check_by_ssh" ]
1655 "-H" = "$event_by_ssh_address$"
1656 "-p" = "$event_by_ssh_port$"
1657 "-C" = "$event_by_ssh_command$"
1658 "-l" = "$event_by_ssh_logname$"
1659 "-i" = "$event_by_ssh_identity$"
1661 set_if = "$event_by_ssh_quiet$"
1663 "-w" = "$event_by_ssh_warn$"
1664 "-c" = "$event_by_ssh_crit$"
1665 "-t" = "$event_by_ssh_timeout$"
1668 vars.event_by_ssh_address = "$address$"
1669 vars.event_by_ssh_quiet = false
1672 The actual event command only passes the `event_by_ssh_command` attribute.
1673 The `event_by_ssh_service` custom attribute takes care of passing the correct
1674 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
1675 is only restarted when the service is not in an `OK` state.
1678 object EventCommand "event_by_ssh_restart_service" {
1679 import "event_by_ssh"
1681 //only restart the daemon if state > 0 (not-ok)
1682 //requires sudo permissions for the icinga user
1683 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
1687 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
1688 which service should be restarted using the `event_by_ssh_service` attribute.
1690 object Service "http" {
1691 import "generic-service"
1692 host_name = "remote-http-host"
1693 check_command = "http"
1695 event_command = "event_by_ssh_restart_service"
1696 vars.event_by_ssh_service = "$host.vars.httpd_name$"
1698 //vars.event_by_ssh_logname = "icinga"
1699 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
1703 Each host with this service then must define the `httpd_name` custom attribute
1704 (for example generated from your cmdb):
1706 object Host "remote-http-host" {
1707 import "generic-host"
1708 address = "192.168.1.100"
1710 vars.httpd_name = "apache2"
1713 You can testdrive this example by manually stopping the `httpd` daemon
1714 on your `remote-http-host`. Enable the `debuglog` feature and tail the
1715 `/var/log/icinga2/debug.log` file.
1717 Remote Host Terminal:
1719 # date; service apache2 status
1720 Mon Sep 15 18:57:39 CEST 2014
1721 Apache2 is running (pid 23651).
1722 # date; service apache2 stop
1723 Mon Sep 15 18:57:47 CEST 2014
1724 [ ok ] Stopping web server: apache2 ... waiting .
1726 Icinga 2 Host Terminal:
1728 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
1729 [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
1730 [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
1731 [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
1732 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
1733 [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
1735 Remote Host Terminal:
1737 # date; service apache2 status
1738 Mon Sep 15 18:58:44 CEST 2014
1739 Apache2 is running (pid 24908).
1742 ## <a id="dependencies"></a> Dependencies
1744 Icinga 2 uses host and service [Dependency](9-object-types.md#objecttype-dependency) objects
1745 for determing their network reachability.
1747 A service can depend on a host, and vice versa. A service has an implicit
1748 dependency (parent) to its host. A host to host dependency acts implicitly
1749 as host parent relation.
1750 When dependencies are calculated, not only the immediate parent is taken into
1751 account but all parents are inherited.
1753 The `parent_host_name` and `parent_service_name` attributes are mandatory for
1754 service dependencies, `parent_host_name` is required for host dependencies.
1755 [Apply rules](3-monitoring-basics.md#using-apply) will allow you to
1756 [determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
1757 dynamic fashion if required.
1759 parent_host_name = "core-router"
1760 parent_service_name = "uplink-port"
1762 Notifications are suppressed by default if a host or service becomes unreachable.
1763 You can control that option by defining the `disable_notifications` attribute.
1765 disable_notifications = false
1767 If the dependency should be triggered in the parent object's soft state, you
1768 need to set `ignore_soft_states` to `false`.
1770 The dependency state filter must be defined based on the parent object being
1771 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
1773 The following example will make the dependency fail and trigger it if the parent
1774 object is **not** in one of these states:
1776 states = [ OK, Critical, Unknown ]
1778 Rephrased: If the parent service object changes into the `Warning` state, this
1779 dependency will fail and render all child objects (hosts or services) unreachable.
1781 You can determine the child's reachability by querying the `is_reachable` attribute
1782 in for example [DB IDO](23-appendix.md#schema-db-ido-extensions).
1784 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
1786 Icinga 2 automatically adds an implicit dependency for services on their host. That way
1787 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
1788 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
1789 `states = [ Up ]` for all service objects.
1791 Service checks are still executed. If you want to prevent them from happening, you can
1792 apply the following dependency to all services setting their host as `parent_host_name`
1793 and disabling the checks. `assign where true` matches on all `Service` objects.
1795 apply Dependency "disable-host-service-checks" to Service {
1796 disable_checks = true
1800 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
1802 A common scenario is the Icinga 2 server behind a router. Checking internet
1803 access by pinging the Google DNS server `google-dns` is a common method, but
1804 will fail in case the `dsl-router` host is down. Therefore the example below
1805 defines a host dependency which acts implicitly as parent relation too.
1807 Furthermore the host may be reachable but ping probes are dropped by the
1808 router's firewall. In case the `dsl-router`'s `ping4` service check fails, all
1809 further checks for the `ping4` service on host `google-dns` service should
1810 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1812 object Host "dsl-router" {
1813 import "generic-host"
1814 address = "192.168.1.1"
1817 object Host "google-dns" {
1818 import "generic-host"
1822 apply Service "ping4" {
1823 import "generic-service"
1825 check_command = "ping4"
1827 assign where host.address
1830 apply Dependency "internet" to Host {
1831 parent_host_name = "dsl-router"
1832 disable_checks = true
1833 disable_notifications = true
1835 assign where host.name != "dsl-router"
1838 apply Dependency "internet" to Service {
1839 parent_host_name = "dsl-router"
1840 parent_service_name = "ping4"
1841 disable_checks = true
1843 assign where host.name != "dsl-router"
1846 ### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
1848 You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
1849 child attributes, e.g. `parent_host_name` to other objects'
1852 A common example are virtual machines hosted on a master. The object
1853 name of that master is auto-generated from your CMDB or VMWare inventory
1854 into the host's custom attributes (or a generic template for your
1857 Define your master host object:
1860 object Host "master.example.com" {
1861 import "generic-host"
1864 Add a generic template defining all common host attributes:
1866 /* generic template for your virtual machines */
1867 template Host "generic-vm" {
1868 import "generic-host"
1871 Add a template for all hosts on your example.com cloud setting
1872 custom attribute `vm_parent` to `master.example.com`:
1874 template Host "generic-vm-example.com" {
1876 vars.vm_parent = "master.example.com"
1879 Define your guest hosts:
1881 object Host "www.example1.com" {
1882 import "generic-vm-master.example.com"
1885 object Host "www.example2.com" {
1886 import "generic-vm-master.example.com"
1889 Apply the host dependency to all child hosts importing the
1890 `generic-vm` template and set the `parent_host_name`
1891 to the previously defined custom attribute `host.vars.vm_parent`.
1893 apply Dependency "vm-host-to-parent-master" to Host {
1894 parent_host_name = host.vars.vm_parent
1895 assign where "generic-vm" in host.templates
1898 You can extend this example, and make your services depend on the
1899 `master.example.com` host too. Their local scope allows you to use
1900 `host.vars.vm_parent` similar to the example above.
1902 apply Dependency "vm-service-to-parent-master" to Service {
1903 parent_host_name = host.vars.vm_parent
1904 assign where "generic-vm" in host.templates
1907 That way you don't need to wait for your guest hosts becoming
1908 unreachable when the master host goes down. Instead the services
1909 will detect their reachability immediately when executing checks.
1913 > This method with setting locally scoped variables only works in
1914 > apply rules, but not in object definitions.
1917 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1919 Another classic example are agent based checks. You would define a health check
1920 for the agent daemon responding to your requests, and make all other services
1921 querying that daemon depend on that health check.
1923 The following configuration defines two nrpe based service checks `nrpe-load`
1924 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1925 `nrpe-health` service.
1927 apply Service "nrpe-health" {
1928 import "generic-service"
1929 check_command = "nrpe"
1930 assign where match("nrpe-*", host.name)
1933 apply Service "nrpe-load" {
1934 import "generic-service"
1935 check_command = "nrpe"
1936 vars.nrpe_command = "check_load"
1937 assign where match("nrpe-*", host.name)
1940 apply Service "nrpe-disk" {
1941 import "generic-service"
1942 check_command = "nrpe"
1943 vars.nrpe_command = "check_disk"
1944 assign where match("nrpe-*", host.name)
1947 object Host "nrpe-server" {
1948 import "generic-host"
1949 address = "192.168.1.5"
1952 apply Dependency "disable-nrpe-checks" to Service {
1953 parent_service_name = "nrpe-health"
1956 disable_checks = true
1957 disable_notifications = true
1958 assign where service.check_command == "nrpe"
1959 ignore where service.name == "nrpe-health"
1962 The `disable-nrpe-checks` dependency is applied to all services
1963 on the `nrpe-service` host using the `nrpe` check_command attribute
1964 but not the `nrpe-health` service itself.