1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
5 Keep in mind these examples are made with a linux server in mind, if you are
6 using Windows you will need to change the services accordingly. See the [ITL reference](7-icinga-template-library.md#windows-plugins)
7 for further information.
9 ## <a id="hosts-services"></a> Hosts and Services
11 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
12 and services can be virtually anything which can be checked in some way:
14 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
18 * Other local or network-accessible services
20 Host objects provide a mechanism to group services that are running
21 on the same physical device.
23 Here is an example of a host object which defines two child services:
25 object Host "my-server1" {
27 check_command = "hostalive"
30 object Service "ping4" {
31 host_name = "my-server1"
32 check_command = "ping4"
35 object Service "http" {
36 host_name = "my-server1"
37 check_command = "http"
40 The example creates two services `ping4` and `http` which belong to the
43 It also specifies that the host should perform its own check using the `hostalive`
46 The `address` attribute is used by check commands to determine which network
47 address is associated with the host object.
49 Details on troubleshooting check problems can be found [here](16-troubleshooting.md#troubleshooting).
51 ### <a id="host-states"></a> Host States
53 Hosts can be in any of the following states:
56 ------------|--------------
57 UP | The host is available.
58 DOWN | The host is unavailable.
60 ### <a id="service-states"></a> Service States
62 Services can be in any of the following states:
65 ------------|--------------
66 OK | The service is working properly.
67 WARNING | The service is experiencing some problems but is still considered to be in working condition.
68 CRITICAL | The service is in a critical state.
69 UNKNOWN | The check could not determine the service's state.
71 ### <a id="hard-soft-states"></a> Hard and Soft States
73 When detecting a problem with a host/service Icinga re-checks the object a number of
74 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
75 notifications. This ensures that no unnecessary notifications are sent for
76 transient failures. During this time the object is in a `SOFT` state.
78 After all re-checks have been executed and the object is still in a non-OK
79 state the host/service switches to a `HARD` state and notifications are sent.
82 ------------|--------------
83 HARD | The host/service's state hasn't recently changed.
84 SOFT | The host/service has recently changed state and is being re-checked.
86 ### <a id="host-service-checks"></a> Host and Service Checks
88 Hosts and services determine their state by running checks in a regular interval.
90 object Host "router" {
91 check_command = "hostalive"
95 The `hostalive` command is one of several built-in check commands. It sends ICMP
96 echo requests to the IP address specified in the `address` attribute to determine
97 whether a host is online.
99 A number of other [built-in check commands](7-icinga-template-library.md#plugin-check-commands) are also
100 available. In addition to these commands the next few chapters will explain in
101 detail how to set up your own check commands.
104 ## <a id="object-inheritance-using-templates"></a> Templates
106 Templates may be used to apply a set of identical attributes to more than one
109 template Service "generic-service" {
110 max_check_attempts = 3
113 enable_perfdata = true
116 apply Service "ping4" {
117 import "generic-service"
119 check_command = "ping4"
121 assign where host.address
124 apply Service "ping6" {
125 import "generic-service"
127 check_command = "ping6"
129 assign where host.address6
133 In this example the `ping4` and `ping6` services inherit properties from the
134 template `generic-service`.
136 Objects as well as templates themselves can import an arbitrary number of
137 other templates. Attributes inherited from a template can be overridden in the
140 You can also import existing non-template objects. Note that templates
141 and objects share the same namespace, i.e. you can't define a template
142 that has the same name like an object.
145 ## <a id="custom-attributes"></a> Custom Attributes
147 In addition to built-in attributes you can define your own attributes:
149 object Host "localhost" {
153 Valid values for custom attributes include:
155 * [Strings](19-language-reference.md#string-literals), [numbers](19-language-reference.md#numeric-literals) and [booleans](19-language-reference.md#boolean-literals)
156 * [Arrays](19-language-reference.md#array) and [dictionaries](19-language-reference.md#dictionary)
157 * [Functions](3-monitoring-basics.md#custom-attributes-functions)
159 ### <a id="custom-attributes-functions"></a> Functions as Custom Attributes
161 Icinga 2 lets you specify [functions](19-language-reference.md#functions) for custom attributes.
162 The special case here is that whenever Icinga 2 needs the value for such a custom attribute it runs
163 the function and uses whatever value the function returns:
165 object CheckCommand "random-value" {
166 import "plugin-check-command"
168 command = [ PluginDir + "/check_dummy", "0", "$text$" ]
170 vars.text = {{ Math.random() * 100 }}
173 This example uses the [abbreviated lambda syntax](19-language-reference.md#nullary-lambdas).
175 These functions have access to a number of variables:
177 Variable | Description
178 -------------|---------------
179 user | The User object (for notifications).
180 service | The Service object (for service checks/notifications/event handlers).
181 host | The Host object.
182 command | The command object (e.g. a CheckCommand object for checks).
186 vars.text = {{ host.check_interval }}
188 In addition to these variables the `macro` function can be used to retrieve the
189 value of arbitrary macro expressions:
192 if (macro("$address$") == "127.0.0.1") {
193 log("Running a check for localhost!")
199 Acessing object attributes at runtime inside these functions is described in the
200 [advanced topics](5-advanced-topics.md#access-object-attributes-at-runtime) chapter.
202 ## <a id="runtime-macros"></a> Runtime Macros
204 Macros can be used to access other objects' attributes at runtime. For example they
205 are used in command definitions to figure out which IP address a check should be
208 object CheckCommand "my-ping" {
209 import "plugin-check-command"
211 command = [ PluginDir + "/check_ping", "-H", "$ping_address$" ]
214 "-w" = "$ping_wrta$,$ping_wpl$%"
215 "-c" = "$ping_crta$,$ping_cpl$%"
216 "-p" = "$ping_packets$"
219 vars.ping_address = "$address$"
227 vars.ping_packets = 5
230 object Host "router" {
231 check_command = "my-ping"
235 In this example we are using the `$address$` macro to refer to the host's `address`
238 We can also directly refer to custom attributes, e.g. by using `$ping_wrta$`. Icinga
239 automatically tries to find the closest match for the attribute you specified. The
240 exact rules for this are explained in the next section.
243 ### <a id="macro-evaluation-order"></a> Evaluation Order
245 When executing commands Icinga 2 checks the following objects in this order to look
246 up macros and their respective values:
248 1. User object (only for notifications)
252 5. Global custom attributes in the `Vars` constant
254 This execution order allows you to define default values for custom attributes
255 in your command objects.
257 Here's how you can override the custom attribute `ping_packets` from the previous
260 object Service "ping" {
261 host_name = "localhost"
262 check_command = "my-ping"
264 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
267 If a custom attribute isn't defined anywhere an empty value is used and a warning is
268 written to the Icinga 2 log.
270 You can also directly refer to a specific attribute - thereby ignoring these evaluation
271 rules - by specifying the full attribute name:
273 $service.vars.ping_wrta$
275 This retrieves the value of the `ping_wrta` custom attribute for the service. This
276 returns an empty value if the service does not have such a custom attribute no matter
277 whether another object such as the host has this attribute.
280 ### <a id="host-runtime-macros"></a> Host Runtime Macros
282 The following host custom attributes are available in all commands that are executed for
286 -----------------------------|--------------
287 host.name | The name of the host object.
288 host.display_name | The value of the `display_name` attribute.
289 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
290 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
291 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
292 host.check_attempt | The current check attempt number.
293 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
294 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
295 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
296 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
297 host.last_state_change | The last state change's timestamp.
298 host.downtime_depth | The number of active downtimes.
299 host.duration_sec | The time since the last state change.
300 host.latency | The host's check latency.
301 host.execution_time | The host's check execution time.
302 host.output | The last check's output.
303 host.perfdata | The last check's performance data.
304 host.last_check | The timestamp when the last check was executed.
305 host.check_source | The monitoring instance that performed the last check.
306 host.num_services | Number of services associated with the host.
307 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
308 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
309 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
310 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
312 ### <a id="service-runtime-macros"></a> Service Runtime Macros
314 The following service macros are available in all commands that are executed for
318 ---------------------------|--------------
319 service.name | The short name of the service object.
320 service.display_name | The value of the `display_name` attribute.
321 service.check_command | The short name of the command along with any arguments to be used for the check.
322 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
323 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
324 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
325 service.check_attempt | The current check attempt number.
326 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
327 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
328 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
329 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
330 service.last_state_change | The last state change's timestamp.
331 service.downtime_depth | The number of active downtimes.
332 service.duration_sec | The time since the last state change.
333 service.latency | The service's check latency.
334 service.execution_time | The service's check execution time.
335 service.output | The last check's output.
336 service.perfdata | The last check's performance data.
337 service.last_check | The timestamp when the last check was executed.
338 service.check_source | The monitoring instance that performed the last check.
340 ### <a id="command-runtime-macros"></a> Command Runtime Macros
342 The following custom attributes are available in all commands:
345 -----------------------|--------------
346 command.name | The name of the command object.
348 ### <a id="user-runtime-macros"></a> User Runtime Macros
350 The following custom attributes are available in all commands that are executed for
354 -----------------------|--------------
355 user.name | The name of the user object.
356 user.display_name | The value of the display_name attribute.
358 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
361 -----------------------|--------------
362 notification.type | The type of the notification.
363 notification.author | The author of the notification comment, if existing.
364 notification.comment | The comment of the notification, if existing.
366 ### <a id="global-runtime-macros"></a> Global Runtime Macros
368 The following macros are available in all executed commands:
371 -----------------------|--------------
372 icinga.timet | Current UNIX timestamp.
373 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
374 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
375 icinga.date | Current date. Example: `2014-01-03`
376 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
377 icinga.uptime | Current uptime of the Icinga 2 process.
379 The following macros provide global statistics:
382 ----------------------------------|--------------
383 icinga.num_services_ok | Current number of services in state 'OK'.
384 icinga.num_services_warning | Current number of services in state 'Warning'.
385 icinga.num_services_critical | Current number of services in state 'Critical'.
386 icinga.num_services_unknown | Current number of services in state 'Unknown'.
387 icinga.num_services_pending | Current number of pending services.
388 icinga.num_services_unreachable | Current number of unreachable services.
389 icinga.num_services_flapping | Current number of flapping services.
390 icinga.num_services_in_downtime | Current number of services in downtime.
391 icinga.num_services_acknowledged | Current number of acknowledged service problems.
392 icinga.num_hosts_up | Current number of hosts in state 'Up'.
393 icinga.num_hosts_down | Current number of hosts in state 'Down'.
394 icinga.num_hosts_unreachable | Current number of unreachable hosts.
395 icinga.num_hosts_flapping | Current number of flapping hosts.
396 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
397 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
400 ## <a id="using-apply"></a> Apply Rules
402 Instead of assigning each object ([Service](6-object-types.md#objecttype-service),
403 [Notification](6-object-types.md#objecttype-notification), [Dependency](6-object-types.md#objecttype-dependency),
404 [ScheduledDowntime](6-object-types.md#objecttype-scheduleddowntime))
405 based on attribute identifiers for example `host_name` objects can be [applied](19-language-reference.md#apply).
407 Before you start using the apply rules keep the following in mind:
409 * Define the best match.
410 * A set of unique [custom attributes](3-monitoring-basics.md#custom-attributes) for these hosts/services?
411 * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
412 * A generic pattern [match](19-language-reference.md#function-calls) on the host/service name?
413 * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](19-language-reference.md#expression-operators)
414 * All expressions must return a boolean value (an empty string is equal to `false` e.g.)
418 > You can set/override object attributes in apply rules using the respectively available
419 > objects in that scope (host and/or service objects).
421 [Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
422 for not only matching for their existance or values in apply expressions, but also assign
423 ("inherit") their values into the generated objected from apply rules.
425 * [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
426 * [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
427 * [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-dependencies)
428 * [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
430 A more advanced example is using [apply with for loops on arrays or
431 dictionaries](3-monitoring-basics.md#using-apply-for) for example provided by
432 [custom atttributes](3-monitoring-basics.md#custom-attributes) or groups.
436 > Building configuration in that dynamic way requires detailed information
437 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
438 > after successful [configuration validation](8-cli-commands.md#config-validation).
441 ### <a id="using-apply-expressions"></a> Apply Rules Expressions
443 You can use simple or advanced combinations of apply rule expressions. Each
444 expression must evaluate into the boolean `true` value. An empty string
445 will be for instance interpreted as `false`. In a similar fashion undefined
446 attributes will return `false`.
450 assign where host.vars.attribute_does_not_exist
452 Multiple `assign where` condition rows are evaluated as `OR` condition.
454 You can combine multiple expressions for matching only a subset of objects. In some cases,
455 you want to be able to add more than one assign/ignore where expression which matches
456 a specific condition. To achieve this you can use the logical `and` and `or` operators.
459 Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
460 matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
461 should be ignored, or any host name ending with `*internal` pattern.
463 object HostGroup "mysql-server" {
464 display_name = "MySQL Server"
466 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
467 ignore where host.vars.test_server == true
468 ignore where match("*internal", host.name)
471 Similar example for advanced notification apply rule filters: If the service
472 attribute `notes` contains the `has gold support 24x7` string `AND` one of the
473 two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
474 `OR` the host custom attribute `always_notify` is set to `true`.
476 The notification is ignored for services whose host name ends with `*internal`
477 `OR` the `priority` custom attribute is [less than](19-language-reference.md#expression-operators) `2`.
479 template Notification "cust-xy-notification" {
480 users = [ "noc-xy", "mgmt-xy" ]
481 command = "mail-service-notification"
484 apply Notification "notify-cust-xy-mysql" to Service {
485 import "cust-xy-notification"
487 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true)
488 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
493 ### <a id="using-apply-services"></a> Apply Services to Hosts
495 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
496 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
498 The example for `ssh` applies a service object to all hosts with the `address`
499 attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
501 apply Service "ssh" {
502 import "generic-service"
504 check_command = "ssh"
506 assign where host.address && host.vars.os == "Linux"
510 Other detailed scenario examples are used in their respective chapters, for example
511 [apply services with custom command arguments](3-monitoring-basics.md#command-passing-parameters).
513 ### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
515 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
519 apply Notification "mail-noc" to Service {
520 import "mail-service-notification"
522 user_groups = [ "noc" ]
524 assign where host.vars.notification.mail
528 In this example the `mail-noc` notification will be created as object for all services having the
529 `notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
530 and all members of the user group `noc` will get notified.
532 It is also possible to generally apply a notification template and dynamically overwrite values from
533 the template by checking for custom attributes. This can be achieved by using [conditional statements](19-language-reference.md#conditional-statements):
535 apply Notification "host-mail-noc" to Host {
536 import "mail-host-notification"
538 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom attribute
539 if (host.vars.notification_interval) {
540 interval = host.vars.notification_interval
543 // same with notification period
544 if (host.vars.notification_period) {
545 interval = host.vars.notification_period
548 // Send SMS instead of email if the host's custom attribute `notification_type` is set to `sms`
549 if (host.vars.notification_type == "sms") {
550 command = "sms-host-notification"
552 command = "mail-host-notification"
555 user_groups = [ "noc" ]
557 assign where host.address
560 In the example above, the notification template `mail-host-notification`, which contains all relevant
561 notification settings, is applied on all host objects where the `host.address` is defined.
562 Each host object is then checked for custom attributes (`host.vars.notification_interval`,
563 `host.vars.notification_period` and `host.vars.notification_type`). Depending if the custom
564 attibute is set or which value it has, the value from the notification template is dynamically
567 The corresponding Host object could look like this:
569 object Host "host1" {
570 import "host-linux-prod"
571 display_name = "host1"
572 address = "192.168.1.50"
573 vars.notification_interval = 1h
574 vars.notification_period = "24x7"
575 vars.notification_type = "sms"
578 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
580 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
582 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
584 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
586 Detailed examples can be found in the [recurring downtimes](5-advanced-topics.md#recurring-downtimes) chapter.
589 ### <a id="using-apply-for"></a> Using Apply For Rules
591 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
592 there is the requirement of generating apply rules objects based on set (array or
595 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
596 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
598 Take the following example: A host provides the snmp oids for different service check
599 types. This could look like the following example:
602 user_groups = [ "noc" ]
604 assign where host.vars.notification.mail
607 ### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
609 Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
611 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
613 The sample configuration includes an example in [downtimes.conf](4-configuring-icinga-2.md#downtimes-conf).
615 Detailed examples can be found in the [recurring downtimes](5-advanced-topics.md#recurring-downtimes) chapter.
618 ### <a id="using-apply-for"></a> Using Apply For Rules
620 Next to the standard way of using [apply rules](3-monitoring-basics.md#using-apply)
621 there is the requirement of generating apply rules objects based on set (array or
624 The sample configuration already includes a detailed example in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
625 and [services.conf](4-configuring-icinga-2.md#services-conf) for this use case.
627 Take the following example: A host provides the snmp oids for different service check
628 types. This could look like the following example:
630 object Host "router-v6" {
631 check_command = "hostalive"
634 vars.oids["if01"] = "1.1.1.1.1"
635 vars.oids["temp"] = "1.1.1.1.2"
636 vars.oids["bgp"] = "1.1.1.1.5"
639 Now we want to create service checks for `if01` and `temp` but not `bgp`.
640 Furthermore we want to pass the snmp oid stored as dictionary value to the
641 custom attribute called `vars.snmp_oid` - this is the command argument required
642 by the [snmp](7-icinga-template-library.md#plugin-check-command-snmp) check command.
643 The service's `display_name` should be set to the identifier inside the dictionary.
645 apply Service for (identifier => oid in host.vars.oids) {
646 check_command = "snmp"
647 display_name = identifier
650 ignore where identifier == "bgp" //don't generate service for bgp checks
653 Icinga 2 evaluates the `apply for` rule for all objects with the custom attribute
654 `oids` set. It then iterates over all list items inside the `for` loop and evaluates the
655 `assign/ignore where` expressions. You can access the loop variable
656 in these expressions, e.g. for ignoring certain values.
657 In this example we'd ignore the `bgp` identifier and avoid generating an unwanted service.
658 We could extend the configuration by also matching the `oid` value on certain regex/wildcard
659 patterns for example.
663 > You don't need an `assign where` expression only checking for existance
664 > of the custom attribute.
666 That way you'll save duplicated apply rules by combining them into one
667 generic `apply for` rule generating the object name with or without a prefix.
670 #### <a id="using-apply-for-custom-attribute-override"></a> Apply For and Custom Attribute Override
672 Imagine a different more advanced example: You are monitoring your network device (host)
673 with many interfaces (services). The following requirements/problems apply:
675 * Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc)
676 * Each interface has its own vlan tag
677 * Some interfaces have QoS enabled
678 * Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be
679 dynamically generated
682 Tip: Define the snmp community as global constant in your [constants.conf](4-configuring-icinga-2.md#constants-conf) file.
684 const IftrafficSnmpCommunity = "public"
686 By defining the `interfaces` dictionary with three example interfaces on the `cisco-catalyst-6509-34`
687 host object, you'll make sure to pass the [custom attribute](3-monitoring-basics.md#custom-attributes)
688 storage required by the for loop in the service apply rule.
690 object Host "cisco-catalyst-6509-34" {
691 import "generic-host"
692 display_name = "Catalyst 6509 #34 VIE21"
693 address = "127.0.1.4"
695 /* "GigabitEthernet0/2" is the interface name,
696 * and key name in service apply for later on
698 vars.interfaces["GigabitEthernet0/2"] = {
699 /* define all custom attributes with the
700 * same name required for command parameters/arguments
701 * in service apply (look into your CheckCommand definition)
703 iftraffic_units = "g"
704 iftraffic_community = IftrafficSnmpCommunity
705 iftraffic_bandwidth = 1
709 vars.interfaces["GigabitEthernet0/4"] = {
710 iftraffic_units = "g"
711 //iftraffic_community = IftrafficSnmpCommunity
712 iftraffic_bandwidth = 1
716 vars.interfaces["MgmtInterface1"] = {
717 iftraffic_community = IftrafficSnmpCommunity
719 interface_address = "127.99.0.100" #special management ip
723 You can also omit the `"if-"` string, then all generated service names are directly
724 taken from the `if_name` variable value.
726 The config dictionary contains all key-value pairs for the specific interface in one
727 loop cycle, like `iftraffic_units`, `vlan`, and `qos` for the specified interface.
729 You can either map the custom attributes from the `interface_config` dictionary to
730 local custom attributes stashed into `vars`. If the names match the required command
731 argument parameters already (for example `iftraffic_units`), you could also add the
732 `interface_config` dictionary to the `vars` dictionary using the `+=` operator.
734 After `vars` is fully populated, all object attributes can be set calculated from
735 provided host attributes. For strings, you can use string concatention with the `+` operator.
737 You can also specifiy the display_name, check command, interval, notes, notes_url, action_url, etc.
738 attributes that way. Attribute strings can be [concatenated](19-language-reference.md#expression-operators),
739 for example for adding a more detailed service `display_name`.
741 This example also uses [if conditions](19-language-reference.md#conditional-statements)
742 if specific values are not set, adding a local default value.
743 The other way around you can override specific custom attributes inherited from a service template,
746 /* loop over the host.vars.interfaces dictionary
747 * for (key => value in dict) means `interface_name` as key
748 * and `interface_config` as value. Access config attributes
749 * with the indexer (`.`) character.
751 apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) {
752 import "generic-service"
753 check_command = "iftraffic"
754 display_name = "IF-" + interface_name
756 /* use the key as command argument (no duplication of values in host.vars.interfaces) */
757 vars.iftraffic_interface = interface_name
759 /* map the custom attributes as command arguments */
760 vars.iftraffic_units = interface_config.iftraffic_units
761 vars.iftraffic_community = interface_config.iftraffic_community
763 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces
764 * are the _exact_ same as required as command parameter by the check command
767 vars += interface_config
769 /* set a default value for units and bandwidth */
770 if (interface_config.iftraffic_units == "") {
771 vars.iftraffic_units = "m"
773 if (interface_config.iftraffic_bandwidth == "") {
774 vars.iftraffic_bandwidth = 1
776 if (interface_config.vlan == "") {
777 vars.vlan = "not set"
779 if (interface_config.qos == "") {
783 /* set the global constant if not explicitely
784 * not provided by the `interfaces` dictionary on the host
786 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) {
787 vars.iftraffic_community = IftrafficSnmpCommunity
790 /* Calculate some additional object attributes after populating the `vars` dictionary */
791 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'"
792 notes_url = "http://foreman.company.com/hosts/" + host.name
793 action_url = "http://snmp.checker.company.com/" + host.name + "/if-" + interface_name
798 This example makes use of the [check_iftraffic](https://exchange.icinga.org/exchange/iftraffic) plugin.
799 The `CheckCommand` definition can be found in the
800 [contributed plugin check commands](7-icinga-template-library.md#plugins-contrib-command-iftraffic)
801 - make sure to include them in your [icinga2 configuration file](4-configuring-icinga-2.md#icinga2-conf).
806 > Building configuration in that dynamic way requires detailed information
807 > of the generated objects. Use the `object list` [CLI command](8-cli-commands.md#cli-command-object)
808 > after successful [configuration validation](8-cli-commands.md#config-validation).
810 Verify that the apply-for-rule successfully created the service objects with the
811 inherited custom attributes:
814 # icinga2 object list --type Service --name *catalyst*
816 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service':
819 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
820 * iftraffic_bandwidth = 1
821 * iftraffic_community = "public"
822 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
823 * iftraffic_interface = "GigabitEthernet0/2"
824 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
825 * iftraffic_units = "g"
826 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
831 Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service':
834 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
835 * iftraffic_bandwidth = 1
836 * iftraffic_community = "public"
837 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
838 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53
839 * iftraffic_interface = "GigabitEthernet0/4"
840 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
841 * iftraffic_units = "g"
842 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
846 Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service':
849 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26
850 * iftraffic_bandwidth = 1
851 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32
852 * iftraffic_community = "public"
853 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65
854 * iftraffic_interface = "MgmtInterface1"
855 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43
856 * iftraffic_units = "m"
857 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57
858 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30
859 * interface_address = "127.99.0.100"
861 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24
865 ### <a id="using-apply-object-attributes"></a> Use Object Attributes in Apply Rules
867 Since apply rules are evaluated after the generic objects, you
868 can reference existing host and/or service object attributes as
869 values for any object attribute specified in that apply rule.
871 object Host "opennebula-host" {
872 import "generic-host"
875 vars.hosting["xyz"] = {
877 customer_name = "Customer xyz"
879 support_contract = "gold"
881 vars.hosting["abc"] = {
883 customer_name = "Customer xyz"
885 support_contract = "silver"
889 apply Service for (customer => config in host.vars.hosting) {
890 import "generic-service"
891 check_command = "ping4"
893 vars.qos = "disabled"
897 vars.http_uri = "/" + vars.customer + "/" + config.http_uri
899 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
901 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
903 notes_url = "http://foreman.company.com/hosts/" + host.name
904 action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
907 ## <a id="groups"></a> Groups
909 A group is a collection of similar objects. Groups are primarily used as a
910 visualization aid in web interfaces.
912 Group membership is defined at the respective object itself. If
913 you have a hostgroup name `windows` for example, and want to assign
914 specific hosts to this group for later viewing the group on your
915 alert dashboard, first create a HostGroup object:
917 object HostGroup "windows" {
918 display_name = "Windows Servers"
921 Then add your hosts to this group:
923 template Host "windows-server" {
924 groups += [ "windows" ]
927 object Host "mssql-srv1" {
928 import "windows-server"
930 vars.mssql_port = 1433
933 object Host "mssql-srv2" {
934 import "windows-server"
936 vars.mssql_port = 1433
939 This can be done for service and user groups the same way:
941 object UserGroup "windows-mssql-admins" {
942 display_name = "Windows MSSQL Admins"
945 template User "generic-windows-mssql-users" {
946 groups += [ "windows-mssql-admins" ]
949 object User "win-mssql-noc" {
950 import "generic-windows-mssql-users"
952 email = "noc@example.com"
955 object User "win-mssql-ops" {
956 import "generic-windows-mssql-users"
958 email = "ops@example.com"
961 ### <a id="group-assign-intro"></a> Group Membership Assign
963 Instead of manually assigning each object to a group you can also assign objects
964 to a group based on their attributes:
966 object HostGroup "prod-mssql" {
967 display_name = "Production MSSQL Servers"
969 assign where host.vars.mssql_port && host.vars.prod_mysql_db
970 ignore where host.vars.test_server == true
971 ignore where match("*internal", host.name)
974 In this example all hosts with the `vars` attribute `mssql_port`
975 will be added as members to the host group `mssql`. However, all `*internal`
976 hosts or with the `test_server` attribute set to `true` are not added to this
979 Details on the `assign where` syntax can be found in the
980 [Language Reference](19-language-reference.md#apply)
982 ## <a id="notifications"></a> Notifications
984 Notifications for service and host problems are an integral part of your
987 When a host or service is in a downtime, a problem has been acknowledged or
988 the dependency logic determined that the host/service is unreachable, no
989 notifications are sent. You can configure additional type and state filters
990 refining the notifications being actually sent.
992 There are many ways of sending notifications, e.g. by e-mail, XMPP,
993 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
994 Instead it relies on external mechanisms such as shell scripts to notify users.
995 More notification methods are listed in the [addons and plugins](13-addons-plugins.md#notification-scripts-interfaces)
998 A notification specification requires one or more users (and/or user groups)
999 who will be notified in case of problems. These users must have all custom
1000 attributes defined which will be used in the `NotificationCommand` on execution.
1002 The user `icingaadmin` in the example below will get notified only on `WARNING` and
1003 `CRITICAL` states and `problem` and `recovery` notification types.
1005 object User "icingaadmin" {
1006 display_name = "Icinga 2 Admin"
1007 enable_notifications = true
1008 states = [ OK, Warning, Critical ]
1009 types = [ Problem, Recovery ]
1010 email = "icinga@localhost"
1013 If you don't set the `states` and `types` configuration attributes for the `User`
1014 object, notifications for all states and types will be sent.
1016 Details on troubleshooting notification problems can be found [here](16-troubleshooting.md#troubleshooting).
1020 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1021 > in order to execute notification commands.
1023 You should choose which information you (and your notified users) are interested in
1024 case of emergency, and also which information does not provide any value to you and
1027 An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
1029 You can add all shared attributes to a `Notification` template which is inherited
1030 to the defined notifications. That way you'll save duplicated attributes in each
1031 `Notification` object. Attributes can be overridden locally.
1033 template Notification "generic-notification" {
1036 command = "mail-service-notification"
1038 states = [ Warning, Critical, Unknown ]
1039 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1040 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1045 The time period `24x7` is included as example configuration with Icinga 2.
1047 Use the `apply` keyword to create `Notification` objects for your services:
1049 apply Notification "notify-cust-xy-mysql" to Service {
1050 import "generic-notification"
1052 users = [ "noc-xy", "mgmt-xy" ]
1054 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
1055 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
1059 Instead of assigning users to notifications, you can also add the `user_groups`
1060 attribute with a list of user groups to the `Notification` object. Icinga 2 will
1061 send notifications to all group members.
1065 > Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
1066 > states for services, `Down` for hosts) will receive `Recovery` notifications.
1068 ### <a id="notification-escalations"></a> Notification Escalations
1070 When a problem notification is sent and a problem still exists at the time of re-notification
1071 you may want to escalate the problem to the next support level. A different approach
1072 is to configure the default notification by email, and escalate the problem via SMS
1073 if not already solved.
1075 You can define notification start and end times as additional configuration
1076 attributes making the `Notification` object a so-called `notification escalation`.
1077 Using templates you can share the basic notification attributes such as users or the
1078 `interval` (and override them for the escalation then).
1080 Using the example from above, you can define additional users being escalated for SMS
1081 notifications between start and end time.
1083 object User "icinga-oncall-2nd-level" {
1084 display_name = "Icinga 2nd Level"
1086 vars.mobile = "+1 555 424642"
1089 object User "icinga-oncall-1st-level" {
1090 display_name = "Icinga 1st Level"
1092 vars.mobile = "+1 555 424642"
1095 Define an additional [NotificationCommand](3-monitoring-basics.md#notification-commands) for SMS notifications.
1099 > The example is not complete as there are many different SMS providers.
1100 > Please note that sending SMS notifications will require an SMS provider
1101 > or local hardware with a SIM card active.
1103 object NotificationCommand "sms-notification" {
1105 PluginDir + "/send_sms_notification",
1110 The two new notification escalations are added onto the local host
1111 and its service `ping4` using the `generic-notification` template.
1112 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
1113 command) after `30m` until `1h`.
1117 > The `interval` was set to 15m in the `generic-notification`
1118 > template example. Lower that value in your escalations by using a secondary
1119 > template or by overriding the attribute directly in the `notifications` array
1120 > position for `escalation-sms-2nd-level`.
1122 If the problem does not get resolved nor acknowledged preventing further notifications
1123 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
1124 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
1126 apply Notification "mail" to Service {
1127 import "generic-notification"
1129 command = "mail-notification"
1130 users = [ "icingaadmin" ]
1132 assign where service.name == "ping4"
1135 apply Notification "escalation-sms-2nd-level" to Service {
1136 import "generic-notification"
1138 command = "sms-notification"
1139 users = [ "icinga-oncall-2nd-level" ]
1146 assign where service.name == "ping4"
1149 apply Notification "escalation-sms-1st-level" to Service {
1150 import "generic-notification"
1152 command = "sms-notification"
1153 users = [ "icinga-oncall-1st-level" ]
1160 assign where service.name == "ping4"
1163 ### <a id="notification-delay"></a> Notification Delay
1165 Sometimes the problem in question should not be notified when the notification is due
1166 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
1167 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
1168 postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
1169 Icinga 2 will not check against any end time for this notification. Make sure to
1170 specify a relatively low notification `interval` to get notified soon enough again.
1172 apply Notification "mail" to Service {
1173 import "generic-notification"
1175 command = "mail-notification"
1176 users = [ "icingaadmin" ]
1180 times.begin = 15m // delay notification window
1182 assign where service.name == "ping4"
1185 ### <a id="disable-renotification"></a> Disable Re-notifications
1187 If you prefer to be notified only once, you can disable re-notifications by setting the
1188 `interval` attribute to `0`.
1190 apply Notification "notify-once" to Service {
1191 import "generic-notification"
1193 command = "mail-notification"
1194 users = [ "icingaadmin" ]
1196 interval = 0 // disable re-notification
1198 assign where service.name == "ping4"
1201 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
1203 If there are no notification state and type filter attributes defined at the `Notification`
1204 or `User` object Icinga 2 assumes that all states and types are being notified.
1206 Available state and type filters for notifications are:
1208 template Notification "generic-notification" {
1210 states = [ Warning, Critical, Unknown ]
1211 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
1212 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
1215 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
1216 into type and state to allow more fine granular filtering for example on downtimes and flapping.
1217 You can filter for acknowledgements and custom notifications too.
1220 ## <a id="commands"></a> Commands
1222 Icinga 2 uses three different command object types to specify how
1223 checks should be performed, notifications should be sent, and
1224 events should be handled.
1226 ### <a id="check-commands"></a> Check Commands
1228 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects define the command line how
1231 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects are referenced by
1232 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1233 using the `check_command` attribute.
1237 > Make sure that the [checker](8-cli-commands.md#features) feature is enabled in order to
1240 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
1242 [CheckCommand](6-object-types.md#objecttype-checkcommand) objects require the [ITL template](7-icinga-template-library.md#itl-plugin-check-command)
1243 `plugin-check-command` to support native plugin based check methods.
1245 Unless you have done so already, download your check plugin and put it
1246 into the [PluginDir](4-configuring-icinga-2.md#constants-conf) directory. The following example uses the
1247 `check_mysql` plugin contained in the Monitoring Plugins package.
1249 The plugin path and all command arguments are made a list of
1250 double-quoted string arguments for proper shell escaping.
1252 Call the `check_disk` plugin with the `--help` parameter to see
1253 all available options. Our example defines warning (`-w`) and
1254 critical (`-c`) thresholds for the disk usage. Without any
1255 partition defined (`-p`) it will check all local partitions.
1257 icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help
1260 This program tests connections to a MySQL server
1263 check_mysql [-d database] [-H host] [-P port] [-s socket]
1264 [-u user] [-p password] [-S] [-l] [-a cert] [-k key]
1265 [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group]
1267 Next step is to understand how [command parameters](3-monitoring-basics.md#command-passing-parameters)
1268 are being passed from a host or service object, and add a [CheckCommand](6-object-types.md#objecttype-checkcommand)
1269 definition based on these required parameters and/or default values.
1271 Please continue reading in the [plugins section](13-addons-plugins.md#plugins) for additional integration examples.
1273 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
1275 Check command parameters are defined as custom attributes which can be accessed as runtime macros
1276 by the executed check command.
1278 The check command parameters for ITL provided plugin check command definitions are documented
1279 [here](7-icinga-template-library.md#plugin-check-commands), for example
1280 [disk](7-icinga-template-library.md#plugin-check-command-disk).
1282 In order to practice passing command parameters you should [integrate your own plugin](3-monitoring-basics.md#command-plugin-integration).
1284 The following example will use `check_mysql` provided by the [Monitoring Plugins installation](2-getting-started.md#setting-up-check-plugins).
1286 Define the default check command custom attributes, for example `mysql_user` and `mysql_password`
1287 (freely definable naming schema) and optional their default threshold values. You can
1288 then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
1289 on the command line.
1293 > Use a common command type as prefix for your command arguments to increase
1294 > readability. `mysql_user` helps understanding the context better than just
1295 > `user` as argument.
1297 The default custom attributes can be overridden by the custom attributes
1298 defined in the host or service using the check command `my-mysql`. The custom attributes
1299 can also be inherited from a parent template using additive inheritance (`+=`).
1301 # vim /etc/icinga2/conf.d/commands.conf
1303 object CheckCommand "my-mysql" {
1304 import "plugin-check-command"
1306 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir
1309 "-H" = "$mysql_host$"
1312 value = "$mysql_user$"
1314 "-p" = "$mysql_password$"
1315 "-P" = "$mysql_port$"
1316 "-s" = "$mysql_socket$"
1317 "-a" = "$mysql_cert$"
1318 "-d" = "$mysql_database$"
1319 "-k" = "$mysql_key$"
1320 "-C" = "$mysql_ca_cert$"
1321 "-D" = "$mysql_ca_dir$"
1322 "-L" = "$mysql_ciphers$"
1323 "-f" = "$mysql_optfile$"
1324 "-g" = "$mysql_group$"
1326 set_if = "$mysql_check_slave$"
1327 description = "Check if the slave thread is running properly."
1330 set_if = "$mysql_ssl$"
1331 description = "Use ssl encryption"
1335 vars.mysql_check_slave = false
1336 vars.mysql_ssl = false
1337 vars.mysql_host = "$address$"
1340 The check command definition also sets `mysql_host` to the `$address$` default value. You can override
1341 this command parameter if for example your MySQL host is not running on the same server's ip address.
1343 Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`.
1344 `MysqlUsername` and `MysqlPassword` are specified as [global constants](4-configuring-icinga-2.md#constants-conf)
1347 # vim /etc/icinga2/conf.d/services.conf
1349 apply Service "mysql-icinga-db-health" {
1350 import "generic-service"
1352 check_command = "my-mysql"
1354 vars.mysql_user = MysqlUsername
1355 vars.mysql_password = MysqlPassword
1357 vars.mysql_database = "icinga"
1358 vars.mysql_host = "192.168.33.11"
1360 assign where match("icinga2*", host.name)
1361 ignore where host.vars.no_health_check == true
1365 Take a different example: The example host configuration in [hosts.conf](4-configuring-icinga-2.md#hosts-conf)
1366 also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`.
1367 You can pass the command parameter as custom attribute `ssh_port` directly inside the service apply rule
1368 inside [services.conf](4-configuring-icinga-2.md#services-conf):
1370 apply Service "ssh" {
1371 import "generic-service"
1373 check_command = "ssh"
1374 vars.ssh_port = 2022 //custom command parameter
1376 assign where (host.address || host.address6) && host.vars.os == "Linux"
1379 If you prefer this being configured at the host instead of the service, modify the host configuration
1380 object instead. The runtime macro resolving order is described [here](3-monitoring-basics.md#macro-evaluation-order).
1382 object Host NodeName {
1384 vars.ssh_port = 2022
1387 #### <a id="command-passing-parameters-apply-for"></a> Passing Check Command Parameters Using Apply For
1389 The host `localhost` with the generated services from the `basic-partitions` dictionary (see
1390 [apply for](3-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions
1391 with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
1394 The custom attribute `disk_partition` can either hold a single string or an array of
1395 string values for passing multiple partitions to the `check_disk` check plugin.
1397 object Host "my-server" {
1398 import "generic-host"
1399 address = "127.0.0.1"
1402 vars.local_disks["basic-partitions"] = {
1403 disk_partitions = [ "/", "/tmp", "/var", "/home" ]
1407 apply Service for (disk => config in host.vars.local_disks) {
1408 import "generic-service"
1409 check_command = "my-disk"
1413 vars.disk_wfree = "10%"
1414 vars.disk_cfree = "5%"
1418 More details on using arrays in custom attributes can be found in
1419 [this chapter](3-monitoring-basics.md#custom-attributes).
1422 #### <a id="command-arguments"></a> Command Arguments
1424 By defining a check command line using the `command` attribute Icinga 2
1425 will resolve all macros in the static string or array. Sometimes it is
1426 required to extend the arguments list based on a met condition evaluated
1427 at command execution. Or making arguments optional - only set if the
1428 macro value can be resolved by Icinga 2.
1430 object CheckCommand "check_http" {
1431 import "plugin-check-command"
1433 command = [ PluginDir + "/check_http" ]
1436 "-H" = "$http_vhost$"
1437 "-I" = "$http_address$"
1439 "-p" = "$http_port$"
1441 set_if = "$http_ssl$"
1444 set_if = "$http_sni$"
1447 value = "$http_auth_pair$"
1448 description = "Username:password on sites with basic authentication"
1451 set_if = "$http_ignore_body$"
1453 "-r" = "$http_expect_body_regex$"
1454 "-w" = "$http_warn_time$"
1455 "-c" = "$http_critical_time$"
1456 "-e" = "$http_expect$"
1459 vars.http_address = "$address$"
1460 vars.http_ssl = false
1461 vars.http_sni = false
1464 The example shows the `check_http` check command defining the most common
1465 arguments. Each of them is optional by default and will be omitted if
1466 the value is not set. For example if the service calling the check command
1467 does not have `vars.http_port` set, it won't get added to the command
1470 If the `vars.http_ssl` custom attribute is set in the service, host or command
1471 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
1472 numeric value to the command line. String values are not supported.
1474 If the macro value cannot be resolved, Icinga 2 will not add the defined argument
1475 to the final command argument array. Empty strings for macro values won't omit
1478 That way you can use the `check_http` command definition for both, with and
1479 without SSL enabled checks saving you duplicated command definitions.
1481 Details on all available options can be found in the
1482 [CheckCommand object definition](6-object-types.md#objecttype-checkcommand).
1485 #### <a id="command-environment-variables"></a> Environment Variables
1487 The `env` command object attribute specifies a list of environment variables with values calculated
1488 from either runtime macros or custom attributes which should be exported as environment variables
1489 prior to executing the command.
1491 This is useful for example for hiding sensitive information on the command line output
1492 when passing credentials to database checks:
1494 object CheckCommand "mysql-health" {
1495 import "plugin-check-command"
1498 PluginDir + "/check_mysql"
1502 "-H" = "$mysql_address$"
1503 "-d" = "$mysql_database$"
1506 vars.mysql_address = "$address$"
1507 vars.mysql_database = "icinga"
1508 vars.mysql_user = "icinga_check"
1509 vars.mysql_pass = "password"
1511 env.MYSQLUSER = "$mysql_user$"
1512 env.MYSQLPASS = "$mysql_pass$"
1517 ### <a id="notification-commands"></a> Notification Commands
1519 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
1520 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
1522 [NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects are referenced by
1523 [Notification](6-object-types.md#objecttype-notification) objects using the `command` attribute.
1525 `NotificationCommand` objects require the [ITL template](7-icinga-template-library.md#itl-plugin-notification-command)
1526 `plugin-notification-command` to support native plugin-based notifications.
1530 > Make sure that the [notification](8-cli-commands.md#features) feature is enabled
1531 > in order to execute notification commands.
1533 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
1534 the current check output) sending an email to the user(s) associated with the
1535 notification itself (`$user.email$`).
1537 If you want to specify default values for some of the custom attribute definitions,
1538 you can add a `vars` dictionary as shown for the `CheckCommand` object.
1540 object NotificationCommand "mail-service-notification" {
1541 import "plugin-notification-command"
1543 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
1546 NOTIFICATIONTYPE = "$notification.type$"
1547 SERVICEDESC = "$service.name$"
1548 HOSTALIAS = "$host.display_name$"
1549 HOSTADDRESS = "$address$"
1550 SERVICESTATE = "$service.state$"
1551 LONGDATETIME = "$icinga.long_date_time$"
1552 SERVICEOUTPUT = "$service.output$"
1553 NOTIFICATIONAUTHORNAME = "$notification.author$"
1554 NOTIFICATIONCOMMENT = "$notification.comment$"
1555 HOSTDISPLAYNAME = "$host.display_name$"
1556 SERVICEDISPLAYNAME = "$service.display_name$"
1557 USEREMAIL = "$user.email$"
1561 The command attribute in the `mail-service-notification` command refers to the following
1562 shell script. The macros specified in the `env` array are exported
1563 as environment variables and can be used in the notification script:
1566 template=$(cat <<TEMPLATE
1569 Notification Type: $NOTIFICATIONTYPE
1571 Service: $SERVICEDESC
1573 Address: $HOSTADDRESS
1574 State: $SERVICESTATE
1576 Date/Time: $LONGDATETIME
1578 Additional Info: $SERVICEOUTPUT
1580 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
1584 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
1588 > This example is for `exim` only. Requires changes for `sendmail` and
1591 While it's possible to specify the entire notification command right
1592 in the NotificationCommand object it is generally advisable to create a
1593 shell script in the `/etc/icinga2/scripts` directory and have the
1594 NotificationCommand object refer to that.
1596 ### <a id="event-commands"></a> Event Commands
1598 Unlike notifications, event commands for hosts/services are called on every
1599 check execution if one of these conditions match:
1601 * The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
1602 * The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
1603 * The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
1605 [EventCommand](6-object-types.md#objecttype-eventcommand) objects are referenced by
1606 [Host](6-object-types.md#objecttype-host) and [Service](6-object-types.md#objecttype-service) objects
1607 using the `event_command` attribute.
1609 Therefore the `EventCommand` object should define a command line
1610 evaluating the current service state and other service runtime attributes
1611 available through runtime vars. Runtime macros such as `$service.state_type$`
1612 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
1613 events being triggered.
1615 Common use case scenarios are a failing HTTP check requiring an immediate
1616 restart via event command, or if an application is locked and requires
1617 a restart upon detection.
1619 `EventCommand` objects require the ITL template `plugin-event-command`
1620 to support native plugin based checks.
1622 #### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
1624 The following example will triggert a restart of the `httpd` daemon
1625 via ssh when the `http` service check fails. If the service state is
1626 `OK`, it will not trigger any event action.
1631 * icinga user with public key authentication
1632 * icinga user with sudo permissions for restarting the httpd daemon.
1636 # ls /home/icinga/.ssh/
1640 icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
1643 Define a generic [EventCommand](6-object-types.md#objecttype-eventcommand) object `event_by_ssh`
1644 which can be used for all event commands triggered using ssh:
1646 /* pass event commands through ssh */
1647 object EventCommand "event_by_ssh" {
1648 import "plugin-event-command"
1650 command = [ PluginDir + "/check_by_ssh" ]
1653 "-H" = "$event_by_ssh_address$"
1654 "-p" = "$event_by_ssh_port$"
1655 "-C" = "$event_by_ssh_command$"
1656 "-l" = "$event_by_ssh_logname$"
1657 "-i" = "$event_by_ssh_identity$"
1659 set_if = "$event_by_ssh_quiet$"
1661 "-w" = "$event_by_ssh_warn$"
1662 "-c" = "$event_by_ssh_crit$"
1663 "-t" = "$event_by_ssh_timeout$"
1666 vars.event_by_ssh_address = "$address$"
1667 vars.event_by_ssh_quiet = false
1670 The actual event command only passes the `event_by_ssh_command` attribute.
1671 The `event_by_ssh_service` custom attribute takes care of passing the correct
1672 daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
1673 is only restarted when the service is not in an `OK` state.
1676 object EventCommand "event_by_ssh_restart_service" {
1677 import "event_by_ssh"
1679 //only restart the daemon if state > 0 (not-ok)
1680 //requires sudo permissions for the icinga user
1681 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
1685 Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
1686 which service should be restarted using the `event_by_ssh_service` attribute.
1688 object Service "http" {
1689 import "generic-service"
1690 host_name = "remote-http-host"
1691 check_command = "http"
1693 event_command = "event_by_ssh_restart_service"
1694 vars.event_by_ssh_service = "$host.vars.httpd_name$"
1696 //vars.event_by_ssh_logname = "icinga"
1697 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
1701 Each host with this service then must define the `httpd_name` custom attribute
1702 (for example generated from your cmdb):
1704 object Host "remote-http-host" {
1705 import "generic-host"
1706 address = "192.168.1.100"
1708 vars.httpd_name = "apache2"
1711 You can testdrive this example by manually stopping the `httpd` daemon
1712 on your `remote-http-host`. Enable the `debuglog` feature and tail the
1713 `/var/log/icinga2/debug.log` file.
1715 Remote Host Terminal:
1717 # date; service apache2 status
1718 Mon Sep 15 18:57:39 CEST 2014
1719 Apache2 is running (pid 23651).
1720 # date; service apache2 stop
1721 Mon Sep 15 18:57:47 CEST 2014
1722 [ ok ] Stopping web server: apache2 ... waiting .
1724 Icinga 2 Host Terminal:
1726 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
1727 [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
1728 [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
1729 [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
1730 [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
1731 [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
1733 Remote Host Terminal:
1735 # date; service apache2 status
1736 Mon Sep 15 18:58:44 CEST 2014
1737 Apache2 is running (pid 24908).
1740 ## <a id="dependencies"></a> Dependencies
1742 Icinga 2 uses host and service [Dependency](6-object-types.md#objecttype-dependency) objects
1743 for determing their network reachability.
1745 A service can depend on a host, and vice versa. A service has an implicit
1746 dependency (parent) to its host. A host to host dependency acts implicitly
1747 as host parent relation.
1748 When dependencies are calculated, not only the immediate parent is taken into
1749 account but all parents are inherited.
1751 The `parent_host_name` and `parent_service_name` attributes are mandatory for
1752 service dependencies, `parent_host_name` is required for host dependencies.
1753 [Apply rules](3-monitoring-basics.md#using-apply) will allow you to
1754 [determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
1755 dynamic fashion if required.
1757 parent_host_name = "core-router"
1758 parent_service_name = "uplink-port"
1760 Notifications are suppressed by default if a host or service becomes unreachable.
1761 You can control that option by defining the `disable_notifications` attribute.
1763 disable_notifications = false
1765 If the dependency should be triggered in the parent object's soft state, you
1766 need to set `ignore_soft_states` to `false`.
1768 The dependency state filter must be defined based on the parent object being
1769 either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
1771 The following example will make the dependency fail and trigger it if the parent
1772 object is **not** in one of these states:
1774 states = [ OK, Critical, Unknown ]
1776 Rephrased: If the parent service object changes into the `Warning` state, this
1777 dependency will fail and render all child objects (hosts or services) unreachable.
1779 You can determine the child's reachability by querying the `is_reachable` attribute
1780 in for example [DB IDO](22-appendix.md#schema-db-ido-extensions).
1782 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
1784 Icinga 2 automatically adds an implicit dependency for services on their host. That way
1785 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
1786 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
1787 `states = [ Up ]` for all service objects.
1789 Service checks are still executed. If you want to prevent them from happening, you can
1790 apply the following dependency to all services setting their host as `parent_host_name`
1791 and disabling the checks. `assign where true` matches on all `Service` objects.
1793 apply Dependency "disable-host-service-checks" to Service {
1794 disable_checks = true
1798 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
1800 A common scenario is the Icinga 2 server behind a router. Checking internet
1801 access by pinging the Google DNS server `google-dns` is a common method, but
1802 will fail in case the `dsl-router` host is down. Therefore the example below
1803 defines a host dependency which acts implicitly as parent relation too.
1805 Furthermore the host may be reachable but ping probes are dropped by the
1806 router's firewall. In case the `dsl-router`'s `ping4` service check fails, all
1807 further checks for the `ping4` service on host `google-dns` service should
1808 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1810 object Host "dsl-router" {
1811 import "generic-host"
1812 address = "192.168.1.1"
1815 object Host "google-dns" {
1816 import "generic-host"
1820 apply Service "ping4" {
1821 import "generic-service"
1823 check_command = "ping4"
1825 assign where host.address
1828 apply Dependency "internet" to Host {
1829 parent_host_name = "dsl-router"
1830 disable_checks = true
1831 disable_notifications = true
1833 assign where host.name != "dsl-router"
1836 apply Dependency "internet" to Service {
1837 parent_host_name = "dsl-router"
1838 parent_service_name = "ping4"
1839 disable_checks = true
1841 assign where host.name != "dsl-router"
1844 ### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
1846 You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
1847 child attributes e.g. `parent_host_name` to other object's
1850 A common example are virtual machines hosted on a master. The object
1851 name of that master is auto-generated from your CMDB or VMWare inventory
1852 into the host's custom attributes (or a generic template for your
1855 Define your master host object:
1858 object Host "master.example.com" {
1859 import "generic-host"
1862 Add a generic template defining all common host attributes:
1864 /* generic template for your virtual machines */
1865 template Host "generic-vm" {
1866 import "generic-host"
1869 Add a template for all hosts on your example.com cloud setting
1870 custom attribute `vm_parent` to `master.example.com`:
1872 template Host "generic-vm-example.com" {
1874 vars.vm_parent = "master.example.com"
1877 Define your guest hosts:
1879 object Host "www.example1.com" {
1880 import "generic-vm-master.example.com"
1883 object Host "www.example2.com" {
1884 import "generic-vm-master.example.com"
1887 Apply the host dependency to all child hosts importing the
1888 `generic-vm` template and set the `parent_host_name`
1889 to the previously defined custom attribute `host.vars.vm_parent`.
1891 apply Dependency "vm-host-to-parent-master" to Host {
1892 parent_host_name = host.vars.vm_parent
1893 assign where "generic-vm" in host.templates
1896 You can extend this example, and make your services depend on the
1897 `master.example.com` host too. Their local scope allows you to use
1898 `host.vars.vm_parent` similar to the example above.
1900 apply Dependency "vm-service-to-parent-master" to Service {
1901 parent_host_name = host.vars.vm_parent
1902 assign where "generic-vm" in host.templates
1905 That way you don't need to wait for your guest hosts becoming
1906 unreachable when the master host goes down. Instead the services
1907 will detect their reachability immediately when executing checks.
1911 > This method with setting locally scoped variables only works in
1912 > apply rules, but not in object definitions.
1915 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1917 Another classic example are agent based checks. You would define a health check
1918 for the agent daemon responding to your requests, and make all other services
1919 querying that daemon depend on that health check.
1921 The following configuration defines two nrpe based service checks `nrpe-load`
1922 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1923 `nrpe-health` service.
1925 apply Service "nrpe-health" {
1926 import "generic-service"
1927 check_command = "nrpe"
1928 assign where match("nrpe-*", host.name)
1931 apply Service "nrpe-load" {
1932 import "generic-service"
1933 check_command = "nrpe"
1934 vars.nrpe_command = "check_load"
1935 assign where match("nrpe-*", host.name)
1938 apply Service "nrpe-disk" {
1939 import "generic-service"
1940 check_command = "nrpe"
1941 vars.nrpe_command = "check_disk"
1942 assign where match("nrpe-*", host.name)
1945 object Host "nrpe-server" {
1946 import "generic-host"
1947 address = "192.168.1.5"
1950 apply Dependency "disable-nrpe-checks" to Service {
1951 parent_service_name = "nrpe-health"
1954 disable_checks = true
1955 disable_notifications = true
1956 assign where service.check_command == "nrpe"
1957 ignore where service.name == "nrpe-health"
1960 The `disable-nrpe-checks` dependency is applied to all services
1961 on the `nrpe-service` host using the `nrpe` check_command attribute
1962 but not the `nrpe-health` service itself.