1 # <a id="monitoring-basics"></a> Monitoring Basics
3 This part of the Icinga 2 documentation provides an overview of all the basic
4 monitoring concepts you need to know to run Icinga 2.
6 ## <a id="hosts-services"></a> Hosts and Services
8 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
9 and services can be virtually anything which can be checked in some way:
11 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
15 * Other local or network-accessible services
17 Host objects provide a mechanism to group services that are running
18 on the same physical device.
20 Here is an example of a host object which defines two child services:
22 object Host "my-server1" {
24 check_command = "hostalive"
27 object Service "ping4" {
28 host_name = "my-server1"
29 check_command = "ping4"
32 object Service "http" {
33 host_name = "my-server1"
34 check_command = "http"
37 The example creates two services `ping4` and `http` which belong to the
40 It also specifies that the host should perform its own check using the `hostalive`
43 The `address` attribute is used by check commands to determine which network
44 address is associated with the host object.
46 Details on troubleshooting check problems can be found [here](#troubleshooting).
48 ### <a id="host-states"></a> Host States
50 Hosts can be in any of the following states:
53 ------------|--------------
54 UP | The host is available.
55 DOWN | The host is unavailable.
57 ### <a id="service-states"></a> Service States
59 Services can be in any of the following states:
62 ------------|--------------
63 OK | The service is working properly.
64 WARNING | The service is experiencing some problems but is still considered to be in working condition.
65 CRITICAL | The service is in a critical state.
66 UNKNOWN | The check could not determine the service's state.
68 ### <a id="hard-soft-states"></a> Hard and Soft States
70 When detecting a problem with a host/service Icinga re-checks the object a number of
71 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
72 notifications. This ensures that no unnecessary notifications are sent for
73 transient failures. During this time the object is in a `SOFT` state.
75 After all re-checks have been executed and the object is still in a non-OK
76 state the host/service switches to a `HARD` state and notifications are sent.
79 ------------|--------------
80 HARD | The host/service's state hasn't recently changed.
81 SOFT | The host/service has recently changed state and is being re-checked.
83 ### <a id="host-service-checks"></a> Host and Service Checks
85 Hosts and Services determine their state from a check result returned from a check
86 execution to the Icinga 2 application. By default the `generic-host` example template
87 will define `hostalive` as host check. If your host is unreachable for ping, you should
88 consider using a different check command, for instance the `http` check command, or if
89 there is no check available, the `dummy` check command.
91 object Host "uncheckable-host" {
92 check_command = "dummy"
94 vars.dummy_text = "Pretending to be OK."
97 Service checks could also use a `dummy` check, but the common strategy is to
98 [integrate an existing plugin](#command-plugin-integration) as
99 [check command](#check-commands) and [reference](#command-passing-parameters)
100 that in your [Service](#objecttype-service) object definition.
102 ## <a id="configuration-best-practice"></a> Configuration Best Practice
104 The [Getting Started](#getting-started) chapter already introduced various aspects
105 of the Icinga 2 configuration language. If you are ready to configure additional
106 hosts, services, notifications, dependencies, etc, you should think about the
107 requirements first and then decide for a possible strategy.
109 There are many ways of creating Icinga 2 configuration objects:
111 * Manually with your preferred editor, for example vi(m), nano, notepad, etc.
112 * Generated by a configuration management tool such as Puppet, Chef, Ansible, etc.
113 * A configuration addon for Icinga 2
114 * A custom exporter script from your CMDB or inventory tool
117 In order to find the best strategy for your own configuration, ask yourself the following questions:
119 * Do your hosts share a common group of services (for example linux hosts with disk, load, etc checks)?
120 * Only a small set of users receives notifications and escalations for all hosts/services?
122 If you can at least answer one of these questions with yes, look for the [apply rules](#using-apply) logic
123 instead of defining objects on a per host and service basis.
125 * You are required to define specific configuration for each host/service?
126 * Does your configuration generation tool already know about the host-service-relationship?
128 Then you should look for the object specific configuration setting `host_name` etc accordingly.
130 Finding the best files and directory tree for your configuration is up to you. Make sure that
131 the [icinga2.conf](#icinga2-conf) configuration file includes them, and then think about:
133 * tree-based on locations, hostgroups, specific host attributes with sub levels of directories.
134 * flat `hosts.conf`, `services.conf`, etc files for rule based configuration.
135 * generated configuration with one file per host and a global configuration for groups, users, etc.
136 * one big file generated from an external application (probably a bad idea for maintaining changes).
139 In either way of choosing the right strategy you should additionally check the following:
141 * Are there any specific attributes describing the host/service you could set as `vars` custom attributes?
142 You can later use them for applying assign/ignore rules, or export them into external interfaces.
143 * Put hosts into hostgroups, services into servicegroups and use these attributes for your apply rules.
144 * Use templates to store generic attributes for your objects and apply rules making your configuration more readable.
145 Details can be found in the [using templates](#using-templates) chapter.
146 * Apply rules may overlap. Keep a central place (for example, `services.conf` or `notifications.conf`) storing
147 the configuration instead of defining apply rules deep in your configuration tree.
148 * Every plugin used as check, notification or event command requires a `Command` definition.
149 Further details can be looked up in the [check commands](#check-commands) chapter.
151 If you happen to have further questions, do not hesitate to join the [community support channels](https://support.icinga.org)
152 and ask community members for their experience and best practices.
155 ### <a id="object-inheritance-using-templates"></a> Object Inheritance Using Templates
157 Templates may be used to apply a set of identical attributes to more than one
160 template Service "generic-service" {
161 max_check_attempts = 3
164 enable_perfdata = true
167 object Service "ping4" {
168 import "generic-service"
170 host_name = "localhost"
171 check_command = "ping4"
174 object Service "ping6" {
175 import "generic-service"
177 host_name = "localhost"
178 check_command = "ping6"
181 In this example the `ping4` and `ping6` services inherit properties from the
182 template `generic-service`.
184 Objects as well as templates themselves can import an arbitrary number of
185 templates. Attributes inherited from a template can be overridden in the
188 ### <a id="using-apply"></a> Apply objects based on rules
190 Instead of assigning each object (`Service`, `Notification`, `Dependency`, `ScheduledDowntime`)
191 based on attribute identifiers for example `host_name` objects can be [applied](#apply).
193 Detailed scenario examples are used in their respective chapters, for example
194 [apply services with custom command arguments](#using-apply-services-command-arguments).
196 #### <a id="using-apply-services"></a> Apply Services to Hosts
198 apply Service "load" {
199 import "generic-service"
201 check_command = "load"
203 assign where "linux-server" in host.groups
204 ignore where host.vars.no_load_check
207 In this example the `load` service will be created as object for all hosts in the `linux-server`
208 host group. If the `no_load_check` custom attribute is set, the host will be
211 #### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
213 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
216 apply Notification "mail-noc" to Service {
217 import "mail-service-notification"
218 command = "mail-service-notification"
219 user_groups = [ "noc" ]
221 assign where service.vars.sla == "24x7"
224 In this example the `mail-noc` notification will be created as object for all services having the
225 `sla` custom attribute set to `24x7`. The notification command is set to `mail-service-notification`
226 and all members of the user group `noc` will get notified.
228 #### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
230 Detailed examples can be found in the [dependencies](#dependencies) chapter.
232 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
234 Detailed examples can be found in the [recurring downtimes](#recurring-downtimes) chapter.
237 ### <a id="groups"></a> Groups
239 Groups are used for combining hosts, services, and users into
240 accessible configuration attributes and views in external (web)
243 Group membership is defined at the respective object itself. If
244 you have a hostgroup name `windows` for example, and want to assign
245 specific hosts to this group for later viewing the group on your
246 alert dashboard, first create the hostgroup:
248 object HostGroup "windows" {
249 display_name = "Windows Servers"
252 Then add your hosts to this hostgroup
254 template Host "windows-server" {
255 groups += [ "windows" ]
258 object Host "mssql-srv1" {
259 import "windows-server"
261 vars.mssql_port = 1433
264 object Host "mssql-srv2" {
265 import "windows-server"
267 vars.mssql_port = 1433
270 This can be done for service and user groups the same way. Additionally
271 the user groups are associated as attributes in `Notification` objects.
273 object UserGroup "windows-mssql-admins" {
274 display_name = "Windows MSSQL Admins"
277 template User "generic-windows-mssql-users" {
278 groups += [ "windows-mssql-admins" ]
281 object User "win-mssql-noc" {
282 import "generic-windows-mssql-users"
284 email = "noc@example.com"
287 object User "win-mssql-ops" {
288 import "generic-windows-mssql-users"
290 email = "ops@example.com"
293 #### <a id="group-assign"></a> Group Membership Assign
295 If there is a certain number of hosts, services, or users matching a pattern
296 it's reasonable to assign the group object to these members.
297 Details on the `assign where` syntax can be found [here](#apply)
299 object HostGroup "mssql" {
300 display_name = "MSSQL Servers"
301 assign where host.vars.mssql_port
304 In this inherited example from above all hosts with the `vars` attribute `mssql_port`
305 set will be added as members to the host group `mssql`.
307 ## <a id="notifications"></a> Notifications
309 Notifications for service and host problems are an integral part of your
312 When a host or service is in a downtime, a problem has been acknowledged or
313 the dependency logic determined that the host/service is unreachable, no
314 notifications are sent. You can configure additional type and state filters
315 refining the notifications being actually sent.
317 There are many ways of sending notifications, e.g. by e-mail, XMPP,
318 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
319 Instead it relies on external mechanisms such as shell scripts to notify users.
321 A notification specification requires one or more users (and/or user groups)
322 who will be notified in case of problems. These users must have all custom
323 attributes defined which will be used in the `NotificationCommand` on execution.
325 The user `icingaadmin` in the example below will get notified only on `WARNING` and
326 `CRITICAL` states and `problem` and `recovery` notification types.
328 object User "icingaadmin" {
329 display_name = "Icinga 2 Admin"
330 enable_notifications = true
331 states = [ OK, Warning, Critical ]
332 types = [ Problem, Recovery ]
333 email = "icinga@localhost"
336 If you don't set the `states` and `types` configuration attributes for the `User`
337 object, notifications for all states and types will be sent.
339 Details on troubleshooting notification problems can be found [here](#troubleshooting).
343 > Make sure that the [notification](#features) feature is enabled on your master instance
344 > in order to execute notification commands.
346 You should choose which information you (and your notified users) are interested in
347 case of emergency, and also which information does not provide any value to you and
350 An example notification command is explained [here](#notification-commands).
352 You can add all shared attributes to a `Notification` template which is inherited
353 to the defined notifications. That way you'll save duplicated attributes in each
354 `Notification` object. Attributes can be overridden locally.
356 template Notification "generic-notification" {
359 command = "mail-service-notification"
361 states = [ Warning, Critical, Unknown ]
362 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
363 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
368 The time period `24x7` is shipped as example configuration with Icinga 2.
370 Use the `apply` keyword to create `Notification` objects for your services:
372 apply Notification "mail" to Service {
373 import "generic-notification"
375 command = "mail-notification"
376 users = [ "icingaadmin" ]
378 assign where service.name == "mysql"
381 Instead of assigning users to notifications, you can also add the `user_groups`
382 attribute with a list of user groups to the `Notification` object. Icinga 2 will
383 send notifications to all group members.
385 ### <a id="notification-escalations"></a> Notification Escalations
387 When a problem notification is sent and a problem still exists at the time of re-notification
388 you may want to escalate the problem to the next support level. A different approach
389 is to configure the default notification by email, and escalate the problem via SMS
390 if not already solved.
392 You can define notification start and end times as additional configuration
393 attributes making the `Notification` object a so-called `notification escalation`.
394 Using templates you can share the basic notification attributes such as users or the
395 `interval` (and override them for the escalation then).
397 Using the example from above, you can define additional users being escalated for SMS
398 notifications between start and end time.
400 object User "icinga-oncall-2nd-level" {
401 display_name = "Icinga 2nd Level"
403 vars.mobile = "+1 555 424642"
406 object User "icinga-oncall-1st-level" {
407 display_name = "Icinga 1st Level"
409 vars.mobile = "+1 555 424642"
412 Define an additional `NotificationCommand` for SMS notifications.
416 > The example is not complete as there are many different SMS providers.
417 > Please note that sending SMS notifications will require an SMS provider
418 > or local hardware with a SIM card active.
420 object NotificationCommand "sms-notification" {
422 PluginDir + "/send_sms_notification",
427 The two new notification escalations are added onto the host `localhost`
428 and its service `ping4` using the `generic-notification` template.
429 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
430 command) after `30m` until `1h`.
434 > The `interval` was set to 15m in the `generic-notification`
435 > template example. Lower that value in your escalations by using a secondary
436 > template or by overriding the attribute directly in the `notifications` array
437 > position for `escalation-sms-2nd-level`.
439 If the problem does not get resolved nor acknowledged preventing further notifications
440 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
441 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
443 apply Notification "mail" to Service {
444 import "generic-notification"
446 command = "mail-notification"
447 users = [ "icingaadmin" ]
449 assign where service.name == "ping4"
452 apply Notification "escalation-sms-2nd-level" to Service {
453 import "generic-notification"
455 command = "sms-notification"
456 users = [ "icinga-oncall-2nd-level" ]
463 assign where service.name == "ping4"
466 apply Notification "escalation-sms-1st-level" to Service {
467 import "generic-notification"
469 command = "sms-notification"
470 users = [ "icinga-oncall-1st-level" ]
477 assign where service.name == "ping4"
480 ### <a id="notification-delay"></a> Notification Delay
482 Sometimes the problem in question should not be notified when the notification is due
483 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
484 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
485 postpone the first notification for 15 minutes. Leave out the `end` key - if not set,
486 Icinga 2 will not check against any end time for this notification.
488 apply Notification "mail" to Service {
489 import "generic-notification"
491 command = "mail-notification"
492 users = [ "icingaadmin" ]
494 times.begin = 15m // delay first notification
496 assign where service.name == "ping4"
499 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
501 If there are no notification state and type filter attributes defined at the `Notification`
502 or `User` object Icinga 2 assumes that all states and types are being notified.
504 Available state and type filters for notifications are:
506 template Notification "generic-notification" {
508 states = [ Warning, Critical, Unknown ]
509 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
510 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
513 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
514 into type and state to allow more fine granular filtering for example on downtimes and flapping.
515 You can filter for acknowledgements and custom notifications too.
518 ## <a id="timeperiods"></a> Time Periods
520 Time Periods define time ranges in Icinga where event actions are
521 triggered, for example whether a service check is executed or not within
522 the `check_period` attribute. Or a notification should be sent to
523 users or not, filtered by the `period` and `notification_period`
524 configuration attributes for `Notification` and `User` objects.
528 > If you are familar with Icinga 1.x - these time period definitions
529 > are called `legacy timeperiods` in Icinga 2.
531 > An Icinga 2 legacy timeperiod requires the `ITL` provided template
532 >`legacy-timeperiod`.
534 The `TimePeriod` attribute `ranges` may contain multiple directives,
535 including weekdays, days of the month, and calendar dates.
536 These types may overlap/override other types in your ranges dictionary.
538 The descending order of precedence is as follows:
540 * Calendar date (2008-01-01)
541 * Specific month date (January 1st)
542 * Generic month date (Day 15)
543 * Offset weekday of specific month (2nd Tuesday in December)
544 * Offset weekday (3rd Monday)
545 * Normal weekday (Tuesday)
547 If you don't set any `check_period` or `notification_period` attribute
548 on your configuration objects Icinga 2 assumes `24x7` as time period
551 object TimePeriod "24x7" {
552 import "legacy-timeperiod"
554 display_name = "Icinga 2 24x7 TimePeriod"
556 "monday" = "00:00-24:00"
557 "tuesday" = "00:00-24:00"
558 "wednesday" = "00:00-24:00"
559 "thursday" = "00:00-24:00"
560 "friday" = "00:00-24:00"
561 "saturday" = "00:00-24:00"
562 "sunday" = "00:00-24:00"
566 If your operation staff should only be notified during workhours
567 create a new timeperiod named `workhours` defining a work day from
570 object TimePeriod "workhours" {
571 import "legacy-timeperiod"
573 display_name = "Icinga 2 8x5 TimePeriod"
575 "monday" = "09:00-17:00"
576 "tuesday" = "09:00-17:00"
577 "wednesday" = "09:00-17:00"
578 "thursday" = "09:00-17:00"
579 "friday" = "09:00-17:00"
583 Use the `period` attribute to assign time periods to
584 `Notification` and `Dependency` objects:
586 object Notification "mail" {
587 import "generic-notification"
589 host_name = "localhost"
591 command = "mail-notification"
592 users = [ "icingaadmin" ]
597 ## <a id="commands"></a> Commands
599 Icinga 2 uses three different command object types to specify how
600 checks should be performed, notifications should be sent, and
601 events should be handled.
603 ### <a id="command-environment-variables"></a> Environment Variables for Commands
605 Please check [Runtime Custom Attributes as Environment Variables](#runtime-custom-attribute-env-vars).
608 ### <a id="check-commands"></a> Check Commands
610 `CheckCommand` objects define the command line how a check is called.
614 > Make sure that the [checker](#features) feature is enabled in order to
617 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
619 `CheckCommand` objects require the [ITL template](#itl-plugin-check-command)
620 `plugin-check-command` to support native plugin based check methods.
622 Unless you have done so already, download your check plugin and put it
623 into the `PluginDir` directory. The following example uses the
624 `check_disk` plugin shipped with the Monitoring Plugins package.
626 The plugin path and all command arguments are made a list of
627 double-quoted string arguments for proper shell escaping.
629 Call the `check_disk` plugin with the `--help` parameter to see
630 all available options. Our example defines warning (`-w`) and
631 critical (`-c`) thresholds for the disk usage. Without any
632 partition defined (`-p`) it will check all local partitions.
634 icinga@icinga2 $ /usr/lib/nagios/plugins/check_disk --help
636 This plugin checks the amount of used disk space on a mounted file system
637 and generates an alert if free space is less than one of the threshold values
641 check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
642 [-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
643 [-t timeout] [-u unit] [-v] [-X type] [-N type]
648 > Don't execute plugins as `root` and always use the absolute path to the plugin! Trust us.
650 Next step is to understand how command parameters are being passed from
651 a host or service object, and add a `CheckCommand` definition based on these
652 required parameters and/or default values.
654 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
656 Unlike Icinga 1.x check command parameters are defined as custom attributes
657 which can be accessed as runtime macros by the executed check command.
659 Define the default check command custom attribute `disk_wfree` and `disk_cfree`
660 (freely definable naming schema) and their default threshold values. You can
661 then use these custom attributes as runtime macros for [command arguments](#command-arguments)
664 The default custom attributes can be overridden by the custom attributes
665 defined in the service using the check command `my-disk`. The custom attributes
666 can also be inherited from a parent template using additive inheritance (`+=`).
669 object CheckCommand "my-disk" {
670 import "plugin-check-command"
672 command = [ PluginDir + "/check_disk" ]
675 "-w" = "$disk_wfree$%"
676 "-c" = "$disk_cfree$%"
684 The host `localhost` with the service `my-disk` checks all disks with modified
685 custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
688 object Host "localhost" {
689 import "generic-host"
691 address = "127.0.0.1"
695 object Service "my-disk" {
696 import "generic-service"
698 host_name = "localhost"
699 check_command = "my-disk"
705 #### <a id="command-arguments"></a> Command Arguments
707 By defining a check command line using the `command` attribute Icinga 2
708 will resolve all macros in the static string or array. Sometimes it is
709 required to extend the arguments list based on a met condition evaluated
710 at command execution. Or making arguments optional - only set if the
711 macro value can be resolved by Icinga 2.
713 object CheckCommand "check_http" {
714 import "plugin-check-command"
716 command = [ PluginDir + "/check_http" ]
719 "-H" = "$http_vhost$"
720 "-I" = "$http_address$"
724 set_if = "$http_ssl$"
727 set_if = "$http_sni$"
730 value = "$http_auth_pair$"
731 description = "Username:password on sites with basic authentication"
734 set_if = "$http_ignore_body$"
736 "-r" = "$http_expect_body_regex$"
737 "-w" = "$http_warn_time$"
738 "-c" = "$http_critical_time$"
739 "-e" = "$http_expect$"
742 vars.http_address = "$address$"
743 vars.http_ssl = false
744 vars.http_sni = false
747 The example shows the `check_http` check command defining the most common
748 arguments. Each of them is optional by default and will be omitted if
749 the value is not set. For example if the service calling the check command
750 does not have `vars.http_port` set, it won't get added to the command
752 If the `vars.http_ssl` custom attribute is set in the service, host or command
753 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
754 option to the command line.
755 That way you can use the `check_http` command definition for both, with and
756 without SSL enabled checks saving you duplicated command definitions.
758 Details on all available options can be found in the
759 [CheckCommand object definition](#objecttype-checkcommand).
761 ### <a id="using-apply-services-command-arguments"></a> Apply Services with custom Command Arguments
763 Imagine the following scenario: The `my-host1` host is reachable using the default port 22, while
764 the `my-host2` host requires a different port on 2222. Both hosts are in the hostgroup `my-linux-servers`.
766 object HostGroup "my-linux-servers" {
767 display_name = "Linux Servers"
768 assign where host.vars.os == "Linux"
771 /* this one has port 22 opened */
772 object Host "my-host1" {
773 import "generic-host"
774 address = "129.168.1.50"
778 /* this one listens on a different ssh port */
779 object Host "my-host2" {
780 import "generic-host"
781 address = "129.168.2.50"
783 vars.custom_ssh_port = 2222
786 All hosts in the `my-linux-servers` hostgroup should get the `my-ssh` service applied based on an
787 [apply rule](#apply). The optional `ssh_port` command argument should be inherited from the host
788 the service is applied to. If not set, the check command `my-ssh` will omit the argument.
790 object CheckCommand "my-ssh" {
791 import "plugin-check-command"
793 command = [ PluginDir + "/check_ssh" ]
798 value = "$ssh_address$"
804 vars.ssh_address = "$address$"
807 /* apply ssh service */
808 apply Service "my-ssh" {
809 import "generic-service"
810 check_command = "my-ssh"
812 //set the command argument for ssh port with a custom host attribute, if set
813 vars.ssh_port = "$host.vars.custom_ssh_port$"
815 assign where "my-linux-servers" in host.groups
818 The `my-host1` will get the `my-ssh` service checking on the default port:
820 [2014-05-26 21:52:23 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '129.168.1.50': PID 27281
822 The `my-host2` will inherit the `custom_ssh_port` variable to the service and execute a different command:
824 [2014-05-26 21:51:32 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '-p', '2222', '129.168.2.50': PID 26956
827 ### <a id="notification-commands"></a> Notification Commands
829 `NotificationCommand` objects define how notifications are delivered to external
830 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
832 `NotificationCommand` objects require the [ITL template](#itl-plugin-notification-command)
833 `plugin-notification-command` to support native plugin-based notifications.
837 > Make sure that the [notification](#features) feature is enabled on your master instance
838 > in order to execute notification commands.
840 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
841 the current check output) sending an email to the user(s) associated with the
842 notification itself (`$user.email$`).
844 If you want to specify default values for some of the custom attribute definitions,
845 you can add a `vars` dictionary as shown for the `CheckCommand` object.
847 object NotificationCommand "mail-service-notification" {
848 import "plugin-notification-command"
850 command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
853 NOTIFICATIONTYPE = "$notification.type$"
854 SERVICEDESC = "$service.name$"
855 HOSTALIAS = "$host.display_name$"
856 HOSTADDRESS = "$address$"
857 SERVICESTATE = "$service.state$"
858 LONGDATETIME = "$icinga.long_date_time$"
859 SERVICEOUTPUT = "$service.output$"
860 NOTIFICATIONAUTHORNAME = "$notification.author$"
861 NOTIFICATIONCOMMENT = "$notification.comment$"
862 HOSTDISPLAYNAME = "$host.display_name$"
863 SERVICEDISPLAYNAME = "$service.display_name$"
864 USEREMAIL = "$user.email$"
868 The command attribute in the `mail-service-notification` command refers to the following
869 shell script. The macros specified in the `env` array are exported
870 as environment variables and can be used in the notification script:
873 template=$(cat <<TEMPLATE
876 Notification Type: $NOTIFICATIONTYPE
878 Service: $SERVICEDESC
880 Address: $HOSTADDRESS
883 Date/Time: $LONGDATETIME
885 Additional Info: $SERVICEOUTPUT
887 Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
891 /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
895 > This example is for `exim` only. Requires changes for `sendmail` and
898 While it's possible to specify the entire notification command right
899 in the NotificationCommand object it is generally advisable to create a
900 shell script in the `/etc/icinga2/scripts` directory and have the
901 NotificationCommand object refer to that.
903 ### <a id="event-commands"></a> Event Commands
905 Unlike notifications event commands are called on every host/service execution
906 if one of these conditions match:
908 * The host/service is in a [soft state](#hard-soft-states)
909 * The host/service state changes into a [hard state](#hard-soft-states)
910 * The host/service state recovers from a [soft or hard state](#hard-soft-states) to [OK](#service-states)/[Up](#host-states)
912 Therefore the `EventCommand` object should define a command line
913 evaluating the current service state and other service runtime attributes
914 available through runtime vars. Runtime macros such as `$service.state_type$`
915 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
916 events being triggered.
918 Common use case scenarios are a failing HTTP check requiring an immediate
919 restart via event command, or if an application is locked and requires
920 a restart upon detection.
922 `EventCommand` objects require the ITL template `plugin-event-command`
923 to support native plugin based checks.
925 When the event command is triggered on a service state change, it will
926 send a check result using the `process_check_result` script forcibly
927 changing the service state back to `OK` (`-r 0`) providing some debug
928 information in the check output (`-o`).
930 object EventCommand "plugin-event-process-check-result" {
931 import "plugin-event-command"
934 PluginDir + "/process_check_result",
936 "-S", "$service.name$",
937 "-c", RunDir + "/icinga2/cmd/icinga2.cmd",
939 "-o", "Event Handler triggered in state '$service.state$' with output '$service.output$'."
944 ## <a id="dependencies"></a> Dependencies
946 Icinga 2 uses host and service [Dependency](#objecttype-dependency) objects
947 for determing their network reachability.
948 The `parent_host_name` and `parent_service_name` attributes are mandatory for
949 service dependencies, `parent_host_name` is required for host dependencies.
951 A service can depend on a host, and vice versa. A service has an implicit
952 dependency (parent) to its host. A host to host dependency acts implicitly
953 as host parent relation.
954 When dependencies are calculated, not only the immediate parent is taken into
955 account but all parents are inherited.
957 Notifications are suppressed if a host or service becomes unreachable.
959 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
961 Icinga 2 automatically adds an implicit dependency for services on their host. That way
962 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
963 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
964 `states = [ Up ]` for all service objects.
966 Service checks are still executed. If you want to prevent them from happening, you can
967 apply the following dependency to all services setting their host as `parent_host_name`
968 and disabling the checks. `assign where true` matches on all `Service` objects.
970 apply Dependency "disable-host-service-checks" to Service {
971 disable_checks = true
975 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
977 A common scenario is the Icinga 2 server behind a router. Checking internet
978 access by pinging the Google DNS server `google-dns` is a common method, but
979 will fail in case the `dsl-router` host is down. Therefore the example below
980 defines a host dependency which acts implicitly as parent relation too.
982 Furthermore the host may be reachable but ping probes are dropped by the
983 router's firewall. In case the `dsl-router``ping4` service check fails, all
984 further checks for the `ping4` service on host `google-dns` service should
985 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
987 object Host "dsl-router" {
988 address = "192.168.1.1"
991 object Host "google-dns" {
995 apply Service "ping4" {
996 import "generic-service"
998 check_command = "ping4"
1000 assign where host.address
1003 apply Dependency "internet" to Host {
1004 parent_host_name = "dsl-router"
1005 disable_checks = true
1006 disable_notifications = true
1008 assign where host.name != "dsl-router"
1011 apply Dependency "internet" to Service {
1012 parent_host_name = "dsl-router"
1013 parent_service_name = "ping4"
1014 disable_checks = true
1016 assign where host.name != "dsl-router"
1020 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1022 Another classic example are agent based checks. You would define a health check
1023 for the agent daemon responding to your requests, and make all other services
1024 querying that daemon depend on that health check.
1026 The following configuration defines two nrpe based service checks `nrpe-load`
1027 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1028 `nrpe-health` service.
1030 apply Service "nrpe-health" {
1031 import "generic-service"
1032 check_command = "nrpe"
1033 assign where match("nrpe-*", host.name)
1036 apply Service "nrpe-load" {
1037 import "generic-service"
1038 check_command = "nrpe"
1039 vars.nrpe_command = "check_load"
1040 assign where match("nrpe-*", host.name)
1043 apply Service "nrpe-disk" {
1044 import "generic-service"
1045 check_command = "nrpe"
1046 vars.nrpe_command = "check_disk"
1047 assign where match("nrpe-*", host.name)
1050 object Host "nrpe-server" {
1051 import "generic-host"
1052 address = "192.168.1.5"
1055 apply Dependency "disable-nrpe-checks" to Service {
1056 parent_service_name = "nrpe-health"
1059 disable_checks = true
1060 disable_notifications = true
1061 assign where service.check_command == "nrpe"
1062 ignore where service.name == "nrpe-health"
1065 The `disable-nrpe-checks` dependency is applied to all services
1066 on the `nrpe-service` host using the `nrpe` check_command attribute
1067 but not the `nrpe-health` service itself.
1070 ## <a id="downtimes"></a> Downtimes
1072 Downtimes can be scheduled for planned server maintenance or
1073 any other targetted service outage you are aware of in advance.
1075 Downtimes will suppress any notifications, and may trigger other
1076 downtimes too. If the downtime was set by accident, or the duration
1077 exceeds the maintenance, you can manually cancel the downtime.
1078 Planned downtimes will also be taken into account for SLA reporting
1079 tools calculating the SLAs based on the state and downtime history.
1081 Multiple downtimes for a single object may overlap. This is useful
1082 when you want to extend your maintenance window taking longer than expected.
1083 If there are multiple downtimes triggered for one object, the overall downtime depth
1084 will be greater than `1`.
1087 If the downtime was scheduled after the problem changed to a critical hard
1088 state triggering a problem notification, and the service recovers during
1089 the downtime window, the recovery notification won't be suppressed.
1091 ### <a id="fixed-flexible-downtimes"></a> Fixed and Flexible Downtimes
1093 A `fixed` downtime will be activated at the defined start time, and
1094 removed at the end time. During this time window the service state
1095 will change to `NOT-OK` and then actually trigger the downtime.
1096 Notifications are suppressed and the downtime depth is incremented.
1098 Common scenarios are a planned distribution upgrade on your linux
1099 servers, or database updates in your warehouse. The customer knows
1100 about a fixed downtime window between 23:00 and 24:00. After 24:00
1101 all problems should be alerted again. Solution is simple -
1102 schedule a `fixed` downtime starting at 23:00 and ending at 24:00.
1104 Unlike a `fixed` downtime, a `flexible` downtime will be triggered
1105 by the state change in the time span defined by start and end time,
1106 and then last for the specified duration in minutes.
1108 Imagine the following scenario: Your service is frequently polled
1109 by users trying to grab free deleted domains for immediate registration.
1110 Between 07:30 and 08:00 the impact will hit for 15 minutes and generate
1111 a network outage visible to the monitoring. The service is still alive,
1112 but answering too slow to Icinga 2 service checks.
1113 For that reason, you may want to schedule a downtime between 07:30 and
1114 08:00 with a duration of 15 minutes. The downtime will then last from
1115 its trigger time until the duration is over. After that, the downtime
1116 is removed (may happen before or after the actual end time!).
1118 ### <a id="scheduling-downtime"></a> Scheduling a downtime
1120 This can either happen through a web interface or by sending an [external command](#external-commands)
1121 to the external command pipe provided by the `ExternalCommandListener` configuration.
1123 Fixed downtimes require a start and end time (a duration will be ignored).
1124 Flexible downtimes need a start and end time for the time span, and a duration
1125 independent from that time span.
1127 ### <a id="triggered-downtimes"></a> Triggered Downtimes
1129 This is optional when scheduling a downtime. If there is already a downtime
1130 scheduled for a future maintenance, the current downtime can be triggered by
1131 that downtime. This renders useful if you have scheduled a host downtime and
1132 are now scheduling a child host's downtime getting triggered by the parent
1133 downtime on NOT-OK state change.
1135 ### <a id="recurring-downtimes"></a> Recurring Downtimes
1137 [ScheduledDowntime objects](#objecttype-scheduleddowntime) can be used to set up
1138 recurring downtimes for services.
1142 apply ScheduledDowntime "backup-downtime" to Service {
1143 author = "icingaadmin"
1144 comment = "Scheduled downtime for backup"
1147 monday = "02:00-03:00"
1148 tuesday = "02:00-03:00"
1149 wednesday = "02:00-03:00"
1150 thursday = "02:00-03:00"
1151 friday = "02:00-03:00"
1152 saturday = "02:00-03:00"
1153 sunday = "02:00-03:00"
1156 assign where "backup" in service.groups
1160 ## <a id="comments"></a> Comments
1162 Comments can be added at runtime and are persistent over restarts. You can
1163 add useful information for others on repeating incidents (for example
1164 "last time syslog at 100% cpu on 17.10.2013 due to stale nfs mount") which
1165 is primarly accessible using web interfaces.
1167 Adding and deleting comment actions are possible through the external command pipe
1168 provided with the `ExternalCommandListener` configuration. The caller must
1169 pass the comment id in case of manipulating an existing comment.
1172 ## <a id="acknowledgements"></a> Acknowledgements
1174 If a problem is alerted and notified you may signal the other notification
1175 recipients that you are aware of the problem and will handle it.
1177 By sending an acknowledgement to Icinga 2 (using the external command pipe
1178 provided with `ExternalCommandListener` configuration) all future notifications
1179 are suppressed, a new comment is added with the provided description and
1180 a notification with the type `NotificationFilterAcknowledgement` is sent
1181 to all notified users.
1183 ### <a id="expiring-acknowledgements"></a> Expiring Acknowledgements
1185 Once a problem is acknowledged it may disappear from your `handled problems`
1186 dashboard and no-one ever looks at it again since it will suppress
1189 This `fire-and-forget` action is quite common. If you're sure that a
1190 current problem should be resolved in the future at a defined time,
1191 you can define an expiration time when acknowledging the problem.
1193 Icinga 2 will clear the acknowledgement when expired and start to
1194 re-notify if the problem persists.
1198 ## <a id="custom-attributes"></a> Custom Attributes
1200 ### <a id="runtime-custom-attributes"></a> Using Custom Attributes at Runtime
1202 Custom attributes may be used in command definitions to dynamically change how the command
1205 Additionally there are Icinga 2 features such as the `PerfDataWriter` type
1206 which use custom attributes to format their output.
1210 > Custom attributes are identified by the 'vars' dictionary attribute as short name.
1211 > Accessing the different attribute keys is possible using the '.' accessor.
1213 Custom attributes in command definitions or performance data templates are evaluated at
1214 runtime when executing a command. These custom attributes cannot be used elsewhere
1215 (e.g. in other configuration attributes).
1217 Custom attribute values must be either a string, a number or a boolean value. Arrays
1218 and dictionaries cannot be used.
1220 Here is an example of a command definition which uses user-defined custom attributes:
1222 object CheckCommand "my-ping" {
1223 import "plugin-check-command"
1226 PluginDir + "/check_ping", "-4"
1230 "-H" = "$ping_address$"
1231 "-w" = "$ping_wrta$,$ping_wpl$%"
1232 "-c" = "$ping_crta$,$ping_cpl$%"
1233 "-p" = "$ping_packets$"
1234 "-t" = "$ping_timeout$"
1237 vars.ping_address = "$address$"
1238 vars.ping_wrta = 100
1240 vars.ping_crta = 200
1242 vars.ping_packets = 5
1243 vars.ping_timeout = 0
1246 Custom attribute names used at runtime must be enclosed in two `$` signs, e.g.
1247 `$address$`. When using the `$` sign as single character, you need to escape
1248 it with an additional dollar sign (`$$`). This example also makes use of the
1249 [command arguments](#command-arguments) passed to the command line. `-4` must
1250 be added as additional array key.
1252 ### <a id="runtime-custom-attributes-evaluation-order"></a> Runtime Custom Attributes Evaluation Order
1254 When executing commands Icinga 2 checks the following objects in this order to look
1255 up custom attributes and their respective values:
1257 1. User object (only for notifications)
1261 5. Global custom attributes in the `vars` constant
1263 This execution order allows you to define default values for custom attributes
1264 in your command objects. The `my-ping` command shown above uses this to set
1265 default values for some of the latency thresholds and timeouts.
1267 When using the `my-ping` command you can override some or all of the custom
1268 attributes in the service definition like this:
1270 object Service "ping" {
1271 host_name = "localhost"
1272 check_command = "my-ping"
1274 vars.ping_packets = 10 // Overrides the default value of 5 given in the command
1277 If a custom attribute isn't defined anywhere an empty value is used and a warning is
1278 emitted to the Icinga 2 log.
1282 > By convention every host should have an `address` attribute. Hosts
1283 > which have an IPv6 address should also have an `address6` attribute.
1285 ### <a id="runtime-custom-attribute-env-vars"></a> Runtime Custom Attributes as Environment Variables
1287 The `env` command object attribute specifies a list of environment variables with values calculated
1288 from either runtime macros or custom attributes which should be exported as environment variables
1289 prior to executing the command.
1291 This is useful for example for hiding sensitive information on the command line output
1292 when passing credentials to database checks:
1294 object CheckCommand "mysql-health" {
1295 import "plugin-check-command"
1298 PluginDir + "/check_mysql"
1302 "-H" = "$mysql_address$"
1303 "-d" = "$mysql_database$"
1306 vars.mysql_address = "$address$"
1307 vars.mysql_database = "icinga"
1308 vars.mysql_user = "icinga_check"
1309 vars.mysql_pass = "password"
1311 env.MYSQLUSER = "$mysql_user$"
1312 env.MYSQLPASS = "$mysql_pass$"
1315 ### <a id="multiple-host-addresses-custom-attributes"></a> Multiple Host Addresses using Custom Attributes
1317 The following example defines a `Host` with three different interface addresses defined as
1318 custom attributes in the `vars` dictionary. The `if-eth0` and `if-eth1` services will import
1319 these values into the `address` custom attribute. This attribute is available through the
1320 generic `$address$` runtime macro.
1322 object Host "multi-ip" {
1323 check_command = "dummy"
1324 vars.address_lo = "127.0.0.1"
1325 vars.address_eth0 = "10.0.0.10"
1326 vars.address_eth1 = "192.168.1.10"
1329 apply Service "if-eth0" {
1330 import "generic-service"
1332 vars.address = "$host.vars.address_eth0$"
1333 check_command = "my-generic-interface-check"
1335 assign where host.vars.address_eth0 != ""
1338 apply Service "if-eth1" {
1339 import "generic-service"
1341 vars.address = "$host.vars.address_eth1$"
1342 check_command = "my-generic-interface-check"
1344 assign where host.vars.address_eth1 != ""
1347 object CheckCommand "my-generic-interface-check" {
1348 import "plugin-check-command"
1350 command = "echo \"This would be the service $service.description$ using the address value: $address$\""
1353 The `CheckCommand` object is just an example to help you with testing and
1354 understanding the different custom attributes and runtime macros.
1356 ### <a id="modified-attributes"></a> Modified Attributes
1358 Icinga 2 allows you to modify defined object attributes at runtime different to
1359 the local configuration object attributes. These modified attributes are
1360 stored as bit-shifted-value and made available in backends. Icinga 2 stores
1361 modified attributes in its state file and restores them on restart.
1363 Modified Attributes can be reset using external commands.
1366 ## <a id="runtime-macros"></a> Runtime Macros
1368 Next to custom attributes there are additional runtime macros made available by Icinga 2.
1369 These runtime macros reflect the current object state and may change over time while
1370 custom attributes are configured statically (but can be modified at runtime using
1373 ### <a id="runtime-macro-evaluation-order"></a> Runtime Macro Evaluation Order
1375 Custom attributes can be accessed at [runtime](#runtime-custom-attributes) using their
1376 identifier omitting the `vars.` prefix.
1377 There are special cases when those custom attributes are not set and Icinga 2 provides
1378 a fallback to existing object attributes for example `host.address`.
1380 In the following example the `$address$` macro will be resolved with the value of `vars.address`.
1382 object Host "localhost" {
1383 import "generic-host"
1384 check_command = "my-host-macro-test"
1385 address = "127.0.0.1"
1386 vars.address = "127.2.2.2"
1389 object CheckCommand "my-host-macro-test" {
1390 command = "echo \"address: $address$ host.address: $host.address$ host.vars.address: $host.vars.address$\""
1393 The check command output will look like
1395 "address: 127.2.2.2 host.address: 127.0.0.1 host.vars.address: 127.2.2.2"
1397 If you alter the host object and remove the `vars.address` line, Icinga 2 will fail to look up `$address$` in the
1398 custom attributes dictionary and then look for the host object's attribute.
1400 The check command output will change to
1402 "address: 127.0.0.1 host.address: 127.0.0.1 host.vars.address: "
1405 The same example can be defined for services overriding the `address` field based on a specific host custom attribute.
1407 object Host "localhost" {
1408 import "generic-host"
1409 address = "127.0.0.1"
1410 vars.macro_address = "127.3.3.3"
1413 apply Service "my-macro-test" to Host {
1414 import "generic-service"
1415 check_command = "my-service-macro-test"
1416 vars.address = "$host.vars.macro_address$"
1418 assign where host.address
1421 object CheckCommand "my-service-macro-test" {
1422 command = "echo \"address: $address$ host.address: $host.address$ host.vars.macro_address: $host.vars.macro_address$ service.vars.address: $service.vars.address$\""
1425 When the service check is executed the output looks like
1427 "address: 127.3.3.3 host.address: 127.0.0.1 host.vars.macro_address: 127.3.3.3 service.vars.address: 127.3.3.3"
1429 That way you can easily override existing macros being accessed by their short name like `$address$` and refrain
1430 from defining multiple check commands (one for `$address$` and one for `$host.vars.macro_address$`).
1433 ### <a id="host-runtime-macros"></a> Host Runtime Macros
1435 The following host custom attributes are available in all commands that are executed for
1439 -----------------------------|--------------
1440 host.name | The name of the host object.
1441 host.display_name | The value of the `display_name` attribute.
1442 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
1443 host.state_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
1444 host.state_type | The host's current state type. Can be one of `SOFT` and `HARD`.
1445 host.check_attempt | The current check attempt number.
1446 host.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
1447 host.last_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
1448 host.last_state_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
1449 host.last_state_type | The host's previous state type. Can be one of `SOFT` and `HARD`.
1450 host.last_state_change | The last state change's timestamp.
1451 host.duration_sec | The time since the last state change.
1452 host.latency | The host's check latency.
1453 host.execution_time | The host's check execution time.
1454 host.output | The last check's output.
1455 host.perfdata | The last check's performance data.
1456 host.last_check | The timestamp when the last check was executed.
1457 host.num_services | Number of services associated with the host.
1458 host.num_services_ok | Number of services associated with the host which are in an `OK` state.
1459 host.num_services_warning | Number of services associated with the host which are in a `WARNING` state.
1460 host.num_services_unknown | Number of services associated with the host which are in an `UNKNOWN` state.
1461 host.num_services_critical | Number of services associated with the host which are in a `CRITICAL` state.
1463 ### <a id="service-runtime-macros"></a> Service Runtime Macros
1465 The following service macros are available in all commands that are executed for
1469 ---------------------------|--------------
1470 service.name | The short name of the service object.
1471 service.display_name | The value of the `display_name` attribute.
1472 service.check_command | The short name of the command along with any arguments to be used for the check.
1473 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
1474 service.state_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
1475 service.state_type | The service's current state type. Can be one of `SOFT` and `HARD`.
1476 service.check_attempt | The current check attempt number.
1477 service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
1478 service.last_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
1479 service.last_state_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
1480 service.last_state_type | The service's previous state type. Can be one of `SOFT` and `HARD`.
1481 service.last_state_change | The last state change's timestamp.
1482 service.duration_sec | The time since the last state change.
1483 service.latency | The service's check latency.
1484 service.execution_time | The service's check execution time.
1485 service.output | The last check's output.
1486 service.perfdata | The last check's performance data.
1487 service.last_check | The timestamp when the last check was executed.
1489 ### <a id="command-runtime-macros"></a> Command Runtime Macros
1491 The following custom attributes are available in all commands:
1494 -----------------------|--------------
1495 command.name | The name of the command object.
1497 ### <a id="user-runtime-macros"></a> User Runtime Macros
1499 The following custom attributes are available in all commands that are executed for
1503 -----------------------|--------------
1504 user.name | The name of the user object.
1505 user.display_name | The value of the display_name attribute.
1507 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
1510 -----------------------|--------------
1511 notification.type | The type of the notification.
1512 notification.author | The author of the notification comment, if existing.
1513 notification.comment | The comment of the notification, if existing.
1515 ### <a id="global-runtime-macros"></a> Global Runtime Macros
1517 The following macros are available in all executed commands:
1520 -----------------------|--------------
1521 icinga.timet | Current UNIX timestamp.
1522 icinga.long_date_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
1523 icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
1524 icinga.date | Current date. Example: `2014-01-03`
1525 icinga.time | Current time including timezone information. Example: `11:23:08 +0000`
1526 icinga.uptime | Current uptime of the Icinga 2 process.
1528 The following macros provide global statistics:
1531 ----------------------------------|--------------
1532 icinga.num_services_ok | Current number of services in state 'OK'.
1533 icinga.num_services_warning | Current number of services in state 'Warning'.
1534 icinga.num_services_critical | Current number of services in state 'Critical'.
1535 icinga.num_services_unknown | Current number of services in state 'Unknown'.
1536 icinga.num_services_pending | Current number of pending services.
1537 icinga.num_services_unreachable | Current number of unreachable services.
1538 icinga.num_services_flapping | Current number of flapping services.
1539 icinga.num_services_in_downtime | Current number of services in downtime.
1540 icinga.num_services_acknowledged | Current number of acknowledged service problems.
1541 icinga.num_hosts_up | Current number of hosts in state 'Up'.
1542 icinga.num_hosts_down | Current number of hosts in state 'Down'.
1543 icinga.num_hosts_unreachable | Current number of unreachable hosts.
1544 icinga.num_hosts_flapping | Current number of flapping hosts.
1545 icinga.num_hosts_in_downtime | Current number of hosts in downtime.
1546 icinga.num_hosts_acknowledged | Current number of acknowledged host problems.
1549 ## <a id="check-result-freshness"></a> Check Result Freshness
1551 In Icinga 2 active check freshness is enabled by default. It is determined by the
1552 `check_interval` attribute and no incoming check results in that period of time.
1554 threshold = last check execution time + check interval
1556 Passive check freshness is calculated from the `check_interval` attribute if set.
1558 threshold = last check result time + check interval
1560 If the freshness checks are invalid, a new check is executed defined by the
1561 `check_command` attribute.
1564 ## <a id="check-flapping"></a> Check Flapping
1566 The flapping algorithm used in Icinga 2 does not store the past states but
1567 calculcates the flapping threshold from a single value based on counters and
1568 half-life values. Icinga 2 compares the value with a single flapping threshold
1569 configuration attribute named `flapping_threshold`.
1571 Flapping detection can be enabled or disabled using the `enable_flapping` attribute.
1574 ## <a id="volatile-services"></a> Volatile Services
1576 By default all services remain in a non-volatile state. When a problem
1577 occurs, the `SOFT` state applies and once `max_check_attempts` attribute
1578 is reached with the check counter, a `HARD` state transition happens.
1579 Notifications are only triggered by `HARD` state changes and are then
1580 re-sent defined by the `interval` attribute.
1582 It may be reasonable to have a volatile service which stays in a `HARD`
1583 state type if the service stays in a `NOT-OK` state. That way each
1584 service recheck will automatically trigger a notification unless the
1585 service is acknowledged or in a scheduled downtime.
1588 ## <a id="external-commands"></a> External Commands
1590 Icinga 2 provides an external command pipe for processing commands
1591 triggering specific actions (for example rescheduling a service check
1592 through the web interface).
1594 In order to enable the `ExternalCommandListener` configuration use the
1595 following command and restart Icinga 2 afterwards:
1597 # icinga2-enable-feature command
1599 Icinga 2 creates the command pipe file as `/var/run/icinga2/cmd/icinga2.cmd`
1600 using the default configuration.
1602 Web interfaces and other Icinga addons are able to send commands to
1603 Icinga 2 through the external command pipe, for example for rescheduling
1604 a forced service check:
1606 # /bin/echo "[`date +%s`] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;`date +%s`" >> /var/run/icinga2/cmd/icinga2.cmd
1608 # tail -f /var/log/messages
1610 Oct 17 15:01:25 icinga-server icinga2: Executing external command: [1382014885] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382014885
1611 Oct 17 15:01:25 icinga-server icinga2: Rescheduling next check for service 'ping4'
1613 By default the command pipe file is owned by the group `icingacmd` with read/write
1614 permissions. Add your webserver's user to the group `icingacmd` to
1615 enable sending commands to Icinga 2 through your web interface:
1617 # usermod -G -a icingacmd www-data
1619 Debian packages use `nagios` as the default user and group name. Therefore change `icingacmd` to
1620 `nagios`. The webserver's user is different between distributions as well.
1622 ### <a id="external-command-list"></a> External Command List
1624 A list of currently supported external commands can be found [here](#external-commands-list-detail).
1626 Detailed information on the commands and their required parameters can be found
1627 on the [Icinga 1.x documentation](http://docs.icinga.org/latest/en/extcommands2.html).
1630 ## <a id="logging"></a> Logging
1632 Icinga 2 supports three different types of logging:
1635 * Syslog (on *NIX-based operating systems)
1636 * Console logging (`STDOUT` on tty)
1638 You can enable additional loggers using the `icinga2-enable-feature`
1639 and `icinga2-disable-feature` commands to configure loggers:
1641 Feature | Description
1642 ---------|------------
1643 debuglog | Debug log (path: `/var/log/icinga2/debug.log`, severity: `debug` or higher)
1644 mainlog | Main log (path: `/var/log/icinga2/icinga2.log`, severity: `information` or higher)
1645 syslog | Syslog (severity: `warning` or higher)
1647 By default file the `mainlog` feature is enabled. When running Icinga 2
1648 on a terminal log messages with severity `information` or higher are
1649 written to the console.
1652 ## <a id="performance-data"></a> Performance Data
1654 When a host or service check is executed plugins should provide so-called
1655 `performance data`. Next to that additional check performance data
1656 can be fetched using Icinga 2 runtime macros such as the check latency
1657 or the current service state (or additional custom attributes).
1659 The performance data can be passed to external applications which aggregate and
1660 store them in their backends. These tools usually generate graphs for historical
1661 reporting and trending.
1663 Well-known addons processing Icinga performance data are PNP4Nagios,
1664 inGraph and Graphite.
1666 ### <a id="writing-performance-data-files"></a> Writing Performance Data Files
1668 PNP4Nagios, inGraph and Graphios use performance data collector daemons to fetch
1669 the current performance files for their backend updates.
1671 Therefore the Icinga 2 `PerfdataWriter` object allows you to define
1672 the output template format for host and services backed with Icinga 2
1675 host_format_template = "DATATYPE::HOSTPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tHOSTPERFDATA::$host.perfdata$\tHOSTCHECKCOMMAND::$host.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$"
1676 service_format_template = "DATATYPE::SERVICEPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tSERVICEDESC::$service.name$\tSERVICEPERFDATA::$service.perfdata$\tSERVICECHECKCOMMAND::$service.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$\tSERVICESTATE::$service.state$\tSERVICESTATETYPE::$service.statetype$"
1678 The default templates are already provided with the Icinga 2 feature configuration
1679 which can be enabled using
1681 # icinga2-enable-feature perfdata
1683 By default all performance data files are rotated in a 15 seconds interval into
1684 the `/var/spool/icinga2/perfdata/` directory as `host-perfdata.<timestamp>` and
1685 `service-perfdata.<timestamp>`.
1686 External collectors need to parse the rotated performance data files and then
1687 remove the processed files.
1689 ### <a id="graphite-carbon-cache-writer"></a> Graphite Carbon Cache Writer
1691 While there are some Graphite collector scripts and daemons like Graphios available for
1692 Icinga 1.x it's more reasonable to directly process the check and plugin performance
1693 in memory in Icinga 2. Once there are new metrics available, Icinga 2 will directly
1694 write them to the defined Graphite Carbon daemon tcp socket.
1696 You can enable the feature using
1698 # icinga2-enable-feature graphite
1700 By default the `GraphiteWriter` object expects the Graphite Carbon Cache to listen at
1701 `127.0.0.1` on port `2003`.
1703 The current naming schema is
1705 icinga.<hostname>.<metricname>
1706 icinga.<hostname>.<servicename>.<metricname>
1710 ## <a id="status-data"></a> Status Data
1712 Icinga 1.x writes object configuration data and status data in a cyclic
1713 interval to its `objects.cache` and `status.dat` files. Icinga 2 provides
1714 the `StatusDataWriter` object which dumps all configuration objects and
1715 status updates in a regular interval.
1717 # icinga2-enable-feature statusdata
1719 Icinga 1.x Classic UI requires this data set as part of its backend.
1723 > If you are not using any web interface or addon which uses these files
1724 > you can safely disable this feature.
1727 ## <a id="compat-logging"></a> Compat Logging
1729 The Icinga 1.x log format is considered being the `Compat Log`
1730 in Icinga 2 provided with the `CompatLogger` object.
1732 These logs are not only used for informational representation in
1733 external web interfaces parsing the logs, but also to generate
1734 SLA reports and trends in Icinga 1.x Classic UI. Furthermore the
1735 [Livestatus](#livestatus) feature uses these logs for answering queries to
1738 The `CompatLogger` object can be enabled with
1740 # icinga2-enable-feature compatlog
1742 By default, the Icinga 1.x log file called `icinga.log` is located
1743 in `/var/log/icinga2/compat`. Rotated log files are moved into
1744 `var/log/icinga2/compat/archives`.
1746 The format cannot be changed without breaking compatibility to
1747 existing log parsers.
1749 # tail -f /var/log/icinga2/compat/icinga.log
1751 [1382115688] LOG ROTATION: HOURLY
1752 [1382115688] LOG VERSION: 2.0
1753 [1382115688] HOST STATE: CURRENT;localhost;UP;HARD;1;
1754 [1382115688] SERVICE STATE: CURRENT;localhost;disk;WARNING;HARD;1;
1755 [1382115688] SERVICE STATE: CURRENT;localhost;http;OK;HARD;1;
1756 [1382115688] SERVICE STATE: CURRENT;localhost;load;OK;HARD;1;
1757 [1382115688] SERVICE STATE: CURRENT;localhost;ping4;OK;HARD;1;
1758 [1382115688] SERVICE STATE: CURRENT;localhost;ping6;OK;HARD;1;
1759 [1382115688] SERVICE STATE: CURRENT;localhost;processes;WARNING;HARD;1;
1760 [1382115688] SERVICE STATE: CURRENT;localhost;ssh;OK;HARD;1;
1761 [1382115688] SERVICE STATE: CURRENT;localhost;users;OK;HARD;1;
1762 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;disk;1382115705
1763 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;http;1382115705
1764 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;load;1382115705
1765 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382115705
1766 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ping6;1382115705
1767 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;processes;1382115705
1768 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ssh;1382115705
1769 [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;users;1382115705
1770 [1382115731] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;localhost;ping6;2;critical test|
1771 [1382115731] SERVICE ALERT: localhost;ping6;CRITICAL;SOFT;2;critical test
1776 ## <a id="db-ido"></a> DB IDO
1778 The IDO (Icinga Data Output) modules for Icinga 2 take care of exporting all
1779 configuration and status information into a database. The IDO database is used
1780 by a number of projects including Icinga Web 1.x and 2.
1782 Details on the installation can be found in the [Getting Started](#configuring-ido)
1783 chapter. Details on the configuration can be found in the
1784 [IdoMysqlConnection](#objecttype-idomysqlconnection) and
1785 [IdoPgsqlConnection](#objecttype-idoPgsqlconnection)
1786 object configuration documentation.
1787 The DB IDO feature supports [High Availability](##high-availability-db-ido) in
1788 the Icinga 2 cluster.
1790 The following example query checks the health of the current Icinga 2 instance
1791 writing its current status to the DB IDO backend table `icinga_programstatus`
1792 every 10 seconds. By default it checks 60 seconds into the past which is a reasonable
1793 amount of time - adjust it for your requirements. If the condition is not met,
1794 the query returns an empty result.
1798 > Use [check plugins](#plugins) to monitor the backend.
1800 Replace the `default` string with your instance name, if different.
1804 # mysql -u root -p icinga -e "SELECT status_update_time FROM icinga_programstatus ps
1805 JOIN icinga_instances i ON ps.instance_id=i.instance_id
1806 WHERE (UNIX_TIMESTAMP(ps.status_update_time) > UNIX_TIMESTAMP(NOW())-60)
1807 AND i.instance_name='default';"
1809 +---------------------+
1810 | status_update_time |
1811 +---------------------+
1812 | 2014-05-29 14:29:56 |
1813 +---------------------+
1816 Example for PostgreSQL:
1818 # export PGPASSWORD=icinga; psql -U icinga -d icinga -c "SELECT ps.status_update_time FROM icinga_programstatus AS ps
1819 JOIN icinga_instances AS i ON ps.instance_id=i.instance_id
1820 WHERE ((SELECT extract(epoch from status_update_time) FROM icinga_programstatus) > (SELECT extract(epoch from now())-60))
1821 AND i.instance_name='default'";
1824 ------------------------
1825 2014-05-29 15:11:38+02
1829 A detailed list on the available table attributes can be found in the [DB IDO Schema documentation](#schema-db-ido).
1832 ## <a id="livestatus"></a> Livestatus
1834 The [MK Livestatus](http://mathias-kettner.de/checkmk_livestatus.html) project
1835 implements a query protocol that lets users query their Icinga instance for
1836 status information. It can also be used to send commands.
1838 Details on the installation can be found in the [Getting Started](#setting-up-livestatus)
1841 ### <a id="livestatus-sockets"></a> Livestatus Sockets
1843 Other to the Icinga 1.x Addon, Icinga 2 supports two socket types
1845 * Unix socket (default)
1848 Details on the configuration can be found in the [LivestatusListener](#objecttype-livestatuslistener)
1849 object configuration.
1851 ### <a id="livestatus-get-queries"></a> Livestatus GET Queries
1855 > All Livestatus queries require an additional empty line as query end identifier.
1856 > The `unixcat` tool is either available by the MK Livestatus project or as separate
1859 There also is a Perl module available in CPAN for accessing the Livestatus socket
1860 programmatically: [Monitoring::Livestatus](http://search.cpan.org/~nierlein/Monitoring-Livestatus-0.74/)
1863 Example using the unix socket:
1865 # echo -e "GET services\n" | unixcat /var/run/icinga2/cmd/livestatus
1867 Example using the tcp socket listening on port `6558`:
1869 # echo -e 'GET services\n' | netcat 127.0.0.1 6558
1871 # cat servicegroups <<EOF
1876 (cat servicegroups; sleep 1) | netcat 127.0.0.1 6558
1879 ### <a id="livestatus-command-queries"></a> Livestatus COMMAND Queries
1881 A list of available external commands and their parameters can be found [here](#external-commands-list-detail)
1883 $ echo -e 'COMMAND <externalcommandstring>' | netcat 127.0.0.1 6558
1886 ### <a id="livestatus-filters"></a> Livestatus Filters
1890 Operator | Negate | Description
1891 ----------|------------------------
1893 ~ | !~ | Regex match
1894 =~ | !=~ | Equality ignoring case
1895 ~~ | !~~ | Regex ignoring case
1898 <= | | Less than or equal
1899 >= | | Greater than or equal
1902 ### <a id="livestatus-stats"></a> Livestatus Stats
1904 Schema: "Stats: aggregatefunction aggregateattribute"
1906 Aggregate Function | Description
1907 -------------------|--------------
1912 std | standard deviation
1913 suminv | sum (1 / value)
1914 avginv | suminv / count
1915 count | ordinary default for any stats query if not aggregate function defined
1920 Filter: has_been_checked = 1
1921 Filter: check_type = 0
1922 Stats: sum execution_time
1924 Stats: sum percent_state_change
1925 Stats: min execution_time
1927 Stats: min percent_state_change
1928 Stats: max execution_time
1930 Stats: max percent_state_change
1932 ResponseHeader: fixed16
1934 ### <a id="livestatus-output"></a> Livestatus Output
1938 CSV Output uses two levels of array separators: The members array separator
1939 is a comma (1st level) while extra info and host|service relation separator
1940 is a pipe (2nd level).
1942 Separators can be set using ASCII codes like:
1944 Separators: 10 59 44 124
1950 ### <a id="livestatus-error-codes"></a> Livestatus Error Codes
1953 ----------|--------------
1955 404 | Table does not exist
1956 452 | Exception on query
1958 ### <a id="livestatus-tables"></a> Livestatus Tables
1960 Table | Join |Description
1961 --------------|-----------|----------------------------
1962 hosts | | host config and status attributes, services counter
1963 hostgroups | | hostgroup config, status attributes and host/service counters
1964 services | hosts | service config and status attributes
1965 servicegroups | | servicegroup config, status attributes and service counters
1966 contacts | | contact config and status attributes
1967 contactgroups | | contact config, members
1968 commands | | command name and line
1969 status | | programstatus, config and stats
1970 comments | services | status attributes
1971 downtimes | services | status attributes
1972 timeperiods | | name and is inside flag
1973 endpoints | | config and status attributes
1974 log | services, hosts, contacts, commands | parses [compatlog](#objecttype-compatlogger) and shows log attributes
1975 statehist | hosts, services | parses [compatlog](#objecttype-compatlogger) and aggregates state change attributes
1977 The `commands` table is populated with `CheckCommand`, `EventCommand` and `NotificationCommand` objects.
1979 A detailed list on the available table attributes can be found in the [Livestatus Schema documentation](#schema-livestatus).
1982 ## <a id="check-result-files"></a> Check Result Files
1984 Icinga 1.x writes its check result files to a temporary spool directory
1985 where they are processed in a regular interval.
1986 While this is extremely inefficient in performance regards it has been
1987 rendered useful for passing passive check results directly into Icinga 1.x
1988 skipping the external command pipe.
1990 Several clustered/distributed environments and check-aggregation addons
1991 use that method. In order to support step-by-step migration of these
1992 environments, Icinga 2 ships the `CheckResultReader` object.
1994 There is no feature configuration available, but it must be defined
1995 on-demand in your Icinga 2 objects configuration.
1997 object CheckResultReader "reader" {
1998 spool_dir = "/data/check-results"