* Network services (HTTP, SMTP, SNMP, SSH, etc.)
* Printers
-* Switches / Routers
-* Temperature Sensors
+* Switches / routers
+* Temperature sensors
* Other local or network-accessible services
Host objects provide a mechanism to group services that are running
}
object Service "ping4" {
- host_name = "localhost"
+ host_name = "my-server1"
check_command = "ping4"
}
object Service "http" {
- host_name = "localhost"
- check_command = "http_ip"
+ host_name = "my-server1"
+ check_command = "http"
}
The example creates two services `ping4` and `http` which belong to the
It also specifies that the host should perform its own check using the `hostalive`
check command.
-The `address` custom attribute is used by check commands to determine which network
+The `address` attribute is used by check commands to determine which network
address is associated with the host object.
+Details on troubleshooting check problems can be found [here](8-troubleshooting.md#troubleshooting).
+
### <a id="host-states"></a> Host States
Hosts can be in any of the following states:
HARD | The host/service's state hasn't recently changed.
SOFT | The host/service has recently changed state and is being re-checked.
+### <a id="host-service-checks"></a> Host and Service Checks
+
+Hosts and Services determine their state from a check result returned from a check
+execution to the Icinga 2 application. By default the `generic-host` example template
+will define `hostalive` as host check. If your host is unreachable for ping, you should
+consider using a different check command, for instance the `http` check command, or if
+there is no check available, the `dummy` check command.
+
+ object Host "uncheckable-host" {
+ check_command = "dummy"
+ vars.dummy_state = 1
+ vars.dummy_text = "Pretending to be OK."
+ }
+
+Service checks could also use a `dummy` check, but the common strategy is to
+[integrate an existing plugin](3-monitoring-basics.md#command-plugin-integration) as
+[check command](3-monitoring-basics.md#check-commands) and [reference](3-monitoring-basics.md#command-passing-parameters)
+that in your [Service](12-object-types.md#objecttype-service) object definition.
+
+## <a id="configuration-best-practice"></a> Configuration Best Practice
+
+The [Getting Started](2-getting-started.md#getting-started) chapter already introduced various aspects
+of the Icinga 2 configuration language. If you are ready to configure additional
+hosts, services, notifications, dependencies, etc, you should think about the
+requirements first and then decide for a possible strategy.
+
+There are many ways of creating Icinga 2 configuration objects:
+
+* Manually with your preferred editor, for example vi(m), nano, notepad, etc.
+* Generated by a [configuration management too](2-getting-started.md#configuration-tools) such as Puppet, Chef, Ansible, etc.
+* A configuration addon for Icinga 2
+* A custom exporter script from your CMDB or inventory tool
+* your own.
+
+In order to find the best strategy for your own configuration, ask yourself the following questions:
+
+* Do your hosts share a common group of services (for example linux hosts with disk, load, etc checks)?
+* Only a small set of users receives notifications and escalations for all hosts/services?
+
+If you can at least answer one of these questions with yes, look for the [apply rules](3-monitoring-basics.md#using-apply) logic
+instead of defining objects on a per host and service basis.
+
+* You are required to define specific configuration for each host/service?
+* Does your configuration generation tool already know about the host-service-relationship?
+
+Then you should look for the object specific configuration setting `host_name` etc accordingly.
+
+Finding the best files and directory tree for your configuration is up to you. Make sure that
+the [icinga2.conf](2-getting-started.md#icinga2-conf) configuration file includes them, and then think about:
+
+* tree-based on locations, hostgroups, specific host attributes with sub levels of directories.
+* flat `hosts.conf`, `services.conf`, etc files for rule based configuration.
+* generated configuration with one file per host and a global configuration for groups, users, etc.
+* one big file generated from an external application (probably a bad idea for maintaining changes).
+* your own.
+
+In either way of choosing the right strategy you should additionally check the following:
+
+* Are there any specific attributes describing the host/service you could set as `vars` custom attributes?
+You can later use them for applying assign/ignore rules, or export them into external interfaces.
+* Put hosts into hostgroups, services into servicegroups and use these attributes for your apply rules.
+* Use templates to store generic attributes for your objects and apply rules making your configuration more readable.
+Details can be found in the [using templates](3-monitoring-basics.md#object-inheritance-using-templates) chapter.
+* Apply rules may overlap. Keep a central place (for example, [services.conf](2-getting-started.md#services-conf) or [notifications.conf](2-getting-started.md#notifications-conf)) storing
+the configuration instead of defining apply rules deep in your configuration tree.
+* Every plugin used as check, notification or event command requires a `Command` definition.
+Further details can be looked up in the [check commands](3-monitoring-basics.md#check-commands) chapter.
+
+If you happen to have further questions, do not hesitate to join the [community support channels](https://support.icinga.org)
+and ask community members for their experience and best practices.
-## <a id="using-templates"></a> Using Templates
+
+### <a id="object-inheritance-using-templates"></a> Object Inheritance Using Templates
Templates may be used to apply a set of identical attributes to more than one
object:
enable_perfdata = true
}
- object Service "ping4" {
+ template Service "ipv6-service {
+ notes = "IPv6 critical != IPv4 broken."
+ }
+
+ apply Service "ping4" {
import "generic-service"
- host_name = "localhost"
check_command = "ping4"
+
+ assign where host.address
}
- object Service "ping6" {
+ apply Service "ping6" {
import "generic-service"
+ import "ipv6-service"
- host_name = "localhost"
check_command = "ping6"
+
+ assign where host.address6
}
+
In this example the `ping4` and `ping6` services inherit properties from the
-template `generic-service`.
+template `generic-service`. The `ping6` service additionally imports the `ipv6-service`
+template with the `notes` attribute.
Objects as well as templates themselves can import an arbitrary number of
templates. Attributes inherited from a template can be overridden in the
object if necessary.
-## <a id="using-apply"></a> Apply objects based on rules
+You can import existing non-template objects into objects which
+requires you to use unique names for templates and objects sharing
+the same namespace.
-Instead of assigning each object (`Service`, `Notification`, `Dependency`, `ScheduledDowntime`)
-based on attribute identifiers for example `host_name` objects can be [applied](#apply).
+Example for importing objects:
- apply Service "load" {
+ object CheckCommand "snmp-simple" {
+ ...
+ vars.snmp_defaults = ...
+ }
+
+ object CheckCommand "snmp-advanced" {
+ import "snmp-simple"
+ ...
+ vars.snmp_advanced = ...
+ }
+
+### <a id="using-apply"></a> Apply objects based on rules
+
+Instead of assigning each object ([Service](12-object-types.md#objecttype-service),
+[Notification](12-object-types.md#objecttype-notification), [Dependency](12-object-types.md#objecttype-dependency),
+[ScheduledDowntime](12-object-types.md#objecttype-scheduleddowntime))
+based on attribute identifiers for example `host_name` objects can be [applied](10-language-reference.md#apply).
+
+Before you start using the apply rules keep the following in mind:
+
+* Define the best match.
+ * A set of unique [custom attributes](3-monitoring-basics.md#custom-attributes-apply) for these hosts/services?
+ * Or [group](3-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
+ * A generic pattern [match](10-language-reference.md#function-calls) on the host/service name?
+ * [Multiple expressions combined](3-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](10-language-reference.md#expression-operators)
+* All expressions must return a boolean value (an empty string is equal to `false` e.g.)
+
+> **Note**
+>
+> You can set/override object attributes in apply rules using the respectively available
+> objects in that scope (host and/or service objects).
+
+[Custom attributes](3-monitoring-basics.md#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
+for not only matching for their existance or values in apply expressions, but also assign
+("inherit") their values into the generated objected from apply rules.
+
+* [Apply services to hosts](3-monitoring-basics.md#using-apply-services)
+* [Apply notifications to hosts and services](3-monitoring-basics.md#using-apply-notifications)
+* [Apply dependencies to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
+* [Apply scheduled downtimes to hosts and services](3-monitoring-basics.md#using-apply-scheduledowntimes)
+
+A more advanced example is using [apply with for loops on arrays or
+dictionaries](#using-apply-for) for example provided by
+[custom atttributes](3-monitoring-basics.md#custom-attributes-apply) or groups.
+
+> **Tip**
+>
+> Building configuration in that dynamic way requires detailed information
+> of the generated objects. Use the `object list` [CLI command](5-cli-commands.md#cli-command-object)
+> after successful [configuration validation](5-cli-commands.md#config-validation).
+
+
+#### <a id="using-apply-expressions"></a> Apply Rules Expressions
+
+You can use simple or advanced combinations of apply rule expressions. Each
+expression must evaluate into the boolean `true` value. An empty string
+will be for instance interpreted as `false`. In a similar fashion undefined
+attributes will return `false`.
+
+Returns `false`:
+
+ assign where host.vars.attribute_does_not_exist
+
+Multiple `assign where` condition rows are evaluated as `OR` condition.
+
+You can combine multiple expressions for matching only a subset of objects. In some cases,
+you want to be able to add more than one assign/ignore where expression which matches
+a specific condition. To achieve this you can use the logical `and` and `or` operators.
+
+
+Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
+matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
+should be ignored, or any host name ending with `*internal` pattern.
+
+ object HostGroup "mysql-server" {
+ display_name = "MySQL Server"
+
+ assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
+ ignore where host.vars.test_server == true
+ ignore where match("*internal", host.name)
+ }
+
+Similar example for advanced notification apply rule filters: If the service
+attribute `notes` contains the `has gold support 24x7` string `AND` one of the
+two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
+`OR` the host custom attribute `always_notify` is set to `true`.
+
+The notification is ignored for services whose host name ends with `*internal`
+`OR` the `priority` custom attribute is [less than](10-language-reference.md#expression-operators) `2`.
+
+ template Notification "cust-xy-notification" {
+ users = [ "noc-xy", "mgmt-xy" ]
+ command = "mail-service-notification"
+ }
+
+ apply Notification "notify-cust-xy-mysql" to Service {
+ import "cust-xy-notification"
+
+ assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
+ ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
+ }
+
+
+
+
+#### <a id="using-apply-services"></a> Apply Services to Hosts
+
+The sample configuration already ships a detailed example in [hosts.conf](2-getting-started.md#hosts-conf)
+and [services.conf](2-getting-started.md#services-conf) for this use case.
+
+The example for `ssh` applies a service object to all hosts with the `address`
+attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
+
+ apply Service "ssh" {
import "generic-service"
- check_command = "load"
+ check_command = "ssh"
- assign where "linux-server" in host.groups
- ignore where host.vars.no_load_check
+ assign where host.address && host.vars.os == "Linux"
}
-In this example the `load` service will be created as object for all hosts in the `linux-server`
-host group. If the `no_load_check` custom attribute is set, the host will be
-ignored.
+
+Other detailed scenario examples are used in their respective chapters, for example
+[apply services with custom command arguments](3-monitoring-basics.md#using-apply-services-command-arguments).
+
+#### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
manner:
+
apply Notification "mail-noc" to Service {
import "mail-service-notification"
- command = "mail-service-notification"
+
user_groups = [ "noc" ]
- assign where service.vars.sla == "24x7"
+ assign where host.vars.notification.mail
}
+
In this example the `mail-noc` notification will be created as object for all services having the
-`sla` custom attribute set to `24x7`. The notification command is set to `mail-service-notification`
+`notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
and all members of the user group `noc` will get notified.
-`Dependency` and `ScheduledDowntime` objects can be applied in a similar fashion.
+#### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
+
+Detailed examples can be found in the [dependencies](3-monitoring-basics.md#dependencies) chapter.
+
+#### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
+The sample confituration ships an example in [downtimes.conf](2-getting-started.md#downtimes-conf).
-## <a id="groups"></a> Groups
+Detailed examples can be found in the [recurring downtimes](3-monitoring-basics.md#recurring-downtimes) chapter.
+
+
+#### <a id="using-apply-for"></a> Using Apply For Rules
+
+Next to the standard way of using apply rules there is the requirement of generating
+apply rules objects based on set (array or dictionary). That way you'll save quite
+of a lot of duplicated apply rules by combining them into one generic generating
+the object name with or without a prefix.
+
+The sample configuration already ships a detailed example in [hosts.conf](2-getting-started.md#hosts-conf)
+and [services.conf](2-getting-started.md#services-conf) for this use case.
+
+Imagine a different example: You are monitoring your switch (hosts) with many
+interfaces (services). The following requirements/problems apply:
+
+* Each interface service check should be named with a prefix and a running number
+* Each interface has its own vlan tag
+* Some interfaces have QoS enabled
+* Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be
+dynamically generated
+
+By defining the `interfaces` dictionary with three example interfaces on the `core-switch`
+host object, you'll make sure to pass the storage required by the for loop in the service apply
+rule.
+
+
+ object Host "core-switch" {
+ import "generic-host"
+ address = "127.0.0.1"
+
+ vars.interfaces["0"] = {
+ port = 1
+ vlan = "internal"
+ address = "127.0.0.2"
+ qos = "enabled"
+ }
+ vars.interfaces["1"] = {
+ port = 2
+ vlan = "mgmt"
+ address = "127.0.1.2"
+ }
+ vars.interfaces["2"] = {
+ port = 3
+ vlan = "remote"
+ address = "127.0.2.2"
+ }
+ }
+
+You can also omit the `"if-"` string, then all generated service names are directly
+taken from the `if_name` variable value.
+
+The config dictionary contains all key-value pairs for the specific interface in one
+loop cycle, like `port`, `vlan`, `address` and `qos` for the `0` interface.
+
+By defining a default value for the custom attribute `qos` in the `vars` dictionary
+before adding the `config` dictionary we''ll ensure that this attribute is always defined.
+
+After `vars` is fully populated, all object attributes can be set. For strings, you can use
+string concatention with the `+` operator.
+
+You can also specifiy the check command that way.
+
+ apply Service "if-" for (if_name => config in host.vars.interfaces) {
+ import "generic-service"
+ check_command = "ping4"
+
+ vars.qos = "disabled"
+ vars += config
+
+ display_name = "if-" + if_name + "-" + vars.vlan
+
+ notes = "Interface check for Port " + string(vars.port) + " in VLAN " + vars.vlan + " on Address " + vars.address + " QoS " + vars.qos
+ notes_url = "http://foreman.company.com/hosts/" + host.name
+ action_url = "http://snmp.checker.company.com/" + host.name + "if-" + if_name
+
+ assign where host.vars.interfaces
+ }
+
+Note that numbers must be explicitely casted to string when adding to strings.
+This can be achieved by wrapping them into the [string()](10-language-reference.md#function-calls) function.
+
+> **Tip**
+>
+> Building configuration in that dynamic way requires detailed information
+> of the generated objects. Use the `object list` [CLI command](5-cli-commands.md#cli-command-object)
+> after successful [configuration validation](5-cli-commands.md#config-validation).
+
+
+#### <a id="using-apply-object attributes"></a> Use Object Attributes in Apply Rules
+
+Since apply rules are evaluated after the generic objects, you
+can reference existing host and/or service object attributes as
+values for any object attribute specified in that apply rule.
+
+ object Host "opennebula-host" {
+ import "generic-host"
+ address = "10.1.1.2"
+
+ vars.hosting["xyz"] = {
+ http_uri = "/shop"
+ customer_name = "Customer xyz"
+ customer_id = "7568"
+ support_contract = "gold"
+ }
+ vars.hosting["abc"] = {
+ http_uri = "/shop"
+ customer_name = "Customer xyz"
+ customer_id = "7568"
+ support_contract = "silver"
+ }
+ }
+
+ apply Service for (customer => config in host.vars.hosting) {
+ import "generic-service"
+ check_command = "ping4"
+
+ vars.qos = "disabled"
+
+ vars += config
+
+ vars.http_uri = "/" + vars.customer + "/" + config.http_uri
+
+ display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
+
+ notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
+
+ notes_url = "http://foreman.company.com/hosts/" + host.name
+ action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
+
+ assign where host.vars.hosting
+ }
+
+### <a id="groups"></a> Groups
Groups are used for combining hosts, services, and users into
accessible configuration attributes and views in external (web)
groups += [ "windows-mssql-admins" ]
}
- object User "win-mssql-noc" {
- import "generic-windows-mssql-users"
+ object User "win-mssql-noc" {
+ import "generic-windows-mssql-users"
+
+ email = "noc@example.com"
+ }
+
+ object User "win-mssql-ops" {
+ import "generic-windows-mssql-users"
+
+ email = "ops@example.com"
+ }
+
+#### <a id="group-assign-intro"></a> Group Membership Assign
+
+If there is a certain number of hosts, services, or users matching a pattern
+it's reasonable to assign the group object to these members.
+Details on the `assign where` syntax can be found [here](10-language-reference.md#apply)
+
+ object HostGroup "prod-mssql" {
+ display_name = "Production MSSQL Servers"
+ assign where host.vars.mssql_port && host.vars.prod_mysql_db
+ ignore where host.vars.test_server == true
+ ignore where match("*internal", host.name)
+ }
+
+In this inherited example from above all hosts with the `vars` attribute `mssql_port`
+set will be added as members to the host group `mssql`. All `*internal`
+hosts or with the `test_server` attribute set to `true` will be ignored.
+
+## <a id="notifications"></a> Notifications
+
+Notifications for service and host problems are an integral part of your
+monitoring setup.
+
+When a host or service is in a downtime, a problem has been acknowledged or
+the dependency logic determined that the host/service is unreachable, no
+notifications are sent. You can configure additional type and state filters
+refining the notifications being actually sent.
+
+There are many ways of sending notifications, e.g. by e-mail, XMPP,
+IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
+Instead it relies on external mechanisms such as shell scripts to notify users.
+
+A notification specification requires one or more users (and/or user groups)
+who will be notified in case of problems. These users must have all custom
+attributes defined which will be used in the `NotificationCommand` on execution.
+
+The user `icingaadmin` in the example below will get notified only on `WARNING` and
+`CRITICAL` states and `problem` and `recovery` notification types.
+
+ object User "icingaadmin" {
+ display_name = "Icinga 2 Admin"
+ enable_notifications = true
+ states = [ OK, Warning, Critical ]
+ types = [ Problem, Recovery ]
+ email = "icinga@localhost"
+ }
+
+If you don't set the `states` and `types` configuration attributes for the `User`
+object, notifications for all states and types will be sent.
+
+Details on troubleshooting notification problems can be found [here](8-troubleshooting.md#troubleshooting).
+
+> **Note**
+>
+> Make sure that the [notification](5-cli-commands.md#features) feature is enabled on your master instance
+> in order to execute notification commands.
+
+You should choose which information you (and your notified users) are interested in
+case of emergency, and also which information does not provide any value to you and
+your environment.
+
+An example notification command is explained [here](3-monitoring-basics.md#notification-commands).
+
+You can add all shared attributes to a `Notification` template which is inherited
+to the defined notifications. That way you'll save duplicated attributes in each
+`Notification` object. Attributes can be overridden locally.
+
+ template Notification "generic-notification" {
+ interval = 15m
+
+ command = "mail-service-notification"
+
+ states = [ Warning, Critical, Unknown ]
+ types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
+ FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
+
+ period = "24x7"
+ }
+
+The time period `24x7` is shipped as example configuration with Icinga 2.
+
+Use the `apply` keyword to create `Notification` objects for your services:
+
+ apply Notification "notify-cust-xy-mysql" to Service {
+ import "generic-notification"
+
+ users = [ "noc-xy", "mgmt-xy" ]
+
+ assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
+ ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true)
+ }
+
+
+Instead of assigning users to notifications, you can also add the `user_groups`
+attribute with a list of user groups to the `Notification` object. Icinga 2 will
+send notifications to all group members.
+
+> **Note**
+>
+> Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown`
+> states for services, `Down` for hosts) will receive `Recovery` notifications.
+
+### <a id="notification-escalations"></a> Notification Escalations
+
+When a problem notification is sent and a problem still exists at the time of re-notification
+you may want to escalate the problem to the next support level. A different approach
+is to configure the default notification by email, and escalate the problem via SMS
+if not already solved.
+
+You can define notification start and end times as additional configuration
+attributes making the `Notification` object a so-called `notification escalation`.
+Using templates you can share the basic notification attributes such as users or the
+`interval` (and override them for the escalation then).
+
+Using the example from above, you can define additional users being escalated for SMS
+notifications between start and end time.
+
+ object User "icinga-oncall-2nd-level" {
+ display_name = "Icinga 2nd Level"
+
+ vars.mobile = "+1 555 424642"
+ }
+
+ object User "icinga-oncall-1st-level" {
+ display_name = "Icinga 1st Level"
+
+ vars.mobile = "+1 555 424642"
+ }
+
+Define an additional [NotificationCommand](#notification) for SMS notifications.
+
+> **Note**
+>
+> The example is not complete as there are many different SMS providers.
+> Please note that sending SMS notifications will require an SMS provider
+> or local hardware with a SIM card active.
+
+ object NotificationCommand "sms-notification" {
+ command = [
+ PluginDir + "/send_sms_notification",
+ "$mobile$",
+ "..."
+ }
+
+The two new notification escalations are added onto the local host
+and its service `ping4` using the `generic-notification` template.
+The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
+command) after `30m` until `1h`.
+
+> **Note**
+>
+> The `interval` was set to 15m in the `generic-notification`
+> template example. Lower that value in your escalations by using a secondary
+> template or by overriding the attribute directly in the `notifications` array
+> position for `escalation-sms-2nd-level`.
+
+If the problem does not get resolved nor acknowledged preventing further notifications
+the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
+notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
+
+ apply Notification "mail" to Service {
+ import "generic-notification"
+
+ command = "mail-notification"
+ users = [ "icingaadmin" ]
+
+ assign where service.name == "ping4"
+ }
+
+ apply Notification "escalation-sms-2nd-level" to Service {
+ import "generic-notification"
+
+ command = "sms-notification"
+ users = [ "icinga-oncall-2nd-level" ]
+
+ times = {
+ begin = 30m
+ end = 1h
+ }
+
+ assign where service.name == "ping4"
+ }
+
+ apply Notification "escalation-sms-1st-level" to Service {
+ import "generic-notification"
+
+ command = "sms-notification"
+ users = [ "icinga-oncall-1st-level" ]
+
+ times = {
+ begin = 1h
+ end = 2h
+ }
+
+ assign where service.name == "ping4"
+ }
+
+### <a id="notification-delay"></a> Notification Delay
+
+Sometimes the problem in question should not be notified when the notification is due
+(the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
+you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
+postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
+Icinga 2 will not check against any end time for this notification. Make sure to
+specify a relatively low notification `interval` to get notified soon enough again.
+
+ apply Notification "mail" to Service {
+ import "generic-notification"
+
+ command = "mail-notification"
+ users = [ "icingaadmin" ]
+
+ interval = 5m
+
+ times.begin = 15m // delay notification window
+
+ assign where service.name == "ping4"
+ }
+
+### <a id="disable-renotification"></a> Disable Re-notifications
+
+If you prefer to be notified only once, you can disable re-notifications by setting the
+`interval` attribute to `0`.
+
+ apply Notification "notify-once" to Service {
+ import "generic-notification"
+
+ command = "mail-notification"
+ users = [ "icingaadmin" ]
- email = "noc@example.com"
+ interval = 0 // disable re-notification
+
+ assign where service.name == "ping4"
}
- object User "win-mssql-ops" {
- import "generic-windows-mssql-users"
+### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
- email = "ops@example.com"
- }
+If there are no notification state and type filter attributes defined at the `Notification`
+or `User` object Icinga 2 assumes that all states and types are being notified.
-### <a id="groups"></a> Group Membership Assign
+Available state and type filters for notifications are:
-If there is a certain number of hosts, services or users matching a pattern
-it's reasonable to assign the group object to these members.
-Details on the `assign where` syntax can be found [here](#group-assign)
+ template Notification "generic-notification" {
- object HostGroup "mssql" {
- display_name = "MSSQL Servers"
- assign where host.vars.mssql_port
+ states = [ Warning, Critical, Unknown ]
+ types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
+ FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
}
-In this inherited example from above all hosts with the `var` `mssql_port`
-set will be added as members to the host group `mssql`.
+If you are familiar with Icinga 1.x `notification_options` please note that they have been split
+into type and state to allow more fine granular filtering for example on downtimes and flapping.
+You can filter for acknowledgements and custom notifications too.s and custom notifications too.
## <a id="timeperiods"></a> Time Periods
}
-
## <a id="commands"></a> Commands
Icinga 2 uses three different command object types to specify how
-checks should be performed, notifications should be sent and
+checks should be performed, notifications should be sent, and
events should be handled.
### <a id="command-environment-variables"></a> Environment Variables for Commands
-Please check [Runtime Custom Attributes as Environment Variables](#runtime-custom-attribute-env-vars).
+Please check [Runtime Custom Attributes as Environment Variables](3-monitoring-basics.md#runtime-custom-attribute-env-vars).
### <a id="check-commands"></a> Check Commands
-`CheckCommand` objects define the command line how a check is called.
+[CheckCommand](12-object-types.md#objecttype-checkcommand) objects define the command line how
+a check is called.
+
+[CheckCommand](12-object-types.md#objecttype-checkcommand) objects are referenced by
+[Host](12-object-types.md#objecttype-host) and [Service](12-object-types.md#objecttype-service) objects
+using the `check_command` attribute.
+
+> **Note**
+>
+> Make sure that the [checker](5-cli-commands.md#features) feature is enabled in order to
+> execute checks.
#### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
-`CheckCommand` objects require the [ITL template](#itl-plugin-check-command)
+[CheckCommand](12-object-types.md#objecttype-checkcommand) objects require the [ITL template](13-icinga-template-library.md#itl-plugin-check-command)
`plugin-check-command` to support native plugin based check methods.
Unless you have done so already, download your check plugin and put it
-into the `PluginDir` directory. The following example uses the
+into the [PluginDir](2-getting-started.md#constants-conf) directory. The following example uses the
`check_disk` plugin shipped with the Monitoring Plugins package.
The plugin path and all command arguments are made a list of
[-t timeout] [-u unit] [-v] [-X type] [-N type]
...
+> **Note**
+>
+> Don't execute plugins as `root` and always use the absolute path to the plugin! Trust us.
+
Next step is to understand how command parameters are being passed from
-a host or service object, and add a `CheckCommand` definition based on these
-required parameters and/or default values.
+a host or service object, and add a [CheckCommand](12-object-types.md#objecttype-checkcommand)
+definition based on these required parameters and/or default values.
#### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
-Unline Icinga 1.x check command parameters are defined as custom attributes
-which can be accessed as runtime macros by the executed check command.
+Check command parameters are defined as custom attributes which can be accessed as runtime macros
+by the executed check command.
Define the default check command custom attribute `disk_wfree` and `disk_cfree`
(freely definable naming schema) and their default threshold values. You can
-then use these custom attributes as runtime macros on the command line.
+then use these custom attributes as runtime macros for [command arguments](3-monitoring-basics.md#command-arguments)
+on the command line.
+
+> **Tip**
+>
+> Use a common command type as prefix for your command arguments to increase
+> readability. `disk_wfree` helps understanding the context better than just
+> `wfree` as argument.
The default custom attributes can be overridden by the custom attributes
-defined in the service using the check command `disk`. The custom attributes
+defined in the service using the check command `my-disk`. The custom attributes
can also be inherited from a parent template using additive inheritance (`+=`).
- object CheckCommand "disk" {
+ object CheckCommand "my-disk" {
import "plugin-check-command"
- command = [
- PluginDir + "/check_disk",
- "-w", "$disk_wfree$%",
- "-c", "$disk_cfree$%"
- ],
+ command = [ PluginDir + "/check_disk" ]
+
+ arguments = {
+ "-w" = "$disk_wfree$%"
+ "-c" = "$disk_cfree$%"
+ "-W" = "$disk_inode_wfree$%"
+ "-K" = "$disk_inode_cfree$%"
+ "-p" = "$disk_partitions$"
+ "-x" = "$disk_partitions_excluded$"
+ }
vars.disk_wfree = 20
vars.disk_cfree = 10
}
-The host `localhost` with the service `disk` checks all disks with modified
-custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
+> **Note**
+>
+> A proper example for the `check_disk` plugin is already shipped with Icinga 2
+> ready to use with the [plugin check commands](13-icinga-template-library.md#plugin-check-command-disk).
+
+The host `localhost` with the applied service `basic-partitions` checks a basic set of disk partitions
+with modified custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
free disk space).
- object Host "localhost" {
- import "generic-host"
+The custom attribute `disk_partition` can either hold a single string or an array of
+string values for passing multiple partitions to the `check_disk` check plugin.
+ object Host "my-server" {
+ import "generic-host"
address = "127.0.0.1"
address6 = "::1"
+
+ vars.local_disks["basic-partitions"] = {
+ disk_partitions = [ "/", "/tmp", "/var", "/home" ]
+ }
}
- object Service "disk" {
+ apply Service for (disk => config in host.vars.local_disks) {
import "generic-service"
+ check_command = "my-disk"
- host_name = "localhost"
- check_command = "disk"
+ vars += config
vars.disk_wfree = 10
vars.disk_cfree = 5
+
+ assign where host.vars.local_disks
+ }
+
+
+More details on using arrays in custom attributes can be found in
+[this chapter](3-monitoring-basics.md#runtime-custom-attributes).
+
+
+#### <a id="command-arguments"></a> Command Arguments
+
+By defining a check command line using the `command` attribute Icinga 2
+will resolve all macros in the static string or array. Sometimes it is
+required to extend the arguments list based on a met condition evaluated
+at command execution. Or making arguments optional - only set if the
+macro value can be resolved by Icinga 2.
+
+ object CheckCommand "check_http" {
+ import "plugin-check-command"
+
+ command = [ PluginDir + "/check_http" ]
+
+ arguments = {
+ "-H" = "$http_vhost$"
+ "-I" = "$http_address$"
+ "-u" = "$http_uri$"
+ "-p" = "$http_port$"
+ "-S" = {
+ set_if = "$http_ssl$"
+ }
+ "--sni" = {
+ set_if = "$http_sni$"
+ }
+ "-a" = {
+ value = "$http_auth_pair$"
+ description = "Username:password on sites with basic authentication"
+ }
+ "--no-body" = {
+ set_if = "$http_ignore_body$"
+ }
+ "-r" = "$http_expect_body_regex$"
+ "-w" = "$http_warn_time$"
+ "-c" = "$http_critical_time$"
+ "-e" = "$http_expect$"
+ }
+
+ vars.http_address = "$address$"
+ vars.http_ssl = false
+ vars.http_sni = false
+ }
+
+The example shows the `check_http` check command defining the most common
+arguments. Each of them is optional by default and will be omitted if
+the value is not set. For example if the service calling the check command
+does not have `vars.http_port` set, it won't get added to the command
+line.
+
+If the `vars.http_ssl` custom attribute is set in the service, host or command
+object definition, Icinga 2 will add the `-S` argument based on the `set_if`
+numeric value to the command line. String values are not supported.
+
+If the macro value cannot be resolved, Icinga 2 will not add the defined argument
+to the final command argument array. Empty strings for macro values won't omit
+the argument.
+
+That way you can use the `check_http` command definition for both, with and
+without SSL enabled checks saving you duplicated command definitions.
+
+Details on all available options can be found in the
+[CheckCommand object definition](12-object-types.md#objecttype-checkcommand).
+
+### <a id="using-apply-services-command-arguments"></a> Apply Services with Custom Command Arguments
+
+Imagine the following scenario: The `my-host1` host is reachable using the default port 22, while
+the `my-host2` host requires a different port on 2222. Both hosts are in the hostgroup `my-linux-servers`.
+
+ object HostGroup "my-linux-servers" {
+ display_name = "Linux Servers"
+ assign where host.vars.os == "Linux"
+ }
+
+ /* this one has port 22 opened */
+ object Host "my-host1" {
+ import "generic-host"
+ address = "129.168.1.50"
+ vars.os = "Linux"
+ }
+
+ /* this one listens on a different ssh port */
+ object Host "my-host2" {
+ import "generic-host"
+ address = "129.168.2.50"
+ vars.os = "Linux"
+ vars.custom_ssh_port = 2222
+ }
+
+All hosts in the `my-linux-servers` hostgroup should get the `my-ssh` service applied based on an
+[apply rule](10-language-reference.md#apply). The optional `ssh_port` command argument should be inherited from the host
+the service is applied to. If not set, the check command `my-ssh` will omit the argument.
+The `host` argument is special: `skip_key` tells Icinga 2 to ignore the key, and directly put the
+value onto the command line. The `order` attribute specifies that this argument is the first one
+(`-1` is smaller than the other defaults).
+
+ object CheckCommand "my-ssh" {
+ import "plugin-check-command"
+
+ command = [ PluginDir + "/check_ssh" ]
+
+ arguments = {
+ "-p" = "$ssh_port$"
+ "host" = {
+ value = "$ssh_address$"
+ skip_key = true
+ order = -1
+ }
+ }
+
+ vars.ssh_address = "$address$"
+ }
+
+ /* apply ssh service */
+ apply Service "my-ssh" {
+ import "generic-service"
+ check_command = "my-ssh"
+
+ //set the command argument for ssh port with a custom host attribute, if set
+ vars.ssh_port = "$host.vars.custom_ssh_port$"
+
+ assign where "my-linux-servers" in host.groups
}
+The `my-host1` will get the `my-ssh` service checking on the default port:
+
+ [2014-05-26 21:52:23 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '129.168.1.50': PID 27281
+
+The `my-host2` will inherit the `custom_ssh_port` variable to the service and execute a different command:
+
+ [2014-05-26 21:51:32 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '-p', '2222', '129.168.2.50': PID 26956
+
### <a id="notification-commands"></a> Notification Commands
-`NotificationCommand` objects define how notifications are delivered to external
+[NotificationCommand](12-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
interfaces (E-Mail, XMPP, IRC, Twitter, etc).
-`NotificationCommand` objects require the [ITL template](#itl-plugin-notification-command)
+[NotificationCommand](12-object-types.md#objecttype-notificationcommand) objects are referenced by
+[Notification](12-object-types.md#objecttype-notification) objects using the `command` attribute.
+
+`NotificationCommand` objects require the [ITL template](13-icinga-template-library.md#itl-plugin-notification-command)
`plugin-notification-command` to support native plugin-based notifications.
+> **Note**
+>
+> Make sure that the [notification](5-cli-commands.md#features) feature is enabled on your master instance
+> in order to execute notification commands.
+
Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
the current check output) sending an email to the user(s) associated with the
notification itself (`$user.email$`).
command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
env = {
- "NOTIFICATIONTYPE" = "$notification.type$"
- "SERVICEDESC" = "$service.name$"
- "HOSTALIAS" = "$host.display_name$",
- "HOSTADDRESS" = "$address$",
- "SERVICESTATE" = "$service.state$",
- "LONGDATETIME" = "$icinga.long_date_time$",
- "SERVICEOUTPUT" = "$service.output$",
- "NOTIFICATIONAUTHORNAME" = "$notification.author$",
- "NOTIFICATIONCOMMENT" = "$notification.comment$",
- "HOSTDISPLAYNAME" = "$host.display_name$",
- "SERVICEDISPLAYNAME" = "$service.display_name$",
- "USEREMAIL" = "$user.email$"
+ NOTIFICATIONTYPE = "$notification.type$"
+ SERVICEDESC = "$service.name$"
+ HOSTALIAS = "$host.display_name$"
+ HOSTADDRESS = "$address$"
+ SERVICESTATE = "$service.state$"
+ LONGDATETIME = "$icinga.long_date_time$"
+ SERVICEOUTPUT = "$service.output$"
+ NOTIFICATIONAUTHORNAME = "$notification.author$"
+ NOTIFICATIONCOMMENT = "$notification.comment$"
+ HOSTDISPLAYNAME = "$host.display_name$"
+ SERVICEDISPLAYNAME = "$service.display_name$"
+ USEREMAIL = "$user.email$"
}
}
/usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
+> **Note**
+>
+> This example is for `exim` only. Requires changes for `sendmail` and
+> other MTAs.
+
While it's possible to specify the entire notification command right
in the NotificationCommand object it is generally advisable to create a
shell script in the `/etc/icinga2/scripts` directory and have the
### <a id="event-commands"></a> Event Commands
-Unlike notifications event commands are called on every service state change
-if defined. Therefore the `EventCommand` object should define a command line
+Unlike notifications event commands for hosts/services are called on every
+check execution if one of these conditions match:
+
+* The host/service is in a [soft state](3-monitoring-basics.md#hard-soft-states)
+* The host/service state changes into a [hard state](3-monitoring-basics.md#hard-soft-states)
+* The host/service state recovers from a [soft or hard state](3-monitoring-basics.md#hard-soft-states) to [OK](3-monitoring-basics.md#service-states)/[Up](3-monitoring-basics.md#host-states)
+
+[EventCommand](12-object-types.md#objecttype-eventcommand) objects are referenced by
+[Host](12-object-types.md#objecttype-host) and [Service](12-object-types.md#objecttype-service) objects
+using the `event_command` attribute.
+
+Therefore the `EventCommand` object should define a command line
evaluating the current service state and other service runtime attributes
-available through runtime vars. Runtime macros such as `$SERVICESTATETYPE$`
-and `$SERVICESTATE$` will be processed by Icinga 2 helping on fine-granular
+available through runtime vars. Runtime macros such as `$service.state_type$`
+and `$service.state$` will be processed by Icinga 2 helping on fine-granular
events being triggered.
Common use case scenarios are a failing HTTP check requiring an immediate
`EventCommand` objects require the ITL template `plugin-event-command`
to support native plugin based checks.
-When the event command is triggered on a service state change, it will
-send a check result using the `process_check_result` script forcibly
-changing the service state back to `OK` (`-r 0`) providing some debug
-information in the check output (`-o`).
+#### <a id="event-command-restart-service-daemon"></a> Use Event Commands to Restart Service Daemon
- object EventCommand "plugin-event-process-check-result" {
- import "plugin-event-command"
+The following example will triggert a restart of the `httpd` daemon
+via ssh when the `http` service check fails. If the service state is
+`OK`, it will not trigger any event action.
- command = [
- PluginDir + "/process_check_result",
- "-H", "$host.name$",
- "-S", "$service.name$",
- "-c", LocalStateDir + "/run/icinga2/cmd/icinga2.cmd",
- "-r", "0",
- "-o", "Event Handler triggered in state '$service.state$' with output '$service.output$'."
- ]
- }
+Requirements:
-### <a id="commands-arguments"></a> Command Arguments
+* ssh connection
+* icinga user with public key authentication
+* icinga user with sudo permissions for restarting the httpd daemon.
-By defining a check command line using the `command` attribute Icinga 2
-will resolve all macros in the static string or array. Sometimes it is
-required to extend the arguments list based on a met condition evaluated
-at command execution. Or making arguments optional - only set if the
-macro value can be resolved by Icinga 2.
+Example on Debian:
+
+ # ls /home/icinga/.ssh/
+ authorized_keys
+
+ # visudo
+ icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart
- object CheckCommand "check_http" {
- import "plugin-check-command"
- command = PluginDir + "/check_http"
+Define a generic [EventCommand](12-object-types.md#objecttype-eventcommand) object `event_by_ssh`
+which can be used for all event commands triggered using ssh:
+
+ /* pass event commands through ssh */
+ object EventCommand "event_by_ssh" {
+ import "plugin-event-command"
+
+ command = [ PluginDir + "/check_by_ssh" ]
arguments = {
- "-H" = "$http_vhost$"
- "-I" = "$http_address$"
- "-u" = "$http_uri$"
- "-p" = "$http_port$"
- "-S" = {
- set_if = "$http_ssl$"
+ "-H" = "$event_by_ssh_address$"
+ "-p" = "$event_by_ssh_port$"
+ "-C" = "$event_by_ssh_command$"
+ "-l" = "$event_by_ssh_logname$"
+ "-i" = "$event_by_ssh_identity$"
+ "-q" = {
+ set_if = "$event_by_ssh_quiet$"
}
- "-w" = "$http_warn_time$"
- "-c" = "$http_critical_time$"
+ "-w" = "$event_by_ssh_warn$"
+ "-c" = "$event_by_ssh_crit$"
+ "-t" = "$event_by_ssh_timeout$"
}
- vars.http_address = "$address$"
- vars.http_ssl = false
+ vars.event_by_ssh_address = "$address$"
+ vars.event_by_ssh_quiet = false
}
-The example shows the `check_http` check command defining the most common
-arguments. Each of them is optional by default and will be omitted if
-the value is not set. For example if the service calling the check command
-does not have `vars.http_port` set, it won't get added to the command
-line.
-If the `vars.http_ssl` custom attribute is set in the service, host or command
-object definition, Icinga 2 will add the `-S` argument based on the `set_if`
-option to the command line.
-That way you can use the `check_http` command definition for both, with and
-without SSL enabled checks saving you duplicated command definitions.
+The actual event command only passes the `event_by_ssh_command` attribute.
+The `event_by_ssh_service` custom attribute takes care of passing the correct
+daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon
+is only restarted when the service is an a not `OK` state.
-Details on all available options can be found in the
-[CheckCommand object definition](#objecttype-checkcommand).
+ object EventCommand "event_by_ssh_restart_service" {
+ import "event_by_ssh"
-## <a id="notifications"></a> Notifications
+ //only restart the daemon if state > 0 (not-ok)
+ //requires sudo permissions for the icinga user
+ vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo /etc/init.d/$event_by_ssh_service$ restart"
+ }
-Notifications for service and host problems are an integral part of your
-monitoring setup.
-When a host or service is in a downtime, a problem has been acknowledged or
-the dependency logic determined that the host/service is unreachable, no
-notirications are sent. You can configure additional type and state filters
-refining the notifications being actually sent.
+Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it
+which service should be restarted using the `event_by_ssh_service` attribute.
-There are many ways of sending notifications, e.g. by e-mail, XMPP,
-IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
-Instead it relies on external mechanisms such as shell scripts to notify users.
+ object Service "http" {
+ import "generic-service"
+ host_name = "remote-http-host"
+ check_command = "http"
-A notification specification requires one or more users (and/or user groups)
-who will be notified in case of problems. These users must have all custom
-attributes defined which will be used in the `NotificationCommand` on execution.
+ event_command = "event_by_ssh_restart_service"
+ vars.event_by_ssh_service = "$host.vars.httpd_name$"
-The user `icingaadmin` in the example below will get notified only on `WARNING` and
-`CRITICAL` states and `problem` and `recovery` notification types.
+ //vars.event_by_ssh_logname = "icinga"
+ //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub"
+ }
- object User "icingaadmin" {
- display_name = "Icinga 2 Admin"
- enable_notifications = true
- states = [ OK, Warning, Critical ]
- types = [ Problem, Recovery ]
- email = "icinga@localhost"
+
+Each host with this service then must define the `httpd_name` custom attribute
+(for example generated from your cmdb):
+
+ object Host "remote-http-host" {
+ import "generic-host"
+ address = "192.168.1.100"
+
+ vars.httpd_name = "apache2"
}
-If you don't set the `states` and `types`
-configuration attributes for the `User` object, notifications for all states and types
-will be sent.
+You can testdrive this example by manually stopping the `httpd` daemon
+on your `remote-http-host`. Enable the `debuglog` feature and tail the
+`/var/log/icinga2/debug.log` file.
-You should choose which information you (and your notified users) are interested in
-case of emergency, and also which information does not provide any value to you and
-your environment.
+Remote Host Terminal:
-An example notification command is explained [here](#notification-commands).
+ # date; service apache2 status
+ Mon Sep 15 18:57:39 CEST 2014
+ Apache2 is running (pid 23651).
+ # date; service apache2 stop
+ Mon Sep 15 18:57:47 CEST 2014
+ [ ok ] Stopping web server: apache2 ... waiting .
-You can add all shared attributes to a `Notification` template which is inherited
-to the defined notifications. That way you'll save duplicated attributes in each
-`Notification` object. Attributes can be overridden locally.
+Icinga 2 Host Terminal:
+ [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100': PID 32622
+ [2014-09-15 18:58:32 +0200] notice/Process: PID 32622 ('/usr/lib64/nagios/plugins/check_http' '-I' '192.168.1.100') terminated with exit code 2
+ [2014-09-15 18:58:32 +0200] notice/Checkable: State Change: Checkable remote-http-host!http soft state change from OK to CRITICAL detected.
+ [2014-09-15 18:58:32 +0200] notice/Checkable: Executing event handler 'event_by_ssh_restart_service' for service 'remote-http-host!http'
+ [2014-09-15 18:58:32 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100': PID 32623
+ [2014-09-15 18:58:33 +0200] notice/Process: PID 32623 ('/usr/lib64/nagios/plugins/check_by_ssh' '-C' 'test 2 -gt 0 && sudo /etc/init.d/apache2 restart' '-H' '192.168.1.100') terminated with exit code 0
- template Notification "generic-notification" {
- interval = 15m
+Remote Host Terminal:
- command = "mail-service-notification"
+ # date; service apache2 status
+ Mon Sep 15 18:58:44 CEST 2014
+ Apache2 is running (pid 24908).
- states = [ Warning, Critical, Unknown ]
- types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
- FlappingEnd, DowntimeStart,DowntimeEnd, DowntimeRemoved ]
- period = "24x7"
- }
-The time period `24x7` is shipped as example configuration with Icinga 2.
-Use the `apply` keyword to create `Notification` objects for your services:
+## <a id="dependencies"></a> Dependencies
- apply Notification "mail" to Service {
- import "generic-notification"
+Icinga 2 uses host and service [Dependency](12-object-types.md#objecttype-dependency) objects
+for determing their network reachability.
- command = "mail-notification"
- users = [ "icingaadmin" ]
+A service can depend on a host, and vice versa. A service has an implicit
+dependency (parent) to its host. A host to host dependency acts implicitly
+as host parent relation.
+When dependencies are calculated, not only the immediate parent is taken into
+account but all parents are inherited.
- assign where service.name == "mysql"
+The `parent_host_name` and `parent_service_name` attributes are mandatory for
+service dependencies, `parent_host_name` is required for host dependencies.
+[Apply rules](3-monitoring-basics.md#using-apply) will allow you to
+[determine these attributes](3-monitoring-basics.md#dependencies-apply-custom-attributes) in a more
+dynamic fashion if required.
+
+ parent_host_name = "core-router"
+ parent_service_name = "uplink-port"
+
+Notifications are suppressed by default if a host or service becomes unreachable.
+You can control that option by defining the `disable_notifications` attribute.
+
+ disable_notifications = false
+
+The dependency state filter must be defined based on the parent object being
+either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`).
+
+The following example will make the dependency fail and trigger it if the parent
+object is **not** in one of these states:
+
+ states = [ OK, Critical, Unknown ]
+
+Rephrased: If the parent service object changes into the `Warning` state, this
+dependency will fail and render all child objects (hosts or services) unreachable.
+
+You can determine the child's reachability by querying the `is_reachable` attribute
+in for example [DB IDO](14-appendix.md#schema-db-ido-extensions).
+
+### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
+
+Icinga 2 automatically adds an implicit dependency for services on their host. That way
+service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
+does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
+`states = [ Up ]` for all service objects.
+
+Service checks are still executed. If you want to prevent them from happening, you can
+apply the following dependency to all services setting their host as `parent_host_name`
+and disabling the checks. `assign where true` matches on all `Service` objects.
+
+ apply Dependency "disable-host-service-checks" to Service {
+ disable_checks = true
+ assign where true
}
-Instead of assigning users to notifications, you can also add the `user_groups`
-attribute with a list of user groups to the `Notification` object. Icinga 2 will
-send notifications to all group members.
+### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
-### <a id="notification-escalations"></a> Notification Escalations
+A common scenario is the Icinga 2 server behind a router. Checking internet
+access by pinging the Google DNS server `google-dns` is a common method, but
+will fail in case the `dsl-router` host is down. Therefore the example below
+defines a host dependency which acts implicitly as parent relation too.
-When a problem notification is sent and a problem still exists after re-notification
-you may want to escalate the problem to the next support level. A different approach
-is to configure the default notification by email, and escalate the problem via sms
-if not already solved.
+Furthermore the host may be reachable but ping probes are dropped by the
+router's firewall. In case the `dsl-router``ping4` service check fails, all
+further checks for the `ping4` service on host `google-dns` service should
+be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
+
+ object Host "dsl-router" {
+ import "generic-host"
+ address = "192.168.1.1"
+ }
-You can define notification start and end times as additional configuration
-attributes making the `Notification` object a so-called `notification escalation`.
-Using templates you can share the basic notification attributes such as users or the
-`interval` (and override them for the escalation then).
+ object Host "google-dns" {
+ import "generic-host"
+ address = "8.8.8.8"
+ }
-Using the example from above, you can define additional users being escalated for sms
-notifications between start and end time.
+ apply Service "ping4" {
+ import "generic-service"
- object User "icinga-oncall-2nd-level" {
- display_name = "Icinga 2nd Level"
+ check_command = "ping4"
- vars.mobile = "+1 555 424642"
+ assign where host.address
}
- object User "icinga-oncall-1st-level" {
- display_name = "Icinga 1st Level"
+ apply Dependency "internet" to Host {
+ parent_host_name = "dsl-router"
+ disable_checks = true
+ disable_notifications = true
- vars.mobile = "+1 555 424642"
+ assign where host.name != "dsl-router"
}
-Define an additional `NotificationCommand` for SMS notifications.
-
-> **Note**
->
-> The example is not complete as there are many different SMS providers.
-> Please note that sending SMS notifications will require an SMS provider
-> or local hardware with a SIM card active.
+ apply Dependency "internet" to Service {
+ parent_host_name = "dsl-router"
+ parent_service_name = "ping4"
+ disable_checks = true
- object NotificationCommand "sms-notification" {
- command = [
- PluginDir + "/send_sms_notification",
- "$mobile$",
- "..."
+ assign where host.name != "dsl-router"
}
-The two new notification escalations are added onto the host `localhost`
-and its service `ping4` using the `generic-notification` template.
-The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
-command) after `30m` until `1h`.
+### <a id="dependencies-apply-custom-attributes"></a> Apply Dependencies based on Custom Attributes
-> **Note**
->
-> The `interval` was set to 15m in the `generic-notification`
-> template example. Lower that value in your escalations by using a secondary
-> template or overriding the attribute directly in the `notifications` array
-> position for `escalation-sms-2nd-level`.
+You can use [apply rules](3-monitoring-basics.md#using-apply) to set parent or
+child attributes e.g. `parent_host_name`to other object's
+attributes.
-If the problem does not get resolved or acknowledged preventing further notifications
-the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
-notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
+A common example are virtual machines hosted on a master. The object
+name of that master is auto-generated from your CMDB or VMWare inventory
+into the host's custom attributes (or a generic template for your
+cloud).
- apply Notification "mail" to Service {
- import "generic-notification"
+Define your master host object:
- command = "mail-notification"
- users = [ "icingaadmin" ]
+ /* your master */
+ object Host "master.example.com" {
+ import "generic-host"
+ }
- assign where service.name == "ping4"
+Add a generic template defining all common host attributes:
+
+ /* generic template for your virtual machines */
+ template Host "generic-vm" {
+ import "generic-host"
}
- apply Notification "escalation-sms-2nd-level" to Service {
- import "generic-notification"
+Add a template for all hosts on your example.com cloud setting
+custom attribute `vm_parent` to `master.example.com`:
- command = "sms-notification"
- users = [ "icinga-oncall-2nd-level" ]
+ template Host "generic-vm-example.com" {
+ import "generic-vm"
+ vars.vm_parent = "master.example.com"
+ }
- times = {
- begin = 30m
- end = 1h
- }
+Define your guest hosts:
- assign where service.name == "ping4"
+ object Host "www.example1.com" {
+ import "generic-vm-master.example.com"
}
- apply Notification "escalation-sms-1st-level" to Service {
- import "generic-notification"
+ object Host "www.example2.com" {
+ import "generic-vm-master.example.com"
+ }
- command = "sms-notification"
- users = [ "icinga-oncall-1st-level" ]
+Apply the host dependency to all child hosts importing the
+`generic-vm` template and set the `parent_host_name`
+to the previously defined custom attribute `host.vars.vm_parent`.
- times = {
- begin = 1h
- end = 2h
- }
+ apply Dependency "vm-host-to-parent-master" to Host {
+ parent_host_name = host.vars.vm_parent
+ assign where "generic-vm" in host.templates
+ }
- assign where service.name == "ping4"
+You can extend this example, and make your services depend on the
+`master.example.com` host too. Their local scope allows you to use
+`host.vars.vm_parent` similar to the example above.
+
+ apply Dependency "vm-service-to-parent-master" to Service {
+ parent_host_name = host.vars.vm_parent
+ assign where "generic-vm" in host.templates
}
-### <a id="first-notification-delay"></a> First Notification Delay
+That way you don't need to wait for your guest hosts becoming
+unreachable when the master host goes down. Instead the services
+will detect their reachability immediately when executing checks.
+
+> **Note**
+>
+> This method with setting locally scoped variables only works in
+> apply rules, but not in object definitions.
-Sometimes the problem in question should not be notified when the first notification
-happens, but a defined time duration afterwards. In Icinga 2 you can use the `times`
-dictionary and set `begin = 15m` as key and value if you want to suppress notifications
-in the first 15 minutes. Leave out the `end` key - if not set, Icinga 2 will not check against any
-end time for this notification.
- apply Notification "mail" to Service {
- import "generic-notification"
+### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
- command = "mail-notification"
- users = [ "icingaadmin" ]
+Another classic example are agent based checks. You would define a health check
+for the agent daemon responding to your requests, and make all other services
+querying that daemon depend on that health check.
- times.begin = 15m // delay first notification
+The following configuration defines two nrpe based service checks `nrpe-load`
+and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
+`nrpe-health` service.
- assign where service.name == "ping4"
+ apply Service "nrpe-health" {
+ import "generic-service"
+ check_command = "nrpe"
+ assign where match("nrpe-*", host.name)
}
-### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
+ apply Service "nrpe-load" {
+ import "generic-service"
+ check_command = "nrpe"
+ vars.nrpe_command = "check_load"
+ assign where match("nrpe-*", host.name)
+ }
-If there are no notification state and type filter attributes defined at the `Notification`
-or `User` object Icinga 2 assumes that all states and types are being notified.
+ apply Service "nrpe-disk" {
+ import "generic-service"
+ check_command = "nrpe"
+ vars.nrpe_command = "check_disk"
+ assign where match("nrpe-*", host.name)
+ }
-Available state and type filters for notifications are:
+ object Host "nrpe-server" {
+ import "generic-host"
+ address = "192.168.1.5"
+ }
- template Notification "generic-notification" {
+ apply Dependency "disable-nrpe-checks" to Service {
+ parent_service_name = "nrpe-health"
- states = [ Warning, Critical, Unknown ]
- types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
- FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
+ states = [ OK ]
+ disable_checks = true
+ disable_notifications = true
+ assign where service.check_command == "nrpe"
+ ignore where service.name == "nrpe-health"
}
-If you are familiar with Icinga 1.x `notification_options` please note that they have been split
-into type and state, and allow more fine granular filtering for example on downtimes and flapping.
-You can filter for acknowledgements and custom notifications too.
+The `disable-nrpe-checks` dependency is applied to all services
+on the `nrpe-service` host using the `nrpe` check_command attribute
+but not the `nrpe-health` service itself.
## <a id="downtimes"></a> Downtimes
Planned downtimes will also be taken into account for SLA reporting
tools calculating the SLAs based on the state and downtime history.
-Downtimes may overlap with their start and end times. If there
-are multiple downtimes triggered for one object, the overall downtime depth
-will be more than `1`. This is useful when you want to extend
-your maintenance window taking longer than expected.
+Multiple downtimes for a single object may overlap. This is useful
+when you want to extend your maintenance window taking longer than expected.
+If there are multiple downtimes triggered for one object, the overall downtime depth
+will be greater than `1`.
+
If the downtime was scheduled after the problem changed to a critical hard
state triggering a problem notification, and the service recovers during
all problems should be alerted again. Solution is simple -
schedule a `fixed` downtime starting at 23:00 and ending at 24:00.
-Unlike a `fixed` downtime, a `flexible` downtime end does not necessarily
-happen at the provided end time. Instead the downtime will be triggered
-by the state change in the time span defined by start and end time, but
-then last a defined duration in minutes.
+Unlike a `fixed` downtime, a `flexible` downtime will be triggered
+by the state change in the time span defined by start and end time,
+and then last for the specified duration in minutes.
Imagine the following scenario: Your service is frequently polled
by users trying to grab free deleted domains for immediate registration.
### <a id="scheduling-downtime"></a> Scheduling a downtime
-This can either happen through a web interface or by sending an [external command](#external-commands)
+This can either happen through a web interface or by sending an [external command](3-monitoring-basics.md#external-commands)
to the external command pipe provided by the `ExternalCommandListener` configuration.
Fixed downtimes require a start and end time (a duration will be ignored).
### <a id="recurring-downtimes"></a> Recurring Downtimes
-[ScheduledDowntime objects](#objecttype-scheduleddowntime) can be used to set up
+[ScheduledDowntime objects](12-object-types.md#objecttype-scheduleddowntime) can be used to set up
recurring downtimes for services.
Example:
}
-## <a id="comments"></a> Comments
+## <a id="comments-intro"></a> Comments
Comments can be added at runtime and are persistent over restarts. You can
add useful information for others on repeating incidents (for example
## <a id="acknowledgements"></a> Acknowledgements
If a problem is alerted and notified you may signal the other notification
-receipients that you are aware of the problem and will handle it.
+recipients that you are aware of the problem and will handle it.
By sending an acknowledgement to Icinga 2 (using the external command pipe
provided with `ExternalCommandListener` configuration) all future notifications
re-notify if the problem persists.
-## <a id="dependencies"></a> Dependencies
-
-Icinga 2 uses host and service [Dependency](#objecttype-dependency) objects
-for determing their network reachability.
-The `parent_host_name` and `parent_service_name` attributes are mandatory for
-service dependencies, `parent_host_name` is required for host dependencies.
-
-A service can depend on a host, and vice versa. A service has an implicit
-dependency (parent) to its host. A host to host dependency acts implicit
-as host parent relation.
-When dependencies are calculated, not only the immediate parent is taken into
-account but all parents are inherited.
-
-Notifications are suppressed if a host or service becomes unreachable.
-
-A common scenario is the Icinga 2 server behind a router. Checking internet
-access by pinging the Google DNS server `google-dns` is a common method, but
-will fail in case the `dsl-router` host is down. Therefore the example below
-defines a host dependency which acts implicit as parent relation too.
-
-Furthermore the host may be reachable but ping probes are dropped by the
-router's firewall. In case the `dsl-router``ping4` service check fails, all
-further checks for the `ping4` service on host `google-dns` service should
-be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
-
- object Host "dsl-router" {
- address = "192.168.1.1"
- }
-
- object Host "google-dns" {
- address = "8.8.8.8"
- }
- apply Service "ping4" {
- import "generic-service"
+## <a id="custom-attributes"></a> Custom Attributes
- check_command = "ping4"
+### <a id="custom-attributes-apply"></a> Using Custom Attributes for Apply Rules
- assign where host.address
- }
+Custom attributes are not only used at runtime in command definitions to pass
+command arguments, but are also a smart way to define patterns and groups
+for applying objects for dynamic config generation.
- apply Dependency "internet" to Service {
- parent_host_name = "dsl-router"
- disable_checks = true
+There are several ways of using custom attributes with [apply rules](3-monitoring-basics.md#using-apply):
- assign where host.name != "dsl-router"
- }
+* As simple attribute literal ([number](10-language-reference.md#numeric-literals), [string](10-language-reference.md#string-literals),
+[boolean](10-language-reference.md#boolean-literals)) for expression conditions (`assign where`, `ignore where`)
+* As [array](10-language-reference.md#array) or [dictionary](10-language-reference.md#dictionary) attribute with nested values
+(e.g. dictionaries in dictionaries) in [apply for](3-monitoring-basics.md#using-apply-for) rules.
+Features like [DB IDO](3-monitoring-basics.md#db-ido), Livestatus(#setting-up-livestatus) or StatusData(#status-data)
+dump this column as encoded JSON string, and set `is_json` resp. `cv_is_json` to `1`.
-## <a id="custom-attributes"></a> Custom Attributes
+If arrays are used in runtime macros (for example `$host.groups$`) all entries
+are separated using the `;` character. If an entry contains a semi-colon itself,
+it is escaped like this: `entry1;ent\;ry2;entry3`.
### <a id="runtime-custom-attributes"></a> Using Custom Attributes at Runtime
Custom attributes may be used in command definitions to dynamically change how the command
is executed.
-Additionally there are Icinga 2 features such as the `PerfDataWriter` type
-which use custom attributes to format their output.
+Additionally there are Icinga 2 features such as the [PerfDataWriter](3-monitoring-basics.md#performance-data) feature
+which use custom runtime attributes to format their output.
> **Tip**
>
-> Custom attributes are identified by the 'vars' dictionary attribute as short name.
-> Accessing the different attribute keys is possible using the '.' accessor.
+> Custom attributes are identified by the `vars` dictionary attribute as short name.
+> Accessing the different attribute keys is possible using the [index accessor](10-language-reference.md#indexer) `.`.
Custom attributes in command definitions or performance data templates are evaluated at
-runtime when executing a command. These custom attributes cannot be used elsewhere
-(e.g. in other configuration attributes).
+runtime when executing a command. These custom attributes cannot be used somewhere else
+for example in other configuration attributes.
+
+Custom attribute values must be either a string, a number, a boolean value or an array.
+Dictionaries cannot be used at the time of writing.
+
+Arrays can be used to pass multiple arguments with or without repeating the key string.
+This helps passing multiple parameters to check plugins requiring them. Prominent
+plugin examples are:
+
+* [check_disk -p](13-icinga-template-library.md#plugin-check-command-disk)
+* [check_nrpe -a](13-icinga-template-library.md#plugin-check-command-nrpe)
+* [check_nscp -l](13-icinga-template-library.md#plugin-check-command-nscp)
+* [check_dns -a](13-icinga-template-library.md#plugin-check-command-dns)
+
+More details on how to use `repeat_key` and other command argument options can be
+found in [this section](12-object-types.md#objecttype-checkcommand-arguments).
+
+> **Note**
+>
+> If a macro value cannot be resolved, be it a single macro, or a recursive macro
+> containing an array of macros, the entire command argument is skipped.
-Here is an example of a command definition which uses user-defined custom attributes:
+This is an example of a command definition which uses user-defined custom attributes:
- object CheckCommand "my-ping" {
+ object CheckCommand "my-icmp" {
import "plugin-check-command"
+ command = [ "/bin/sudo", PluginDir + "/check_icmp" ]
- command = [
- PluginDir + "/check_ping",
- "-4",
- "-H", "$address$",
- "-w", "$ping_wrta$,$ping_wpl$%",
- "-c", "$ping_crta$,$ping_cpl$%",
- "-p", "$ping_packets$",
- "-t", "$ping_timeout$"
- ]
+ arguments = {
+ "-H" = {
+ value = "$icmp_targets$"
+ repeat_key = false
+ order = 1
+ }
+ "-w" = "$icmp_wrta$,$icmp_wpl$%"
+ "-c" = "$icmp_crta$,$icmp_cpl$%"
+ "-s" = "$icmp_source$"
+ "-n" = "$icmp_packets$"
+ "-i" = "$icmp_packet_interval$"
+ "-I" = "$icmp_target_interval$"
+ "-m" = "$icmp_hosts_alive$"
+ "-b" = "$icmp_data_bytes$"
+ "-t" = "$icmp_timeout$"
+ }
+
+ vars.icmp_wrta = 200.00
+ vars.icmp_wpl = 40
+ vars.icmp_crta = 500.00
+ vars.icmp_cpl = 80
+
+ vars.notes = "Requires setuid root or sudo."
+ }
- vars.ping_wrta = 100
- vars.ping_wpl = 5
- vars.ping_crta = 200
- vars.ping_cpl = 15
- vars.ping_packets = 5
- vars.ping_timeout = 0
+Custom attribute names used at runtime must be enclosed in two `$` signs,
+for example `$address$`.
+
+> **Note**
+>
+> When using the `$` sign as single character, you need to escape it with an
+> additional dollar sign (`$$`).
+
+This example also makes use of the [command arguments](3-monitoring-basics.md#command-arguments) passed
+to the command line.
+
+You can integrate the above example `CheckCommand` definition
+[passing command argument parameters](3-monitoring-basics.md#command-passing-parameters) like this:
+
+ object Host "my-icmp-host" {
+ import "generic-host"
+ address = "192.168.1.10"
+ vars.address_mgmt = "192.168.2.10"
+ vars.address_web = "192.168.10.10"
+ vars.icmp_targets = [ "$address$", "$host.vars.address_mgmt$", "$host.vars.address_web$" ]
}
-Custom attribute names used at runtime must be enclosed in two `$` signs, e.g.
-`$address$`. When using the `$` sign as single character, you need to escape
-it with an additional dollar sign (`$$`).
+ apply Service "my-icmp" {
+ check_command = "my-icmp"
+ check_interval = 1m
+ retry_interval = 30s
+
+ vars.icmp_targets = host.vars.icmp_targets
+
+ assign where host.vars.icmp_targets
+ }
### <a id="runtime-custom-attributes-evaluation-order"></a> Runtime Custom Attributes Evaluation Order
2. Service object
3. Host object
4. Command object
-5. Global custom attributes in the Vars constant
+5. Global custom attributes in the `vars` constant
This execution order allows you to define default values for custom attributes
in your command objects. The `my-ping` command shown above uses this to set
default values for some of the latency thresholds and timeouts.
-When using the `my-ping` command you can override all or some of the custom
+When using the `my-ping` command you can override some or all of the custom
attributes in the service definition like this:
object Service "ping" {
when passing credentials to database checks:
object CheckCommand "mysql-health" {
- import "plugin-check-command",
+ import "plugin-check-command"
+
+ command = [
+ PluginDir + "/check_mysql"
+ ]
- command = PluginDir + "/check_mysql -H $address$ -d $db$",
+ arguments = {
+ "-H" = "$mysql_address$"
+ "-d" = "$mysql_database$"
+ }
- vars.mysql_user = "icinga_check",
+ vars.mysql_address = "$address$"
+ vars.mysql_database = "icinga"
+ vars.mysql_user = "icinga_check"
vars.mysql_pass = "password"
- env.MYSQLUSER = "$mysql_user$",
+ env.MYSQLUSER = "$mysql_user$"
env.MYSQLPASS = "$mysql_pass$"
}
+### <a id="multiple-host-addresses-custom-attributes"></a> Multiple Host Addresses using Custom Attributes
+
+The following example defines a `Host` with three different interface addresses defined as
+custom attributes in the `vars` dictionary. The `if-eth0` and `if-eth1` services will import
+these values into the `address` custom attribute. This attribute is available through the
+generic `$address$` runtime macro.
+
+ object Host "multi-ip" {
+ check_command = "dummy"
+ vars.address_lo = "127.0.0.1"
+ vars.address_eth0 = "10.0.0.10"
+ vars.address_eth1 = "192.168.1.10"
+ }
+
+ apply Service "if-eth0" {
+ import "generic-service"
+
+ vars.address = "$host.vars.address_eth0$"
+ check_command = "my-generic-interface-check"
+
+ assign where host.vars.address_eth0 != ""
+ }
+
+ apply Service "if-eth1" {
+ import "generic-service"
+
+ vars.address = "$host.vars.address_eth1$"
+ check_command = "my-generic-interface-check"
+
+ assign where host.vars.address_eth1 != ""
+ }
+
+ object CheckCommand "my-generic-interface-check" {
+ import "plugin-check-command"
+
+ command = "echo \"This would be the service $service.description$ using the address value: $address$\""
+ }
+
+The `CheckCommand` object is just an example to help you with testing and
+understanding the different custom attributes and runtime macros.
+
### <a id="modified-attributes"></a> Modified Attributes
Icinga 2 allows you to modify defined object attributes at runtime different to
### <a id="runtime-macro-evaluation-order"></a> Runtime Macro Evaluation Order
-Custom attributes can be accessed at [runtime](#runtime-custom-attributes) using their
+Custom attributes can be accessed at [runtime](3-monitoring-basics.md#runtime-custom-attributes) using their
identifier omitting the `vars.` prefix.
There are special cases when those custom attributes are not set and Icinga 2 provides
a fallback to existing object attributes for example `host.address`.
In order to enable the `ExternalCommandListener` configuration use the
following command and restart Icinga 2 afterwards:
- # icinga2-enable-feature command
+ # icinga2 feature enable command
Icinga 2 creates the command pipe file as `/var/run/icinga2/cmd/icinga2.cmd`
using the default configuration.
Oct 17 15:01:25 icinga-server icinga2: Executing external command: [1382014885] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382014885
Oct 17 15:01:25 icinga-server icinga2: Rescheduling next check for service 'ping4'
-By default the command pipe file is owned by the group `icingacmd` with read/write
-permissions. Add your webserver's user to the group `icingacmd` to
-enable sending commands to Icinga 2 through your web interface:
-
- # usermod -G -a icingacmd www-data
-
-Debian packages use `nagios` as the default user and group name. Therefore change `icingacmd` to
-`nagios`.
### <a id="external-command-list"></a> External Command List
-A list of currently supported external commands can be found [here](#external-commands-list-detail)
+A list of currently supported external commands can be found [here](14-appendix.md#external-commands-list-detail).
Detailed information on the commands and their required parameters can be found
on the [Icinga 1.x documentation](http://docs.icinga.org/latest/en/extcommands2.html).
-
-## <a id="event-handlers"></a> Event Handlers
-
-Event handlers are defined as `EventCommand` objects in Icinga 2.
-
-Unlike notifications event commands are called on every host/service execution
-if defined. Therefore the `EventCommand` object should define a command line
-evaluating the current service state and other service runtime attributes
-available through runtime macros. Runtime macros such as `$service.state_type$`
-and `$service.state$` will be processed by Icinga 2 helping on fine-granular
-events being triggered.
-
-Common use case scenarios are a failing HTTP check requiring an immediate
-restart via event command, or if an application is locked and requires
-a restart upon detection.
-
-
## <a id="logging"></a> Logging
Icinga 2 supports three different types of logging:
* Syslog (on *NIX-based operating systems)
* Console logging (`STDOUT` on tty)
-You can enable additional loggers using the `icinga2-enable-feature`
-and `icinga2-disable-feature` commands to configure loggers:
+You can enable additional loggers using the `icinga2 feature enable`
+and `icinga2 feature disable` commands to configure loggers:
Feature | Description
---------|------------
runtime vars.
host_format_template = "DATATYPE::HOSTPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tHOSTPERFDATA::$host.perfdata$\tHOSTCHECKCOMMAND::$host.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$"
- service_format_template = "DATATYPE::SERVICEPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tSERVICEDESC::$service.description$\tSERVICEPERFDATA::$service.perfdata$\tSERVICECHECKCOMMAND::$service.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$\tSERVICESTATE::$service.state$\tSERVICESTATETYPE::$service.statetype$"
+ service_format_template = "DATATYPE::SERVICEPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tSERVICEDESC::$service.name$\tSERVICEPERFDATA::$service.perfdata$\tSERVICECHECKCOMMAND::$service.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$\tSERVICESTATE::$service.state$\tSERVICESTATETYPE::$service.statetype$"
The default templates are already provided with the Icinga 2 feature configuration
which can be enabled using
- # icinga2-enable-feature perfdata
+ # icinga2 feature enable perfdata
By default all performance data files are rotated in a 15 seconds interval into
the `/var/spool/icinga2/perfdata/` directory as `host-perfdata.<timestamp>` and
You can enable the feature using
- # icinga2-enable-feature graphite
+ # icinga2 feature enable graphite
By default the `GraphiteWriter` object expects the Graphite Carbon Cache to listen at
-`127.0.0.1` on port `2003`.
+`127.0.0.1` on TCP port `2003`.
The current naming schema is
icinga.<hostname>.<metricname>
icinga.<hostname>.<servicename>.<metricname>
+You can customize the metric prefix name by using the `host_name_template` and
+`service_name_template` configuration attributes.
+
+The example below uses [runtime macros](3-monitoring-basics.md#runtime-macros) and a
+[global constant](10-language-reference.md#constants) named `GraphiteEnv`. The constant name
+is freely definable and should be put in the [constants.conf](2-getting-started.md#constants-conf) file.
+
+ const GraphiteEnv = "icinga.env1"
+
+ object GraphiteWriter "graphite" {
+ host_name_template = GraphiteEnv + ".$host.name$"
+ service_name_template = GraphiteEnv + ".$host.name$.$service.name$"
+ }
+
+To make sure Icinga 2 writes a valid label into Graphite some characters are replaced
+with `_` in the target name:
+
+ \/.- (and space)
+
+The resulting name in Graphite might look like:
+
+ www-01 / http-cert / response time
+ icinga.www_01.http_cert.response_time
+
+In addition to the performance data retrieved from the check plugin, Icinga 2 sends
+internal check statistic data to Graphite:
+
+ metric | description
+ -------------------|------------------------------------------
+ current_attempt | current check attempt
+ max_check_attempts | maximum check attempts until the hard state is reached
+ reachable | checked object is reachable
+ downtime_depth | number of downtimes this object is in
+ execution_time | check execution time
+ latency | check latency
+ state | current state of the checked object
+ state_type | 0=SOFT, 1=HARD state
+
+The following example illustrates how to configure the storage-schemas for Graphite Carbon
+Cache. Please make sure that the order is correct because the first match wins.
+
+ [icinga_internals]
+ pattern = ^icinga\..*\.(max_check_attempts|reachable|current_attempt|execution_time|latency|state|state_type)
+ retentions = 5m:7d
+
+ [icinga_default]
+ # intervals like PNP4Nagios uses them per default
+ pattern = ^icinga\.
+ retentions = 1m:2d,5m:10d,30m:90d,360m:4y
+
+### <a id="gelfwriter"></a> GELF Writer
+
+The `Graylog Extended Log Format` (short: [GELF](http://www.graylog2.org/resources/gelf))
+can be used to send application logs directly to a TCP socket.
+
+While it has been specified by the [graylog2](http://www.graylog2.org/) project as their
+[input resource standard](http://www.graylog2.org/resources/gelf), other tools such as
+[Logstash](http://www.logstash.net) also support `GELF` as
+[input type](http://logstash.net/docs/latest/inputs/gelf).
+
+You can enable the feature using
+
+ # icinga2 feature enable gelf
+
+By default the `GelfWriter` object expects the GELF receiver to listen at `127.0.0.1` on TCP port `12201`.
+The default `source` attribute is set to `icinga2`. You can customize that for your needs if required.
+
+Currently these events are processed:
+* Check results
+* State changes
+* Notifications
## <a id="status-data"></a> Status Data
the `StatusDataWriter` object which dumps all configuration objects and
status updates in a regular interval.
- # icinga2-enable-feature statusdata
+ # icinga2 feature enable statusdata
Icinga 1.x Classic UI requires this data set as part of its backend.
> you can safely disable this feature.
-
## <a id="compat-logging"></a> Compat Logging
The Icinga 1.x log format is considered being the `Compat Log`
These logs are not only used for informational representation in
external web interfaces parsing the logs, but also to generate
-SLA reports and trends in Icinga 1.x Classic UI. Futhermore the
-`Livestatus` feature uses these logs for answering queries to
+SLA reports and trends in Icinga 1.x Classic UI. Furthermore the
+[Livestatus](7-livestatus.md#setting-up-livestatus) feature uses these logs for answering queries to
historical tables.
The `CompatLogger` object can be enabled with
- # icinga2-enable-feature compatlog
+ # icinga2 feature enable compatlog
By default, the Icinga 1.x log file called `icinga.log` is located
in `/var/log/icinga2/compat`. Rotated log files are moved into
+
+## <a id="db-ido"></a> DB IDO
+
+The IDO (Icinga Data Output) modules for Icinga 2 take care of exporting all
+configuration and status information into a database. The IDO database is used
+by a number of projects including Icinga Web 1.x and 2.
+
+Details on the installation can be found in the [Configuring DB IDO](2-getting-started.md#configuring-db-ido)
+chapter. Details on the configuration can be found in the
+[IdoMysqlConnection](12-object-types.md#objecttype-idomysqlconnection) and
+[IdoPgsqlConnection](12-object-types.md#objecttype-idopgsqlconnection)
+object configuration documentation.
+The DB IDO feature supports [High Availability](4-monitoring-remote-systems.md#high-availability-db-ido) in
+the Icinga 2 cluster.
+
+The following example query checks the health of the current Icinga 2 instance
+writing its current status to the DB IDO backend table `icinga_programstatus`
+every 10 seconds. By default it checks 60 seconds into the past which is a reasonable
+amount of time - adjust it for your requirements. If the condition is not met,
+the query returns an empty result.
+
+> **Tip**
+>
+> Use [check plugins](6-addons-plugins.md#plugins) to monitor the backend.
+
+Replace the `default` string with your instance name, if different.
+
+Example for MySQL:
+
+ # mysql -u root -p icinga -e "SELECT status_update_time FROM icinga_programstatus ps
+ JOIN icinga_instances i ON ps.instance_id=i.instance_id
+ WHERE (UNIX_TIMESTAMP(ps.status_update_time) > UNIX_TIMESTAMP(NOW())-60)
+ AND i.instance_name='default';"
+
+ +---------------------+
+ | status_update_time |
+ +---------------------+
+ | 2014-05-29 14:29:56 |
+ +---------------------+
+
+
+Example for PostgreSQL:
+
+ # export PGPASSWORD=icinga; psql -U icinga -d icinga -c "SELECT ps.status_update_time FROM icinga_programstatus AS ps
+ JOIN icinga_instances AS i ON ps.instance_id=i.instance_id
+ WHERE ((SELECT extract(epoch from status_update_time) FROM icinga_programstatus) > (SELECT extract(epoch from now())-60))
+ AND i.instance_name='default'";
+
+ status_update_time
+ ------------------------
+ 2014-05-29 15:11:38+02
+ (1 Zeile)
+
+
+A detailed list on the available table attributes can be found in the [DB IDO Schema documentation](14-appendix.md#schema-db-ido).
+
+
## <a id="check-result-files"></a> Check Result Files
-Icinga 1.x writes its check result files into a temporary spool directory
-where it reads these check result files in a regular interval from.
-While this is extremly inefficient in performance regards it has been
+Icinga 1.x writes its check result files to a temporary spool directory
+where they are processed in a regular interval.
+While this is extremely inefficient in performance regards it has been
rendered useful for passing passive check results directly into Icinga 1.x
skipping the external command pipe.
object CheckResultReader "reader" {
spool_dir = "/data/check-results"
}
-
-
-
-
-
-