From c1f4d2243e1382f72cf57d1f2cb5841714c1ca7e Mon Sep 17 00:00:00 2001 From: Michael Friedrich Date: Fri, 7 Nov 2014 03:40:46 +0100 Subject: [PATCH] Documentation: Better apply rule best practice in monitoring basics fixes #7480 fixes #7543 fixes #7187 fixes #7573 --- doc/2-getting-started.md | 2 +- doc/4-monitoring-basics.md | 334 +++++++++++++++++++++++++++---- doc/7-configuring-icinga-2.md | 7 + etc/icinga2/conf.d/services.conf | 1 + 4 files changed, 309 insertions(+), 35 deletions(-) diff --git a/doc/2-getting-started.md b/doc/2-getting-started.md index 11eb00dbe..b6a890c80 100644 --- a/doc/2-getting-started.md +++ b/doc/2-getting-started.md @@ -1219,7 +1219,7 @@ re-implementation of the Livestatus protocol which is compatible with MK Livestatus. Details on the available tables and attributes with Icinga 2 can be found -in the [Livestatus Schema](#schema-livestatus) section. +in the [Livestatus](#livestatus) section. You can enable Livestatus using icinga2 feature enable: diff --git a/doc/4-monitoring-basics.md b/doc/4-monitoring-basics.md index 13477a5ff..c01b069e8 100644 --- a/doc/4-monitoring-basics.md +++ b/doc/4-monitoring-basics.md @@ -37,6 +37,12 @@ Here is an example of a host object which defines two child services: The example creates two services `ping4` and `http` which belong to the host `my-server1`. +> **Note** +> +> When using [apply](#using-apply) rules, a service apply definition will +> implicitely create a relationship to each host by setting the `host_name` +> attribute. + It also specifies that the host should perform its own check using the `hostalive` check command. @@ -109,7 +115,7 @@ requirements first and then decide for a possible strategy. There are many ways of creating Icinga 2 configuration objects: * Manually with your preferred editor, for example vi(m), nano, notepad, etc. -* Generated by a configuration management tool such as Puppet, Chef, Ansible, etc. +* Generated by a [configuration management too](#configuration-tools) such as Puppet, Chef, Ansible, etc. * A configuration addon for Icinga 2 * A custom exporter script from your CMDB or inventory tool * your own. @@ -143,7 +149,7 @@ You can later use them for applying assign/ignore rules, or export them into ext * Put hosts into hostgroups, services into servicegroups and use these attributes for your apply rules. * Use templates to store generic attributes for your objects and apply rules making your configuration more readable. Details can be found in the [using templates](#using-templates) chapter. -* Apply rules may overlap. Keep a central place (for example, `services.conf` or `notifications.conf`) storing +* Apply rules may overlap. Keep a central place (for example, [services.conf](#services-conf) or [notifications.conf](#notifications-conf)) storing the configuration instead of defining apply rules deep in your configuration tree. * Every plugin used as check, notification or event command requires a `Command` definition. Further details can be looked up in the [check commands](#check-commands) chapter. @@ -164,22 +170,31 @@ object: enable_perfdata = true } - object Service "ping4" { + template Service "ipv6-service { + notes = "IPv6 critical != IPv4 broken." + } + + apply Service "ping4" { import "generic-service" - host_name = "localhost" check_command = "ping4" + + assign where host.address } - object Service "ping6" { + apply Service "ping6" { import "generic-service" + import "ipv6-service" - host_name = "localhost" check_command = "ping6" + + assign where host.address6 } + In this example the `ping4` and `ping6` services inherit properties from the -template `generic-service`. +template `generic-service`. The `ping6` service additionally imports the `ipv6-service` +template with the `notes` attribute. Objects as well as templates themselves can import an arbitrary number of templates. Attributes inherited from a template can be overridden in the @@ -187,42 +202,135 @@ object if necessary. ### Apply objects based on rules -Instead of assigning each object (`Service`, `Notification`, `Dependency`, `ScheduledDowntime`) +Instead of assigning each object ([Service](#objecttype-service), +[Notification](#objecttype-notification), [Dependency](#objecttype-dependency), +[ScheduledDowntime](#objecttype-scheduleddowntime)) based on attribute identifiers for example `host_name` objects can be [applied](#apply). -Detailed scenario examples are used in their respective chapters, for example -[apply services with custom command arguments](#using-apply-services-command-arguments). +Before you start using the apply rules keep the following in mind: + +* Define the best match. + * A set of unique [custom attributes](#custom-attributes-apply) for these hosts/services? + * Or [group](#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it? + * A generic pattern [match](#function-calls) on the host/service name? + * [Multiple expressions combined](#using-apply-expressions) with `&&` or `||` [operators](#expression-operators) +* All expressions must return a boolean value (an empty string is equal to `false` e.g.) + +> **Note** +> +> You can set/override object attributes in apply rules using the respectively available +> objects in that scope (host and/or service objects). + +[Custom attributes](#custom-attributes) can also store nested dictionaries and arrays. That way you can use them +for not only matching for their existance or values in apply expressions, but also assign +("inherit") their values into the generated objected from apply rules. + +* [Apply services to hosts](#using-apply-services) +* [Apply notifications to hosts and services](#using-apply-notifications) +* [Apply dependencies to hosts and services](#using-apply-scheduledowntimes) +* [Apply scheduled downtimes to hosts and services](#using-apply-scheduledowntimes) + +A more advanced example is using [apply with for loops on arrays or +dictionaries](#using-apply-for) for example provided by +[custom atttributes](#custom-attributes-apply) or groups. + +> **Tip** +> +> Building configuration in that dynamic way requires detailed information +> of the generated objects. Use the `object list` [cli command](#cli-command-object) +> after successful [configuration validation](#config-validation). + + +#### Apply Rules Expressions + +You can use simple or advanced combinations of apply rule expressions. Each +expression must evaluate into the boolean `true` value. An empty string +will be for instance interpreted as `false`. In a similar fashion undefined +attributes will return `false`. + +Returns `false`: + + assign where host.vars.attribute_does_not_exist + +Multiple `assign where` condition rows are evaluated as `OR` condition. + +You can combine multiple expressions for matching only a subset of objects. In some cases, +you want to be able to add more than one assign/ignore where expression which matches +a specific condition. To achieve this you can use the logical `and` and `or` operators. + + +Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db` +matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true` +should be ignored, or any host name ending with `*internal` pattern. + + object HostGroup "mysql-server" { + display_name = "MySQL Server" + + assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db) + ignore where host.vars.test_server == true + ignore where match("*internal", host.name) + } + +Similar example for advanced notification apply rule filters: If the service +attribute `notes` contains the `has gold support 24x7` string `AND` one of the +two condition passes: Either the `customer` host custom attribute is set to `customer-xy` +`OR` the host custom attribute `always_notify` is set to `true`. + +The notification is ignored for services whose host name ends with `*internal` +`OR` the `priority` custom attribute is [less than](#expression-operators) `2`. + + template Notification "cust-xy-notification" { + users = [ "noc-xy", "mgmt-xy" ] + command = "mail-service-notification" + } + + apply Notification "notify-cust-xy-mysql" to Service { + import "cust-xy-notification" + + assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true + ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.is_clustered == true) + } + + + #### Apply Services to Hosts - apply Service "load" { +The sample configuration already ships a detailed example in [hosts.conf](#hosts-conf) +and [services.conf](#services-conf) for this use case. + +The example for `ssh` applies a service object to all hosts with the `address` +attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`. + + apply Service "ssh" { import "generic-service" - check_command = "load" + check_command = "ssh" - assign where "linux-server" in host.groups - ignore where host.vars.no_load_check + assign where host.address && host.vars.os == "Linux" } -In this example the `load` service will be created as object for all hosts in the `linux-server` -host group. If the `no_load_check` custom attribute is set, the host will be -ignored. + +Other detailed scenario examples are used in their respective chapters, for example +[apply services with custom command arguments](#using-apply-services-command-arguments). #### Apply Notifications to Hosts and Services Notifications are applied to specific targets (`Host` or `Service`) and work in a similar manner: + apply Notification "mail-noc" to Service { import "mail-service-notification" - command = "mail-service-notification" + user_groups = [ "noc" ] - assign where service.vars.sla == "24x7" + assign where host.vars.notification.mail } + In this example the `mail-noc` notification will be created as object for all services having the -`sla` custom attribute set to `24x7`. The notification command is set to `mail-service-notification` +`notification.mail` custom attribute defined. The notification command is set to `mail-service-notification` and all members of the user group `noc` will get notified. #### Apply Dependencies to Hosts and Services @@ -231,9 +339,138 @@ Detailed examples can be found in the [dependencies](#dependencies) chapter. ### Apply Recurring Downtimes to Hosts and Services +The sample confituration ships an example in [downtimes.conf](#downtimes-conf). + Detailed examples can be found in the [recurring downtimes](#recurring-downtimes) chapter. +#### Using Apply For Rules + +Next to the standard way of using apply rules there is + +The sample configuration already ships a detailed example in [hosts.conf](#hosts-conf) +and [services.conf](#services-conf) for this use case. + +Imagine a different example: You are monitoring your switch (hosts) with many +interfaces (services). The following requirements/problems apply: + +* Each interface service check should be named with a prefix and a running number +* Each interface has its own vlan tag +* Some interfaces have QoS enabled +* Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be +dynamically generated + +By defining the `interfaces` dictionary with three example interfaces on the `core-switch` +host object, you'll make sure to pass the storage required by the for loop in the service apply +rule. + + + object Host "core-switch" { + import "generic-host" + address = "127.0.0.1" + + vars.interfaces["0"] = { + port = 1 + vlan = "internal" + address = "127.0.0.2" + qos = "enabled" + } + vars.interfaces["1"] = { + port = 2 + vlan = "mgmt" + address = "127.0.1.2" + } + vars.interfaces["2"] = { + port = 3 + vlan = "remote" + address = "127.0.2.2" + } + } + +You can also omit the `"if-"` string, then all generated service names are directly +taken from the `if_name` variable value. + +The config dictionary contains all key-value pairs for the specific interface in one +loop cycle, like `port`, `vlan`, `address` and `qos` for the `0` interface. + +By defining a default value for the custom attribute `qos` in the `vars` dictionary +before adding the `config` dictionary we''ll ensure that this attribute is always defined. + +After `vars` is fully populated, all object attributes can be set. For strings, you can use +string concatention with the `+` operator. + +You can also specifiy the check command that way. + + apply Service "if-" for (if_name => config in host.vars.interfaces) { + import "generic-service" + check_command = "ping4" + + vars.qos = "disabled" + vars += config + + display_name = "if-" + if_name + "-" + vars.vlan + + notes = "Interface check for Port " + string(vars.port) + " in VLAN " + vars.vlan + " on Address " + vars.address + " QoS " + vars.qos + notes_url = "http://foreman.company.com/hosts/" + host.name + action_url = "http://snmp.checker.company.com/" + host.name + "if-" + if_name + + assign where host.vars.interfaces + } + +Note that numbers must be explicitely casted to string when adding to strings. +This can be achieved by wrapping them into the [string()](#function-calls) function. + +> **Tip** +> +> Building configuration in that dynamic way requires detailed information +> of the generated objects. Use the `object list` [cli command](#cli-command-object) +> after successful [configuration validation](#config-validation). + + +#### Use Object Attributes in Apply Rules + +Since apply rules are evaluated after the generic objects, you +can reference existing host and/or service object attributes as +values for any object attribute specified in that apply rule. + + object Host "opennebula-host" { + import "generic-host" + address = "10.1.1.2" + + vars.hosting["xyz"] = { + http_uri = "/shop" + customer_name = "Customer xyz" + customer_id = "7568" + support_contract = "gold" + } + vars.hosting["abc"] = { + http_uri = "/shop" + customer_name = "Customer xyz" + customer_id = "7568" + support_contract = "silver" + } + } + + apply Service for (customer => config in host.vars.hosting) { + import "generic-service" + check_command = "ping4" + + vars.qos = "disabled" + + vars += config + + vars.http_uri = "/" + vars.customer + "/" + config.http_uri + + display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id + + notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")." + + notes_url = "http://foreman.company.com/hosts/" + host.name + action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id + + assign where host.vars.hosting + } + ### Groups Groups are used for combining hosts, services, and users into @@ -296,13 +533,16 @@ If there is a certain number of hosts, services, or users matching a pattern it's reasonable to assign the group object to these members. Details on the `assign where` syntax can be found [here](#apply) - object HostGroup "mssql" { - display_name = "MSSQL Servers" - assign where host.vars.mssql_port + object HostGroup "prod-mssql" { + display_name = "Production MSSQL Servers" + assign where host.vars.mssql_port && host.vars.prod_mysql_db + ignore where host.vars.test_server == true + ignore where match("*internal", host.name) } In this inherited example from above all hosts with the `vars` attribute `mssql_port` -set will be added as members to the host group `mssql`. +set will be added as members to the host group `mssql`. All `*internal` +hosts or with the `test_server` attribute set to `true` will be ignored. ## Notifications @@ -367,17 +607,20 @@ to the defined notifications. That way you'll save duplicated attributes in each The time period `24x7` is shipped as example configuration with Icinga 2. + + Use the `apply` keyword to create `Notification` objects for your services: - apply Notification "mail" to Service { + apply Notification "notify-cust-xy-mysql" to Service { import "generic-notification" - command = "mail-notification" - users = [ "icingaadmin" ] + users = [ "noc-xy", "mgmt-xy" ] - assign where service.name == "mysql" + assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true + ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.is_clustered == true) } + Instead of assigning users to notifications, you can also add the `user_groups` attribute with a list of user groups to the `Notification` object. Icinga 2 will send notifications to all group members. @@ -424,7 +667,7 @@ Define an additional [NotificationCommand](#notification) for SMS notifications. "..." } -The two new notification escalations are added onto the host `localhost` +The two new notification escalations are added onto the local host and its service `ping4` using the `generic-notification` template. The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification` command) after `30m` until `1h`. @@ -482,8 +725,9 @@ notified, but only for one hour (`2h` as `end` key for the `times` dictionary). Sometimes the problem in question should not be notified when the notification is due (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to -postpone the first notification for 15 minutes. Leave out the `end` key - if not set, -Icinga 2 will not check against any end time for this notification. +postpone the notification window for 15 minutes. Leave out the `end` key - if not set, +Icinga 2 will not check against any end time for this notification. Make sure to +specify a relatively low notification `interval` to get notified soon enough again. apply Notification "mail" to Service { import "generic-notification" @@ -491,7 +735,9 @@ Icinga 2 will not check against any end time for this notification. command = "mail-notification" users = [ "icingaadmin" ] - times.begin = 15m // delay first notification + interval = 5m + + times.begin = 15m // delay notification window assign where service.name == "ping4" } @@ -528,7 +774,7 @@ Available state and type filters for notifications are: If you are familiar with Icinga 1.x `notification_options` please note that they have been split into type and state to allow more fine granular filtering for example on downtimes and flapping. -You can filter for acknowledgements and custom notifications too. +You can filter for acknowledgements and custom notifications too.s and custom notifications too. ## Time Periods @@ -1337,13 +1583,33 @@ re-notify if the problem persists. ## Custom Attributes +### Using Custom Attributes for Apply Rules + +Custom attributes are not only used at runtime in command definitions to pass +command arguments, but are also a smart way to define patterns and groups +for applying objects for dynamic config generation. + +There are several ways of using custom attributes with [apply rules](#using-apply): + +* As simple attribute literal ([number](#numeric-literal), [string](#string-literal), +[boolean](#boolean-literal)) for expression conditions (`assign where`, `ignore where`) +* As [array](#array) or [dictionary](#dictionary) attribute with nested values +(e.g. dictionaries in dictionaries) in [apply for](#using-apply-for) rules. + +Features like [DB IDO](#db-ido), Livestatus(#livestatus) or StatusData(#status-data) +dump this column as encoded JSON string, and set `is_json` resp. `cv_is_json` to `1`. + +If arrays are used in runtime macros (for example `$host.groups$`) all entries +are separated using the `;` character. If an entry contains a semi-colon itself, +it is escaped like this: `entry1;ent\;ry2;entry3`. + ### Using Custom Attributes at Runtime Custom attributes may be used in command definitions to dynamically change how the command is executed. Additionally there are Icinga 2 features such as the `PerfDataWriter` type -which use custom attributes to format their output. +which use custom runtime attributes to format their output. > **Tip** > diff --git a/doc/7-configuring-icinga-2.md b/doc/7-configuring-icinga-2.md index 85e227c80..b883cf808 100644 --- a/doc/7-configuring-icinga-2.md +++ b/doc/7-configuring-icinga-2.md @@ -280,6 +280,13 @@ Functions can be called using the `()` operator: check_interval = len(MyGroups) * 1m } +> **Tip** +> +> Use these functions in [apply](#using-apply) rule expressions. + + assign where match("192.168.*", host.address) + + Function | Description --------------------------------|----------------------- regex(pattern, text) | Returns true if the regex pattern matches the text, false otherwise. diff --git a/etc/icinga2/conf.d/services.conf b/etc/icinga2/conf.d/services.conf index 58665f8b4..4f794b61d 100644 --- a/etc/icinga2/conf.d/services.conf +++ b/etc/icinga2/conf.d/services.conf @@ -50,6 +50,7 @@ apply Service "ssh" { check_command = "ssh" assign where host.address && host.vars.os == "Linux" + ignore where host.name == "localhost" /* for upgrade safety */ } -- 2.40.0