Documentation: Split 'advanced' into multiple sections.

author Michael Friedrich <michael.friedrich@netways.de>

Tue, 18 Mar 2014 10:18:17 +0000 (11:18 +0100)

committer Michael Friedrich <michael.friedrich@netways.de>

Tue, 18 Mar 2014 10:18:17 +0000 (11:18 +0100)
author Michael Friedrich <michael.friedrich@netways.de>
Tue, 18 Mar 2014 10:18:17 +0000 (11:18 +0100)
committer Michael Friedrich <michael.friedrich@netways.de>
Tue, 18 Mar 2014 10:18:17 +0000 (11:18 +0100)
diff --git a/doc/6-advanced-topics.md b/doc/6-advanced-topics.md

index 70c3752a45cfdb3551003f6ed1fcb041d029240e..7ca11490e723cdf377b599c55b3716c561bde990 100644 (file)
--- a/doc/6-advanced-topics.md
+++ b/doc/6-advanced-topics.md
@@ -1,494 +1 @@
  # <a id="advanced-topics"></a> Advanced Topics
-
-## <a id="downtimes"></a> Downtimes
-
-Downtimes can be scheduled for planned server maintenance or
-any other targetted service outage you are aware of in advance.
-
-Downtimes will suppress any notifications, and may trigger other
-downtimes too. If the downtime was set by accident, or the duration
-exceeds the maintenance, you can manually cancel the downtime.
-Planned downtimes will also be taken into account for SLA reporting
-tools calculating the SLAs based on the state and downtime history.
-
-> **Note**
->
-> Downtimes may overlap with their start and end times. If there
-> are multiple downtimes triggered for one object, the overall downtime depth
-> will be more than `1`. This is useful when you want to extend
-> your maintenance window taking longer than expected.
-
-### <a id="fixed-flexible-downtimes"></a> Fixed and Flexible Downtimes
-
-A `fixed` downtime will be activated at the defined start time, and
-removed at the end time. During this time window the service state
-will change to `NOT-OK` and then actually trigger the downtime.
-Notifications are suppressed and the downtime depth is incremented.
-
-Common scenarios are a planned distribution upgrade on your linux
-servers, or database updates in your warehouse. The customer knows
-about a fixed downtime window between 23:00 and 24:00. After 24:00
-all problems should be alerted again. Solution is simple -
-schedule a `fixed` downtime starting at 23:00 and ending at 24:00.
-
-Unlike a `fixed` downtime, a `flexible` downtime end does not necessarily
-happen at the provided end time. Instead the downtime will be triggered
-by the state change in the time span defined by start and end time, but
-then last a defined duration in minutes.
-
-Imagine the following scenario: Your service is frequently polled
-by users trying to grab free deleted domains for immediate registration.
-Between 07:30 and 08:00 the impact will hit for 15 minutes and generate
-a network outage visible to the monitoring. The service is still alive,
-but answering too slow to Icinga 2 service checks.
-For that reason, you may want to schedule a downtime between 07:30 and
-08:00 with a duration of 15 minutes. The downtime will then last from
-its trigger time until the duration is over. After that, the downtime
-is removed (may happen before or after the actual end time!).
-
-### <a id="scheduling-downtime"></a> Scheduling a downtime
-
-This can either happen through a web interface (Icinga 1.x Classic UI or Web)
-or by using the external command pipe provided by the `ExternalCommandListener`
-configuration.
-
-Fixed downtimes require a start and end time (a duration will be ignored).
-Flexible downtimes need a start and end time for the time span, and a duration
-independent from that time span.
-
-> **Note**
->
-> Modern web interfaces treat services in a downtime as `handled`.
-
-### <a id="triggered-downtimes"></a> Triggered Downtimes
-
-This is optional when scheduling a downtime. If there is already a downtime
-scheduled for a future maintenance, the current downtime can be triggered by
-that downtime. This renders useful if you have scheduled a host downtime and
-are now scheduling a child host's downtime getting triggered by the parent
-downtime on NOT-OK state change.
-
-### <a id="recurring-downtimes"></a> Recurring Downtimes
-
-[ScheduledDowntime objects](#objecttype-scheduleddowntime) can be used to set up
-recurring downtimes for services.
-
-Example:
-
-    template ScheduledDowntime "backup-downtime" {
-      author = "icingaadmin",
-      comment = "Scheduled downtime for backup",
-
-      ranges = {
-        monday = "02:00-03:00",
-        tuesday = "02:00-03:00",
-        wednesday = "02:00-03:00",
-        thursday = "02:00-03:00",
-        friday = "02:00-03:00",
-        saturday = "02:00-03:00",
-        sunday = "02:00-03:00"
-      }
-    }
-
-    object Host "localhost" inherits "generic-host" {
-      ...
-      services["load"] = {
-        templates = [ "generic-service" ],
-
-        check_command = "load",
-
-        scheduled_downtimes["backup"] = {
-          templates = [ "backup-downtime" ]
-        }
-      },
-    }
-
-
-## <a id="comments"></a> Comments
-
-Comments can be added at runtime and are persistent over restarts. You can
-add useful information for others on repeating incidents (for example
-"last time syslog at 100% cpu on 17.10.2013 due to stale nfs mount") which
-is primarly accessible using web interfaces.
-
-Adding and deleting comment actions are possible through the external command pipe
-provided with the `ExternalCommandListener` configuration. The caller must
-pass the comment id in case of manipulating an existing comment.
-
-## <a id="acknowledgements"></a> Acknowledgements
-
-If a problem is alerted and notified you may signal the other notification
-receipients that you are aware of the problem and will handle it.
-
-By sending an acknowledgement to Icinga 2 (using the external command pipe
-provided with `ExternalCommandListener` configuration) all future notifications
-are suppressed, a new comment is added with the provided description and
-a notification with the type `NotificationFilterAcknowledgement` is sent
-to all notified users.
-
-> **Note**
->
-> Modern web interfaces treat acknowledged problems as `handled`.
-
-### <a id="expiring-acknowledgements"></a> Expiring Acknowledgements
-
-Once a problem is acknowledged it may disappear from your `handled problems`
-dashboard and no-one ever looks at it again since it will suppress
-notifications too.
-
-This `fire-and-forget` action is quite common. If you're sure that a
-current problem should be resolved in the future at a defined time,
-you can define an expiration time when acknowledging the problem.
-
-Icinga 2 will clear the acknowledgement when expired and start to
-re-notify if the problem persists.
-
-## <a id="cluster"></a> Cluster
-
-An Icinga 2 cluster consists of two or more nodes and can reside on multiple
-architectures. The base concept of Icinga 2 is the possibility to add additional
-features using components. In case of a cluster setup you have to add the
-cluster feature to all nodes. Before you start configuring the diffent nodes
-it's necessary to setup the underlying communication layer based on SSL.
-
-### <a id="certificate-authority-certificates"></a> Certificate Authority and Certificates
-
-Icinga 2 comes with two scripts helping you to create CA and node certificates
-for you Icinga 2 Cluster.
-
-The first step is the creation of CA using the following command:
-
-    icinga2-build-ca
-
-Please make sure to export a variable containing an empty folder for the created
-CA files:
-
-    export ICINGA_CA="/root/icinga-ca"
-
-In the next step you have to create a certificate and a key file for every node
-using the following command:
-
-    icinga2-build-key icinga-node-1
-
-Please create a certificate and a key file for every node in the Icinga 2
-Cluster and save the CA key in case you want to set up certificates for
-additional nodes at a later date.
-
-### <a id="enable-cluster-configuration"></a> Enable the Cluster Configuration
-
-Until the cluster-component is moved into an independent feature you have to
-enable the required libraries in the icinga2.conf configuration file:
-
-    library "cluster"
-
-### <a id="configure-clusterlistener-object"></a> Configure the ClusterListener Object
-
-The ClusterListener needs to be configured on every node in the cluster with the
-following settings:
-
-  Configuration Setting    |Value
-  -------------------------|------------------------------------
-  ca_path                  | path to ca.crt file
-  cert_path                | path to server certificate
-  key_path                 | path to server key
-  bind_port                | port for incoming and outgoing conns
-  peers                    | array of all reachable nodes
-  ------------------------- ------------------------------------
-
-A sample config part can look like this:
-
-    /**
-     * Load cluster library and configure ClusterListener using certificate files
-     */
-    library "cluster"
-
-    object ClusterListener "cluster" {
-      ca_path = "/etc/icinga2/ca/ca.crt",
-      cert_path = "/etc/icinga2/ca/icinga-node-1.crt",
-      key_path = "/etc/icinga2/ca/icinga-node-1.key",
-
-      bind_port = 8888,
-
-      peers = [ "icinga-node-2" ]
-    }
-
-> **Note**
->
-> The certificate files must be readable by the user Icinga 2 is running as. Also,
-> the private key file should not be world-readable.
-
-Peers configures the direction used to connect multiple nodes together. If have
-a three node cluster consisting of
-
-* node-1
-* node-2
-* node-3
-
-and `node-3` is only reachable from `node-2`, you have to consider this in your
-peer configuration.
-
-### <a id="configure-cluster-endpoints"></a> Configure Cluster Endpoints
-
-In addition to the configured port and hostname every endpoint can have specific
-abilities to send configuration files to other nodes and limit the hosts allowed
-to send configuration files.
-
-  Configuration Setting    |Value
-  -------------------------|------------------------------------
-  host                     | hostname
-  port                     | port
-  accept_config            | defines all nodes allowed to send configs
-  config_files             | defines all files to be send to that node - MUST BE AN ABSOLUTE PATH
-  ------------------------- ------------------------------------
-
-A sample config part can look like this:
-
-    /**
-     * Configure config master endpoint
-     */
-       
-    object Endpoint "icinga-node-1" {
-      host = "icinga-node-1.localdomain",
-      port = 8888,
-      config_files = ["/etc/icinga2/conf.d/*.conf"]
-    }
-
-If you update the configuration files on the configured file sender, it will
-force a restart on all receiving nodes after validating the new config.
-
-A sample config part for a config receiver endpoint can look like this:
-
-    /**
-     * Configure config receiver endpoint
-     */
-
-    object Endpoint "icinga-node-2" {
-      host = "icinga-node-2.localdomain",
-      port = 8888,
-      accept_config = [ "icinga-node-1" ]
-    }
-
-By default these configuration files are saved in /var/lib/icinga2/cluster/config.
-
-In order to load configuration files which were received from a remote Icinga 2
-instance you will have to add the following include directive to your
-`icinga2.conf` configuration file:
-
-    include (IcingaLocalStateDir + "/lib/icinga2/cluster/config/*/*")
-
-### <a id="initial-cluster-sync"></a> Initial Cluster Sync
-
-In order to make sure that all of your cluster nodes have the same state you will
-have to pick one of the nodes as your initial "master" and copy its state file
-to all the other nodes.
-
-You can find the state file in `/var/lib/icinga2/icinga2.state`. Before copying
-the state file you should make sure that all your cluster nodes are properly shut
-down.
-
-
-### <a id="assign-services-to-cluster-nodes"></a> Assign Services to Cluster Nodes
-
-By default all services are distributed among the cluster nodes with the `Checker`
-feature enabled.
-If you require specific services to be only executed by one or more checker nodes
-within the cluster, you must define `authorities` as additional service object
-attribute. Required Endpoints must be defined as array.
-
-    object Host "dmz-host1" inherits "generic-host" {
-      services["dmz-oracledb"] = {
-        templates = [ "generic-service" ],
-        authorities = [ "icinga-node-1" ],
-      }
-    }
-
-> **Tip**
->
-> Most common usecase is building a classic Master-Slave-Setup. The master node
-> does not have the `Checker` feature enabled, and the slave nodes are checking
-> services based on their location, inheriting from a global service template
-> defining the authorities.
-
-### <a id="cluster-health-check"></a> Cluster Health Check
-
-The Icinga 2 [ITL](#itl) ships an internal check command checking all configured
-`EndPoints` in the cluster setup. The check result will become critical if
-one or more configured nodes are not connected.
-
-Example:
-
-    object Host "icinga2a" inherits "generic-host" {
-      services["cluster"] = {
-        templates = [ "generic-service" ],
-        check_interval = 1m,
-        check_command = "cluster",
-        authorities = [ "icinga2a" ]
-      },
-    }
-
-> **Note**
->
-> Each cluster node should execute its own local cluster health check to
-> get an idea about network related connection problems from different
-> point of views. Use the `authorities` attribute to assign the service
-> check to the configured node.
-
-### <a id="host-multiple-cluster-nodes"></a> Host With Multiple Cluster Nodes
-
-Special scenarios might require multiple cluster nodes running on a single host.
-By default Icinga 2 and its features will drop their runtime data below the prefix
-`IcingaLocalStateDir`. By default packages will set that path to `/var`.
-You can either set that variable as constant configuration
-definition in [icinga2.conf](#icinga2-conf) or pass it as runtime variable to
-the Icinga 2 daemon.
-
-    # icinga2 -c /etc/icinga2/node1/icinga2.conf -DIcingaLocalStateDir=/opt/node1/var
-
-## <a id="domains"></a> Domains
-
-A [Service](#objecttype-service) object can be restricted using the `domains` attribute
-array specifying endpoint privileges.
-A Domain object specifices the ACLs applied for each [Endpoint](#objecttype-endpoint).
-
-The following example assigns the domain `dmz-db` to the service `dmz-oracledb`. Endpoint
-`icinga-node-dmz-1` does not allow any object modification (no commands, check results) and only
-relays local messages to the remote node(s). The endpoint `icinga-node-dmz-2` processes all
-messages read and write (accept check results, commands and also relay messages to remote
-nodes).
-
-That way the service `dmz-oracledb` on endpoint `icinga-node-dmz-1` will not be modified
-by any cluster event message, and could be checked by the local authority too presenting
-a different state history. `icinga-node-dmz-2` still receives all cluster message updates
-from the `icinga-node-dmz-1` endpoint.
-
-    object Host "dmz-host1" inherits "generic-host" {
-      services["dmz-oracledb"] = {
-        templates = [ "generic-service" ],
-        domains = [ "dmz-db" ],
-        authorities = [ "icinga-node-dmz-1", "icinga-node-dmz-2"],
-      }
-    }
-
-    object Domain "dmz-db" {
-      acl = {
-        icinga-node-dmz-1 = (DomainPrivReadOnly),
-        icinga-node-dmz-2 = (DomainPrivReadWrite)
-      }
-    }
-
-## <a id="dependencies"></a> Dependencies
-
-Icinga 2 uses host and service [Dependency](#objecttype-dependency) objects either directly
-defined or as inline definition as `dependencies` dictionary. The `parent_host` and `parent_service`
-attributes are mandatory, `child_host` and `child_service` attributes are obsolete within
-inline definitions in an existing service object or service inline definition.
-
-A service can depend on a host, and vice versa. A service has an implicit dependency (parent)
-to its host. A host to host dependency acts implicit as host parent relation.
-When dependencies are calculated, not only the immediate parent is taken into
-account but all parents are inherited.
-
-A common scenario is the Icinga 2 server behind a router. Checking internet
-access by pinging the Google DNS server `google-dns` is a common method, but
-will fail in case the `dsl-router` host is down. Therefore the example below
-defines a host dependency which acts implicit as parent relation too.
-
-Furthermore the host may be reachable but ping samples are dropped by the
-router's firewall. In case the `dsl-router``ping4` service check fails, all
-further checks for the `google-dns` `ping4` service should be suppressed.
-This is achieved by setting the `disable_checks` attribute to `true`.
-
-    object Host "dsl-router" {
-      services["ping4"] = {
-        templates = "generic-service",
-        check_command = "ping4"
-      }
-
-      macros = {
-        address = "192.168.1.1",
-      },
-    }
-
-    object Host "google-dns" {
-      services["ping4"] = {
-        templates = "generic-service",
-        check_command = "ping4",
-        dependencies["dsl-router-ping4"] = {
-          parent_host = "dsl-router",
-          parent_service = "ping4",
-          disable_checks = true
-        }
-      }
-
-      macros = {
-        address = "8.8.8.8",
-      }, 
-      
-      dependencies["dsl-router"] = {
-        parent_host = "dsl-router"
-      },
-
-    }
-
-## <a id="check-result-freshness"></a> Check Result Freshness
-
-In Icinga 2 active check freshness is enabled by default. It is determined by the
-`check_interval` attribute and no incoming check results in that period of time.
-
-    threshold = last check execution time + check interval
-
-Passive check freshness is calculated from the `check_interval` attribute if set.
-
-    threshold = last check result time + check interval
-
-If the freshness checks are invalid, a new check is executed defined by the
-`check_command` attribute.
-
-## <a id="check-flapping"></a> Check Flapping
-
-The flapping algorithm used in Icinga 2 does not store the past states but
-calculcates the flapping threshold from a single value based on counters and
-half-life values. Icinga 2 compares the value with a single flapping threshold
-configuration attribute named `flapping_threshold`.
-
-> **Note**
->
-> Flapping must be explicitely enabled setting the `Service` object attribute
-> `enable_flapping = 1`.
-
-## <a id="volatile-services"></a> Volatile Services
-
-By default all services remain in a non-volatile state. When a problem
-occurs, the `SOFT` state applies and once `max_check_attempts` attribute
-is reached with the check counter, a `HARD` state transition happens.
-Notifications are only triggered by `HARD` state changes and are then
-re-sent defined by the `notification_interval` attribute.
-
-It may be reasonable to have a volatile service which stays in a `HARD`
-state type if the service stays in a `NOT-OK` state. That way each
-service recheck will automatically trigger a notification unless the
-service is acknowledged or in a scheduled downtime.
-
-## <a id="modified-attributes"></a> Modified Attributes
-
-Icinga 2 allows you to modify defined object attributes at runtime different to
-the local configuration object attributes. These modified attributes are
-stored as bit-shifted-value and made available in backends. Icinga 2 stores
-modified attributes in its state file and restores them on restart.
-
-Modified Attributes can be reset using external commands.
-
-
-## <a id="plugin-api"></a> Plugin API
-
-Currently the native plugin api inherited from the `Monitoring Plugins` (former
-`Nagios Plugins`) project is available.
-Future specifications will be documented here.
-
-### <a id="monitoring-plugin-api"></a> Monitoring Plugin API
-
-The `Monitoring Plugin API` (former `Nagios Plugin API`) is defined in the
-[Monitoring Plugins Development Guidelines](https://www.monitoring-plugins.org/doc/guidelines.html).
-
-
-
diff --git a/doc/6.01-downtimes.md b/doc/6.01-downtimes.md

new file mode 100644 (file)

index 0000000..6d2ab0b
--- /dev/null
+++ b/doc/6.01-downtimes.md
@@ -0,0 +1,102 @@
+## <a id="downtimes"></a> Downtimes
+
+Downtimes can be scheduled for planned server maintenance or
+any other targetted service outage you are aware of in advance.
+
+Downtimes will suppress any notifications, and may trigger other
+downtimes too. If the downtime was set by accident, or the duration
+exceeds the maintenance, you can manually cancel the downtime.
+Planned downtimes will also be taken into account for SLA reporting
+tools calculating the SLAs based on the state and downtime history.
+
+> **Note**
+>
+> Downtimes may overlap with their start and end times. If there
+> are multiple downtimes triggered for one object, the overall downtime depth
+> will be more than `1`. This is useful when you want to extend
+> your maintenance window taking longer than expected.
+
+### <a id="fixed-flexible-downtimes"></a> Fixed and Flexible Downtimes
+
+A `fixed` downtime will be activated at the defined start time, and
+removed at the end time. During this time window the service state
+will change to `NOT-OK` and then actually trigger the downtime.
+Notifications are suppressed and the downtime depth is incremented.
+
+Common scenarios are a planned distribution upgrade on your linux
+servers, or database updates in your warehouse. The customer knows
+about a fixed downtime window between 23:00 and 24:00. After 24:00
+all problems should be alerted again. Solution is simple -
+schedule a `fixed` downtime starting at 23:00 and ending at 24:00.
+
+Unlike a `fixed` downtime, a `flexible` downtime end does not necessarily
+happen at the provided end time. Instead the downtime will be triggered
+by the state change in the time span defined by start and end time, but
+then last a defined duration in minutes.
+
+Imagine the following scenario: Your service is frequently polled
+by users trying to grab free deleted domains for immediate registration.
+Between 07:30 and 08:00 the impact will hit for 15 minutes and generate
+a network outage visible to the monitoring. The service is still alive,
+but answering too slow to Icinga 2 service checks.
+For that reason, you may want to schedule a downtime between 07:30 and
+08:00 with a duration of 15 minutes. The downtime will then last from
+its trigger time until the duration is over. After that, the downtime
+is removed (may happen before or after the actual end time!).
+
+### <a id="scheduling-downtime"></a> Scheduling a downtime
+
+This can either happen through a web interface (Icinga 1.x Classic UI or Web)
+or by using the external command pipe provided by the `ExternalCommandListener`
+configuration.
+
+Fixed downtimes require a start and end time (a duration will be ignored).
+Flexible downtimes need a start and end time for the time span, and a duration
+independent from that time span.
+
+> **Note**
+>
+> Modern web interfaces treat services in a downtime as `handled`.
+
+### <a id="triggered-downtimes"></a> Triggered Downtimes
+
+This is optional when scheduling a downtime. If there is already a downtime
+scheduled for a future maintenance, the current downtime can be triggered by
+that downtime. This renders useful if you have scheduled a host downtime and
+are now scheduling a child host's downtime getting triggered by the parent
+downtime on NOT-OK state change.
+
+### <a id="recurring-downtimes"></a> Recurring Downtimes
+
+[ScheduledDowntime objects](#objecttype-scheduleddowntime) can be used to set up
+recurring downtimes for services.
+
+Example:
+
+    template ScheduledDowntime "backup-downtime" {
+      author = "icingaadmin",
+      comment = "Scheduled downtime for backup",
+
+      ranges = {
+        monday = "02:00-03:00",
+        tuesday = "02:00-03:00",
+        wednesday = "02:00-03:00",
+        thursday = "02:00-03:00",
+        friday = "02:00-03:00",
+        saturday = "02:00-03:00",
+        sunday = "02:00-03:00"
+      }
+    }
+
+    object Host "localhost" inherits "generic-host" {
+      ...
+      services["load"] = {
+        templates = [ "generic-service" ],
+
+        check_command = "load",
+
+        scheduled_downtimes["backup"] = {
+          templates = [ "backup-downtime" ]
+        }
+      },
+    }
+\ No newline at end of file
diff --git a/doc/6.02-comments.md b/doc/6.02-comments.md

new file mode 100644 (file)

index 0000000..10a796e
--- /dev/null
+++ b/doc/6.02-comments.md
@@ -0,0 +1,10 @@
+## <a id="comments"></a> Comments
+
+Comments can be added at runtime and are persistent over restarts. You can
+add useful information for others on repeating incidents (for example
+"last time syslog at 100% cpu on 17.10.2013 due to stale nfs mount") which
+is primarly accessible using web interfaces.
+
+Adding and deleting comment actions are possible through the external command pipe
+provided with the `ExternalCommandListener` configuration. The caller must
+pass the comment id in case of manipulating an existing comment.
+\ No newline at end of file
diff --git a/doc/6.03-acknowledgements.md b/doc/6.03-acknowledgements.md

new file mode 100644 (file)

index 0000000..0a03986
--- /dev/null
+++ b/doc/6.03-acknowledgements.md
@@ -0,0 +1,27 @@
+## <a id="acknowledgements"></a> Acknowledgements
+
+If a problem is alerted and notified you may signal the other notification
+receipients that you are aware of the problem and will handle it.
+
+By sending an acknowledgement to Icinga 2 (using the external command pipe
+provided with `ExternalCommandListener` configuration) all future notifications
+are suppressed, a new comment is added with the provided description and
+a notification with the type `NotificationFilterAcknowledgement` is sent
+to all notified users.
+
+> **Note**
+>
+> Modern web interfaces treat acknowledged problems as `handled`.
+
+### <a id="expiring-acknowledgements"></a> Expiring Acknowledgements
+
+Once a problem is acknowledged it may disappear from your `handled problems`
+dashboard and no-one ever looks at it again since it will suppress
+notifications too.
+
+This `fire-and-forget` action is quite common. If you're sure that a
+current problem should be resolved in the future at a defined time,
+you can define an expiration time when acknowledging the problem.
+
+Icinga 2 will clear the acknowledgement when expired and start to
+re-notify if the problem persists.
+\ No newline at end of file
diff --git a/doc/6.04-cluster.md b/doc/6.04-cluster.md

new file mode 100644 (file)

index 0000000..9284d79
--- /dev/null
+++ b/doc/6.04-cluster.md
@@ -0,0 +1,200 @@
+## <a id="cluster"></a> Cluster
+
+An Icinga 2 cluster consists of two or more nodes and can reside on multiple
+architectures. The base concept of Icinga 2 is the possibility to add additional
+features using components. In case of a cluster setup you have to add the
+cluster feature to all nodes. Before you start configuring the diffent nodes
+it's necessary to setup the underlying communication layer based on SSL.
+
+### <a id="certificate-authority-certificates"></a> Certificate Authority and Certificates
+
+Icinga 2 comes with two scripts helping you to create CA and node certificates
+for you Icinga 2 Cluster.
+
+The first step is the creation of CA using the following command:
+
+    icinga2-build-ca
+
+Please make sure to export a variable containing an empty folder for the created
+CA files:
+
+    export ICINGA_CA="/root/icinga-ca"
+
+In the next step you have to create a certificate and a key file for every node
+using the following command:
+
+    icinga2-build-key icinga-node-1
+
+Please create a certificate and a key file for every node in the Icinga 2
+Cluster and save the CA key in case you want to set up certificates for
+additional nodes at a later date.
+
+### <a id="enable-cluster-configuration"></a> Enable the Cluster Configuration
+
+Until the cluster-component is moved into an independent feature you have to
+enable the required libraries in the icinga2.conf configuration file:
+
+    library "cluster"
+
+### <a id="configure-clusterlistener-object"></a> Configure the ClusterListener Object
+
+The ClusterListener needs to be configured on every node in the cluster with the
+following settings:
+
+  Configuration Setting    |Value
+  -------------------------|------------------------------------
+  ca_path                  | path to ca.crt file
+  cert_path                | path to server certificate
+  key_path                 | path to server key
+  bind_port                | port for incoming and outgoing conns
+  peers                    | array of all reachable nodes
+  ------------------------- ------------------------------------
+
+A sample config part can look like this:
+
+    /**
+     * Load cluster library and configure ClusterListener using certificate files
+     */
+    library "cluster"
+
+    object ClusterListener "cluster" {
+      ca_path = "/etc/icinga2/ca/ca.crt",
+      cert_path = "/etc/icinga2/ca/icinga-node-1.crt",
+      key_path = "/etc/icinga2/ca/icinga-node-1.key",
+
+      bind_port = 8888,
+
+      peers = [ "icinga-node-2" ]
+    }
+
+> **Note**
+>
+> The certificate files must be readable by the user Icinga 2 is running as. Also,
+> the private key file should not be world-readable.
+
+Peers configures the direction used to connect multiple nodes together. If have
+a three node cluster consisting of
+
+* node-1
+* node-2
+* node-3
+
+and `node-3` is only reachable from `node-2`, you have to consider this in your
+peer configuration.
+
+### <a id="configure-cluster-endpoints"></a> Configure Cluster Endpoints
+
+In addition to the configured port and hostname every endpoint can have specific
+abilities to send configuration files to other nodes and limit the hosts allowed
+to send configuration files.
+
+  Configuration Setting    |Value
+  -------------------------|------------------------------------
+  host                     | hostname
+  port                     | port
+  accept_config            | defines all nodes allowed to send configs
+  config_files             | defines all files to be send to that node - MUST BE AN ABSOLUTE PATH
+  ------------------------- ------------------------------------
+
+A sample config part can look like this:
+
+    /**
+     * Configure config master endpoint
+     */
+
+    object Endpoint "icinga-node-1" {
+      host = "icinga-node-1.localdomain",
+      port = 8888,
+      config_files = ["/etc/icinga2/conf.d/*.conf"]
+    }
+
+If you update the configuration files on the configured file sender, it will
+force a restart on all receiving nodes after validating the new config.
+
+A sample config part for a config receiver endpoint can look like this:
+
+    /**
+     * Configure config receiver endpoint
+     */
+
+    object Endpoint "icinga-node-2" {
+      host = "icinga-node-2.localdomain",
+      port = 8888,
+      accept_config = [ "icinga-node-1" ]
+    }
+
+By default these configuration files are saved in /var/lib/icinga2/cluster/config.
+
+In order to load configuration files which were received from a remote Icinga 2
+instance you will have to add the following include directive to your
+`icinga2.conf` configuration file:
+
+    include (IcingaLocalStateDir + "/lib/icinga2/cluster/config/*/*")
+
+### <a id="initial-cluster-sync"></a> Initial Cluster Sync
+
+In order to make sure that all of your cluster nodes have the same state you will
+have to pick one of the nodes as your initial "master" and copy its state file
+to all the other nodes.
+
+You can find the state file in `/var/lib/icinga2/icinga2.state`. Before copying
+the state file you should make sure that all your cluster nodes are properly shut
+down.
+
+
+### <a id="assign-services-to-cluster-nodes"></a> Assign Services to Cluster Nodes
+
+By default all services are distributed among the cluster nodes with the `Checker`
+feature enabled.
+If you require specific services to be only executed by one or more checker nodes
+within the cluster, you must define `authorities` as additional service object
+attribute. Required Endpoints must be defined as array.
+
+    object Host "dmz-host1" inherits "generic-host" {
+      services["dmz-oracledb"] = {
+        templates = [ "generic-service" ],
+        authorities = [ "icinga-node-1" ],
+      }
+    }
+
+> **Tip**
+>
+> Most common usecase is building a classic Master-Slave-Setup. The master node
+> does not have the `Checker` feature enabled, and the slave nodes are checking
+> services based on their location, inheriting from a global service template
+> defining the authorities.
+
+### <a id="cluster-health-check"></a> Cluster Health Check
+
+The Icinga 2 [ITL](#itl) ships an internal check command checking all configured
+`EndPoints` in the cluster setup. The check result will become critical if
+one or more configured nodes are not connected.
+
+Example:
+
+    object Host "icinga2a" inherits "generic-host" {
+      services["cluster"] = {
+        templates = [ "generic-service" ],
+        check_interval = 1m,
+        check_command = "cluster",
+        authorities = [ "icinga2a" ]
+      },
+    }
+
+> **Note**
+>
+> Each cluster node should execute its own local cluster health check to
+> get an idea about network related connection problems from different
+> point of views. Use the `authorities` attribute to assign the service
+> check to the configured node.
+
+### <a id="host-multiple-cluster-nodes"></a> Host With Multiple Cluster Nodes
+
+Special scenarios might require multiple cluster nodes running on a single host.
+By default Icinga 2 and its features will drop their runtime data below the prefix
+`IcingaLocalStateDir`. By default packages will set that path to `/var`.
+You can either set that variable as constant configuration
+definition in [icinga2.conf](#icinga2-conf) or pass it as runtime variable to
+the Icinga 2 daemon.
+
+    # icinga2 -c /etc/icinga2/node1/icinga2.conf -DIcingaLocalStateDir=/opt/node1/var
+\ No newline at end of file
diff --git a/doc/6.05-domains.md b/doc/6.05-domains.md

new file mode 100644 (file)

index 0000000..3e79ae3
--- /dev/null
+++ b/doc/6.05-domains.md
@@ -0,0 +1,31 @@
+## <a id="domains"></a> Domains
+
+A [Service](#objecttype-service) object can be restricted using the `domains` attribute
+array specifying endpoint privileges.
+A Domain object specifices the ACLs applied for each [Endpoint](#objecttype-endpoint).
+
+The following example assigns the domain `dmz-db` to the service `dmz-oracledb`. Endpoint
+`icinga-node-dmz-1` does not allow any object modification (no commands, check results) and only
+relays local messages to the remote node(s). The endpoint `icinga-node-dmz-2` processes all
+messages read and write (accept check results, commands and also relay messages to remote
+nodes).
+
+That way the service `dmz-oracledb` on endpoint `icinga-node-dmz-1` will not be modified
+by any cluster event message, and could be checked by the local authority too presenting
+a different state history. `icinga-node-dmz-2` still receives all cluster message updates
+from the `icinga-node-dmz-1` endpoint.
+
+    object Host "dmz-host1" inherits "generic-host" {
+      services["dmz-oracledb"] = {
+        templates = [ "generic-service" ],
+        domains = [ "dmz-db" ],
+        authorities = [ "icinga-node-dmz-1", "icinga-node-dmz-2"],
+      }
+    }
+
+    object Domain "dmz-db" {
+      acl = {
+        icinga-node-dmz-1 = (DomainPrivReadOnly),
+        icinga-node-dmz-2 = (DomainPrivReadWrite)
+      }
+    }
+\ No newline at end of file
diff --git a/doc/6.06-dependencies.md b/doc/6.06-dependencies.md

new file mode 100644 (file)

index 0000000..bbbdd94
--- /dev/null
+++ b/doc/6.06-dependencies.md
@@ -0,0 +1,53 @@
+## <a id="dependencies"></a> Dependencies
+
+Icinga 2 uses host and service [Dependency](#objecttype-dependency) objects either directly
+defined or as inline definition as `dependencies` dictionary. The `parent_host` and `parent_service`
+attributes are mandatory, `child_host` and `child_service` attributes are obsolete within
+inline definitions in an existing service object or service inline definition.
+
+A service can depend on a host, and vice versa. A service has an implicit dependency (parent)
+to its host. A host to host dependency acts implicit as host parent relation.
+When dependencies are calculated, not only the immediate parent is taken into
+account but all parents are inherited.
+
+A common scenario is the Icinga 2 server behind a router. Checking internet
+access by pinging the Google DNS server `google-dns` is a common method, but
+will fail in case the `dsl-router` host is down. Therefore the example below
+defines a host dependency which acts implicit as parent relation too.
+
+Furthermore the host may be reachable but ping samples are dropped by the
+router's firewall. In case the `dsl-router``ping4` service check fails, all
+further checks for the `google-dns` `ping4` service should be suppressed.
+This is achieved by setting the `disable_checks` attribute to `true`.
+
+    object Host "dsl-router" {
+      services["ping4"] = {
+        templates = "generic-service",
+        check_command = "ping4"
+      }
+
+      macros = {
+        address = "192.168.1.1",
+      },
+    }
+
+    object Host "google-dns" {
+      services["ping4"] = {
+        templates = "generic-service",
+        check_command = "ping4",
+        dependencies["dsl-router-ping4"] = {
+          parent_host = "dsl-router",
+          parent_service = "ping4",
+          disable_checks = true
+        }
+      }
+
+      macros = {
+        address = "8.8.8.8",
+      },
+
+      dependencies["dsl-router"] = {
+        parent_host = "dsl-router"
+      },
+
+    }
+\ No newline at end of file
diff --git a/doc/6.07-check-result-freshness.md b/doc/6.07-check-result-freshness.md

new file mode 100644 (file)

index 0000000..6dac587
--- /dev/null
+++ b/doc/6.07-check-result-freshness.md
@@ -0,0 +1,13 @@
+## <a id="check-result-freshness"></a> Check Result Freshness
+
+In Icinga 2 active check freshness is enabled by default. It is determined by the
+`check_interval` attribute and no incoming check results in that period of time.
+
+    threshold = last check execution time + check interval
+
+Passive check freshness is calculated from the `check_interval` attribute if set.
+
+    threshold = last check result time + check interval
+
+If the freshness checks are invalid, a new check is executed defined by the
+`check_command` attribute.
+\ No newline at end of file
diff --git a/doc/6.08-check-flapping.md b/doc/6.08-check-flapping.md

new file mode 100644 (file)

index 0000000..3381b4b
--- /dev/null
+++ b/doc/6.08-check-flapping.md
@@ -0,0 +1,11 @@
+## <a id="check-flapping"></a> Check Flapping
+
+The flapping algorithm used in Icinga 2 does not store the past states but
+calculcates the flapping threshold from a single value based on counters and
+half-life values. Icinga 2 compares the value with a single flapping threshold
+configuration attribute named `flapping_threshold`.
+
+> **Note**
+>
+> Flapping must be explicitely enabled setting the `Service` object attribute
+> `enable_flapping = 1`.
+\ No newline at end of file
diff --git a/doc/6.09-volatile-services.md b/doc/6.09-volatile-services.md

new file mode 100644 (file)

index 0000000..3cf0b5e
--- /dev/null
+++ b/doc/6.09-volatile-services.md
@@ -0,0 +1,12 @@
+## <a id="volatile-services"></a> Volatile Services
+
+By default all services remain in a non-volatile state. When a problem
+occurs, the `SOFT` state applies and once `max_check_attempts` attribute
+is reached with the check counter, a `HARD` state transition happens.
+Notifications are only triggered by `HARD` state changes and are then
+re-sent defined by the `notification_interval` attribute.
+
+It may be reasonable to have a volatile service which stays in a `HARD`
+state type if the service stays in a `NOT-OK` state. That way each
+service recheck will automatically trigger a notification unless the
+service is acknowledged or in a scheduled downtime.
+\ No newline at end of file
diff --git a/doc/6.10-modified-attributes.md b/doc/6.10-modified-attributes.md

new file mode 100644 (file)

index 0000000..d3229a4
--- /dev/null
+++ b/doc/6.10-modified-attributes.md
@@ -0,0 +1,8 @@
+## <a id="modified-attributes"></a> Modified Attributes
+
+Icinga 2 allows you to modify defined object attributes at runtime different to
+the local configuration object attributes. These modified attributes are
+stored as bit-shifted-value and made available in backends. Icinga 2 stores
+modified attributes in its state file and restores them on restart.
+
+Modified Attributes can be reset using external commands.
+\ No newline at end of file
diff --git a/doc/6.11-plugin-api.md b/doc/6.11-plugin-api.md

new file mode 100644 (file)

index 0000000..c5b61cc
--- /dev/null
+++ b/doc/6.11-plugin-api.md
@@ -0,0 +1,10 @@
+## <a id="plugin-api"></a> Plugin API
+
+Currently the native plugin api inherited from the `Monitoring Plugins` (former
+`Nagios Plugins`) project is available.
+Future specifications will be documented here.
+
+### <a id="monitoring-plugin-api"></a> Monitoring Plugin API
+
+The `Monitoring Plugin API` (former `Nagios Plugin API`) is defined in the
+[Monitoring Plugins Development Guidelines](https://www.monitoring-plugins.org/doc/guidelines.html).
+\ No newline at end of file
author	Michael Friedrich <michael.friedrich@netways.de>
	Tue, 18 Mar 2014 10:18:17 +0000 (11:18 +0100)
committer	Michael Friedrich <michael.friedrich@netways.de>
	Tue, 18 Mar 2014 10:18:17 +0000 (11:18 +0100)
doc/6-advanced-topics.md		patch \| blob \| history
doc/6.01-downtimes.md	[new file with mode: 0644]	patch \| blob
doc/6.02-comments.md	[new file with mode: 0644]	patch \| blob
doc/6.03-acknowledgements.md	[new file with mode: 0644]	patch \| blob
doc/6.04-cluster.md	[new file with mode: 0644]	patch \| blob
doc/6.05-domains.md	[new file with mode: 0644]	patch \| blob
doc/6.06-dependencies.md	[new file with mode: 0644]	patch \| blob
doc/6.07-check-result-freshness.md	[new file with mode: 0644]	patch \| blob
doc/6.08-check-flapping.md	[new file with mode: 0644]	patch \| blob
doc/6.09-volatile-services.md	[new file with mode: 0644]	patch \| blob
doc/6.10-modified-attributes.md	[new file with mode: 0644]	patch \| blob
doc/6.11-plugin-api.md	[new file with mode: 0644]	patch \| blob