* [Register](https://accounts.icinga.org/register) an Icinga account.
* Create a new issue at the [Icinga 2 Development Tracker](https://dev.icinga.org/projects/i2).
-* When reporting a bug, please include the details described in the [Troubleshooting](16-troubleshooting.md#troubleshooting-information-required) chapter (version, configs, logs, etc).
+* When reporting a bug, please include the details described in the [Troubleshooting](16-troubleshooting.md#troubleshooting-information-required) chapter (version, configs, logs, etc.).
## <a id="whats-new"></a> What's New
* You can blacklist remote nodes entirely. They are then ignored on `node update-config`
on the master.
* Your remote instance can have local configuration **and** act as remote command execution bridge.
-* You can use the `global` cluster zones to sync check commands, templates, etc to your remote clients.
+* You can use the `global` cluster zones to sync check commands, templates, etc. to your remote clients.
Be it just for command execution or for helping the local configuration.
* If your remote clients shouldn't have local configuration, remove `conf.d` inclusion from `icinga2`
and simply use the cluster configuration sync.
### <a id="icinga2-client-configuration-command-bridge"></a> Clients as Command Execution Bridge
-Similar to other addons (NRPE, NSClient++, etc) the remote Icinga 2 client will only
+Similar to other addons (NRPE, NSClient++, etc.) the remote Icinga 2 client will only
execute commands the master instance is sending. There are no local host or service
objects configured, only the check command definitions must be configured.
### <a id="icinga2-client-configuration-master-config-sync"></a> Clients with Master Config Sync
This is an advanced configuration mode which requires knowledge about the Icinga 2
-cluster configuration and its object relation (Zones, Endpoints, etc) and the way you
+cluster configuration and its object relation (Zones, Endpoints, etc.) and the way you
will be able to sync the configuration from the master to the remote satellite or client.
Please continue reading in the [distributed monitoring chapter](13-distributed-monitoring-ha.md#distributed-monitoring-high-availability),
* All nodes (endpoints) in a cluster zone provide high availability functionality and trust each other.
* Cluster zones can be built in a Top-Down-design where the child trusts the parent.
-Decide whether to use the built-in [configuration syncronization](13-distributed-monitoring-ha.md#cluster-zone-config-sync) or use an external tool (Puppet, Ansible, Chef, Salt, etc) to manage the configuration deployment.
+Decide whether to use the built-in [configuration syncronization](13-distributed-monitoring-ha.md#cluster-zone-config-sync) or use an external tool (Puppet, Ansible, Chef, Salt, etc.) to manage the configuration deployment.
> **Tip**
> ** Note **
>
-> Only put templates, groups, etc into this zone. DO NOT add checkable objects such as
+> Only put templates, groups, etc. into this zone. DO NOT add checkable objects such as
> hosts or services here. If they are checked by all instances globally, this will lead
> into duplicated check results and unclear state history. Not easy to troubleshoot too -
> you have been warned.
>
> There's a [Vagrant demo setup](https://github.com/Icinga/icinga-vagrant/tree/master/icinga2x-cluster)
> available featuring a two node cluster showcasing several aspects (config sync,
-> remote command execution, etc).
+> remote command execution, etc.).
### <a id="cluster-scenarios-master-satellite-clients"></a> Cluster with Master, Satellites and Remote Clients
### <a id="cluster-scenarios-security"></a> Security in Cluster Scenarios
While there are certain capabilities to ensure the safe communication between all
-nodes (firewalls, policies, software hardening, etc) the Icinga 2 cluster also provides
+nodes (firewalls, policies, software hardening, etc.) the Icinga 2 cluster also provides
additional security itself:
* [SSL certificates](13-distributed-monitoring-ha.md#manual-certificate-generation) are mandatory for cluster communication.
-* Child zones only receive event updates (check results, commands, etc) for their configured updates.
+* Child zones only receive event updates (check results, commands, etc.) for their configured updates.
* Zones cannot influence/interfere other zones. Each checked object is assigned to only one zone.
* All nodes in a zone trust each other.
* [Configuration sync](13-distributed-monitoring-ha.md#zone-config-sync-permissions) is disabled by default.
(or the master is able to connect, depending on firewall policies) which means
remote instances won't see each/connect to each other.
-All events (check results, downtimes, comments, etc) are synced to the master node,
+All events (check results, downtimes, comments, etc.) are synced to the master node,
but the remote nodes can still run local features such as a web interface, reporting,
graphing, etc. in their own specified zone.
Use your distribution's package manager to install the `pnp4nagios` package.
-If you're planning to use it configure it to use the
+If you're planning to use it, configure it to use the
[bulk mode with npcd and npcdmod](http://docs.pnp4nagios.org/pnp-0.6/modes#bulk_mode_with_npcd_and_npcdmod)
in combination with Icinga 2's [PerfdataWriter](15-features.md#performance-data). NPCD collects the performance
data files which Icinga 2 generates.
Graphite consists of 3 software components:
-* carbon - a Twisted daemon that listens for time-series data
-* whisper - a simple database library for storing time-series data (similar in design to RRD)
-* graphite webapp - A Django webapp that renders graphs on-demand using Cairo
+* carbon -- a Twisted daemon that listens for time-series data
+* whisper -- a simple database library for storing time-series data (similar in design to RRD)
+* graphite webapp -- a Django webapp that renders graphs on-demand using Cairo
Use the [GraphiteWriter](15-features.md#graphite-carbon-cache-writer) feature
for sending real-time metrics from Icinga 2 to Graphite.
## <a id="configuration-tools"></a> Configuration Management Tools
-If you require your favourite configuration tool to export Icinga 2 configuration, please get in
+If you require your favourite configuration tool to export the Icinga 2 configuration, please get in
touch with their developers. The Icinga project does not provide a configuration web interface
yet. Follow the [Icinga Blog](https://www.icinga.org/blog/) for updates on this topic.
-If you're looking for puppet manifests, chef cookbooks, ansible recipes, etc - we're happy
+If you're looking for puppet manifests, chef cookbooks, ansible recipes, etc. -- we're happy
to integrate them upstream, so please get in touch with the [Icinga team](https://www.icinga.org/community/get-involved/).
These tools are currently in development and require feedback and tests:
vars.pnp_check_arg1 = "!$nrpe_command$"
}
-If there are warnings about unresolved macros make sure to specify a default value for `vars.pnp_check_arg1` inside the
+If there are warnings about unresolved macros, make sure to specify a default value for `vars.pnp_check_arg1` inside the
In PNP, the custom template for nrpe is then defined in `/etc/pnp4nagios/custom/nrpe.cfg`
and the additional command arg string will be seen in the xml too for other templates.
The following example query checks the health of the current Icinga 2 instance
writing its current status to the DB IDO backend table `icinga_programstatus`
every 10 seconds. By default it checks 60 seconds into the past which is a reasonable
-amount of time - adjust it for your requirements. If the condition is not met,
+amount of time -- adjust it for your requirements. If the condition is not met,
the query returns an empty result.
> **Tip**
>
> Use [check plugins](14-addons-plugins.md#plugins) to monitor the backend.
-Replace the `default` string with your instance name, if different.
+Replace the `default` string with your instance name if different.
Example for MySQL:
> **Note**
>
-> If you are not using any web interface or addon which uses these files
+> If you are not using any web interface or addon which uses these files,
> you can safely disable this feature.
* `icinga2 feature list`
* `icinga2 daemon --validate`
* Relevant output from your main and debug log ( `icinga2 object list --type='filelogger'` )
- * The newest Icinga 2 crash log, if relevant
+ * The newest Icinga 2 crash log if relevant
* Your icinga2.conf and, if you run multiple Icinga 2 instances, your zones.conf
* How was Icinga 2 installed (and which repository in case) and which distribution are you using
* Provide complete configuration snippets explaining your problem in detail
-* If the check command failed - what's the output of your manual plugin tests?
+* If the check command failed, what's the output of your manual plugin tests?
* In case of [debugging](21-development.md#development) Icinga 2, the full back traces and outputs
## <a id="troubleshooting-enable-debug-output"></a> Enable Debug Output
### <a id="checks-not-executed"></a> Checks are not executed
-* Check the [debug log](16-troubleshooting.md#troubleshooting-enable-debug-output) to see if the check command gets executed
-* Verify that failed depedencies do not prevent command execution
-* Make sure that the plugin is executable by the Icinga 2 user (run a manual test)
+* Check the [debug log](16-troubleshooting.md#troubleshooting-enable-debug-output) to see if the check command gets executed.
+* Verify that failed depedencies do not prevent command execution.
+* Make sure that the plugin is executable by the Icinga 2 user (run a manual test).
* Make sure the [checker](8-cli-commands.md#enable-features) feature is enabled.
* Use the Icinga 2 API [event streams](9-icinga2-api.md#icinga2-api-event-streams) to receive live check result streams.
## <a id="notifications-not-sent"></a> Notifications are not sent
-* Check the debug log to see if a notification is triggered
-* If yes, verify that all conditions are satisfied
+* Check the debug log to see if a notification is triggered.
+* If yes, verify that all conditions are satisfied.
* Are any errors on the notification command execution logged?
-Verify the following configuration
+Verify the following configuration:
-* Is the host/service `enable_notifications` attribute set, and if, to which value?
+* Is the host/service `enable_notifications` attribute set, and if so, to which value?
* Do the notification attributes `states`, `types`, `period` match the notification conditions?
* Do the user attributes `states`, `types`, `period` match the notification conditions?
* Are there any notification `begin` and `end` times configured?
* Make sure the [notification](8-cli-commands.md#enable-features) feature is enabled.
* Does the referenced NotificationCommand work when executed as Icinga user on the shell?
-If notifications are to be sent via mail make sure that the mail program specified inside the
+If notifications are to be sent via mail, make sure that the mail program specified inside the
[NotificationCommand object](6-object-types.md#objecttype-notificationcommand) exists.
The name and location depends on the distribution so the preconfigured setting might have to be
changed on your system.
* Packet loss on the connection
* Firewall rules preventing traffic
-Use tools like `netstat`, `tcpdump`, `nmap`, etc to make sure that the cluster communication
+Use tools like `netstat`, `tcpdump`, `nmap`, etc. to make sure that the cluster communication
happens (default port is `5665`).
# tcpdump -n port 5665 -i any
### <a id="troubleshooting-cluster-check-results"></a> Cluster Troubleshooting Overdue Check Results
If your master does not receive check results (or any other events) from the child zones
-(satellite, clients, etc) make sure to check whether the client sending in events
+(satellite, clients, etc.), make sure to check whether the client sending in events
is allowed to do so.
The [cluster naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention)
-applies so if there's a mismatch between your client node's endpoint name and its provided
+applies. So, if there's a mismatch between your client node's endpoint name and its provided
certificate's CN, the master will deny all events.
> **Tip**
## <a id="upgrading-mysql-db"></a> Upgrading the MySQL database
-If you're upgrading an existing Icinga 2 instance you should check the
+If you're upgrading an existing Icinga 2 instance, you should check the
`/usr/share/icinga2-ido-mysql/schema/upgrade` directory for an incremental schema upgrade file.
> **Note**
>
-> If there isn't an upgrade file for your current version available there's nothing to do.
+> If there isn't an upgrade file for your current version available, there's nothing to do.
Apply all database schema upgrade files incrementally.
## <a id="upgrading-postgresql-db"></a> Upgrading the PostgreSQL database
-If you're updating an existing Icinga 2 instance you should check the
+If you're updating an existing Icinga 2 instance, you should check the
`/usr/share/icinga2-ido-pgsql/schema/upgrade` directory for an incremental schema upgrade file.
> **Note**
>
-> If there isn't an upgrade file for your current version available there's nothing to do.
+> If there isn't an upgrade file for your current version available, there's nothing to do.
Apply all database schema upgrade files incrementally.
Identifiers may not contain certain characters (e.g. space) or start
with certain characters (e.g. digits). If you want to use a dictionary
-key that is not a valid identifier you can enclose the key in double
+key that is not a valid identifier, you can enclose the key in double
quotes.
### <a id="array"></a> Array
}
}
-If the `hello` attribute does not already have a value it is automatically initialized to an empty dictionary.
+If the `hello` attribute does not already have a value, it is automatically initialized to an empty dictionary.
## <a id="template-imports"></a> Template Imports
Any valid config attribute can be accessed using the `host` and `service`
variables. For example, `host.address` would return the value of the host's
-"address" attribute - or null if that attribute isn't set.
+"address" attribute -- or null if that attribute isn't set.
More usage examples are documented in the [monitoring basics](3-monitoring-basics.md#using-apply-expressions)
chapter.
function max(...);
Returns the largest argument. A variable number of arguments can be specified.
-If no arguments are given -Infinity is returned.
+If no arguments are given, -Infinity is returned.
### <a id="math-min"></a> Math.min
function min(...);
Returns the smallest argument. A variable number of arguments can be specified.
-If no arguments are given +Infinity is returned.
+If no arguments are given, +Infinity is returned.
### <a id="math-pow"></a> Math.pow
function find(str, start);
Returns the zero-based index at which the string `str` was found in the string. If the string
-was not found -1 is returned. `start` specifies the zero-based index at which `find` should
+was not found, -1 is returned. `start` specifies the zero-based index at which `find` should
start looking for the string (defaults to 0 when not specified).
Example:
function contains(str);
Returns `true` if the string `str` was found in the string. If the string
-was not found `false` is returned. Use [find](19-library-reference.md#string-find)
+was not found, `false` is returned. Use [find](19-library-reference.md#string-find)
for getting the index instead.
Example:
Object prototype;
-Returns the prototype object for the type. When an attribute is accessed on an object that doesn't exist the prototype object is checked to see if an attribute with the requested name exists there. If it does that attribute's value is returned.
+Returns the prototype object for the type. When an attribute is accessed on an object that doesn't exist the prototype object is checked to see if an attribute with the requested name exists. If it does, the attribute's value is returned.
The prototype functionality is used to implement methods.
Furthermore, you may also have to install debug symbols for Boost and your C library.
-If you're building your own binaries you should use the `-DCMAKE_BUILD_TYPE=Debug` cmake
+If you're building your own binaries, you should use the `-DCMAKE_BUILD_TYPE=Debug` cmake
build flag for debug builds.
Call GDB with the binary (`/usr/sbin/icinga2` is a wrapper script calling
`/usr/lib64/icinga2/sbin/icinga2` since 2.4) and all arguments and run it in foreground.
-If VFork causes trouble disable it inside the gdb run.
+If VFork causes trouble, disable it inside the gdb run.
# gdb --args /usr/lib64/icinga2/sbin/icinga2 daemon -x debug -DUseVfork=0
(gdb) bt
(gdb) thread apply all bt full
-If Icinga 2 is still running generate a full backtrace from the running
+If Icinga 2 is still running, generate a full backtrace from the running
process and store it into a new file (e.g. for debugging dead locks):
# gdb -p $(pidof icinga2) -batch -ex "thread apply all bt full" -ex "detach" -ex "q" > gdb_bt.log
-If you're opening an issue at [https://dev.icinga.org] make sure
+If you're opening an issue at [https://dev.icinga.org], make sure
to attach as much detail as possible.
### <a id="development-debug-gdb-backtrace-stepping"></a> GDB Backtrace Stepping
vars.CVTEST = "service cv value"
}
-If you are just defining `$CVTEST$` in your command definition its value depends on the
-execution scope - the host check command will fetch the host attribute value of `vars.CVTEST`
+If you are just defining `$CVTEST$` in your command definition, its value depends on the
+execution scope -- the host check command will fetch the host attribute value of `vars.CVTEST`
while the service check command resolves its value to the service attribute attribute `vars.CVTEST`.
> **Note**
}
The `service_notification_options` can be [mapped](22-migrating-from-icinga-1x.md#manual-config-migration-hints-notification-filters)
-into generic `state` and `type` filters, if additional notification filtering is required. `alias` gets
+into generic `state` and `type` filters if additional notification filtering is required. `alias` gets
renamed to `display_name`.
object User "testconfig-user" {
the state filter defined in the `execution_failure_criteria` defines the Icinga 2 `state` attribute.
If the state filter matches, you can define whether to disable checks and notifications or not.
-The following example describes service dependencies. If you migrate from Icinga 1.x you will only
+The following example describes service dependencies. If you migrate from Icinga 1.x, you will only
want to use the classic `Host-to-Host` and `Service-to-Service` dependency relationships.
define service {
* If your current setup consists of instances distributing the check load, you should consider
building a [load distribution](13-distributed-monitoring-ha.md#cluster-scenarios-load-distribution) setup with Icinga 2.
-* If your current setup includes active/passive clustering with external tools like Pacemaker/DRBD
+* If your current setup includes active/passive clustering with external tools like Pacemaker/DRBD,
consider the [High Availability](13-distributed-monitoring-ha.md#cluster-scenarios-high-availability) setup.
-* If you have build your own custom configuration deployment and check result collecting mechanism
+* If you have build your own custom configuration deployment and check result collecting mechanism,
you should re-design your setup and re-evaluate your requirements, and how they may be fulfilled
using the Icinga 2 cluster capabilities.
notifications_enabled 0
}
-Icinga 2 supports objects and (global) variables, but does not make a difference
-if it's the main configuration file, or any included file.
+Icinga 2 supports objects and (global) variables, but does not make a difference
+between the main configuration file or any other included file.
icinga2.conf:
check_interval = 5m
}
-Please note that the default time value is seconds, if no duration literal
+Please note that the default time value is seconds if no duration literal
is given. `check_interval = 5` behaves the same as `check_interval = 5s`.
All strings require double quotes in Icinga 2. Therefore a double quote
### <a id="differences-1x-2-macros"></a> Macros
Various object attributes and runtime variables can be accessed as macros in
-commands in Icinga 1.x - Icinga 2 supports all required [custom attributes](3-monitoring-basics.md#custom-attributes).
+commands in Icinga 1.x -- Icinga 2 supports all required [custom attributes](3-monitoring-basics.md#custom-attributes).
#### <a id="differences-1x-2-command-arguments"></a> Command Arguments
-If you have previously used Icinga 1.x you may already be familiar with
+If you have previously used Icinga 1.x, you may already be familiar with
user and argument definitions (e.g., `USER1` or `ARG1`). Unlike in Icinga 1.x
the Icinga 2 custom attributes may have arbitrary names and arguments are no
longer specified in the `check_command` setting.
### <a id="differences-1x-2-async-event-execution"></a> Asynchronous Event Execution
-Unlike Icinga 1.x, Icinga 2 does not block when it waits for a command
-being executed - be it a check, notification, event handler, performance data writing update, etc.
-That way you'll recognize low to zero (check) latencies with Icinga 2.
+Unlike Icinga 1.x, Icinga 2 does not block when it's waiting for a command
+being executed -- whether if it's a check, a notification, an event
+handler, a performance data writing update, etc. That way you'll
+recognize low to zero (check) latencies with Icinga 2.
### <a id="differences-1x-2-checks"></a> Checks
timeout. This was essentially bad when there only was a couple of check plugins
requiring some command timeouts to be extended.
-Icinga 2 allows you to specify the command timeout directly on the command. So
+Icinga 2 allows you to specify the command timeout directly on the command. So,
if your VMVware check plugin takes 15 minutes, [increase the timeout](6-object-types.md#objecttype-checkcommand)
accordingly.
In Nagios / Icinga 1.x a daemon reload does the following:
* receive reload signal SIGHUP
-* stop all events (checks, notifications, etc)
+* stop all events (checks, notifications, etc.)
* read the configuration from disk and validate all config objects in a single threaded fashion
* validation NOT ok: stop the daemon (cannot restore old config state)
* validation ok: start with new objects, dump status.dat / ido
num_services_hard_warn | int | All services in a hard state with Warning state.
num_services_hard_crit | int | All services in a hard state with Critical state.
num_services_hard_unknown | int | All services in a hard state with Unknown state.
- hard_state | int | Returns OK, if state is OK. Returns current state if now a hard state type. Returns last hard state otherwise.
+ hard_state | int | Returns OK if state is OK. Returns current state if now a hard state type. Returns last hard state otherwise.
staleness | int | Indicates time since last check normalized onto the check_interval.
groups | array | All hostgroups this host is a member of.
contact_groups | array | All usergroups associated with this host through notifications.
custom_variable_names | array | .
custom_variable_values | array | .
custom_variables | array | Array of custom variable array pair.
- hard_state | int | Returns OK, if state is OK. Returns current state if now a hard state type. Returns last hard state otherwise.
+ hard_state | int | Returns OK if state is OK. Returns current state if now a hard state type. Returns last hard state otherwise.
staleness | int | Indicates time since last check normalized onto the check_interval.
groups | array | All hostgroups this host is a member of.
contact_groups | array | All usergroups associated with this host through notifications.
Imagine a different more advanced example: You are monitoring your network device (host)
with many interfaces (services). The following requirements/problems apply:
-* Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc)
+* Each interface service check should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc.)
* Each interface has its own vlan tag
* Some interfaces have QoS enabled
* Additional attributes such as `display_name` or `notes`, `notes_url` and `action_url` must be
### <a id="notification-commands"></a> Notification Commands
[NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects define how notifications are delivered to external
-interfaces (email, XMPP, IRC, Twitter, etc).
+interfaces (email, XMPP, IRC, Twitter, etc.).
[NotificationCommand](6-object-types.md#objecttype-notificationcommand) objects are referenced by
[Notification](6-object-types.md#objecttype-notification) objects using the `command` attribute.
In order to find the best strategy for your own configuration, ask yourself the following questions:
-* Do your hosts share a common group of services (for example linux hosts with disk, load, etc checks)?
+* Do your hosts share a common group of services (for example linux hosts with disk, load, etc. checks)?
* Only a small set of users receives notifications and escalations for all hosts/services?
If you can at least answer one of these questions with yes, look for the
* You are required to define specific configuration for each host/service?
* Does your configuration generation tool already know about the host-service-relationship?
-Then you should look for the object specific configuration setting `host_name` etc accordingly.
+Then you should look for the object specific configuration setting `host_name` etc. accordingly.
Finding the best files and directory tree for your configuration is up to you. Make sure that
the [icinga2.conf](4-configuring-icinga-2.md#icinga2-conf) configuration file includes them,
and then think about:
* tree-based on locations, hostgroups, specific host attributes with sub levels of directories.
-* flat `hosts.conf`, `services.conf`, etc files for rule based configuration.
+* flat `hosts.conf`, `services.conf`, etc. files for rule based configuration.
* generated configuration with one file per host and a global configuration for groups, users, etc.
* one big file generated from an external application (probably a bad idea for maintaining changes).
* your own.
snmp_crit | **Optional.** The critical threshold.
snmp_interface | **Optional.** Network interface name. Default to regex "eth0".
snmp_interface_perf | **Optional.** Check the input/ouput bandwidth of the interface. Defaults to true.
-snmp_interface_label | **Optional.** Add label before speed in output: in=, out=, errors-out=, etc...
+snmp_interface_label | **Optional.** Add label before speed in output: in=, out=, errors-out=, etc.
snmp_interface_bits_bytes | **Optional.** Output performance data in bits/s or Bytes/s. **Depends** on snmp_interface_kbits set to true. Defaults to true.
snmp_interface_percent | **Optional.** Output performance data in % of max speed. Defaults to false.
snmp_interface_kbits | **Optional.** Make the warning and critical levels in KBits/s. Defaults to true.