granicus.if.org Git - icinga2/blob - doc/14-features.md

   1 # Icinga 2 Features <a id="icinga2-features"></a>
   2
   3 ## Logging <a id="logging"></a>
   4
   5 Icinga 2 supports three different types of logging:
   6
   7 * File logging
   8 * Syslog (on Linux/UNIX)
   9 * Console logging (`STDOUT` on tty)
  10
  11 You can enable additional loggers using the `icinga2 feature enable`
  12 and `icinga2 feature disable` commands to configure loggers:
  13
  14 Feature  | Description
  15 ---------|------------
  16 debuglog | Debug log (path: `/var/log/icinga2/debug.log`, severity: `debug` or higher)
  17 mainlog  | Main log (path: `/var/log/icinga2/icinga2.log`, severity: `information` or higher)
  18 syslog   | Syslog (severity: `warning` or higher)
  19
  20 By default file the `mainlog` feature is enabled. When running Icinga 2
  21 on a terminal log messages with severity `information` or higher are
  22 written to the console.
  23
  24 ### Log Rotation <a id="logging-logrotate"></a>
  25
  26 Packages provide a configuration file for [logrotate](https://linux.die.net/man/8/logrotate)
  27 on Linux/Unix. Typically this is installed into `/etc/logrotate.d/icinga2`
  28 and modifications won't be overridden on upgrade.
  29
  30 Instead of sending the reload HUP signal, logrotate
  31 sends the USR1 signal to notify the Icinga daemon
  32 that it has rotate the log file. Icinga reopens the log
  33 files then:
  34
  35 * `/var/log/icinga2/icinga2.log` (requires `mainlog` enabled)
  36 * `/var/log/icinga2/debug.log` (requires `debuglog` enabled)
  37 * `/var/log/icinga2/erorr.log`
  38
  39 By default, log files will be rotated daily.
  40
  41 ## Core Backends <a id="core-backends"></a>
  42
  43 ### REST API <a id="core-backends-api"></a>
  44
  45 The REST API is documented [here](12-icinga2-api.md#icinga2-api) as a core feature.
  46
  47 ### IDO Database (DB IDO) <a id="db-ido"></a>
  48
  49 The IDO (Icinga Data Output) feature for Icinga 2 takes care of exporting all
  50 configuration and status information into a database. The IDO database is used
  51 by Icinga Web 2 as data backend.
  52
  53 Details on the installation can be found in the [Configuring DB IDO](02-getting-started.md#configuring-db-ido-mysql)
  54 chapter. Details on the configuration can be found in the
  55 [IdoMysqlConnection](09-object-types.md#objecttype-idomysqlconnection) and
  56 [IdoPgsqlConnection](09-object-types.md#objecttype-idopgsqlconnection)
  57 object configuration documentation.
  58
  59 #### DB IDO Health <a id="db-ido-health"></a>
  60
  61 If the monitoring health indicator is critical in Icinga Web 2,
  62 you can use the following queries to manually check whether Icinga 2
  63 is actually updating the IDO database.
  64
  65 Icinga 2 writes its current status to the `icinga_programstatus` table
  66 every 10 seconds. The query below checks 60 seconds into the past which is a reasonable
  67 amount of time -- adjust it for your requirements. If the condition is not met,
  68 the query returns an empty result.
  69
  70 > **Tip**
  71 >
  72 > Use [check plugins](05-service-monitoring.md#service-monitoring-plugins) to monitor the backend.
  73
  74 Replace the `default` string with your instance name if different.
  75
  76 Example for MySQL:
  77
  78 ```
  79 # mysql -u root -p icinga -e "SELECT status_update_time FROM icinga_programstatus ps
  80   JOIN icinga_instances i ON ps.instance_id=i.instance_id
  81   WHERE (UNIX_TIMESTAMP(ps.status_update_time) > UNIX_TIMESTAMP(NOW())-60)
  82   AND i.instance_name='default';"
  83
  84 +---------------------+
  85 | status_update_time  |
  86 +---------------------+
  87 | 2014-05-29 14:29:56 |
  88 +---------------------+
  89 ```
  90
  91 Example for PostgreSQL:
  92
  93 ```
  94 # export PGPASSWORD=icinga; psql -U icinga -d icinga -c "SELECT ps.status_update_time FROM icinga_programstatus AS ps
  95   JOIN icinga_instances AS i ON ps.instance_id=i.instance_id
  96   WHERE ((SELECT extract(epoch from status_update_time) FROM icinga_programstatus) > (SELECT extract(epoch from now())-60))
  97   AND i.instance_name='default'";
  98
  99 status_update_time
 100 ------------------------
 101  2014-05-29 15:11:38+02
 102 (1 Zeile)
 103 ```
 104
 105 A detailed list on the available table attributes can be found in the [DB IDO Schema documentation](24-appendix.md#schema-db-ido).
 106
 107 #### DB IDO in Cluster HA Zones <a id="db-ido-cluster-ha"></a>
 108
 109 The DB IDO feature supports [High Availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-db-ido) in
 110 the Icinga 2 cluster.
 111
 112 By default, both endpoints in a zone calculate the
 113 endpoint which activates the feature, the other endpoint
 114 automatically pauses it. If the cluster connection
 115 breaks at some point, the paused IDO feature automatically
 116 does a failover.
 117
 118 You can disable this behaviour by setting `enable_ha = false`
 119 in both feature configuration files.
 120
 121 #### DB IDO Cleanup <a id="db-ido-cleanup"></a>
 122
 123 Objects get deactivated when they are deleted from the configuration.
 124 This is visible with the `is_active` column in the `icinga_objects` table.
 125 Therefore all queries need to join this table and add `WHERE is_active=1` as
 126 condition. Deleted objects preserve their history table entries for later SLA
 127 reporting.
 128
 129 Historical data isn't purged by default. You can enable the least
 130 kept data age inside the `cleanup` configuration attribute for the
 131 IDO features [IdoMysqlConnection](09-object-types.md#objecttype-idomysqlconnection)
 132 and [IdoPgsqlConnection](09-object-types.md#objecttype-idopgsqlconnection).
 133
 134 Example if you prefer to keep notification history for 30 days:
 135
 136 ```
 137   cleanup = {
 138      notifications_age = 30d
 139      contactnotifications_age = 30d
 140   }
 141 ```
 142
 143 The historical tables are populated depending on the data `categories` specified.
 144 Some tables are empty by default.
 145
 146 #### DB IDO Tuning <a id="db-ido-tuning"></a>
 147
 148 As with any application database, there are ways to optimize and tune the database performance.
 149
 150 General tips for performance tuning:
 151
 152 * [MariaDB KB](https://mariadb.com/kb/en/library/optimization-and-tuning/)
 153 * [PostgreSQL Wiki](https://wiki.postgresql.org/wiki/Performance_Optimization)
 154
 155 Re-creation of indexes, changed column values, etc. will increase the database size. Ensure to
 156 add health checks for this, and monitor the trend in your Grafana dashboards.
 157
 158 In order to optimize the tables, there are different approaches. Always keep in mind to have a
 159 current backup and schedule maintenance downtime for these kind of tasks!
 160
 161 MySQL:
 162
 163 ```
 164 mariadb> OPTIMIZE TABLE icinga_statehistory;
 165 ```
 166
 167 > **Important**
 168 >
 169 > Tables might not support optimization at runtime. This can take a **long** time.
 170 >
 171 > `Table does not support optimize, doing recreate + analyze instead`.
 172
 173 If you want to optimize all tables in a specified database, there is a script called `mysqlcheck`.
 174 This also allows to repair broken tables in the case of emergency.
 175
 176 ```
 177 mysqlcheck --optimize icinga
 178 ```
 179
 180 PostgreSQL:
 181
 182 ```
 183 icinga=# vacuum;
 184 VACUUM
 185 ```
 186
 187 > **Note**
 188 >
 189 > Don't use `VACUUM FULL` as this has a severe impact on performance.
 190
 191
 192 ## Metrics <a id="metrics"></a>
 193
 194 Whenever a host or service check is executed, or received via the REST API,
 195 best practice is to provide performance data.
 196
 197 This data is parsed by features sending metrics to time series databases (TSDB):
 198
 199 * [Graphite](14-features.md#graphite-carbon-cache-writer)
 200 * [InfluxDB](14-features.md#influxdb-writer)
 201 * [OpenTSDB](14-features.md#opentsdb-writer)
 202
 203 Metrics, state changes and notifications can be managed with the following integrations:
 204
 205 * [Elastic Stack](14-features.md#elastic-stack-integration)
 206 * [Graylog](14-features.md#graylog-integration)
 207
 208
 209 ### Graphite Writer <a id="graphite-carbon-cache-writer"></a>
 210
 211 [Graphite](13-addons.md#addons-graphing-graphite) is a tool stack for storing
 212 metrics and needs to be running prior to enabling the `graphite` feature.
 213
 214 Icinga 2 writes parsed metrics directly to Graphite's Carbon Cache
 215 TCP port, defaulting to `2003`.
 216
 217 You can enable the feature using
 218
 219 ```
 220 # icinga2 feature enable graphite
 221 ```
 222
 223 By default the [GraphiteWriter](09-object-types.md#objecttype-graphitewriter) feature
 224 expects the Graphite Carbon Cache to listen at `127.0.0.1` on TCP port `2003`.
 225
 226 #### Graphite Schema <a id="graphite-carbon-cache-writer-schema"></a>
 227
 228 The current naming schema is defined as follows. The [Icinga Web 2 Graphite module](https://github.com/icinga/icingaweb2-module-graphite)
 229 depends on this schema.
 230
 231 The default prefix for hosts and services is configured using
 232 [runtime macros](03-monitoring-basics.md#runtime-macros)like this:
 233
 234 ```
 235 icinga2.$host.name$.host.$host.check_command$
 236 icinga2.$host.name$.services.$service.name$.$service.check_command$
 237 ```
 238
 239 You can customize the prefix name by using the `host_name_template` and
 240 `service_name_template` configuration attributes.
 241
 242 The additional levels will allow fine granular filters and also template
 243 capabilities, e.g. by using the check command `disk` for specific
 244 graph templates in web applications rendering the Graphite data.
 245
 246 The following characters are escaped in prefix labels:
 247
 248   Character     | Escaped character
 249   --------------|--------------------------
 250   whitespace    | _
 251   .             | _
 252   \             | _
 253   /             | _
 254
 255 Metric values are stored like this:
 256
 257 ```
 258 <prefix>.perfdata.<perfdata-label>.value
 259 ```
 260
 261 The following characters are escaped in performance labels
 262 parsed from plugin output:
 263
 264   Character     | Escaped character
 265   --------------|--------------------------
 266   whitespace    | _
 267   \             | _
 268   /             | _
 269   ::            | .
 270
 271 Note that labels may contain dots (`.`) allowing to
 272 add more subsequent levels inside the Graphite tree.
 273 `::` adds support for [multi performance labels](http://my-plugin.de/wiki/projects/check_multi/configuration/performance)
 274 and is therefore replaced by `.`.
 275
 276 By enabling `enable_send_thresholds` Icinga 2 automatically adds the following threshold metrics:
 277
 278 ```
 279 <prefix>.perfdata.<perfdata-label>.min
 280 <prefix>.perfdata.<perfdata-label>.max
 281 <prefix>.perfdata.<perfdata-label>.warn
 282 <prefix>.perfdata.<perfdata-label>.crit
 283 ```
 284
 285 By enabling `enable_send_metadata` Icinga 2 automatically adds the following metadata metrics:
 286
 287 ```
 288 <prefix>.metadata.current_attempt
 289 <prefix>.metadata.downtime_depth
 290 <prefix>.metadata.acknowledgement
 291 <prefix>.metadata.execution_time
 292 <prefix>.metadata.latency
 293 <prefix>.metadata.max_check_attempts
 294 <prefix>.metadata.reachable
 295 <prefix>.metadata.state
 296 <prefix>.metadata.state_type
 297 ```
 298
 299 Metadata metric overview:
 300
 301   metric             | description
 302   -------------------|------------------------------------------
 303   current_attempt    | current check attempt
 304   max_check_attempts | maximum check attempts until the hard state is reached
 305   reachable          | checked object is reachable
 306   downtime_depth     | number of downtimes this object is in
 307   acknowledgement    | whether the object is acknowledged or not
 308   execution_time     | check execution time
 309   latency            | check latency
 310   state              | current state of the checked object
 311   state_type         | 0=SOFT, 1=HARD state
 312
 313 The following example illustrates how to configure the storage schemas for Graphite Carbon
 314 Cache.
 315
 316 ```
 317 [icinga2_default]
 318 # intervals like PNP4Nagios uses them per default
 319 pattern = ^icinga2\.
 320 retentions = 1m:2d,5m:10d,30m:90d,360m:4y
 321 ```
 322
 323 #### Graphite in Cluster HA Zones <a id="graphite-carbon-cache-writer-cluster-ha"></a>
 324
 325 The Graphite feature supports [high availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
 326 in cluster zones since 2.11.
 327
 328 By default, all endpoints in a zone will activate the feature and start
 329 writing metrics to a Carbon Cache socket. In HA enabled scenarios,
 330 it is possible to set `enable_ha = true` in all feature configuration
 331 files. This allows each endpoint to calculate the feature authority,
 332 and only one endpoint actively writes metrics, the other endpoints
 333 pause the feature.
 334
 335 When the cluster connection breaks at some point, the remaining endpoint(s)
 336 in that zone will automatically resume the feature. This built-in failover
 337 mechanism ensures that metrics are written even if the cluster fails.
 338
 339 The recommended way of running Graphite in this scenario is a dedicated server
 340 where Carbon Cache/Relay is running as receiver.
 341
 342
 343 ### InfluxDB Writer <a id="influxdb-writer"></a>
 344
 345 Once there are new metrics available, Icinga 2 will directly write them to the
 346 defined InfluxDB HTTP API.
 347
 348 You can enable the feature using
 349
 350 ```
 351 # icinga2 feature enable influxdb
 352 ```
 353
 354 By default the [InfluxdbWriter](09-object-types.md#objecttype-influxdbwriter) feature
 355 expects the InfluxDB daemon to listen at `127.0.0.1` on port `8086`.
 356
 357 Measurement names and tags are fully configurable by the end user. The InfluxdbWriter
 358 object will automatically add a `metric` tag to each data point. This correlates to the
 359 perfdata label. Fields (value, warn, crit, min, max, unit) are created from data if available
 360 and the configuration allows it.  If a value associated with a tag is not able to be
 361 resolved, it will be dropped and not sent to the target host.
 362
 363 Backslashes are allowed in tag keys, tag values and field keys, however they are also
 364 escape characters when followed by a space or comma, but cannot be escaped themselves.
 365 As a result all trailling slashes in these fields are replaced with an underscore.  This
 366 predominantly affects Windows paths e.g. `C:\` becomes `C:_`.
 367
 368 The database is assumed to exist so this object will make no attempt to create it currently.
 369
 370 If [SELinux](22-selinux.md#selinux) is enabled, it will not allow access for Icinga 2 to InfluxDB until the [boolean](22-selinux.md#selinux-policy-booleans)
 371 `icinga2_can_connect_all` is set to true as InfluxDB is not providing its own policy.
 372
 373 More configuration details can be found [here](09-object-types.md#objecttype-influxdbwriter).
 374
 375 #### Instance Tagging <a id="influxdb-writer-instance-tags"></a>
 376
 377 Consider the following service check:
 378
 379 ```
 380 apply Service "disk" for (disk => attributes in host.vars.disks) {
 381   import "generic-service"
 382   check_command = "disk"
 383   display_name = "Disk " + disk
 384   vars.disk_partitions = disk
 385   assign where host.vars.disks
 386 }
 387 ```
 388
 389 This is a typical pattern for checking individual disks, NICs, SSL certificates etc associated
 390 with a host.  What would be useful is to have the data points tagged with the specific instance
 391 for that check.  This would allow you to query time series data for a check on a host and for a
 392 specific instance e.g. /dev/sda.  To do this quite simply add the instance to the service variables:
 393
 394 ```
 395 apply Service "disk" for (disk => attributes in host.vars.disks) {
 396   ...
 397   vars.instance = disk
 398   ...
 399 }
 400 ```
 401
 402 Then modify your writer configuration to add this tag to your data points if the instance variable
 403 is associated with the service:
 404
 405 ```
 406 object InfluxdbWriter "influxdb" {
 407   ...
 408   service_template = {
 409     measurement = "$service.check_command$"
 410     tags = {
 411       hostname = "$host.name$"
 412       service = "$service.name$"
 413       instance = "$service.vars.instance$"
 414     }
 415   }
 416   ...
 417 }
 418 ```
 419
 420 #### InfluxDB in Cluster HA Zones <a id="influxdb-writer-cluster-ha"></a>
 421
 422 The InfluxDB feature supports [high availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
 423 in cluster zones since 2.11.
 424
 425 By default, all endpoints in a zone will activate the feature and start
 426 writing metrics to the InfluxDB HTTP API. In HA enabled scenarios,
 427 it is possible to set `enable_ha = true` in all feature configuration
 428 files. This allows each endpoint to calculate the feature authority,
 429 and only one endpoint actively writes metrics, the other endpoints
 430 pause the feature.
 431
 432 When the cluster connection breaks at some point, the remaining endpoint(s)
 433 in that zone will automatically resume the feature. This built-in failover
 434 mechanism ensures that metrics are written even if the cluster fails.
 435
 436 The recommended way of running InfluxDB in this scenario is a dedicated server
 437 where the InfluxDB HTTP API or Telegraf as Proxy are running.
 438
 439 ### Elastic Stack Integration <a id="elastic-stack-integration"></a>
 440
 441 [Icingabeat](https://github.com/icinga/icingabeat) is an Elastic Beat that fetches data
 442 from the Icinga 2 API and sends it either directly to [Elasticsearch](https://www.elastic.co/products/elasticsearch)
 443 or [Logstash](https://www.elastic.co/products/logstash).
 444
 445 More integrations:
 446
 447 * [Logstash output](https://github.com/Icinga/logstash-output-icinga) for the Icinga 2 API.
 448 * [Logstash Grok Pattern](https://github.com/Icinga/logstash-grok-pattern) for Icinga 2 logs.
 449
 450 #### Elasticsearch Writer <a id="elasticsearch-writer"></a>
 451
 452 This feature forwards check results, state changes and notification events
 453 to an [Elasticsearch](https://www.elastic.co/products/elasticsearch) installation over its HTTP API.
 454
 455 The check results include parsed performance data metrics if enabled.
 456
 457 > **Note**
 458 >
 459 > Elasticsearch 5.x or 6.x are required. This feature has been successfully tested with
 460 > Elasticsearch 5.6.7 and 6.3.1.
 461
 462
 463
 464 Enable the feature and restart Icinga 2.
 465
 466 ```
 467 # icinga2 feature enable elasticsearch
 468 ```
 469
 470 The default configuration expects an Elasticsearch instance running on `localhost` on port `9200
 471  and writes to an index called `icinga2`.
 472
 473 More configuration details can be found [here](09-object-types.md#objecttype-elasticsearchwriter).
 474
 475 #### Current Elasticsearch Schema <a id="elastic-writer-schema"></a>
 476
 477 The following event types are written to Elasticsearch:
 478
 479 * icinga2.event.checkresult
 480 * icinga2.event.statechange
 481 * icinga2.event.notification
 482
 483 Performance data metrics must be explicitly enabled with the `enable_send_perfdata`
 484 attribute.
 485
 486 Metric values are stored like this:
 487
 488 ```
 489 check_result.perfdata.<perfdata-label>.value
 490 ```
 491
 492 The following characters are escaped in perfdata labels:
 493
 494   Character   | Escaped character
 495   ------------|--------------------------
 496   whitespace  | _
 497   \           | _
 498   /           | _
 499   ::          | .
 500
 501 Note that perfdata labels may contain dots (`.`) allowing to
 502 add more subsequent levels inside the tree.
 503 `::` adds support for [multi performance labels](http://my-plugin.de/wiki/projects/check_multi/configuration/performance)
 504 and is therefore replaced by `.`.
 505
 506 Icinga 2 automatically adds the following threshold metrics
 507 if existing:
 508
 509 ```
 510 check_result.perfdata.<perfdata-label>.min
 511 check_result.perfdata.<perfdata-label>.max
 512 check_result.perfdata.<perfdata-label>.warn
 513 check_result.perfdata.<perfdata-label>.crit
 514 ```
 515
 516 #### Elasticsearch in Cluster HA Zones <a id="elasticsearch-writer-cluster-ha"></a>
 517
 518 The Elasticsearch feature supports [high availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
 519 in cluster zones since 2.11.
 520
 521 By default, all endpoints in a zone will activate the feature and start
 522 writing events to the Elasticsearch HTTP API. In HA enabled scenarios,
 523 it is possible to set `enable_ha = true` in all feature configuration
 524 files. This allows each endpoint to calculate the feature authority,
 525 and only one endpoint actively writes events, the other endpoints
 526 pause the feature.
 527
 528 When the cluster connection breaks at some point, the remaining endpoint(s)
 529 in that zone will automatically resume the feature. This built-in failover
 530 mechanism ensures that events are written even if the cluster fails.
 531
 532 The recommended way of running Elasticsearch in this scenario is a dedicated server
 533 where you either have the Elasticsearch HTTP API, or a TLS secured HTTP proxy,
 534 or Logstash for additional filtering.
 535
 536 ### Graylog Integration <a id="graylog-integration"></a>
 537
 538 #### GELF Writer <a id="gelfwriter"></a>
 539
 540 The `Graylog Extended Log Format` (short: [GELF](http://docs.graylog.org/en/latest/pages/gelf.html))
 541 can be used to send application logs directly to a TCP socket.
 542
 543 While it has been specified by the [Graylog](https://www.graylog.org) project as their
 544 [input resource standard](http://docs.graylog.org/en/latest/pages/sending_data.html), other tools such as
 545 [Logstash](https://www.elastic.co/products/logstash) also support `GELF` as
 546 [input type](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-gelf.html).
 547
 548 You can enable the feature using
 549
 550 ```
 551 # icinga2 feature enable gelf
 552 ```
 553
 554 By default the `GelfWriter` object expects the GELF receiver to listen at `127.0.0.1` on TCP port `12201`.
 555 The default `source`  attribute is set to `icinga2`. You can customize that for your needs if required.
 556
 557 Currently these events are processed:
 558 * Check results
 559 * State changes
 560 * Notifications
 561
 562 #### Graylog/GELF in Cluster HA Zones <a id="gelf-writer-cluster-ha"></a>
 563
 564 The Gelf feature supports [high availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
 565 in cluster zones since 2.11.
 566
 567 By default, all endpoints in a zone will activate the feature and start
 568 writing events to the Graylog HTTP API. In HA enabled scenarios,
 569 it is possible to set `enable_ha = true` in all feature configuration
 570 files. This allows each endpoint to calculate the feature authority,
 571 and only one endpoint actively writes events, the other endpoints
 572 pause the feature.
 573
 574 When the cluster connection breaks at some point, the remaining endpoint(s)
 575 in that zone will automatically resume the feature. This built-in failover
 576 mechanism ensures that events are written even if the cluster fails.
 577
 578 The recommended way of running Graylog in this scenario is a dedicated server
 579 where you have the Graylog HTTP API listening.
 580
 581 ### OpenTSDB Writer <a id="opentsdb-writer"></a>
 582
 583 While there are some OpenTSDB collector scripts and daemons like tcollector available for
 584 Icinga 1.x it's more reasonable to directly process the check and plugin performance
 585 in memory in Icinga 2. Once there are new metrics available, Icinga 2 will directly
 586 write them to the defined TSDB TCP socket.
 587
 588 You can enable the feature using
 589
 590 ```
 591 # icinga2 feature enable opentsdb
 592 ```
 593
 594 By default the `OpenTsdbWriter` object expects the TSD to listen at
 595 `127.0.0.1` on port `4242`.
 596
 597 The current naming schema is
 598
 599 ```
 600 icinga.host.<metricname>
 601 icinga.service.<servicename>.<metricname>
 602 ```
 603
 604 for host and service checks. The tag host is always applied.
 605
 606 To make sure Icinga 2 writes a valid metric into OpenTSDB some characters are replaced
 607 with `_` in the target name:
 608
 609 ```
 610 \  (and space)
 611 ```
 612
 613 The resulting name in OpenTSDB might look like:
 614
 615 ```
 616 www-01 / http-cert / response time
 617 icinga.http_cert.response_time
 618 ```
 619
 620 In addition to the performance data retrieved from the check plugin, Icinga 2 sends
 621 internal check statistic data to OpenTSDB:
 622
 623   metric             | description
 624   -------------------|------------------------------------------
 625   current_attempt    | current check attempt
 626   max_check_attempts | maximum check attempts until the hard state is reached
 627   reachable          | checked object is reachable
 628   downtime_depth     | number of downtimes this object is in
 629   acknowledgement    | whether the object is acknowledged or not
 630   execution_time     | check execution time
 631   latency            | check latency
 632   state              | current state of the checked object
 633   state_type         | 0=SOFT, 1=HARD state
 634
 635 While reachable, state and state_type are metrics for the host or service the
 636 other metrics follow the current naming schema
 637
 638 ```
 639 icinga.check.<metricname>
 640 ```
 641
 642 with the following tags
 643
 644   tag     | description
 645   --------|------------------------------------------
 646   type    | the check type, one of [host, service]
 647   host    | hostname, the check ran on
 648   service | the service name (if type=service)
 649
 650 > **Note**
 651 >
 652 > You might want to set the tsd.core.auto_create_metrics setting to `true`
 653 > in your opentsdb.conf configuration file.
 654
 655 #### OpenTSDB in Cluster HA Zones <a id="opentsdb-writer-cluster-ha"></a>
 656
 657 The OpenTSDB feature supports [high availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
 658 in cluster zones since 2.11.
 659
 660 By default, all endpoints in a zone will activate the feature and start
 661 writing events to the OpenTSDB listener. In HA enabled scenarios,
 662 it is possible to set `enable_ha = true` in all feature configuration
 663 files. This allows each endpoint to calculate the feature authority,
 664 and only one endpoint actively writes metrics, the other endpoints
 665 pause the feature.
 666
 667 When the cluster connection breaks at some point, the remaining endpoint(s)
 668 in that zone will automatically resume the feature. This built-in failover
 669 mechanism ensures that metrics are written even if the cluster fails.
 670
 671 The recommended way of running OpenTSDB in this scenario is a dedicated server
 672 where you have OpenTSDB running.
 673
 674
 675 ### Writing Performance Data Files <a id="writing-performance-data-files"></a>
 676
 677 PNP and Graphios use performance data collector daemons to fetch
 678 the current performance files for their backend updates.
 679
 680 Therefore the Icinga 2 [PerfdataWriter](09-object-types.md#objecttype-perfdatawriter)
 681 feature allows you to define the output template format for host and services helped
 682 with Icinga 2 runtime vars.
 683
 684 ```
 685 host_format_template = "DATATYPE::HOSTPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tHOSTPERFDATA::$host.perfdata$\tHOSTCHECKCOMMAND::$host.check_command$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.state_type$"
 686 service_format_template = "DATATYPE::SERVICEPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tSERVICEDESC::$service.name$\tSERVICEPERFDATA::$service.perfdata$\tSERVICECHECKCOMMAND::$service.check_command$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.state_type$\tSERVICESTATE::$service.state$\tSERVICESTATETYPE::$service.state_type$"
 687 ```
 688
 689 The default templates are already provided with the Icinga 2 feature configuration
 690 which can be enabled using
 691
 692 ```
 693 # icinga2 feature enable perfdata
 694 ```
 695
 696 By default all performance data files are rotated in a 15 seconds interval into
 697 the `/var/spool/icinga2/perfdata/` directory as `host-perfdata.<timestamp>` and
 698 `service-perfdata.<timestamp>`.
 699 External collectors need to parse the rotated performance data files and then
 700 remove the processed files.
 701
 702 #### Perfdata Files in Cluster HA Zones <a id="perfdata-writer-cluster-ha"></a>
 703
 704 The Perfdata feature supports [high availability](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
 705 in cluster zones since 2.11.
 706
 707 By default, all endpoints in a zone will activate the feature and start
 708 writing metrics to the local spool directory. In HA enabled scenarios,
 709 it is possible to set `enable_ha = true` in all feature configuration
 710 files. This allows each endpoint to calculate the feature authority,
 711 and only one endpoint actively writes metrics, the other endpoints
 712 pause the feature.
 713
 714 When the cluster connection breaks at some point, the remaining endpoint(s)
 715 in that zone will automatically resume the feature. This built-in failover
 716 mechanism ensures that metrics are written even if the cluster fails.
 717
 718 The recommended way of running Perfdata is to mount the perfdata spool
 719 directory via NFS on a central server where PNP with the NPCD collector
 720 is running on.
 721
 722
 723
 724
 725 ## Livestatus <a id="setting-up-livestatus"></a>
 726
 727 The [MK Livestatus](https://mathias-kettner.de/checkmk_livestatus.html) project
 728 implements a query protocol that lets users query their Icinga instance for
 729 status information. It can also be used to send commands.
 730
 731 The Livestatus component that is distributed as part of Icinga 2 is a
 732 re-implementation of the Livestatus protocol which is compatible with MK
 733 Livestatus.
 734
 735 > **Tip**
 736 >
 737 > Only install the Livestatus feature if your web interface or addon requires
 738 > you to do so.
 739 > [Icinga Web 2](02-getting-started.md#setting-up-icingaweb2) does not need
 740 > Livestatus.
 741
 742 Details on the available tables and attributes with Icinga 2 can be found
 743 in the [Livestatus Schema](24-appendix.md#schema-livestatus) section.
 744
 745 You can enable Livestatus using icinga2 feature enable:
 746
 747 ```
 748 # icinga2 feature enable livestatus
 749 ```
 750
 751 After that you will have to restart Icinga 2:
 752
 753 ```
 754 # systemctl restart icinga2
 755 ```
 756
 757 By default the Livestatus socket is available in `/var/run/icinga2/cmd/livestatus`.
 758
 759 In order for queries and commands to work you will need to add your query user
 760 (e.g. your web server) to the `icingacmd` group:
 761
 762 ```
 763 # usermod -a -G icingacmd www-data
 764 ```
 765
 766 The Debian packages use `nagios` as the user and group name. Make sure to change `icingacmd` to
 767 `nagios` if you're using Debian.
 768
 769 Change `www-data` to the user you're using to run queries.
 770
 771 In order to use the historical tables provided by the livestatus feature (for example, the
 772 `log` table) you need to have the `CompatLogger` feature enabled. By default these logs
 773 are expected to be in `/var/log/icinga2/compat`. A different path can be set using the
 774 `compat_log_path` configuration attribute.
 775
 776 ```
 777 # icinga2 feature enable compatlog
 778 ```
 779
 780 ### Livestatus Sockets <a id="livestatus-sockets"></a>
 781
 782 Other to the Icinga 1.x Addon, Icinga 2 supports two socket types
 783
 784 * Unix socket (default)
 785 * TCP socket
 786
 787 Details on the configuration can be found in the [LivestatusListener](09-object-types.md#objecttype-livestatuslistener)
 788 object configuration.
 789
 790 ### Livestatus GET Queries <a id="livestatus-get-queries"></a>
 791
 792 > **Note**
 793 >
 794 > All Livestatus queries require an additional empty line as query end identifier.
 795 > The `nc` tool (`netcat`) provides the `-U` parameter to communicate using
 796 > a unix socket.
 797
 798 There also is a Perl module available in CPAN for accessing the Livestatus socket
 799 programmatically: [Monitoring::Livestatus](http://search.cpan.org/~nierlein/Monitoring-Livestatus-0.74/)
 800
 801
 802 Example using the unix socket:
 803
 804 ```
 805 # echo -e "GET services\n" | /usr/bin/nc -U /var/run/icinga2/cmd/livestatus
 806
 807 Example using the tcp socket listening on port `6558`:
 808
 809 # echo -e 'GET services\n' | netcat 127.0.0.1 6558
 810
 811 # cat servicegroups <<EOF
 812 GET servicegroups
 813
 814 EOF
 815
 816 (cat servicegroups; sleep 1) | netcat 127.0.0.1 6558
 817 ```
 818
 819 ### Livestatus COMMAND Queries <a id="livestatus-command-queries"></a>
 820
 821 A list of available external commands and their parameters can be found [here](24-appendix.md#external-commands-list-detail)
 822
 823 ```
 824 $ echo -e 'COMMAND <externalcommandstring>' | netcat 127.0.0.1 6558
 825 ```
 826
 827 ### Livestatus Filters <a id="livestatus-filters"></a>
 828
 829 and, or, negate
 830
 831   Operator  | Negate   | Description
 832   ----------|----------|-------------
 833    =        | !=       | Equality
 834    ~        | !~       | Regex match
 835    =~       | !=~      | Equality ignoring case
 836    ~~       | !~~      | Regex ignoring case
 837    <        |          | Less than
 838    >        |          | Greater than
 839    <=       |          | Less than or equal
 840    >=       |          | Greater than or equal
 841
 842
 843 ### Livestatus Stats <a id="livestatus-stats"></a>
 844
 845 Schema: "Stats: aggregatefunction aggregateattribute"
 846
 847   Aggregate Function | Description
 848   -------------------|--------------
 849   sum                | &nbsp;
 850   min                | &nbsp;
 851   max                | &nbsp;
 852   avg                | sum / count
 853   std                | standard deviation
 854   suminv             | sum (1 / value)
 855   avginv             | suminv / count
 856   count              | ordinary default for any stats query if not aggregate function defined
 857
 858 Example:
 859
 860 ```
 861 GET hosts
 862 Filter: has_been_checked = 1
 863 Filter: check_type = 0
 864 Stats: sum execution_time
 865 Stats: sum latency
 866 Stats: sum percent_state_change
 867 Stats: min execution_time
 868 Stats: min latency
 869 Stats: min percent_state_change
 870 Stats: max execution_time
 871 Stats: max latency
 872 Stats: max percent_state_change
 873 OutputFormat: json
 874 ResponseHeader: fixed16
 875 ```
 876
 877 ### Livestatus Output <a id="livestatus-output"></a>
 878
 879 * CSV
 880
 881 CSV output uses two levels of array separators: The members array separator
 882 is a comma (1st level) while extra info and host|service relation separator
 883 is a pipe (2nd level).
 884
 885 Separators can be set using ASCII codes like:
 886
 887 ```
 888 Separators: 10 59 44 124
 889 ```
 890
 891 * JSON
 892
 893 Default separators.
 894
 895 ### Livestatus Error Codes <a id="livestatus-error-codes"></a>
 896
 897   Code      | Description
 898   ----------|--------------
 899   200       | OK
 900   404       | Table does not exist
 901   452       | Exception on query
 902
 903 ### Livestatus Tables <a id="livestatus-tables"></a>
 904
 905   Table         | Join      |Description
 906   --------------|-----------|----------------------------
 907   hosts         | &nbsp;    | host config and status attributes, services counter
 908   hostgroups    | &nbsp;    | hostgroup config, status attributes and host/service counters
 909   services      | hosts     | service config and status attributes
 910   servicegroups | &nbsp;    | servicegroup config, status attributes and service counters
 911   contacts      | &nbsp;    | contact config and status attributes
 912   contactgroups | &nbsp;    | contact config, members
 913   commands      | &nbsp;    | command name and line
 914   status        | &nbsp;    | programstatus, config and stats
 915   comments      | services  | status attributes
 916   downtimes     | services  | status attributes
 917   timeperiods   | &nbsp;    | name and is inside flag
 918   endpoints     | &nbsp;    | config and status attributes
 919   log           | services, hosts, contacts, commands | parses [compatlog](09-object-types.md#objecttype-compatlogger) and shows log attributes
 920   statehist     | hosts, services | parses [compatlog](09-object-types.md#objecttype-compatlogger) and aggregates state change attributes
 921   hostsbygroup  | hostgroups | host attributes grouped by hostgroup and its attributes
 922   servicesbygroup | servicegroups | service attributes grouped by servicegroup and its attributes
 923   servicesbyhostgroup  | hostgroups | service attributes grouped by hostgroup and its attributes
 924
 925 The `commands` table is populated with `CheckCommand`, `EventCommand` and `NotificationCommand` objects.
 926
 927 A detailed list on the available table attributes can be found in the [Livestatus Schema documentation](24-appendix.md#schema-livestatus).
 928
 929
 930 ## Deprecated Features <a id="deprecated-features"></a>
 931
 932 ### Status Data Files <a id="status-data"></a>
 933
 934 > **Note**
 935 >
 936 > This feature is DEPRECATED and will be removed in future releases.
 937 > Check the [roadmap](https://github.com/Icinga/icinga2/milestones).
 938
 939 Icinga 1.x writes object configuration data and status data in a cyclic
 940 interval to its `objects.cache` and `status.dat` files. Icinga 2 provides
 941 the `StatusDataWriter` object which dumps all configuration objects and
 942 status updates in a regular interval.
 943
 944 ```
 945 # icinga2 feature enable statusdata
 946 ```
 947
 948 If you are not using any web interface or addon which uses these files,
 949 you can safely disable this feature.
 950
 951 ### Compat Log Files <a id="compat-logging"></a>
 952
 953 > **Note**
 954 >
 955 > This feature is DEPRECATED and will be removed in future releases.
 956 > Check the [roadmap](https://github.com/Icinga/icinga2/milestones).
 957
 958 The Icinga 1.x log format is considered being the `Compat Log`
 959 in Icinga 2 provided with the `CompatLogger` object.
 960
 961 These logs are used for informational representation in
 962 external web interfaces parsing the logs, but also to generate
 963 SLA reports and trends.
 964 The [Livestatus](14-features.md#setting-up-livestatus) feature uses these logs
 965 for answering queries to historical tables.
 966
 967 The `CompatLogger` object can be enabled with
 968
 969 ```
 970 # icinga2 feature enable compatlog
 971 ```
 972
 973 By default, the Icinga 1.x log file called `icinga.log` is located
 974 in `/var/log/icinga2/compat`. Rotated log files are moved into
 975 `var/log/icinga2/compat/archives`.
 976
 977 ### External Command Pipe <a id="external-commands"></a>
 978
 979 > **Note**
 980 >
 981 > Please use the [REST API](12-icinga2-api.md#icinga2-api) as modern and secure alternative
 982 > for external actions.
 983
 984 > **Note**
 985 >
 986 > This feature is DEPRECATED and will be removed in future releases.
 987 > Check the [roadmap](https://github.com/Icinga/icinga2/milestones).
 988
 989 Icinga 2 provides an external command pipe for processing commands
 990 triggering specific actions (for example rescheduling a service check
 991 through the web interface).
 992
 993 In order to enable the `ExternalCommandListener` configuration use the
 994 following command and restart Icinga 2 afterwards:
 995
 996 ```
 997 # icinga2 feature enable command
 998 ```
 999
1000 Icinga 2 creates the command pipe file as `/var/run/icinga2/cmd/icinga2.cmd`
1001 using the default configuration.
1002
1003 Web interfaces and other Icinga addons are able to send commands to
1004 Icinga 2 through the external command pipe, for example for rescheduling
1005 a forced service check:
1006
1007 ```
1008 # /bin/echo "[`date +%s`] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;`date +%s`" >> /var/run/icinga2/cmd/icinga2.cmd
1009
1010 # tail -f /var/log/messages
1011
1012 Oct 17 15:01:25 icinga-server icinga2: Executing external command: [1382014885] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382014885
1013 Oct 17 15:01:25 icinga-server icinga2: Rescheduling next check for service 'ping4'
1014 ```
1015
1016 A list of currently supported external commands can be found [here](24-appendix.md#external-commands-list-detail).
1017
1018 Detailed information on the commands and their required parameters can be found
1019 on the [Icinga 1.x documentation](https://docs.icinga.com/latest/en/extcommands2.html).
1020
1021
1022 ### Check Result Files <a id="check-result-files"></a>
1023
1024 > **Note**
1025 >
1026 > This feature is DEPRECATED and will be removed in future releases.
1027 > Check the [roadmap](https://github.com/Icinga/icinga2/milestones).
1028
1029 Icinga 1.x writes its check result files to a temporary spool directory
1030 where they are processed in a regular interval.
1031 While this is extremely inefficient in performance regards it has been
1032 rendered useful for passing passive check results directly into Icinga 1.x
1033 skipping the external command pipe.
1034
1035 Several clustered/distributed environments and check-aggregation addons
1036 use that method. In order to support step-by-step migration of these
1037 environments, Icinga 2 supports the `CheckResultReader` object.
1038
1039 There is no feature configuration available, but it must be defined
1040 on-demand in your Icinga 2 objects configuration.
1041
1042 ```
1043 object CheckResultReader "reader" {
1044   spool_dir = "/data/check-results"
1045 }
1046 ```