granicus.if.org Git - icinga2/blob - doc/3-monitoring-basics.md

   1 # <a id="monitoring-basics"></a> Monitoring Basics
   2
   3 This part of the Icinga 2 documentation provides an overview of all the basic
   4 monitoring concepts you need to know to run Icinga 2.
   5
   6 ## <a id="hosts-services"></a> Hosts and Services
   7
   8 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
   9 and services can be virtually anything which can be checked in some way:
  10
  11 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
  12 * Printers
  13 * Switches / routers
  14 * Temperature sensors
  15 * Other local or network-accessible services
  16
  17 Host objects provide a mechanism to group services that are running
  18 on the same physical device.
  19
  20 Here is an example of a host object which defines two child services:
  21
  22     object Host "my-server1" {
  23       address = "10.0.0.1"
  24       check_command = "hostalive"
  25     }
  26
  27     object Service "ping4" {
  28       host_name = "my-server1"
  29       check_command = "ping4"
  30     }
  31
  32     object Service "http" {
  33       host_name = "my-server1"
  34       check_command = "http"
  35     }
  36
  37 The example creates two services `ping4` and `http` which belong to the
  38 host `my-server1`.
  39
  40 It also specifies that the host should perform its own check using the `hostalive`
  41 check command.
  42
  43 The `address` attribute is used by check commands to determine which network
  44 address is associated with the host object.
  45
  46 Details on troubleshooting check problems can be found [here](#troubleshooting).
  47
  48 ### <a id="host-states"></a> Host States
  49
  50 Hosts can be in any of the following states:
  51
  52   Name        | Description
  53   ------------|--------------
  54   UP          | The host is available.
  55   DOWN        | The host is unavailable.
  56
  57 ### <a id="service-states"></a> Service States
  58
  59 Services can be in any of the following states:
  60
  61   Name        | Description
  62   ------------|--------------
  63   OK          | The service is working properly.
  64   WARNING     | The service is experiencing some problems but is still considered to be in working condition.
  65   CRITICAL    | The service is in a critical state.
  66   UNKNOWN     | The check could not determine the service's state.
  67
  68 ### <a id="hard-soft-states"></a> Hard and Soft States
  69
  70 When detecting a problem with a host/service Icinga re-checks the object a number of
  71 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
  72 notifications. This ensures that no unnecessary notifications are sent for
  73 transient failures. During this time the object is in a `SOFT` state.
  74
  75 After all re-checks have been executed and the object is still in a non-OK
  76 state the host/service switches to a `HARD` state and notifications are sent.
  77
  78   Name        | Description
  79   ------------|--------------
  80   HARD        | The host/service's state hasn't recently changed.
  81   SOFT        | The host/service has recently changed state and is being re-checked.
  82
  83 ### <a id="host-service-checks"></a> Host and Service Checks
  84
  85 Hosts and Services determine their state from a check result returned from a check
  86 execution to the Icinga 2 application. By default the `generic-host` example template
  87 will define `hostalive` as host check. If your host is unreachable for ping, you should
  88 consider using a different check command, for instance the `http` check command, or if
  89 there is no check available, the `dummy` check command.
  90
  91     object Host "uncheckable-host" {
  92       check_command = "dummy"
  93       vars.dummy_state = 1
  94       vars.dummy_text = "Pretending to be OK."
  95     }
  96
  97 Service checks could also use a `dummy` check, but the common strategy is to
  98 [integrate an existing plugin](#command-plugin-integration) as
  99 [check command](#check-commands) and [reference](#command-passing-parameters)
 100 that in your [Service](#objecttype-service) object definition.
 101
 102 ## <a id="configuration-best-practice"></a> Configuration Best Practice
 103
 104 The [Getting Started](#getting-started) chapter already introduced various aspects
 105 of the Icinga 2 configuration language. If you are ready to configure additional
 106 hosts, services, notifications, dependencies, etc, you should think about the
 107 requirements first and then decide for a possible strategy.
 108
 109 There are many ways of creating Icinga 2 configuration objects:
 110
 111 * Manually with your preferred editor, for example vi(m), nano, notepad, etc.
 112 * Generated by a configuration management tool such as Puppet, Chef, Ansible, etc.
 113 * A configuration addon for Icinga 2
 114 * A custom exporter script from your CMDB or inventory tool
 115 * your own.
 116
 117 In order to find the best strategy for your own configuration, ask yourself the following questions:
 118
 119 * Do your hosts share a common group of services (for example linux hosts with disk, load, etc checks)?
 120 * Only a small set of users receives notifications and escalations for all hosts/services?
 121
 122 If you can at least answer one of these questions with yes, look for the [apply rules](#using-apply) logic
 123 instead of defining objects on a per host and service basis.
 124
 125 * You are required to define specific configuration for each host/service?
 126 * Does your configuration generation tool already know about the host-service-relationship?
 127
 128 Then you should look for the object specific configuration setting `host_name` etc accordingly.
 129
 130 Finding the best files and directory tree for your configuration is up to you. Make sure that
 131 the [icinga2.conf](#icinga2-conf) configuration file includes them, and then think about:
 132
 133 * tree-based on locations, hostgroups, specific host attributes with sub levels of directories.
 134 * flat `hosts.conf`, `services.conf`, etc files for rule based configuration.
 135 * generated configuration with one file per host and a global configuration for groups, users, etc.
 136 * one big file generated from an external application (probably a bad idea for maintaining changes).
 137 * your own.
 138
 139 In either way of choosing the right strategy you should additionally check the following:
 140
 141 * Are there any specific attributes describing the host/service you could set as `vars` custom attributes?
 142 You can later use them for applying assign/ignore rules, or export them into external interfaces.
 143 * Put hosts into hostgroups, services into servicegroups and use these attributes for your apply rules.
 144 * Use templates to store generic attributes for your objects and apply rules making your configuration more readable.
 145 Details can be found in the [using templates](#using-templates) chapter.
 146 * Apply rules may overlap. Keep a central place (for example, `services.conf` or `notifications.conf`) storing
 147 the configuration instead of defining apply rules deep in your configuration tree.
 148 * Every plugin used as check, notification or event command requires a `Command` definition.
 149 Further details can be looked up in the [check commands](#check-commands) chapter.
 150
 151 If you happen to have further questions, do not hesitate to join the [community support channels](https://support.icinga.org)
 152 and ask community members for their experience and best practices.
 153
 154
 155 ### <a id="object-inheritance-using-templates"></a> Object Inheritance Using Templates
 156
 157 Templates may be used to apply a set of identical attributes to more than one
 158 object:
 159
 160     template Service "generic-service" {
 161       max_check_attempts = 3
 162       check_interval = 5m
 163       retry_interval = 1m
 164       enable_perfdata = true
 165     }
 166
 167     object Service "ping4" {
 168       import "generic-service"
 169
 170       host_name = "localhost"
 171       check_command = "ping4"
 172     }
 173
 174     object Service "ping6" {
 175       import "generic-service"
 176
 177       host_name = "localhost"
 178       check_command = "ping6"
 179     }
 180
 181 In this example the `ping4` and `ping6` services inherit properties from the
 182 template `generic-service`.
 183
 184 Objects as well as templates themselves can import an arbitrary number of
 185 templates. Attributes inherited from a template can be overridden in the
 186 object if necessary.
 187
 188 ### <a id="using-apply"></a> Apply objects based on rules
 189
 190 Instead of assigning each object (`Service`, `Notification`, `Dependency`, `ScheduledDowntime`)
 191 based on attribute identifiers for example `host_name` objects can be [applied](#apply).
 192
 193 Detailed scenario examples are used in their respective chapters, for example
 194 [apply services with custom command arguments](#using-apply-services-command-arguments).
 195
 196 #### <a id="using-apply-services"></a> Apply Services to Hosts
 197
 198     apply Service "load" {
 199       import "generic-service"
 200
 201       check_command = "load"
 202
 203       assign where "linux-server" in host.groups
 204       ignore where host.vars.no_load_check
 205     }
 206
 207 In this example the `load` service will be created as object for all hosts in the `linux-server`
 208 host group. If the `no_load_check` custom attribute is set, the host will be
 209 ignored.
 210
 211 #### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
 212
 213 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
 214 manner:
 215
 216     apply Notification "mail-noc" to Service {
 217       import "mail-service-notification"
 218       command = "mail-service-notification"
 219       user_groups = [ "noc" ]
 220
 221       assign where service.vars.sla == "24x7"
 222     }
 223
 224 In this example the `mail-noc` notification will be created as object for all services having the
 225 `sla` custom attribute set to `24x7`. The notification command is set to `mail-service-notification`
 226 and all members of the user group `noc` will get notified.
 227
 228 #### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
 229
 230 Detailed examples can be found in the [dependencies](#dependencies) chapter.
 231
 232 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
 233
 234 Detailed examples can be found in the [recurring downtimes](#recurring-downtimes) chapter.
 235
 236
 237 ### <a id="groups"></a> Groups
 238
 239 Groups are used for combining hosts, services, and users into
 240 accessible configuration attributes and views in external (web)
 241 interfaces.
 242
 243 Group membership is defined at the respective object itself. If
 244 you have a hostgroup name `windows` for example, and want to assign
 245 specific hosts to this group for later viewing the group on your
 246 alert dashboard, first create the hostgroup:
 247
 248     object HostGroup "windows" {
 249       display_name = "Windows Servers"
 250     }
 251
 252 Then add your hosts to this hostgroup
 253
 254     template Host "windows-server" {
 255       groups += [ "windows" ]
 256     }
 257
 258     object Host "mssql-srv1" {
 259       import "windows-server"
 260
 261       vars.mssql_port = 1433
 262     }
 263
 264     object Host "mssql-srv2" {
 265       import "windows-server"
 266
 267       vars.mssql_port = 1433
 268     }
 269
 270 This can be done for service and user groups the same way. Additionally
 271 the user groups are associated as attributes in `Notification` objects.
 272
 273     object UserGroup "windows-mssql-admins" {
 274       display_name = "Windows MSSQL Admins"
 275     }
 276
 277     template User "generic-windows-mssql-users" {
 278       groups += [ "windows-mssql-admins" ]
 279     }
 280
 281     object User "win-mssql-noc" {
 282       import "generic-windows-mssql-users"
 283
 284       email = "noc@example.com"
 285     }
 286
 287     object User "win-mssql-ops" {
 288       import "generic-windows-mssql-users"
 289
 290       email = "ops@example.com"
 291     }
 292
 293 #### <a id="group-assign"></a> Group Membership Assign
 294
 295 If there is a certain number of hosts, services, or users matching a pattern
 296 it's reasonable to assign the group object to these members.
 297 Details on the `assign where` syntax can be found [here](#apply)
 298
 299     object HostGroup "mssql" {
 300       display_name = "MSSQL Servers"
 301       assign where host.vars.mssql_port
 302     }
 303
 304 In this inherited example from above all hosts with the `vars` attribute `mssql_port`
 305 set will be added as members to the host group `mssql`.
 306
 307 ## <a id="notifications"></a> Notifications
 308
 309 Notifications for service and host problems are an integral part of your
 310 monitoring setup.
 311
 312 When a host or service is in a downtime, a problem has been acknowledged or
 313 the dependency logic determined that the host/service is unreachable, no
 314 notifications are sent. You can configure additional type and state filters
 315 refining the notifications being actually sent.
 316
 317 There are many ways of sending notifications, e.g. by e-mail, XMPP,
 318 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
 319 Instead it relies on external mechanisms such as shell scripts to notify users.
 320
 321 A notification specification requires one or more users (and/or user groups)
 322 who will be notified in case of problems. These users must have all custom
 323 attributes defined which will be used in the `NotificationCommand` on execution.
 324
 325 The user `icingaadmin` in the example below will get notified only on `WARNING` and
 326 `CRITICAL` states and `problem` and `recovery` notification types.
 327
 328     object User "icingaadmin" {
 329       display_name = "Icinga 2 Admin"
 330       enable_notifications = true
 331       states = [ OK, Warning, Critical ]
 332       types = [ Problem, Recovery ]
 333       email = "icinga@localhost"
 334     }
 335
 336 If you don't set the `states` and `types` configuration attributes for the `User`
 337 object, notifications for all states and types will be sent.
 338
 339 Details on troubleshooting notification problems can be found [here](#troubleshooting).
 340
 341 > **Note**
 342 >
 343 > Make sure that the [notification](#features) feature is enabled on your master instance
 344 > in order to execute notification commands.
 345
 346 You should choose which information you (and your notified users) are interested in
 347 case of emergency, and also which information does not provide any value to you and
 348 your environment.
 349
 350 An example notification command is explained [here](#notification-commands).
 351
 352 You can add all shared attributes to a `Notification` template which is inherited
 353 to the defined notifications. That way you'll save duplicated attributes in each
 354 `Notification` object. Attributes can be overridden locally.
 355
 356     template Notification "generic-notification" {
 357       interval = 15m
 358
 359       command = "mail-service-notification"
 360
 361       states = [ Warning, Critical, Unknown ]
 362       types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
 363                 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
 364
 365       period = "24x7"
 366     }
 367
 368 The time period `24x7` is shipped as example configuration with Icinga 2.
 369
 370 Use the `apply` keyword to create `Notification` objects for your services:
 371
 372     apply Notification "mail" to Service {
 373       import "generic-notification"
 374
 375       command = "mail-notification"
 376       users = [ "icingaadmin" ]
 377
 378       assign where service.name == "mysql"
 379     }
 380
 381 Instead of assigning users to notifications, you can also add the `user_groups`
 382 attribute with a list of user groups to the `Notification` object. Icinga 2 will
 383 send notifications to all group members.
 384
 385 ### <a id="notification-escalations"></a> Notification Escalations
 386
 387 When a problem notification is sent and a problem still exists at the time of re-notification
 388 you may want to escalate the problem to the next support level. A different approach
 389 is to configure the default notification by email, and escalate the problem via SMS
 390 if not already solved.
 391
 392 You can define notification start and end times as additional configuration
 393 attributes making the `Notification` object a so-called `notification escalation`.
 394 Using templates you can share the basic notification attributes such as users or the
 395 `interval` (and override them for the escalation then).
 396
 397 Using the example from above, you can define additional users being escalated for SMS
 398 notifications between start and end time.
 399
 400     object User "icinga-oncall-2nd-level" {
 401       display_name = "Icinga 2nd Level"
 402
 403       vars.mobile = "+1 555 424642"
 404     }
 405
 406     object User "icinga-oncall-1st-level" {
 407       display_name = "Icinga 1st Level"
 408
 409       vars.mobile = "+1 555 424642"
 410     }
 411
 412 Define an additional `NotificationCommand` for SMS notifications.
 413
 414 > **Note**
 415 >
 416 > The example is not complete as there are many different SMS providers.
 417 > Please note that sending SMS notifications will require an SMS provider
 418 > or local hardware with a SIM card active.
 419
 420     object NotificationCommand "sms-notification" {
 421        command = [
 422          PluginDir + "/send_sms_notification",
 423          "$mobile$",
 424          "..."
 425     }
 426
 427 The two new notification escalations are added onto the host `localhost`
 428 and its service `ping4` using the `generic-notification` template.
 429 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
 430 command) after `30m` until `1h`.
 431
 432 > **Note**
 433 >
 434 > The `interval` was set to 15m in the `generic-notification`
 435 > template example. Lower that value in your escalations by using a secondary
 436 > template or by overriding the attribute directly in the `notifications` array
 437 > position for `escalation-sms-2nd-level`.
 438
 439 If the problem does not get resolved nor acknowledged preventing further notifications
 440 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
 441 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
 442
 443     apply Notification "mail" to Service {
 444       import "generic-notification"
 445
 446       command = "mail-notification"
 447       users = [ "icingaadmin" ]
 448
 449       assign where service.name == "ping4"
 450     }
 451
 452     apply Notification "escalation-sms-2nd-level" to Service {
 453       import "generic-notification"
 454
 455       command = "sms-notification"
 456       users = [ "icinga-oncall-2nd-level" ]
 457
 458       times = {
 459         begin = 30m
 460         end = 1h
 461       }
 462
 463       assign where service.name == "ping4"
 464     }
 465
 466     apply Notification "escalation-sms-1st-level" to Service {
 467       import "generic-notification"
 468
 469       command = "sms-notification"
 470       users = [ "icinga-oncall-1st-level" ]
 471
 472       times = {
 473         begin = 1h
 474         end = 2h
 475       }
 476
 477       assign where service.name == "ping4"
 478     }
 479
 480 ### <a id="notification-delay"></a> Notification Delay
 481
 482 Sometimes the problem in question should not be notified when the notification is due
 483 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
 484 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
 485 postpone the first notification for 15 minutes. Leave out the `end` key - if not set,
 486 Icinga 2 will not check against any end time for this notification.
 487
 488     apply Notification "mail" to Service {
 489       import "generic-notification"
 490
 491       command = "mail-notification"
 492       users = [ "icingaadmin" ]
 493
 494       times.begin = 15m // delay first notification
 495
 496       assign where service.name == "ping4"
 497     }
 498
 499 ### <a id="disable-renotification"></a> Disable Re-notifications
 500
 501 If you prefer to be notified only once, you can disable re-notifications by setting the
 502 `interval` attribute to `0`.
 503
 504     apply Notification "notify-once" to Service {
 505       import "generic-notification"
 506
 507       command = "mail-notification"
 508       users = [ "icingaadmin" ]
 509
 510       interval = 0 // disable re-notification
 511
 512       assign where service.name == "ping4"
 513     }
 514
 515 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
 516
 517 If there are no notification state and type filter attributes defined at the `Notification`
 518 or `User` object Icinga 2 assumes that all states and types are being notified.
 519
 520 Available state and type filters for notifications are:
 521
 522     template Notification "generic-notification" {
 523
 524       states = [ Warning, Critical, Unknown ]
 525       types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
 526                 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
 527     }
 528
 529 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
 530 into type and state to allow more fine granular filtering for example on downtimes and flapping.
 531 You can filter for acknowledgements and custom notifications too.
 532
 533
 534 ## <a id="timeperiods"></a> Time Periods
 535
 536 Time Periods define time ranges in Icinga where event actions are
 537 triggered, for example whether a service check is executed or not within
 538 the `check_period` attribute. Or a notification should be sent to
 539 users or not, filtered by the `period` and `notification_period`
 540 configuration attributes for `Notification` and `User` objects.
 541
 542 > **Note**
 543 >
 544 > If you are familar with Icinga 1.x - these time period definitions
 545 > are called `legacy timeperiods` in Icinga 2.
 546 >
 547 > An Icinga 2 legacy timeperiod requires the `ITL` provided template
 548 >`legacy-timeperiod`.
 549
 550 The `TimePeriod` attribute `ranges` may contain multiple directives,
 551 including weekdays, days of the month, and calendar dates.
 552 These types may overlap/override other types in your ranges dictionary.
 553
 554 The descending order of precedence is as follows:
 555
 556 * Calendar date (2008-01-01)
 557 * Specific month date (January 1st)
 558 * Generic month date (Day 15)
 559 * Offset weekday of specific month (2nd Tuesday in December)
 560 * Offset weekday (3rd Monday)
 561 * Normal weekday (Tuesday)
 562
 563 If you don't set any `check_period` or `notification_period` attribute
 564 on your configuration objects Icinga 2 assumes `24x7` as time period
 565 as shown below.
 566
 567     object TimePeriod "24x7" {
 568       import "legacy-timeperiod"
 569
 570       display_name = "Icinga 2 24x7 TimePeriod"
 571       ranges = {
 572         "monday"    = "00:00-24:00"
 573         "tuesday"   = "00:00-24:00"
 574         "wednesday" = "00:00-24:00"
 575         "thursday"  = "00:00-24:00"
 576         "friday"    = "00:00-24:00"
 577         "saturday"  = "00:00-24:00"
 578         "sunday"    = "00:00-24:00"
 579       }
 580     }
 581
 582 If your operation staff should only be notified during workhours
 583 create a new timeperiod named `workhours` defining a work day from
 584 09:00 to 17:00.
 585
 586     object TimePeriod "workhours" {
 587       import "legacy-timeperiod"
 588
 589       display_name = "Icinga 2 8x5 TimePeriod"
 590       ranges = {
 591         "monday"    = "09:00-17:00"
 592         "tuesday"   = "09:00-17:00"
 593         "wednesday" = "09:00-17:00"
 594         "thursday"  = "09:00-17:00"
 595         "friday"    = "09:00-17:00"
 596       }
 597     }
 598
 599 Use the `period` attribute to assign time periods to
 600 `Notification` and `Dependency` objects:
 601
 602     object Notification "mail" {
 603       import "generic-notification"
 604
 605       host_name = "localhost"
 606
 607       command = "mail-notification"
 608       users = [ "icingaadmin" ]
 609       period = "workhours"
 610     }
 611
 612
 613 ## <a id="commands"></a> Commands
 614
 615 Icinga 2 uses three different command object types to specify how
 616 checks should be performed, notifications should be sent, and
 617 events should be handled.
 618
 619 ### <a id="command-environment-variables"></a> Environment Variables for Commands
 620
 621 Please check [Runtime Custom Attributes as Environment Variables](#runtime-custom-attribute-env-vars).
 622
 623
 624 ### <a id="check-commands"></a> Check Commands
 625
 626 `CheckCommand` objects define the command line how a check is called.
 627
 628 > **Note**
 629 >
 630 > Make sure that the [checker](#features) feature is enabled in order to
 631 > execute checks.
 632
 633 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
 634
 635 `CheckCommand` objects require the [ITL template](#itl-plugin-check-command)
 636 `plugin-check-command` to support native plugin based check methods.
 637
 638 Unless you have done so already, download your check plugin and put it
 639 into the `PluginDir` directory. The following example uses the
 640 `check_disk` plugin shipped with the Monitoring Plugins package.
 641
 642 The plugin path and all command arguments are made a list of
 643 double-quoted string arguments for proper shell escaping.
 644
 645 Call the `check_disk` plugin with the `--help` parameter to see
 646 all available options. Our example defines warning (`-w`) and
 647 critical (`-c`) thresholds for the disk usage. Without any
 648 partition defined (`-p`) it will check all local partitions.
 649
 650     icinga@icinga2 $ /usr/lib/nagios/plugins/check_disk --help
 651     ...
 652     This plugin checks the amount of used disk space on a mounted file system
 653     and generates an alert if free space is less than one of the threshold values
 654
 655
 656     Usage:
 657      check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
 658     [-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
 659     [-t timeout] [-u unit] [-v] [-X type] [-N type]
 660     ...
 661
 662 > **Note**
 663 >
 664 > Don't execute plugins as `root` and always use the absolute path to the plugin! Trust us.
 665
 666 Next step is to understand how command parameters are being passed from
 667 a host or service object, and add a `CheckCommand` definition based on these
 668 required parameters and/or default values.
 669
 670 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
 671
 672 Unlike Icinga 1.x check command parameters are defined as custom attributes
 673 which can be accessed as runtime macros by the executed check command.
 674
 675 Define the default check command custom attribute `disk_wfree` and `disk_cfree`
 676 (freely definable naming schema) and their default threshold values. You can
 677 then use these custom attributes as runtime macros for [command arguments](#command-arguments)
 678 on the command line.
 679
 680 The default custom attributes can be overridden by the custom attributes
 681 defined in the service using the check command `my-disk`. The custom attributes
 682 can also be inherited from a parent template using additive inheritance (`+=`).
 683
 684
 685     object CheckCommand "my-disk" {
 686       import "plugin-check-command"
 687
 688       command = [ PluginDir + "/check_disk" ]
 689
 690       arguments = {
 691         "-w" = "$disk_wfree$%"
 692         "-c" = "$disk_cfree$%"
 693       }
 694
 695       vars.disk_wfree = 20
 696       vars.disk_cfree = 10
 697     }
 698
 699
 700 The host `localhost` with the service `my-disk` checks all disks with modified
 701 custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
 702 free disk space).
 703
 704     object Host "localhost" {
 705       import "generic-host"
 706
 707       address = "127.0.0.1"
 708       address6 = "::1"
 709     }
 710
 711     object Service "my-disk" {
 712       import "generic-service"
 713
 714       host_name = "localhost"
 715       check_command = "my-disk"
 716
 717       vars.disk_wfree = 10
 718       vars.disk_cfree = 5
 719     }
 720
 721 #### <a id="command-arguments"></a> Command Arguments
 722
 723 By defining a check command line using the `command` attribute Icinga 2
 724 will resolve all macros in the static string or array. Sometimes it is
 725 required to extend the arguments list based on a met condition evaluated
 726 at command execution. Or making arguments optional - only set if the
 727 macro value can be resolved by Icinga 2.
 728
 729     object CheckCommand "check_http" {
 730       import "plugin-check-command"
 731
 732       command = [ PluginDir + "/check_http" ]
 733
 734       arguments = {
 735         "-H" = "$http_vhost$"
 736         "-I" = "$http_address$"
 737         "-u" = "$http_uri$"
 738         "-p" = "$http_port$"
 739         "-S" = {
 740           set_if = "$http_ssl$"
 741         }
 742         "--sni" = {
 743           set_if = "$http_sni$"
 744         }
 745         "-a" = {
 746           value = "$http_auth_pair$"
 747           description = "Username:password on sites with basic authentication"
 748         }
 749         "--no-body" = {
 750           set_if = "$http_ignore_body$"
 751         }
 752         "-r" = "$http_expect_body_regex$"
 753         "-w" = "$http_warn_time$"
 754         "-c" = "$http_critical_time$"
 755         "-e" = "$http_expect$"
 756       }
 757
 758       vars.http_address = "$address$"
 759       vars.http_ssl = false
 760       vars.http_sni = false
 761     }
 762
 763 The example shows the `check_http` check command defining the most common
 764 arguments. Each of them is optional by default and will be omitted if
 765 the value is not set. For example if the service calling the check command
 766 does not have `vars.http_port` set, it won't get added to the command
 767 line.
 768
 769 If the `vars.http_ssl` custom attribute is set in the service, host or command
 770 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
 771 numeric value to the command line. String values are not supported.
 772
 773 That way you can use the `check_http` command definition for both, with and
 774 without SSL enabled checks saving you duplicated command definitions.
 775
 776 Details on all available options can be found in the
 777 [CheckCommand object definition](#objecttype-checkcommand).
 778
 779 ### <a id="using-apply-services-command-arguments"></a> Apply Services with custom Command Arguments
 780
 781 Imagine the following scenario: The `my-host1` host is reachable using the default port 22, while
 782 the `my-host2` host requires a different port on 2222. Both hosts are in the hostgroup `my-linux-servers`.
 783
 784     object HostGroup "my-linux-servers" {
 785       display_name = "Linux Servers"
 786       assign where host.vars.os == "Linux"
 787     }
 788
 789     /* this one has port 22 opened */
 790     object Host "my-host1" {
 791       import "generic-host"
 792       address = "129.168.1.50"
 793       vars.os = "Linux"
 794     }
 795
 796     /* this one listens on a different ssh port */
 797     object Host "my-host2" {
 798       import "generic-host"
 799       address = "129.168.2.50"
 800       vars.os = "Linux"
 801       vars.custom_ssh_port = 2222
 802     }
 803
 804 All hosts in the `my-linux-servers` hostgroup should get the `my-ssh` service applied based on an
 805 [apply rule](#apply). The optional `ssh_port` command argument should be inherited from the host
 806 the service is applied to. If not set, the check command `my-ssh` will omit the argument.
 807 The `host` argument is special: `skip_key` tells Icinga 2 to ignore the key, and directly put the
 808 value onto the command line. The `order` attribute specifies that this argument is the first one
 809 (`-1` is smaller than the other defaults).
 810
 811     object CheckCommand "my-ssh" {
 812       import "plugin-check-command"
 813
 814       command = [ PluginDir + "/check_ssh" ]
 815
 816       arguments = {
 817         "-p" = "$ssh_port$"
 818         "host" = {
 819           value = "$ssh_address$"
 820           skip_key = true
 821           order = -1
 822         }
 823       }
 824
 825       vars.ssh_address = "$address$"
 826     }
 827
 828     /* apply ssh service */
 829     apply Service "my-ssh" {
 830       import "generic-service"
 831       check_command = "my-ssh"
 832
 833       //set the command argument for ssh port with a custom host attribute, if set
 834       vars.ssh_port = "$host.vars.custom_ssh_port$"
 835
 836       assign where "my-linux-servers" in host.groups
 837     }
 838
 839 The `my-host1` will get the `my-ssh` service checking on the default port:
 840
 841     [2014-05-26 21:52:23 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '129.168.1.50': PID 27281
 842
 843 The `my-host2` will inherit the `custom_ssh_port` variable to the service and execute a different command:
 844
 845     [2014-05-26 21:51:32 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '-p', '2222', '129.168.2.50': PID 26956
 846
 847
 848 ### <a id="notification-commands"></a> Notification Commands
 849
 850 `NotificationCommand` objects define how notifications are delivered to external
 851 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
 852
 853 `NotificationCommand` objects require the [ITL template](#itl-plugin-notification-command)
 854 `plugin-notification-command` to support native plugin-based notifications.
 855
 856 > **Note**
 857 >
 858 > Make sure that the [notification](#features) feature is enabled on your master instance
 859 > in order to execute notification commands.
 860
 861 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
 862 the current check output) sending an email to the user(s) associated with the
 863 notification itself (`$user.email$`).
 864
 865 If you want to specify default values for some of the custom attribute definitions,
 866 you can add a `vars` dictionary as shown for the `CheckCommand` object.
 867
 868     object NotificationCommand "mail-service-notification" {
 869       import "plugin-notification-command"
 870
 871       command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
 872
 873       env = {
 874         NOTIFICATIONTYPE = "$notification.type$"
 875         SERVICEDESC = "$service.name$"
 876         HOSTALIAS = "$host.display_name$"
 877         HOSTADDRESS = "$address$"
 878         SERVICESTATE = "$service.state$"
 879         LONGDATETIME = "$icinga.long_date_time$"
 880         SERVICEOUTPUT = "$service.output$"
 881         NOTIFICATIONAUTHORNAME = "$notification.author$"
 882         NOTIFICATIONCOMMENT = "$notification.comment$"
 883         HOSTDISPLAYNAME = "$host.display_name$"
 884         SERVICEDISPLAYNAME = "$service.display_name$"
 885         USEREMAIL = "$user.email$"
 886       }
 887     }
 888
 889 The command attribute in the `mail-service-notification` command refers to the following
 890 shell script. The macros specified in the `env` array are exported
 891 as environment variables and can be used in the notification script:
 892
 893     #!/usr/bin/env bash
 894     template=$(cat <<TEMPLATE
 895     ***** Icinga  *****
 896
 897     Notification Type: $NOTIFICATIONTYPE
 898
 899     Service: $SERVICEDESC
 900     Host: $HOSTALIAS
 901     Address: $HOSTADDRESS
 902     State: $SERVICESTATE
 903
 904     Date/Time: $LONGDATETIME
 905
 906     Additional Info: $SERVICEOUTPUT
 907
 908     Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
 909     TEMPLATE
 910     )
 911
 912     /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
 913
 914 > **Note**
 915 >
 916 > This example is for `exim` only. Requires changes for `sendmail` and
 917 > other MTAs.
 918
 919 While it's possible to specify the entire notification command right
 920 in the NotificationCommand object it is generally advisable to create a
 921 shell script in the `/etc/icinga2/scripts` directory and have the
 922 NotificationCommand object refer to that.
 923
 924 ### <a id="event-commands"></a> Event Commands
 925
 926 Unlike notifications event commands are called on every host/service execution
 927 if one of these conditions match:
 928
 929 * The host/service is in a [soft state](#hard-soft-states)
 930 * The host/service state changes into a [hard state](#hard-soft-states)
 931 * The host/service state recovers from a [soft or hard state](#hard-soft-states) to [OK](#service-states)/[Up](#host-states)
 932
 933 Therefore the `EventCommand` object should define a command line
 934 evaluating the current service state and other service runtime attributes
 935 available through runtime vars. Runtime macros such as `$service.state_type$`
 936 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
 937 events being triggered.
 938
 939 Common use case scenarios are a failing HTTP check requiring an immediate
 940 restart via event command, or if an application is locked and requires
 941 a restart upon detection.
 942
 943 `EventCommand` objects require the ITL template `plugin-event-command`
 944 to support native plugin based checks.
 945
 946 When the event command is triggered on a service state change, it will
 947 send a check result using the `process_check_result` script forcibly
 948 changing the service state back to `OK` (`-r 0`) providing some debug
 949 information in the check output (`-o`).
 950
 951     object EventCommand "plugin-event-process-check-result" {
 952       import "plugin-event-command"
 953
 954       command = [
 955         PluginDir + "/process_check_result",
 956         "-H", "$host.name$",
 957         "-S", "$service.name$",
 958         "-c", RunDir + "/icinga2/cmd/icinga2.cmd",
 959         "-r", "0",
 960         "-o", "Event Handler triggered in state '$service.state$' with output '$service.output$'."
 961       ]
 962     }
 963
 964
 965 ## <a id="dependencies"></a> Dependencies
 966
 967 Icinga 2 uses host and service [Dependency](#objecttype-dependency) objects
 968 for determing their network reachability.
 969 The `parent_host_name` and `parent_service_name` attributes are mandatory for
 970 service dependencies, `parent_host_name` is required for host dependencies.
 971
 972 A service can depend on a host, and vice versa. A service has an implicit
 973 dependency (parent) to its host. A host to host dependency acts implicitly
 974 as host parent relation.
 975 When dependencies are calculated, not only the immediate parent is taken into
 976 account but all parents are inherited.
 977
 978 Notifications are suppressed if a host or service becomes unreachable.
 979
 980 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
 981
 982 Icinga 2 automatically adds an implicit dependency for services on their host. That way
 983 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
 984 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
 985 `states = [ Up ]` for all service objects.
 986
 987 Service checks are still executed. If you want to prevent them from happening, you can
 988 apply the following dependency to all services setting their host as `parent_host_name`
 989 and disabling the checks. `assign where true` matches on all `Service` objects.
 990
 991     apply Dependency "disable-host-service-checks" to Service {
 992       disable_checks = true
 993       assign where true
 994     }
 995
 996 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
 997
 998 A common scenario is the Icinga 2 server behind a router. Checking internet
 999 access by pinging the Google DNS server `google-dns` is a common method, but
1000 will fail in case the `dsl-router` host is down. Therefore the example below
1001 defines a host dependency which acts implicitly as parent relation too.
1002
1003 Furthermore the host may be reachable but ping probes are dropped by the
1004 router's firewall. In case the `dsl-router``ping4` service check fails, all
1005 further checks for the `ping4` service on host `google-dns` service should
1006 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
1007
1008     object Host "dsl-router" {
1009       address = "192.168.1.1"
1010     }
1011
1012     object Host "google-dns" {
1013       address = "8.8.8.8"
1014     }
1015
1016     apply Service "ping4" {
1017       import "generic-service"
1018
1019       check_command = "ping4"
1020
1021       assign where host.address
1022     }
1023
1024     apply Dependency "internet" to Host {
1025       parent_host_name = "dsl-router"
1026       disable_checks = true
1027       disable_notifications = true
1028
1029       assign where host.name != "dsl-router"
1030     }
1031
1032     apply Dependency "internet" to Service {
1033       parent_host_name = "dsl-router"
1034       parent_service_name = "ping4"
1035       disable_checks = true
1036
1037       assign where host.name != "dsl-router"
1038     }
1039
1040
1041 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1042
1043 Another classic example are agent based checks. You would define a health check
1044 for the agent daemon responding to your requests, and make all other services
1045 querying that daemon depend on that health check.
1046
1047 The following configuration defines two nrpe based service checks `nrpe-load`
1048 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1049 `nrpe-health` service.
1050
1051     apply Service "nrpe-health" {
1052       import "generic-service"
1053       check_command = "nrpe"
1054       assign where match("nrpe-*", host.name)
1055     }
1056
1057     apply Service "nrpe-load" {
1058       import "generic-service"
1059       check_command = "nrpe"
1060       vars.nrpe_command = "check_load"
1061       assign where match("nrpe-*", host.name)
1062     }
1063
1064     apply Service "nrpe-disk" {
1065       import "generic-service"
1066       check_command = "nrpe"
1067       vars.nrpe_command = "check_disk"
1068       assign where match("nrpe-*", host.name)
1069     }
1070
1071     object Host "nrpe-server" {
1072       import "generic-host"
1073       address = "192.168.1.5"
1074     }
1075
1076     apply Dependency "disable-nrpe-checks" to Service {
1077       parent_service_name = "nrpe-health"
1078
1079       states = [ OK ]
1080       disable_checks = true
1081       disable_notifications = true
1082       assign where service.check_command == "nrpe"
1083       ignore where service.name == "nrpe-health"
1084     }
1085
1086 The `disable-nrpe-checks` dependency is applied to all services
1087 on the `nrpe-service` host using the `nrpe` check_command attribute
1088 but not the `nrpe-health` service itself.
1089
1090
1091 ## <a id="downtimes"></a> Downtimes
1092
1093 Downtimes can be scheduled for planned server maintenance or
1094 any other targetted service outage you are aware of in advance.
1095
1096 Downtimes will suppress any notifications, and may trigger other
1097 downtimes too. If the downtime was set by accident, or the duration
1098 exceeds the maintenance, you can manually cancel the downtime.
1099 Planned downtimes will also be taken into account for SLA reporting
1100 tools calculating the SLAs based on the state and downtime history.
1101
1102 Multiple downtimes for a single object may overlap. This is useful
1103 when you want to extend your maintenance window taking longer than expected.
1104 If there are multiple downtimes triggered for one object, the overall downtime depth
1105 will be greater than `1`.
1106
1107
1108 If the downtime was scheduled after the problem changed to a critical hard
1109 state triggering a problem notification, and the service recovers during
1110 the downtime window, the recovery notification won't be suppressed.
1111
1112 ### <a id="fixed-flexible-downtimes"></a> Fixed and Flexible Downtimes
1113
1114 A `fixed` downtime will be activated at the defined start time, and
1115 removed at the end time. During this time window the service state
1116 will change to `NOT-OK` and then actually trigger the downtime.
1117 Notifications are suppressed and the downtime depth is incremented.
1118
1119 Common scenarios are a planned distribution upgrade on your linux
1120 servers, or database updates in your warehouse. The customer knows
1121 about a fixed downtime window between 23:00 and 24:00. After 24:00
1122 all problems should be alerted again. Solution is simple -
1123 schedule a `fixed` downtime starting at 23:00 and ending at 24:00.
1124
1125 Unlike a `fixed` downtime, a `flexible` downtime will be triggered
1126 by the state change in the time span defined by start and end time,
1127 and then last for the specified duration in minutes.
1128
1129 Imagine the following scenario: Your service is frequently polled
1130 by users trying to grab free deleted domains for immediate registration.
1131 Between 07:30 and 08:00 the impact will hit for 15 minutes and generate
1132 a network outage visible to the monitoring. The service is still alive,
1133 but answering too slow to Icinga 2 service checks.
1134 For that reason, you may want to schedule a downtime between 07:30 and
1135 08:00 with a duration of 15 minutes. The downtime will then last from
1136 its trigger time until the duration is over. After that, the downtime
1137 is removed (may happen before or after the actual end time!).
1138
1139 ### <a id="scheduling-downtime"></a> Scheduling a downtime
1140
1141 This can either happen through a web interface or by sending an [external command](#external-commands)
1142 to the external command pipe provided by the `ExternalCommandListener` configuration.
1143
1144 Fixed downtimes require a start and end time (a duration will be ignored).
1145 Flexible downtimes need a start and end time for the time span, and a duration
1146 independent from that time span.
1147
1148 ### <a id="triggered-downtimes"></a> Triggered Downtimes
1149
1150 This is optional when scheduling a downtime. If there is already a downtime
1151 scheduled for a future maintenance, the current downtime can be triggered by
1152 that downtime. This renders useful if you have scheduled a host downtime and
1153 are now scheduling a child host's downtime getting triggered by the parent
1154 downtime on NOT-OK state change.
1155
1156 ### <a id="recurring-downtimes"></a> Recurring Downtimes
1157
1158 [ScheduledDowntime objects](#objecttype-scheduleddowntime) can be used to set up
1159 recurring downtimes for services.
1160
1161 Example:
1162
1163     apply ScheduledDowntime "backup-downtime" to Service {
1164       author = "icingaadmin"
1165       comment = "Scheduled downtime for backup"
1166
1167       ranges = {
1168         monday = "02:00-03:00"
1169         tuesday = "02:00-03:00"
1170         wednesday = "02:00-03:00"
1171         thursday = "02:00-03:00"
1172         friday = "02:00-03:00"
1173         saturday = "02:00-03:00"
1174         sunday = "02:00-03:00"
1175       }
1176
1177       assign where "backup" in service.groups
1178     }
1179
1180
1181 ## <a id="comments"></a> Comments
1182
1183 Comments can be added at runtime and are persistent over restarts. You can
1184 add useful information for others on repeating incidents (for example
1185 "last time syslog at 100% cpu on 17.10.2013 due to stale nfs mount") which
1186 is primarly accessible using web interfaces.
1187
1188 Adding and deleting comment actions are possible through the external command pipe
1189 provided with the `ExternalCommandListener` configuration. The caller must
1190 pass the comment id in case of manipulating an existing comment.
1191
1192
1193 ## <a id="acknowledgements"></a> Acknowledgements
1194
1195 If a problem is alerted and notified you may signal the other notification
1196 recipients that you are aware of the problem and will handle it.
1197
1198 By sending an acknowledgement to Icinga 2 (using the external command pipe
1199 provided with `ExternalCommandListener` configuration) all future notifications
1200 are suppressed, a new comment is added with the provided description and
1201 a notification with the type `NotificationFilterAcknowledgement` is sent
1202 to all notified users.
1203
1204 ### <a id="expiring-acknowledgements"></a> Expiring Acknowledgements
1205
1206 Once a problem is acknowledged it may disappear from your `handled problems`
1207 dashboard and no-one ever looks at it again since it will suppress
1208 notifications too.
1209
1210 This `fire-and-forget` action is quite common. If you're sure that a
1211 current problem should be resolved in the future at a defined time,
1212 you can define an expiration time when acknowledging the problem.
1213
1214 Icinga 2 will clear the acknowledgement when expired and start to
1215 re-notify if the problem persists.
1216
1217
1218
1219 ## <a id="custom-attributes"></a> Custom Attributes
1220
1221 ### <a id="runtime-custom-attributes"></a> Using Custom Attributes at Runtime
1222
1223 Custom attributes may be used in command definitions to dynamically change how the command
1224 is executed.
1225
1226 Additionally there are Icinga 2 features such as the `PerfDataWriter` type
1227 which use custom attributes to format their output.
1228
1229 > **Tip**
1230 >
1231 > Custom attributes are identified by the 'vars' dictionary attribute as short name.
1232 > Accessing the different attribute keys is possible using the '.' accessor.
1233
1234 Custom attributes in command definitions or performance data templates are evaluated at
1235 runtime when executing a command. These custom attributes cannot be used elsewhere
1236 (e.g. in other configuration attributes).
1237
1238 Custom attribute values must be either a string, a number or a boolean value. Arrays
1239 and dictionaries cannot be used.
1240
1241 Here is an example of a command definition which uses user-defined custom attributes:
1242
1243     object CheckCommand "my-ping" {
1244       import "plugin-check-command"
1245
1246       command = [
1247         PluginDir + "/check_ping", "-4"
1248       ]
1249
1250       arguments = {
1251         "-H" = "$ping_address$"
1252         "-w" = "$ping_wrta$,$ping_wpl$%"
1253         "-c" = "$ping_crta$,$ping_cpl$%"
1254         "-p" = "$ping_packets$"
1255         "-t" = "$ping_timeout$"
1256       }
1257
1258       vars.ping_address = "$address$"
1259       vars.ping_wrta = 100
1260       vars.ping_wpl = 5
1261       vars.ping_crta = 200
1262       vars.ping_cpl = 15
1263       vars.ping_packets = 5
1264       vars.ping_timeout = 0
1265     }
1266
1267 Custom attribute names used at runtime must be enclosed in two `$` signs, e.g.
1268 `$address$`. When using the `$` sign as single character, you need to escape
1269 it with an additional dollar sign (`$$`). This example also makes use of the
1270 [command arguments](#command-arguments) passed to the command line. `-4` must
1271 be added as additional array key.
1272
1273 ### <a id="runtime-custom-attributes-evaluation-order"></a> Runtime Custom Attributes Evaluation Order
1274
1275 When executing commands Icinga 2 checks the following objects in this order to look
1276 up custom attributes and their respective values:
1277
1278 1. User object (only for notifications)
1279 2. Service object
1280 3. Host object
1281 4. Command object
1282 5. Global custom attributes in the `vars` constant
1283
1284 This execution order allows you to define default values for custom attributes
1285 in your command objects. The `my-ping` command shown above uses this to set
1286 default values for some of the latency thresholds and timeouts.
1287
1288 When using the `my-ping` command you can override some or all of the custom
1289 attributes in the service definition like this:
1290
1291     object Service "ping" {
1292       host_name = "localhost"
1293       check_command = "my-ping"
1294
1295       vars.ping_packets = 10 // Overrides the default value of 5 given in the command
1296     }
1297
1298 If a custom attribute isn't defined anywhere an empty value is used and a warning is
1299 emitted to the Icinga 2 log.
1300
1301 > **Best Practice**
1302 >
1303 > By convention every host should have an `address` attribute. Hosts
1304 > which have an IPv6 address should also have an `address6` attribute.
1305
1306 ### <a id="runtime-custom-attribute-env-vars"></a> Runtime Custom Attributes as Environment Variables
1307
1308 The `env` command object attribute specifies a list of environment variables with values calculated
1309 from either runtime macros or custom attributes which should be exported as environment variables
1310 prior to executing the command.
1311
1312 This is useful for example for hiding sensitive information on the command line output
1313 when passing credentials to database checks:
1314
1315     object CheckCommand "mysql-health" {
1316       import "plugin-check-command"
1317
1318       command = [
1319         PluginDir + "/check_mysql"
1320       ]
1321
1322       arguments = {
1323         "-H" = "$mysql_address$"
1324         "-d" = "$mysql_database$"
1325       }
1326
1327       vars.mysql_address = "$address$"
1328       vars.mysql_database = "icinga"
1329       vars.mysql_user = "icinga_check"
1330       vars.mysql_pass = "password"
1331
1332       env.MYSQLUSER = "$mysql_user$"
1333       env.MYSQLPASS = "$mysql_pass$"
1334     }
1335
1336 ### <a id="multiple-host-addresses-custom-attributes"></a> Multiple Host Addresses using Custom Attributes
1337
1338 The following example defines a `Host` with three different interface addresses defined as
1339 custom attributes in the `vars` dictionary. The `if-eth0` and `if-eth1` services will import
1340 these values into the `address` custom attribute. This attribute is available through the
1341 generic `$address$` runtime macro.
1342
1343     object Host "multi-ip" {
1344       check_command = "dummy"
1345       vars.address_lo = "127.0.0.1"
1346       vars.address_eth0 = "10.0.0.10"
1347       vars.address_eth1 = "192.168.1.10"
1348     }
1349
1350     apply Service "if-eth0" {
1351       import "generic-service"
1352
1353       vars.address = "$host.vars.address_eth0$"
1354       check_command = "my-generic-interface-check"
1355
1356       assign where host.vars.address_eth0 != ""
1357     }
1358
1359     apply Service "if-eth1" {
1360       import "generic-service"
1361
1362       vars.address = "$host.vars.address_eth1$"
1363       check_command = "my-generic-interface-check"
1364
1365       assign where host.vars.address_eth1 != ""
1366     }
1367
1368     object CheckCommand "my-generic-interface-check" {
1369       import "plugin-check-command"
1370
1371       command = "echo \"This would be the service $service.description$ using the address value: $address$\""
1372     }
1373
1374 The `CheckCommand` object is just an example to help you with testing and
1375 understanding the different custom attributes and runtime macros.
1376
1377 ### <a id="modified-attributes"></a> Modified Attributes
1378
1379 Icinga 2 allows you to modify defined object attributes at runtime different to
1380 the local configuration object attributes. These modified attributes are
1381 stored as bit-shifted-value and made available in backends. Icinga 2 stores
1382 modified attributes in its state file and restores them on restart.
1383
1384 Modified Attributes can be reset using external commands.
1385
1386
1387 ## <a id="runtime-macros"></a> Runtime Macros
1388
1389 Next to custom attributes there are additional runtime macros made available by Icinga 2.
1390 These runtime macros reflect the current object state and may change over time while
1391 custom attributes are configured statically (but can be modified at runtime using
1392 external commands).
1393
1394 ### <a id="runtime-macro-evaluation-order"></a> Runtime Macro Evaluation Order
1395
1396 Custom attributes can be accessed at [runtime](#runtime-custom-attributes) using their
1397 identifier omitting the `vars.` prefix.
1398 There are special cases when those custom attributes are not set and Icinga 2 provides
1399 a fallback to existing object attributes for example `host.address`.
1400
1401 In the following example the `$address$` macro will be resolved with the value of `vars.address`.
1402
1403     object Host "localhost" {
1404       import "generic-host"
1405       check_command = "my-host-macro-test"
1406       address = "127.0.0.1"
1407       vars.address = "127.2.2.2"
1408     }
1409
1410     object CheckCommand "my-host-macro-test" {
1411       command = "echo \"address: $address$ host.address: $host.address$ host.vars.address: $host.vars.address$\""
1412     }
1413
1414 The check command output will look like
1415
1416     "address: 127.2.2.2 host.address: 127.0.0.1 host.vars.address: 127.2.2.2"
1417
1418 If you alter the host object and remove the `vars.address` line, Icinga 2 will fail to look up `$address$` in the
1419 custom attributes dictionary and then look for the host object's attribute.
1420
1421 The check command output will change to
1422
1423     "address: 127.0.0.1 host.address: 127.0.0.1 host.vars.address: "
1424
1425
1426 The same example can be defined for services overriding the `address` field based on a specific host custom attribute.
1427
1428     object Host "localhost" {
1429       import "generic-host"
1430       address = "127.0.0.1"
1431       vars.macro_address = "127.3.3.3"
1432     }
1433
1434     apply Service "my-macro-test" to Host {
1435       import "generic-service"
1436       check_command = "my-service-macro-test"
1437       vars.address = "$host.vars.macro_address$"
1438
1439       assign where host.address
1440     }
1441
1442     object CheckCommand "my-service-macro-test" {
1443       command = "echo \"address: $address$ host.address: $host.address$ host.vars.macro_address: $host.vars.macro_address$ service.vars.address: $service.vars.address$\""
1444     }
1445
1446 When the service check is executed the output looks like
1447
1448     "address: 127.3.3.3 host.address: 127.0.0.1 host.vars.macro_address: 127.3.3.3 service.vars.address: 127.3.3.3"
1449
1450 That way you can easily override existing macros being accessed by their short name like `$address$` and refrain
1451 from defining multiple check commands (one for `$address$` and one for `$host.vars.macro_address$`).
1452
1453
1454 ### <a id="host-runtime-macros"></a> Host Runtime Macros
1455
1456 The following host custom attributes are available in all commands that are executed for
1457 hosts or services:
1458
1459   Name                         | Description
1460   -----------------------------|--------------
1461   host.name                    | The name of the host object.
1462   host.display_name            | The value of the `display_name` attribute.
1463   host.state                   | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
1464   host.state_id                | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
1465   host.state_type              | The host's current state type. Can be one of `SOFT` and `HARD`.
1466   host.check_attempt           | The current check attempt number.
1467   host.max_check_attempts      | The maximum number of checks which are executed before changing to a hard state.
1468   host.last_state              | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
1469   host.last_state_id           | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
1470   host.last_state_type         | The host's previous state type. Can be one of `SOFT` and `HARD`.
1471   host.last_state_change       | The last state change's timestamp.
1472   host.duration_sec            | The time since the last state change.
1473   host.latency                 | The host's check latency.
1474   host.execution_time          | The host's check execution time.
1475   host.output                  | The last check's output.
1476   host.perfdata                | The last check's performance data.
1477   host.last_check              | The timestamp when the last check was executed.
1478   host.num_services            | Number of services associated with the host.
1479   host.num_services_ok         | Number of services associated with the host which are in an `OK` state.
1480   host.num_services_warning    | Number of services associated with the host which are in a `WARNING` state.
1481   host.num_services_unknown    | Number of services associated with the host which are in an `UNKNOWN` state.
1482   host.num_services_critical   | Number of services associated with the host which are in a `CRITICAL` state.
1483
1484 ### <a id="service-runtime-macros"></a> Service Runtime Macros
1485
1486 The following service macros are available in all commands that are executed for
1487 services:
1488
1489   Name                       | Description
1490   ---------------------------|--------------
1491   service.name               | The short name of the service object.
1492   service.display_name       | The value of the `display_name` attribute.
1493   service.check_command      | The short name of the command along with any arguments to be used for the check.
1494   service.state              | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
1495   service.state_id           | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
1496   service.state_type         | The service's current state type. Can be one of `SOFT` and `HARD`.
1497   service.check_attempt      | The current check attempt number.
1498   service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
1499   service.last_state         | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
1500   service.last_state_id      | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
1501   service.last_state_type    | The service's previous state type. Can be one of `SOFT` and `HARD`.
1502   service.last_state_change  | The last state change's timestamp.
1503   service.duration_sec       | The time since the last state change.
1504   service.latency            | The service's check latency.
1505   service.execution_time     | The service's check execution time.
1506   service.output             | The last check's output.
1507   service.perfdata           | The last check's performance data.
1508   service.last_check         | The timestamp when the last check was executed.
1509
1510 ### <a id="command-runtime-macros"></a> Command Runtime Macros
1511
1512 The following custom attributes are available in all commands:
1513
1514   Name                   | Description
1515   -----------------------|--------------
1516   command.name           | The name of the command object.
1517
1518 ### <a id="user-runtime-macros"></a> User Runtime Macros
1519
1520 The following custom attributes are available in all commands that are executed for
1521 users:
1522
1523   Name                   | Description
1524   -----------------------|--------------
1525   user.name              | The name of the user object.
1526   user.display_name      | The value of the display_name attribute.
1527
1528 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
1529
1530   Name                   | Description
1531   -----------------------|--------------
1532   notification.type      | The type of the notification.
1533   notification.author    | The author of the notification comment, if existing.
1534   notification.comment   | The comment of the notification, if existing.
1535
1536 ### <a id="global-runtime-macros"></a> Global Runtime Macros
1537
1538 The following macros are available in all executed commands:
1539
1540   Name                   | Description
1541   -----------------------|--------------
1542   icinga.timet           | Current UNIX timestamp.
1543   icinga.long_date_time  | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
1544   icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
1545   icinga.date            | Current date. Example: `2014-01-03`
1546   icinga.time            | Current time including timezone information. Example: `11:23:08 +0000`
1547   icinga.uptime          | Current uptime of the Icinga 2 process.
1548
1549 The following macros provide global statistics:
1550
1551   Name                              | Description
1552   ----------------------------------|--------------
1553   icinga.num_services_ok            | Current number of services in state 'OK'.
1554   icinga.num_services_warning       | Current number of services in state 'Warning'.
1555   icinga.num_services_critical      | Current number of services in state 'Critical'.
1556   icinga.num_services_unknown       | Current number of services in state 'Unknown'.
1557   icinga.num_services_pending       | Current number of pending services.
1558   icinga.num_services_unreachable   | Current number of unreachable services.
1559   icinga.num_services_flapping      | Current number of flapping services.
1560   icinga.num_services_in_downtime   | Current number of services in downtime.
1561   icinga.num_services_acknowledged  | Current number of acknowledged service problems.
1562   icinga.num_hosts_up               | Current number of hosts in state 'Up'.
1563   icinga.num_hosts_down             | Current number of hosts in state 'Down'.
1564   icinga.num_hosts_unreachable      | Current number of unreachable hosts.
1565   icinga.num_hosts_flapping         | Current number of flapping hosts.
1566   icinga.num_hosts_in_downtime      | Current number of hosts in downtime.
1567   icinga.num_hosts_acknowledged     | Current number of acknowledged host problems.
1568
1569
1570 ## <a id="check-result-freshness"></a> Check Result Freshness
1571
1572 In Icinga 2 active check freshness is enabled by default. It is determined by the
1573 `check_interval` attribute and no incoming check results in that period of time.
1574
1575     threshold = last check execution time + check interval
1576
1577 Passive check freshness is calculated from the `check_interval` attribute if set.
1578
1579     threshold = last check result time + check interval
1580
1581 If the freshness checks are invalid, a new check is executed defined by the
1582 `check_command` attribute.
1583
1584
1585 ## <a id="check-flapping"></a> Check Flapping
1586
1587 The flapping algorithm used in Icinga 2 does not store the past states but
1588 calculcates the flapping threshold from a single value based on counters and
1589 half-life values. Icinga 2 compares the value with a single flapping threshold
1590 configuration attribute named `flapping_threshold`.
1591
1592 Flapping detection can be enabled or disabled using the `enable_flapping` attribute.
1593
1594
1595 ## <a id="volatile-services"></a> Volatile Services
1596
1597 By default all services remain in a non-volatile state. When a problem
1598 occurs, the `SOFT` state applies and once `max_check_attempts` attribute
1599 is reached with the check counter, a `HARD` state transition happens.
1600 Notifications are only triggered by `HARD` state changes and are then
1601 re-sent defined by the `interval` attribute.
1602
1603 It may be reasonable to have a volatile service which stays in a `HARD`
1604 state type if the service stays in a `NOT-OK` state. That way each
1605 service recheck will automatically trigger a notification unless the
1606 service is acknowledged or in a scheduled downtime.
1607
1608
1609 ## <a id="external-commands"></a> External Commands
1610
1611 Icinga 2 provides an external command pipe for processing commands
1612 triggering specific actions (for example rescheduling a service check
1613 through the web interface).
1614
1615 In order to enable the `ExternalCommandListener` configuration use the
1616 following command and restart Icinga 2 afterwards:
1617
1618     # icinga2-enable-feature command
1619
1620 Icinga 2 creates the command pipe file as `/var/run/icinga2/cmd/icinga2.cmd`
1621 using the default configuration.
1622
1623 Web interfaces and other Icinga addons are able to send commands to
1624 Icinga 2 through the external command pipe, for example for rescheduling
1625 a forced service check:
1626
1627     # /bin/echo "[`date +%s`] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;`date +%s`" >> /var/run/icinga2/cmd/icinga2.cmd
1628
1629     # tail -f /var/log/messages
1630
1631     Oct 17 15:01:25 icinga-server icinga2: Executing external command: [1382014885] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382014885
1632     Oct 17 15:01:25 icinga-server icinga2: Rescheduling next check for service 'ping4'
1633
1634
1635 ### <a id="external-command-list"></a> External Command List
1636
1637 A list of currently supported external commands can be found [here](#external-commands-list-detail).
1638
1639 Detailed information on the commands and their required parameters can be found
1640 on the [Icinga 1.x documentation](http://docs.icinga.org/latest/en/extcommands2.html).
1641
1642 ## <a id="logging"></a> Logging
1643
1644 Icinga 2 supports three different types of logging:
1645
1646 * File logging
1647 * Syslog (on *NIX-based operating systems)
1648 * Console logging (`STDOUT` on tty)
1649
1650 You can enable additional loggers using the `icinga2-enable-feature`
1651 and `icinga2-disable-feature` commands to configure loggers:
1652
1653 Feature  | Description
1654 ---------|------------
1655 debuglog | Debug log (path: `/var/log/icinga2/debug.log`, severity: `debug` or higher)
1656 mainlog  | Main log (path: `/var/log/icinga2/icinga2.log`, severity: `information` or higher)
1657 syslog   | Syslog (severity: `warning` or higher)
1658
1659 By default file the `mainlog` feature is enabled. When running Icinga 2
1660 on a terminal log messages with severity `information` or higher are
1661 written to the console.
1662
1663
1664 ## <a id="performance-data"></a> Performance Data
1665
1666 When a host or service check is executed plugins should provide so-called
1667 `performance data`. Next to that additional check performance data
1668 can be fetched using Icinga 2 runtime macros such as the check latency
1669 or the current service state (or additional custom attributes).
1670
1671 The performance data can be passed to external applications which aggregate and
1672 store them in their backends. These tools usually generate graphs for historical
1673 reporting and trending.
1674
1675 Well-known addons processing Icinga performance data are PNP4Nagios,
1676 inGraph and Graphite.
1677
1678 ### <a id="writing-performance-data-files"></a> Writing Performance Data Files
1679
1680 PNP4Nagios, inGraph and Graphios use performance data collector daemons to fetch
1681 the current performance files for their backend updates.
1682
1683 Therefore the Icinga 2 `PerfdataWriter` object allows you to define
1684 the output template format for host and services backed with Icinga 2
1685 runtime vars.
1686
1687     host_format_template = "DATATYPE::HOSTPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tHOSTPERFDATA::$host.perfdata$\tHOSTCHECKCOMMAND::$host.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$"
1688     service_format_template = "DATATYPE::SERVICEPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tSERVICEDESC::$service.name$\tSERVICEPERFDATA::$service.perfdata$\tSERVICECHECKCOMMAND::$service.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$\tSERVICESTATE::$service.state$\tSERVICESTATETYPE::$service.statetype$"
1689
1690 The default templates are already provided with the Icinga 2 feature configuration
1691 which can be enabled using
1692
1693     # icinga2-enable-feature perfdata
1694
1695 By default all performance data files are rotated in a 15 seconds interval into
1696 the `/var/spool/icinga2/perfdata/` directory as `host-perfdata.<timestamp>` and
1697 `service-perfdata.<timestamp>`.
1698 External collectors need to parse the rotated performance data files and then
1699 remove the processed files.
1700
1701 ### <a id="graphite-carbon-cache-writer"></a> Graphite Carbon Cache Writer
1702
1703 While there are some Graphite collector scripts and daemons like Graphios available for
1704 Icinga 1.x it's more reasonable to directly process the check and plugin performance
1705 in memory in Icinga 2. Once there are new metrics available, Icinga 2 will directly
1706 write them to the defined Graphite Carbon daemon tcp socket.
1707
1708 You can enable the feature using
1709
1710     # icinga2-enable-feature graphite
1711
1712 By default the `GraphiteWriter` object expects the Graphite Carbon Cache to listen at
1713 `127.0.0.1` on port `2003`.
1714
1715 The current naming schema is
1716
1717     icinga.<hostname>.<metricname>
1718     icinga.<hostname>.<servicename>.<metricname>
1719
1720 To make sure Icinga 2 writes a valid label into Graphite some characters are replaced
1721 with `_` in the target name:
1722
1723     \/.-  (and space)
1724
1725 The resulting name in Graphite might look like:
1726
1727     www-01 / http-cert / response time
1728     icinga.www_01.http_cert.response_time
1729
1730 In addition to the performance data retrieved from the check plugin, Icinga 2 sends
1731 internal check statistic data to Graphite:
1732
1733   metric             | description
1734   -------------------|------------------------------------------
1735   current_attempt    | current check attempt
1736   max_check_attempts | maximum check attempts until the hard state is reached
1737   reachable          | checked object is reachable
1738   execution_time     | check execution time
1739   latency            | check latency
1740   state              | current state of the checked object
1741   state_type         | 0=SOFT, 1=HARD state
1742
1743 The following example illustrates how to configure the storage-schemas for Graphite Carbon
1744 Cache. Please make sure that the order is correct because the first match wins.
1745
1746     [icinga_internals]
1747     pattern = ^icinga\..*\.(max_check_attempts|reachable|current_attempt|execution_time|latency|state|state_type)
1748     retentions = 5m:7d
1749
1750     [icinga_default]
1751     # intervals like PNP4Nagios uses them per default
1752     pattern = ^icinga\.
1753     retentions = 1m:2d,5m:10d,30m:90d,360m:4y
1754
1755 ## <a id="status-data"></a> Status Data
1756
1757 Icinga 1.x writes object configuration data and status data in a cyclic
1758 interval to its `objects.cache` and `status.dat` files. Icinga 2 provides
1759 the `StatusDataWriter` object which dumps all configuration objects and
1760 status updates in a regular interval.
1761
1762     # icinga2-enable-feature statusdata
1763
1764 Icinga 1.x Classic UI requires this data set as part of its backend.
1765
1766 > **Note**
1767 >
1768 > If you are not using any web interface or addon which uses these files
1769 > you can safely disable this feature.
1770
1771
1772 ## <a id="compat-logging"></a> Compat Logging
1773
1774 The Icinga 1.x log format is considered being the `Compat Log`
1775 in Icinga 2 provided with the `CompatLogger` object.
1776
1777 These logs are not only used for informational representation in
1778 external web interfaces parsing the logs, but also to generate
1779 SLA reports and trends in Icinga 1.x Classic UI. Furthermore the
1780 [Livestatus](#livestatus) feature uses these logs for answering queries to
1781 historical tables.
1782
1783 The `CompatLogger` object can be enabled with
1784
1785     # icinga2-enable-feature compatlog
1786
1787 By default, the Icinga 1.x log file called `icinga.log` is located
1788 in `/var/log/icinga2/compat`. Rotated log files are moved into
1789 `var/log/icinga2/compat/archives`.
1790
1791 The format cannot be changed without breaking compatibility to
1792 existing log parsers.
1793
1794     # tail -f /var/log/icinga2/compat/icinga.log
1795
1796     [1382115688] LOG ROTATION: HOURLY
1797     [1382115688] LOG VERSION: 2.0
1798     [1382115688] HOST STATE: CURRENT;localhost;UP;HARD;1;
1799     [1382115688] SERVICE STATE: CURRENT;localhost;disk;WARNING;HARD;1;
1800     [1382115688] SERVICE STATE: CURRENT;localhost;http;OK;HARD;1;
1801     [1382115688] SERVICE STATE: CURRENT;localhost;load;OK;HARD;1;
1802     [1382115688] SERVICE STATE: CURRENT;localhost;ping4;OK;HARD;1;
1803     [1382115688] SERVICE STATE: CURRENT;localhost;ping6;OK;HARD;1;
1804     [1382115688] SERVICE STATE: CURRENT;localhost;processes;WARNING;HARD;1;
1805     [1382115688] SERVICE STATE: CURRENT;localhost;ssh;OK;HARD;1;
1806     [1382115688] SERVICE STATE: CURRENT;localhost;users;OK;HARD;1;
1807     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;disk;1382115705
1808     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;http;1382115705
1809     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;load;1382115705
1810     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382115705
1811     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ping6;1382115705
1812     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;processes;1382115705
1813     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ssh;1382115705
1814     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;users;1382115705
1815     [1382115731] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;localhost;ping6;2;critical test|
1816     [1382115731] SERVICE ALERT: localhost;ping6;CRITICAL;SOFT;2;critical test
1817
1818
1819
1820
1821 ## <a id="db-ido"></a> DB IDO
1822
1823 The IDO (Icinga Data Output) modules for Icinga 2 take care of exporting all
1824 configuration and status information into a database. The IDO database is used
1825 by a number of projects including Icinga Web 1.x and 2.
1826
1827 Details on the installation can be found in the [Getting Started](#configuring-ido)
1828 chapter. Details on the configuration can be found in the
1829 [IdoMysqlConnection](#objecttype-idomysqlconnection) and
1830 [IdoPgsqlConnection](#objecttype-idoPgsqlconnection)
1831 object configuration documentation.
1832 The DB IDO feature supports [High Availability](##high-availability-db-ido) in
1833 the Icinga 2 cluster.
1834
1835 The following example query checks the health of the current Icinga 2 instance
1836 writing its current status to the DB IDO backend table `icinga_programstatus`
1837 every 10 seconds. By default it checks 60 seconds into the past which is a reasonable
1838 amount of time - adjust it for your requirements. If the condition is not met,
1839 the query returns an empty result.
1840
1841 > **Tip**
1842 >
1843 > Use [check plugins](#plugins) to monitor the backend.
1844
1845 Replace the `default` string with your instance name, if different.
1846
1847 Example for MySQL:
1848
1849     # mysql -u root -p icinga -e "SELECT status_update_time FROM icinga_programstatus ps
1850       JOIN icinga_instances i ON ps.instance_id=i.instance_id
1851       WHERE (UNIX_TIMESTAMP(ps.status_update_time) > UNIX_TIMESTAMP(NOW())-60)
1852       AND i.instance_name='default';"
1853
1854     +---------------------+
1855     | status_update_time  |
1856     +---------------------+
1857     | 2014-05-29 14:29:56 |
1858     +---------------------+
1859
1860
1861 Example for PostgreSQL:
1862
1863     # export PGPASSWORD=icinga; psql -U icinga -d icinga -c "SELECT ps.status_update_time FROM icinga_programstatus AS ps
1864       JOIN icinga_instances AS i ON ps.instance_id=i.instance_id
1865       WHERE ((SELECT extract(epoch from status_update_time) FROM icinga_programstatus) > (SELECT extract(epoch from now())-60))
1866       AND i.instance_name='default'";
1867
1868     status_update_time
1869     ------------------------
1870      2014-05-29 15:11:38+02
1871     (1 Zeile)
1872
1873
1874 A detailed list on the available table attributes can be found in the [DB IDO Schema documentation](#schema-db-ido).
1875
1876
1877 ## <a id="livestatus"></a> Livestatus
1878
1879 The [MK Livestatus](http://mathias-kettner.de/checkmk_livestatus.html) project
1880 implements a query protocol that lets users query their Icinga instance for
1881 status information. It can also be used to send commands.
1882
1883 Details on the installation can be found in the [Getting Started](#setting-up-livestatus)
1884 chapter.
1885
1886 ### <a id="livestatus-sockets"></a> Livestatus Sockets
1887
1888 Other to the Icinga 1.x Addon, Icinga 2 supports two socket types
1889
1890 * Unix socket (default)
1891 * TCP socket
1892
1893 Details on the configuration can be found in the [LivestatusListener](#objecttype-livestatuslistener)
1894 object configuration.
1895
1896 ### <a id="livestatus-get-queries"></a> Livestatus GET Queries
1897
1898 > **Note**
1899 >
1900 > All Livestatus queries require an additional empty line as query end identifier.
1901 > The `unixcat` tool is either available by the MK Livestatus project or as separate
1902 > binary.
1903
1904 There also is a Perl module available in CPAN for accessing the Livestatus socket
1905 programmatically: [Monitoring::Livestatus](http://search.cpan.org/~nierlein/Monitoring-Livestatus-0.74/)
1906
1907
1908 Example using the unix socket:
1909
1910     # echo -e "GET services\n" | unixcat /var/run/icinga2/cmd/livestatus
1911
1912 Example using the tcp socket listening on port `6558`:
1913
1914     # echo -e 'GET services\n' | netcat 127.0.0.1 6558
1915
1916     # cat servicegroups <<EOF
1917     GET servicegroups
1918
1919     EOF
1920
1921     (cat servicegroups; sleep 1) | netcat 127.0.0.1 6558
1922
1923
1924 ### <a id="livestatus-command-queries"></a> Livestatus COMMAND Queries
1925
1926 A list of available external commands and their parameters can be found [here](#external-commands-list-detail)
1927
1928     $ echo -e 'COMMAND <externalcommandstring>' | netcat 127.0.0.1 6558
1929
1930
1931 ### <a id="livestatus-filters"></a> Livestatus Filters
1932
1933 and, or, negate
1934
1935   Operator  | Negate   | Description
1936   ----------|------------------------
1937    =        | !=       | Equality
1938    ~        | !~       | Regex match
1939    =~       | !=~      | Equality ignoring case
1940    ~~       | !~~      | Regex ignoring case
1941    <        |          | Less than
1942    >        |          | Greater than
1943    <=       |          | Less than or equal
1944    >=       |          | Greater than or equal
1945
1946
1947 ### <a id="livestatus-stats"></a> Livestatus Stats
1948
1949 Schema: "Stats: aggregatefunction aggregateattribute"
1950
1951   Aggregate Function | Description
1952   -------------------|--------------
1953   sum                | &nbsp;
1954   min                | &nbsp;
1955   max                | &nbsp;
1956   avg                | sum / count
1957   std                | standard deviation
1958   suminv             | sum (1 / value)
1959   avginv             | suminv / count
1960   count              | ordinary default for any stats query if not aggregate function defined
1961
1962 Example:
1963
1964     GET hosts
1965     Filter: has_been_checked = 1
1966     Filter: check_type = 0
1967     Stats: sum execution_time
1968     Stats: sum latency
1969     Stats: sum percent_state_change
1970     Stats: min execution_time
1971     Stats: min latency
1972     Stats: min percent_state_change
1973     Stats: max execution_time
1974     Stats: max latency
1975     Stats: max percent_state_change
1976     OutputFormat: json
1977     ResponseHeader: fixed16
1978
1979 ### <a id="livestatus-output"></a> Livestatus Output
1980
1981 * CSV
1982
1983 CSV Output uses two levels of array separators: The members array separator
1984 is a comma (1st level) while extra info and host|service relation separator
1985 is a pipe (2nd level).
1986
1987 Separators can be set using ASCII codes like:
1988
1989     Separators: 10 59 44 124
1990
1991 * JSON
1992
1993 Default separators.
1994
1995 ### <a id="livestatus-error-codes"></a> Livestatus Error Codes
1996
1997   Code      | Description
1998   ----------|--------------
1999   200       | OK
2000   404       | Table does not exist
2001   452       | Exception on query
2002
2003 ### <a id="livestatus-tables"></a> Livestatus Tables
2004
2005   Table         | Join      |Description
2006   --------------|-----------|----------------------------
2007   hosts         | &nbsp;    | host config and status attributes, services counter
2008   hostgroups    | &nbsp;    | hostgroup config, status attributes and host/service counters
2009   services      | hosts     | service config and status attributes
2010   servicegroups | &nbsp;    | servicegroup config, status attributes and service counters
2011   contacts      | &nbsp;    | contact config and status attributes
2012   contactgroups | &nbsp;    | contact config, members
2013   commands      | &nbsp;    | command name and line
2014   status        | &nbsp;    | programstatus, config and stats
2015   comments      | services  | status attributes
2016   downtimes     | services  | status attributes
2017   timeperiods   | &nbsp;    | name and is inside flag
2018   endpoints     | &nbsp;    | config and status attributes
2019   log           | services, hosts, contacts, commands | parses [compatlog](#objecttype-compatlogger) and shows log attributes
2020   statehist     | hosts, services | parses [compatlog](#objecttype-compatlogger) and aggregates state change attributes
2021
2022 The `commands` table is populated with `CheckCommand`, `EventCommand` and `NotificationCommand` objects.
2023
2024 A detailed list on the available table attributes can be found in the [Livestatus Schema documentation](#schema-livestatus).
2025
2026
2027 ## <a id="check-result-files"></a> Check Result Files
2028
2029 Icinga 1.x writes its check result files to a temporary spool directory
2030 where they are processed in a regular interval.
2031 While this is extremely inefficient in performance regards it has been
2032 rendered useful for passing passive check results directly into Icinga 1.x
2033 skipping the external command pipe.
2034
2035 Several clustered/distributed environments and check-aggregation addons
2036 use that method. In order to support step-by-step migration of these
2037 environments, Icinga 2 ships the `CheckResultReader` object.
2038
2039 There is no feature configuration available, but it must be defined
2040 on-demand in your Icinga 2 objects configuration.
2041
2042     object CheckResultReader "reader" {
2043       spool_dir = "/data/check-results"
2044     }