granicus.if.org Git - icinga2/blob - doc/3-monitoring-basics.md

   1 # <a id="monitoring-basics"></a> Monitoring Basics
   2
   3 This part of the Icinga 2 documentation provides an overview of all the basic
   4 monitoring concepts you need to know to run Icinga 2.
   5
   6 ## <a id="hosts-services"></a> Hosts and Services
   7
   8 Icinga 2 can be used to monitor the availability of hosts and services. Hosts
   9 and services can be virtually anything which can be checked in some way:
  10
  11 * Network services (HTTP, SMTP, SNMP, SSH, etc.)
  12 * Printers
  13 * Switches / routers
  14 * Temperature sensors
  15 * Other local or network-accessible services
  16
  17 Host objects provide a mechanism to group services that are running
  18 on the same physical device.
  19
  20 Here is an example of a host object which defines two child services:
  21
  22     object Host "my-server1" {
  23       address = "10.0.0.1"
  24       check_command = "hostalive"
  25     }
  26
  27     object Service "ping4" {
  28       host_name = "my-server1"
  29       check_command = "ping4"
  30     }
  31
  32     object Service "http" {
  33       host_name = "my-server1"
  34       check_command = "http"
  35     }
  36
  37 The example creates two services `ping4` and `http` which belong to the
  38 host `my-server1`.
  39
  40 It also specifies that the host should perform its own check using the `hostalive`
  41 check command.
  42
  43 The `address` attribute is used by check commands to determine which network
  44 address is associated with the host object.
  45
  46 Details on troubleshooting check problems can be found [here](#troubleshooting).
  47
  48 ### <a id="host-states"></a> Host States
  49
  50 Hosts can be in any of the following states:
  51
  52   Name        | Description
  53   ------------|--------------
  54   UP          | The host is available.
  55   DOWN        | The host is unavailable.
  56
  57 ### <a id="service-states"></a> Service States
  58
  59 Services can be in any of the following states:
  60
  61   Name        | Description
  62   ------------|--------------
  63   OK          | The service is working properly.
  64   WARNING     | The service is experiencing some problems but is still considered to be in working condition.
  65   CRITICAL    | The service is in a critical state.
  66   UNKNOWN     | The check could not determine the service's state.
  67
  68 ### <a id="hard-soft-states"></a> Hard and Soft States
  69
  70 When detecting a problem with a host/service Icinga re-checks the object a number of
  71 times (based on the `max_check_attempts` and `retry_interval` settings) before sending
  72 notifications. This ensures that no unnecessary notifications are sent for
  73 transient failures. During this time the object is in a `SOFT` state.
  74
  75 After all re-checks have been executed and the object is still in a non-OK
  76 state the host/service switches to a `HARD` state and notifications are sent.
  77
  78   Name        | Description
  79   ------------|--------------
  80   HARD        | The host/service's state hasn't recently changed.
  81   SOFT        | The host/service has recently changed state and is being re-checked.
  82
  83 ### <a id="host-service-checks"></a> Host and Service Checks
  84
  85 Hosts and Services determine their state from a check result returned from a check
  86 execution to the Icinga 2 application. By default the `generic-host` example template
  87 will define `hostalive` as host check. If your host is unreachable for ping, you should
  88 consider using a different check command, for instance the `http` check command, or if
  89 there is no check available, the `dummy` check command.
  90
  91     object Host "uncheckable-host" {
  92       check_command = "dummy"
  93       vars.dummy_state = 1
  94       vars.dummy_text = "Pretending to be OK."
  95     }
  96
  97 Service checks could also use a `dummy` check, but the common strategy is to
  98 [integrate an existing plugin](#command-plugin-integration) as
  99 [check command](#check-commands) and [reference](#command-passing-parameters)
 100 that in your [Service](#objecttype-service) object definition.
 101
 102 ## <a id="configuration-best-practice"></a> Configuration Best Practice
 103
 104 The [Getting Started](#getting-started) chapter already introduced various aspects
 105 of the Icinga 2 configuration language. If you are ready to configure additional
 106 hosts, services, notifications, dependencies, etc, you should think about the
 107 requirements first and then decide for a possible strategy.
 108
 109 There are many ways of creating Icinga 2 configuration objects:
 110
 111 * Manually with your preferred editor, for example vi(m), nano, notepad, etc.
 112 * Generated by a configuration management tool such as Puppet, Chef, Ansible, etc.
 113 * A configuration addon for Icinga 2
 114 * A custom exporter script from your CMDB or inventory tool
 115 * your own.
 116
 117 In order to find the best strategy for your own configuration, ask yourself the following questions:
 118
 119 * Do your hosts share a common group of services (for example linux hosts with disk, load, etc checks)?
 120 * Only a small set of users receives notifications and escalations for all hosts/services?
 121
 122 If you can at least answer one of these questions with yes, look for the [apply rules](#using-apply) logic
 123 instead of defining objects on a per host and service basis.
 124
 125 * You are required to define specific configuration for each host/service?
 126 * Does your configuration generation tool already know about the host-service-relationship?
 127
 128 Then you should look for the object specific configuration setting `host_name` etc accordingly.
 129
 130 Finding the best files and directory tree for your configuration is up to you. Make sure that
 131 the [icinga2.conf](#icinga2-conf) configuration file includes them, and then think about:
 132
 133 * tree-based on locations, hostgroups, specific host attributes with sub levels of directories.
 134 * flat `hosts.conf`, `services.conf`, etc files for rule based configuration.
 135 * generated configuration with one file per host and a global configuration for groups, users, etc.
 136 * one big file generated from an external application (probably a bad idea for maintaining changes).
 137 * your own.
 138
 139 In either way of choosing the right strategy you should additionally check the following:
 140
 141 * Are there any specific attributes describing the host/service you could set as `vars` custom attributes?
 142 You can later use them for applying assign/ignore rules, or export them into external interfaces.
 143 * Put hosts into hostgroups, services into servicegroups and use these attributes for your apply rules.
 144 * Use templates to store generic attributes for your objects and apply rules making your configuration more readable.
 145 Details can be found in the [using templates](#using-templates) chapter.
 146 * Apply rules may overlap. Keep a central place (for example, `services.conf` or `notifications.conf`) storing
 147 the configuration instead of defining apply rules deep in your configuration tree.
 148 * Every plugin used as check, notification or event command requires a `Command` definition.
 149 Further details can be looked up in the [check commands](#check-commands) chapter.
 150
 151 If you happen to have further questions, do not hesitate to join the [community support channels](https://support.icinga.org)
 152 and ask community members for their experience and best practices.
 153
 154
 155 ### <a id="object-inheritance-using-templates"></a> Object Inheritance Using Templates
 156
 157 Templates may be used to apply a set of identical attributes to more than one
 158 object:
 159
 160     template Service "generic-service" {
 161       max_check_attempts = 3
 162       check_interval = 5m
 163       retry_interval = 1m
 164       enable_perfdata = true
 165     }
 166
 167     object Service "ping4" {
 168       import "generic-service"
 169
 170       host_name = "localhost"
 171       check_command = "ping4"
 172     }
 173
 174     object Service "ping6" {
 175       import "generic-service"
 176
 177       host_name = "localhost"
 178       check_command = "ping6"
 179     }
 180
 181 In this example the `ping4` and `ping6` services inherit properties from the
 182 template `generic-service`.
 183
 184 Objects as well as templates themselves can import an arbitrary number of
 185 templates. Attributes inherited from a template can be overridden in the
 186 object if necessary.
 187
 188 ### <a id="using-apply"></a> Apply objects based on rules
 189
 190 Instead of assigning each object (`Service`, `Notification`, `Dependency`, `ScheduledDowntime`)
 191 based on attribute identifiers for example `host_name` objects can be [applied](#apply).
 192
 193 Detailed scenario examples are used in their respective chapters, for example
 194 [apply services with custom command arguments](#using-apply-services-command-arguments).
 195
 196 #### <a id="using-apply-services"></a> Apply Services to Hosts
 197
 198     apply Service "load" {
 199       import "generic-service"
 200
 201       check_command = "load"
 202
 203       assign where "linux-server" in host.groups
 204       ignore where host.vars.no_load_check
 205     }
 206
 207 In this example the `load` service will be created as object for all hosts in the `linux-server`
 208 host group. If the `no_load_check` custom attribute is set, the host will be
 209 ignored.
 210
 211 #### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
 212
 213 Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
 214 manner:
 215
 216     apply Notification "mail-noc" to Service {
 217       import "mail-service-notification"
 218       command = "mail-service-notification"
 219       user_groups = [ "noc" ]
 220
 221       assign where service.vars.sla == "24x7"
 222     }
 223
 224 In this example the `mail-noc` notification will be created as object for all services having the
 225 `sla` custom attribute set to `24x7`. The notification command is set to `mail-service-notification`
 226 and all members of the user group `noc` will get notified.
 227
 228 #### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
 229
 230 Detailed examples can be found in the [dependencies](#dependencies) chapter.
 231
 232 ### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
 233
 234 Detailed examples can be found in the [recurring downtimes](#recurring-downtimes) chapter.
 235
 236
 237 ### <a id="groups"></a> Groups
 238
 239 Groups are used for combining hosts, services, and users into
 240 accessible configuration attributes and views in external (web)
 241 interfaces.
 242
 243 Group membership is defined at the respective object itself. If
 244 you have a hostgroup name `windows` for example, and want to assign
 245 specific hosts to this group for later viewing the group on your
 246 alert dashboard, first create the hostgroup:
 247
 248     object HostGroup "windows" {
 249       display_name = "Windows Servers"
 250     }
 251
 252 Then add your hosts to this hostgroup
 253
 254     template Host "windows-server" {
 255       groups += [ "windows" ]
 256     }
 257
 258     object Host "mssql-srv1" {
 259       import "windows-server"
 260
 261       vars.mssql_port = 1433
 262     }
 263
 264     object Host "mssql-srv2" {
 265       import "windows-server"
 266
 267       vars.mssql_port = 1433
 268     }
 269
 270 This can be done for service and user groups the same way. Additionally
 271 the user groups are associated as attributes in `Notification` objects.
 272
 273     object UserGroup "windows-mssql-admins" {
 274       display_name = "Windows MSSQL Admins"
 275     }
 276
 277     template User "generic-windows-mssql-users" {
 278       groups += [ "windows-mssql-admins" ]
 279     }
 280
 281     object User "win-mssql-noc" {
 282       import "generic-windows-mssql-users"
 283
 284       email = "noc@example.com"
 285     }
 286
 287     object User "win-mssql-ops" {
 288       import "generic-windows-mssql-users"
 289
 290       email = "ops@example.com"
 291     }
 292
 293 #### <a id="group-assign"></a> Group Membership Assign
 294
 295 If there is a certain number of hosts, services, or users matching a pattern
 296 it's reasonable to assign the group object to these members.
 297 Details on the `assign where` syntax can be found [here](#apply)
 298
 299     object HostGroup "mssql" {
 300       display_name = "MSSQL Servers"
 301       assign where host.vars.mssql_port
 302     }
 303
 304 In this inherited example from above all hosts with the `vars` attribute `mssql_port`
 305 set will be added as members to the host group `mssql`.
 306
 307 ## <a id="notifications"></a> Notifications
 308
 309 Notifications for service and host problems are an integral part of your
 310 monitoring setup.
 311
 312 When a host or service is in a downtime, a problem has been acknowledged or
 313 the dependency logic determined that the host/service is unreachable, no
 314 notifications are sent. You can configure additional type and state filters
 315 refining the notifications being actually sent.
 316
 317 There are many ways of sending notifications, e.g. by e-mail, XMPP,
 318 IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications.
 319 Instead it relies on external mechanisms such as shell scripts to notify users.
 320
 321 A notification specification requires one or more users (and/or user groups)
 322 who will be notified in case of problems. These users must have all custom
 323 attributes defined which will be used in the `NotificationCommand` on execution.
 324
 325 The user `icingaadmin` in the example below will get notified only on `WARNING` and
 326 `CRITICAL` states and `problem` and `recovery` notification types.
 327
 328     object User "icingaadmin" {
 329       display_name = "Icinga 2 Admin"
 330       enable_notifications = true
 331       states = [ OK, Warning, Critical ]
 332       types = [ Problem, Recovery ]
 333       email = "icinga@localhost"
 334     }
 335
 336 If you don't set the `states` and `types` configuration attributes for the `User`
 337 object, notifications for all states and types will be sent.
 338
 339 Details on troubleshooting notification problems can be found [here](#troubleshooting).
 340
 341 > **Note**
 342 >
 343 > Make sure that the [notification](#features) feature is enabled on your master instance
 344 > in order to execute notification commands.
 345
 346 You should choose which information you (and your notified users) are interested in
 347 case of emergency, and also which information does not provide any value to you and
 348 your environment.
 349
 350 An example notification command is explained [here](#notification-commands).
 351
 352 You can add all shared attributes to a `Notification` template which is inherited
 353 to the defined notifications. That way you'll save duplicated attributes in each
 354 `Notification` object. Attributes can be overridden locally.
 355
 356     template Notification "generic-notification" {
 357       interval = 15m
 358
 359       command = "mail-service-notification"
 360
 361       states = [ Warning, Critical, Unknown ]
 362       types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
 363                 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
 364
 365       period = "24x7"
 366     }
 367
 368 The time period `24x7` is shipped as example configuration with Icinga 2.
 369
 370 Use the `apply` keyword to create `Notification` objects for your services:
 371
 372     apply Notification "mail" to Service {
 373       import "generic-notification"
 374
 375       command = "mail-notification"
 376       users = [ "icingaadmin" ]
 377
 378       assign where service.name == "mysql"
 379     }
 380
 381 Instead of assigning users to notifications, you can also add the `user_groups`
 382 attribute with a list of user groups to the `Notification` object. Icinga 2 will
 383 send notifications to all group members.
 384
 385 ### <a id="notification-escalations"></a> Notification Escalations
 386
 387 When a problem notification is sent and a problem still exists at the time of re-notification
 388 you may want to escalate the problem to the next support level. A different approach
 389 is to configure the default notification by email, and escalate the problem via SMS
 390 if not already solved.
 391
 392 You can define notification start and end times as additional configuration
 393 attributes making the `Notification` object a so-called `notification escalation`.
 394 Using templates you can share the basic notification attributes such as users or the
 395 `interval` (and override them for the escalation then).
 396
 397 Using the example from above, you can define additional users being escalated for SMS
 398 notifications between start and end time.
 399
 400     object User "icinga-oncall-2nd-level" {
 401       display_name = "Icinga 2nd Level"
 402
 403       vars.mobile = "+1 555 424642"
 404     }
 405
 406     object User "icinga-oncall-1st-level" {
 407       display_name = "Icinga 1st Level"
 408
 409       vars.mobile = "+1 555 424642"
 410     }
 411
 412 Define an additional `NotificationCommand` for SMS notifications.
 413
 414 > **Note**
 415 >
 416 > The example is not complete as there are many different SMS providers.
 417 > Please note that sending SMS notifications will require an SMS provider
 418 > or local hardware with a SIM card active.
 419
 420     object NotificationCommand "sms-notification" {
 421        command = [
 422          PluginDir + "/send_sms_notification",
 423          "$mobile$",
 424          "..."
 425     }
 426
 427 The two new notification escalations are added onto the host `localhost`
 428 and its service `ping4` using the `generic-notification` template.
 429 The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
 430 command) after `30m` until `1h`.
 431
 432 > **Note**
 433 >
 434 > The `interval` was set to 15m in the `generic-notification`
 435 > template example. Lower that value in your escalations by using a secondary
 436 > template or by overriding the attribute directly in the `notifications` array
 437 > position for `escalation-sms-2nd-level`.
 438
 439 If the problem does not get resolved nor acknowledged preventing further notifications
 440 the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was
 441 notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
 442
 443     apply Notification "mail" to Service {
 444       import "generic-notification"
 445
 446       command = "mail-notification"
 447       users = [ "icingaadmin" ]
 448
 449       assign where service.name == "ping4"
 450     }
 451
 452     apply Notification "escalation-sms-2nd-level" to Service {
 453       import "generic-notification"
 454
 455       command = "sms-notification"
 456       users = [ "icinga-oncall-2nd-level" ]
 457
 458       times = {
 459         begin = 30m
 460         end = 1h
 461       }
 462
 463       assign where service.name == "ping4"
 464     }
 465
 466     apply Notification "escalation-sms-1st-level" to Service {
 467       import "generic-notification"
 468
 469       command = "sms-notification"
 470       users = [ "icinga-oncall-1st-level" ]
 471
 472       times = {
 473         begin = 1h
 474         end = 2h
 475       }
 476
 477       assign where service.name == "ping4"
 478     }
 479
 480 ### <a id="notification-delay"></a> Notification Delay
 481
 482 Sometimes the problem in question should not be notified when the notification is due
 483 (the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
 484 you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
 485 postpone the first notification for 15 minutes. Leave out the `end` key - if not set,
 486 Icinga 2 will not check against any end time for this notification.
 487
 488     apply Notification "mail" to Service {
 489       import "generic-notification"
 490
 491       command = "mail-notification"
 492       users = [ "icingaadmin" ]
 493
 494       times.begin = 15m // delay first notification
 495
 496       assign where service.name == "ping4"
 497     }
 498
 499 ### <a id="notification-filters-state-type"></a> Notification Filters by State and Type
 500
 501 If there are no notification state and type filter attributes defined at the `Notification`
 502 or `User` object Icinga 2 assumes that all states and types are being notified.
 503
 504 Available state and type filters for notifications are:
 505
 506     template Notification "generic-notification" {
 507
 508       states = [ Warning, Critical, Unknown ]
 509       types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
 510                 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
 511     }
 512
 513 If you are familiar with Icinga 1.x `notification_options` please note that they have been split
 514 into type and state to allow more fine granular filtering for example on downtimes and flapping.
 515 You can filter for acknowledgements and custom notifications too.
 516
 517
 518 ## <a id="timeperiods"></a> Time Periods
 519
 520 Time Periods define time ranges in Icinga where event actions are
 521 triggered, for example whether a service check is executed or not within
 522 the `check_period` attribute. Or a notification should be sent to
 523 users or not, filtered by the `period` and `notification_period`
 524 configuration attributes for `Notification` and `User` objects.
 525
 526 > **Note**
 527 >
 528 > If you are familar with Icinga 1.x - these time period definitions
 529 > are called `legacy timeperiods` in Icinga 2.
 530 >
 531 > An Icinga 2 legacy timeperiod requires the `ITL` provided template
 532 >`legacy-timeperiod`.
 533
 534 The `TimePeriod` attribute `ranges` may contain multiple directives,
 535 including weekdays, days of the month, and calendar dates.
 536 These types may overlap/override other types in your ranges dictionary.
 537
 538 The descending order of precedence is as follows:
 539
 540 * Calendar date (2008-01-01)
 541 * Specific month date (January 1st)
 542 * Generic month date (Day 15)
 543 * Offset weekday of specific month (2nd Tuesday in December)
 544 * Offset weekday (3rd Monday)
 545 * Normal weekday (Tuesday)
 546
 547 If you don't set any `check_period` or `notification_period` attribute
 548 on your configuration objects Icinga 2 assumes `24x7` as time period
 549 as shown below.
 550
 551     object TimePeriod "24x7" {
 552       import "legacy-timeperiod"
 553
 554       display_name = "Icinga 2 24x7 TimePeriod"
 555       ranges = {
 556         "monday"    = "00:00-24:00"
 557         "tuesday"   = "00:00-24:00"
 558         "wednesday" = "00:00-24:00"
 559         "thursday"  = "00:00-24:00"
 560         "friday"    = "00:00-24:00"
 561         "saturday"  = "00:00-24:00"
 562         "sunday"    = "00:00-24:00"
 563       }
 564     }
 565
 566 If your operation staff should only be notified during workhours
 567 create a new timeperiod named `workhours` defining a work day from
 568 09:00 to 17:00.
 569
 570     object TimePeriod "workhours" {
 571       import "legacy-timeperiod"
 572
 573       display_name = "Icinga 2 8x5 TimePeriod"
 574       ranges = {
 575         "monday"    = "09:00-17:00"
 576         "tuesday"   = "09:00-17:00"
 577         "wednesday" = "09:00-17:00"
 578         "thursday"  = "09:00-17:00"
 579         "friday"    = "09:00-17:00"
 580       }
 581     }
 582
 583 Use the `period` attribute to assign time periods to
 584 `Notification` and `Dependency` objects:
 585
 586     object Notification "mail" {
 587       import "generic-notification"
 588
 589       host_name = "localhost"
 590
 591       command = "mail-notification"
 592       users = [ "icingaadmin" ]
 593       period = "workhours"
 594     }
 595
 596
 597 ## <a id="commands"></a> Commands
 598
 599 Icinga 2 uses three different command object types to specify how
 600 checks should be performed, notifications should be sent, and
 601 events should be handled.
 602
 603 ### <a id="command-environment-variables"></a> Environment Variables for Commands
 604
 605 Please check [Runtime Custom Attributes as Environment Variables](#runtime-custom-attribute-env-vars).
 606
 607
 608 ### <a id="check-commands"></a> Check Commands
 609
 610 `CheckCommand` objects define the command line how a check is called.
 611
 612 > **Note**
 613 >
 614 > Make sure that the [checker](#features) feature is enabled in order to
 615 > execute checks.
 616
 617 #### <a id="command-plugin-integration"></a> Integrate the Plugin with a CheckCommand Definition
 618
 619 `CheckCommand` objects require the [ITL template](#itl-plugin-check-command)
 620 `plugin-check-command` to support native plugin based check methods.
 621
 622 Unless you have done so already, download your check plugin and put it
 623 into the `PluginDir` directory. The following example uses the
 624 `check_disk` plugin shipped with the Monitoring Plugins package.
 625
 626 The plugin path and all command arguments are made a list of
 627 double-quoted string arguments for proper shell escaping.
 628
 629 Call the `check_disk` plugin with the `--help` parameter to see
 630 all available options. Our example defines warning (`-w`) and
 631 critical (`-c`) thresholds for the disk usage. Without any
 632 partition defined (`-p`) it will check all local partitions.
 633
 634     icinga@icinga2 $ /usr/lib/nagios/plugins/check_disk --help
 635     ...
 636     This plugin checks the amount of used disk space on a mounted file system
 637     and generates an alert if free space is less than one of the threshold values
 638
 639
 640     Usage:
 641      check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
 642     [-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
 643     [-t timeout] [-u unit] [-v] [-X type] [-N type]
 644     ...
 645
 646 > **Note**
 647 >
 648 > Don't execute plugins as `root` and always use the absolute path to the plugin! Trust us.
 649
 650 Next step is to understand how command parameters are being passed from
 651 a host or service object, and add a `CheckCommand` definition based on these
 652 required parameters and/or default values.
 653
 654 #### <a id="command-passing-parameters"></a> Passing Check Command Parameters from Host or Service
 655
 656 Unlike Icinga 1.x check command parameters are defined as custom attributes
 657 which can be accessed as runtime macros by the executed check command.
 658
 659 Define the default check command custom attribute `disk_wfree` and `disk_cfree`
 660 (freely definable naming schema) and their default threshold values. You can
 661 then use these custom attributes as runtime macros for [command arguments](#command-arguments)
 662 on the command line.
 663
 664 The default custom attributes can be overridden by the custom attributes
 665 defined in the service using the check command `my-disk`. The custom attributes
 666 can also be inherited from a parent template using additive inheritance (`+=`).
 667
 668
 669     object CheckCommand "my-disk" {
 670       import "plugin-check-command"
 671
 672       command = [ PluginDir + "/check_disk" ]
 673
 674       arguments = {
 675         "-w" = "$disk_wfree$%"
 676         "-c" = "$disk_cfree$%"
 677       }
 678
 679       vars.disk_wfree = 20
 680       vars.disk_cfree = 10
 681     }
 682
 683
 684 The host `localhost` with the service `my-disk` checks all disks with modified
 685 custom attributes (warning thresholds at `10%`, critical thresholds at `5%`
 686 free disk space).
 687
 688     object Host "localhost" {
 689       import "generic-host"
 690
 691       address = "127.0.0.1"
 692       address6 = "::1"
 693     }
 694
 695     object Service "my-disk" {
 696       import "generic-service"
 697
 698       host_name = "localhost"
 699       check_command = "my-disk"
 700
 701       vars.disk_wfree = 10
 702       vars.disk_cfree = 5
 703     }
 704
 705 #### <a id="command-arguments"></a> Command Arguments
 706
 707 By defining a check command line using the `command` attribute Icinga 2
 708 will resolve all macros in the static string or array. Sometimes it is
 709 required to extend the arguments list based on a met condition evaluated
 710 at command execution. Or making arguments optional - only set if the
 711 macro value can be resolved by Icinga 2.
 712
 713     object CheckCommand "check_http" {
 714       import "plugin-check-command"
 715
 716       command = [ PluginDir + "/check_http" ]
 717
 718       arguments = {
 719         "-H" = "$http_vhost$"
 720         "-I" = "$http_address$"
 721         "-u" = "$http_uri$"
 722         "-p" = "$http_port$"
 723         "-S" = {
 724           set_if = "$http_ssl$"
 725         }
 726         "--sni" = {
 727           set_if = "$http_sni$"
 728         }
 729         "-a" = {
 730           value = "$http_auth_pair$"
 731           description = "Username:password on sites with basic authentication"
 732         }
 733         "--no-body" = {
 734           set_if = "$http_ignore_body$"
 735         }
 736         "-r" = "$http_expect_body_regex$"
 737         "-w" = "$http_warn_time$"
 738         "-c" = "$http_critical_time$"
 739         "-e" = "$http_expect$"
 740       }
 741
 742       vars.http_address = "$address$"
 743       vars.http_ssl = false
 744       vars.http_sni = false
 745     }
 746
 747 The example shows the `check_http` check command defining the most common
 748 arguments. Each of them is optional by default and will be omitted if
 749 the value is not set. For example if the service calling the check command
 750 does not have `vars.http_port` set, it won't get added to the command
 751 line.
 752 If the `vars.http_ssl` custom attribute is set in the service, host or command
 753 object definition, Icinga 2 will add the `-S` argument based on the `set_if`
 754 option to the command line.
 755 That way you can use the `check_http` command definition for both, with and
 756 without SSL enabled checks saving you duplicated command definitions.
 757
 758 Details on all available options can be found in the
 759 [CheckCommand object definition](#objecttype-checkcommand).
 760
 761 ### <a id="using-apply-services-command-arguments"></a> Apply Services with custom Command Arguments
 762
 763 Imagine the following scenario: The `my-host1` host is reachable using the default port 22, while
 764 the `my-host2` host requires a different port on 2222. Both hosts are in the hostgroup `my-linux-servers`.
 765
 766     object HostGroup "my-linux-servers" {
 767       display_name = "Linux Servers"
 768       assign where host.vars.os == "Linux"
 769     }
 770
 771     /* this one has port 22 opened */
 772     object Host "my-host1" {
 773       import "generic-host"
 774       address = "129.168.1.50"
 775       vars.os = "Linux"
 776     }
 777
 778     /* this one listens on a different ssh port */
 779     object Host "my-host2" {
 780       import "generic-host"
 781       address = "129.168.2.50"
 782       vars.os = "Linux"
 783       vars.custom_ssh_port = 2222
 784     }
 785
 786 All hosts in the `my-linux-servers` hostgroup should get the `my-ssh` service applied based on an
 787 [apply rule](#apply). The optional `ssh_port` command argument should be inherited from the host
 788 the service is applied to. If not set, the check command `my-ssh` will omit the argument.
 789
 790     object CheckCommand "my-ssh" {
 791       import "plugin-check-command"
 792
 793       command = [ PluginDir + "/check_ssh" ]
 794
 795       arguments = {
 796         "-p" = "$ssh_port$"
 797         "host" = {
 798           value = "$ssh_address$"
 799           skip_key = true
 800           order = -1
 801         }
 802       }
 803
 804       vars.ssh_address = "$address$"
 805     }
 806
 807     /* apply ssh service */
 808     apply Service "my-ssh" {
 809       import "generic-service"
 810       check_command = "my-ssh"
 811
 812       //set the command argument for ssh port with a custom host attribute, if set
 813       vars.ssh_port = "$host.vars.custom_ssh_port$"
 814
 815       assign where "my-linux-servers" in host.groups
 816     }
 817
 818 The `my-host1` will get the `my-ssh` service checking on the default port:
 819
 820     [2014-05-26 21:52:23 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '129.168.1.50': PID 27281
 821
 822 The `my-host2` will inherit the `custom_ssh_port` variable to the service and execute a different command:
 823
 824     [2014-05-26 21:51:32 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ssh', '-p', '2222', '129.168.2.50': PID 26956
 825
 826
 827 ### <a id="notification-commands"></a> Notification Commands
 828
 829 `NotificationCommand` objects define how notifications are delivered to external
 830 interfaces (E-Mail, XMPP, IRC, Twitter, etc).
 831
 832 `NotificationCommand` objects require the [ITL template](#itl-plugin-notification-command)
 833 `plugin-notification-command` to support native plugin-based notifications.
 834
 835 > **Note**
 836 >
 837 > Make sure that the [notification](#features) feature is enabled on your master instance
 838 > in order to execute notification commands.
 839
 840 Below is an example using runtime macros from Icinga 2 (such as `$service.output$` for
 841 the current check output) sending an email to the user(s) associated with the
 842 notification itself (`$user.email$`).
 843
 844 If you want to specify default values for some of the custom attribute definitions,
 845 you can add a `vars` dictionary as shown for the `CheckCommand` object.
 846
 847     object NotificationCommand "mail-service-notification" {
 848       import "plugin-notification-command"
 849
 850       command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]
 851
 852       env = {
 853         NOTIFICATIONTYPE = "$notification.type$"
 854         SERVICEDESC = "$service.name$"
 855         HOSTALIAS = "$host.display_name$"
 856         HOSTADDRESS = "$address$"
 857         SERVICESTATE = "$service.state$"
 858         LONGDATETIME = "$icinga.long_date_time$"
 859         SERVICEOUTPUT = "$service.output$"
 860         NOTIFICATIONAUTHORNAME = "$notification.author$"
 861         NOTIFICATIONCOMMENT = "$notification.comment$"
 862         HOSTDISPLAYNAME = "$host.display_name$"
 863         SERVICEDISPLAYNAME = "$service.display_name$"
 864         USEREMAIL = "$user.email$"
 865       }
 866     }
 867
 868 The command attribute in the `mail-service-notification` command refers to the following
 869 shell script. The macros specified in the `env` array are exported
 870 as environment variables and can be used in the notification script:
 871
 872     #!/usr/bin/env bash
 873     template=$(cat <<TEMPLATE
 874     ***** Icinga  *****
 875
 876     Notification Type: $NOTIFICATIONTYPE
 877
 878     Service: $SERVICEDESC
 879     Host: $HOSTALIAS
 880     Address: $HOSTADDRESS
 881     State: $SERVICESTATE
 882
 883     Date/Time: $LONGDATETIME
 884
 885     Additional Info: $SERVICEOUTPUT
 886
 887     Comment: [$NOTIFICATIONAUTHORNAME] $NOTIFICATIONCOMMENT
 888     TEMPLATE
 889     )
 890
 891     /usr/bin/printf "%b" $template | mail -s "$NOTIFICATIONTYPE - $HOSTDISPLAYNAME - $SERVICEDISPLAYNAME is $SERVICESTATE" $USEREMAIL
 892
 893 > **Note**
 894 >
 895 > This example is for `exim` only. Requires changes for `sendmail` and
 896 > other MTAs.
 897
 898 While it's possible to specify the entire notification command right
 899 in the NotificationCommand object it is generally advisable to create a
 900 shell script in the `/etc/icinga2/scripts` directory and have the
 901 NotificationCommand object refer to that.
 902
 903 ### <a id="event-commands"></a> Event Commands
 904
 905 Unlike notifications event commands are called on every host/service execution
 906 if one of these conditions match:
 907
 908 * The host/service is in a [soft state](#hard-soft-states)
 909 * The host/service state changes into a [hard state](#hard-soft-states)
 910 * The host/service state recovers from a [soft or hard state](#hard-soft-states) to [OK](#service-states)/[Up](#host-states)
 911
 912 Therefore the `EventCommand` object should define a command line
 913 evaluating the current service state and other service runtime attributes
 914 available through runtime vars. Runtime macros such as `$service.state_type$`
 915 and `$service.state$` will be processed by Icinga 2 helping on fine-granular
 916 events being triggered.
 917
 918 Common use case scenarios are a failing HTTP check requiring an immediate
 919 restart via event command, or if an application is locked and requires
 920 a restart upon detection.
 921
 922 `EventCommand` objects require the ITL template `plugin-event-command`
 923 to support native plugin based checks.
 924
 925 When the event command is triggered on a service state change, it will
 926 send a check result using the `process_check_result` script forcibly
 927 changing the service state back to `OK` (`-r 0`) providing some debug
 928 information in the check output (`-o`).
 929
 930     object EventCommand "plugin-event-process-check-result" {
 931       import "plugin-event-command"
 932
 933       command = [
 934         PluginDir + "/process_check_result",
 935         "-H", "$host.name$",
 936         "-S", "$service.name$",
 937         "-c", RunDir + "/icinga2/cmd/icinga2.cmd",
 938         "-r", "0",
 939         "-o", "Event Handler triggered in state '$service.state$' with output '$service.output$'."
 940       ]
 941     }
 942
 943
 944 ## <a id="dependencies"></a> Dependencies
 945
 946 Icinga 2 uses host and service [Dependency](#objecttype-dependency) objects
 947 for determing their network reachability.
 948 The `parent_host_name` and `parent_service_name` attributes are mandatory for
 949 service dependencies, `parent_host_name` is required for host dependencies.
 950
 951 A service can depend on a host, and vice versa. A service has an implicit
 952 dependency (parent) to its host. A host to host dependency acts implicitly
 953 as host parent relation.
 954 When dependencies are calculated, not only the immediate parent is taken into
 955 account but all parents are inherited.
 956
 957 Notifications are suppressed if a host or service becomes unreachable.
 958
 959 ### <a id="dependencies-implicit-host-service"></a> Implicit Dependencies for Services on Host
 960
 961 Icinga 2 automatically adds an implicit dependency for services on their host. That way
 962 service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency
 963 does not overwrite other dependencies and implicitely sets `disable_notifications = true` and
 964 `states = [ Up ]` for all service objects.
 965
 966 Service checks are still executed. If you want to prevent them from happening, you can
 967 apply the following dependency to all services setting their host as `parent_host_name`
 968 and disabling the checks. `assign where true` matches on all `Service` objects.
 969
 970     apply Dependency "disable-host-service-checks" to Service {
 971       disable_checks = true
 972       assign where true
 973     }
 974
 975 ### <a id="dependencies-network-reachability"></a> Dependencies for Network Reachability
 976
 977 A common scenario is the Icinga 2 server behind a router. Checking internet
 978 access by pinging the Google DNS server `google-dns` is a common method, but
 979 will fail in case the `dsl-router` host is down. Therefore the example below
 980 defines a host dependency which acts implicitly as parent relation too.
 981
 982 Furthermore the host may be reachable but ping probes are dropped by the
 983 router's firewall. In case the `dsl-router``ping4` service check fails, all
 984 further checks for the `ping4` service on host `google-dns` service should
 985 be suppressed. This is achieved by setting the `disable_checks` attribute to `true`.
 986
 987     object Host "dsl-router" {
 988       address = "192.168.1.1"
 989     }
 990
 991     object Host "google-dns" {
 992       address = "8.8.8.8"
 993     }
 994
 995     apply Service "ping4" {
 996       import "generic-service"
 997
 998       check_command = "ping4"
 999
1000       assign where host.address
1001     }
1002
1003     apply Dependency "internet" to Host {
1004       parent_host_name = "dsl-router"
1005       disable_checks = true
1006       disable_notifications = true
1007
1008       assign where host.name != "dsl-router"
1009     }
1010
1011     apply Dependency "internet" to Service {
1012       parent_host_name = "dsl-router"
1013       parent_service_name = "ping4"
1014       disable_checks = true
1015
1016       assign where host.name != "dsl-router"
1017     }
1018
1019
1020 ### <a id="dependencies-agent-checks"></a> Dependencies for Agent Checks
1021
1022 Another classic example are agent based checks. You would define a health check
1023 for the agent daemon responding to your requests, and make all other services
1024 querying that daemon depend on that health check.
1025
1026 The following configuration defines two nrpe based service checks `nrpe-load`
1027 and `nrpe-disk` applied to the `nrpe-server`. The health check is defined as
1028 `nrpe-health` service.
1029
1030     apply Service "nrpe-health" {
1031       import "generic-service"
1032       check_command = "nrpe"
1033       assign where match("nrpe-*", host.name)
1034     }
1035
1036     apply Service "nrpe-load" {
1037       import "generic-service"
1038       check_command = "nrpe"
1039       vars.nrpe_command = "check_load"
1040       assign where match("nrpe-*", host.name)
1041     }
1042
1043     apply Service "nrpe-disk" {
1044       import "generic-service"
1045       check_command = "nrpe"
1046       vars.nrpe_command = "check_disk"
1047       assign where match("nrpe-*", host.name)
1048     }
1049
1050     object Host "nrpe-server" {
1051       import "generic-host"
1052       address = "192.168.1.5"
1053     }
1054
1055     apply Dependency "disable-nrpe-checks" to Service {
1056       parent_service_name = "nrpe-health"
1057
1058       states = [ OK ]
1059       disable_checks = true
1060       disable_notifications = true
1061       assign where service.check_command == "nrpe"
1062       ignore where service.name == "nrpe-health"
1063     }
1064
1065 The `disable-nrpe-checks` dependency is applied to all services
1066 on the `nrpe-service` host using the `nrpe` check_command attribute
1067 but not the `nrpe-health` service itself.
1068
1069
1070 ## <a id="downtimes"></a> Downtimes
1071
1072 Downtimes can be scheduled for planned server maintenance or
1073 any other targetted service outage you are aware of in advance.
1074
1075 Downtimes will suppress any notifications, and may trigger other
1076 downtimes too. If the downtime was set by accident, or the duration
1077 exceeds the maintenance, you can manually cancel the downtime.
1078 Planned downtimes will also be taken into account for SLA reporting
1079 tools calculating the SLAs based on the state and downtime history.
1080
1081 Multiple downtimes for a single object may overlap. This is useful
1082 when you want to extend your maintenance window taking longer than expected.
1083 If there are multiple downtimes triggered for one object, the overall downtime depth
1084 will be greater than `1`.
1085
1086
1087 If the downtime was scheduled after the problem changed to a critical hard
1088 state triggering a problem notification, and the service recovers during
1089 the downtime window, the recovery notification won't be suppressed.
1090
1091 ### <a id="fixed-flexible-downtimes"></a> Fixed and Flexible Downtimes
1092
1093 A `fixed` downtime will be activated at the defined start time, and
1094 removed at the end time. During this time window the service state
1095 will change to `NOT-OK` and then actually trigger the downtime.
1096 Notifications are suppressed and the downtime depth is incremented.
1097
1098 Common scenarios are a planned distribution upgrade on your linux
1099 servers, or database updates in your warehouse. The customer knows
1100 about a fixed downtime window between 23:00 and 24:00. After 24:00
1101 all problems should be alerted again. Solution is simple -
1102 schedule a `fixed` downtime starting at 23:00 and ending at 24:00.
1103
1104 Unlike a `fixed` downtime, a `flexible` downtime will be triggered
1105 by the state change in the time span defined by start and end time,
1106 and then last for the specified duration in minutes.
1107
1108 Imagine the following scenario: Your service is frequently polled
1109 by users trying to grab free deleted domains for immediate registration.
1110 Between 07:30 and 08:00 the impact will hit for 15 minutes and generate
1111 a network outage visible to the monitoring. The service is still alive,
1112 but answering too slow to Icinga 2 service checks.
1113 For that reason, you may want to schedule a downtime between 07:30 and
1114 08:00 with a duration of 15 minutes. The downtime will then last from
1115 its trigger time until the duration is over. After that, the downtime
1116 is removed (may happen before or after the actual end time!).
1117
1118 ### <a id="scheduling-downtime"></a> Scheduling a downtime
1119
1120 This can either happen through a web interface or by sending an [external command](#external-commands)
1121 to the external command pipe provided by the `ExternalCommandListener` configuration.
1122
1123 Fixed downtimes require a start and end time (a duration will be ignored).
1124 Flexible downtimes need a start and end time for the time span, and a duration
1125 independent from that time span.
1126
1127 ### <a id="triggered-downtimes"></a> Triggered Downtimes
1128
1129 This is optional when scheduling a downtime. If there is already a downtime
1130 scheduled for a future maintenance, the current downtime can be triggered by
1131 that downtime. This renders useful if you have scheduled a host downtime and
1132 are now scheduling a child host's downtime getting triggered by the parent
1133 downtime on NOT-OK state change.
1134
1135 ### <a id="recurring-downtimes"></a> Recurring Downtimes
1136
1137 [ScheduledDowntime objects](#objecttype-scheduleddowntime) can be used to set up
1138 recurring downtimes for services.
1139
1140 Example:
1141
1142     apply ScheduledDowntime "backup-downtime" to Service {
1143       author = "icingaadmin"
1144       comment = "Scheduled downtime for backup"
1145
1146       ranges = {
1147         monday = "02:00-03:00"
1148         tuesday = "02:00-03:00"
1149         wednesday = "02:00-03:00"
1150         thursday = "02:00-03:00"
1151         friday = "02:00-03:00"
1152         saturday = "02:00-03:00"
1153         sunday = "02:00-03:00"
1154       }
1155
1156       assign where "backup" in service.groups
1157     }
1158
1159
1160 ## <a id="comments"></a> Comments
1161
1162 Comments can be added at runtime and are persistent over restarts. You can
1163 add useful information for others on repeating incidents (for example
1164 "last time syslog at 100% cpu on 17.10.2013 due to stale nfs mount") which
1165 is primarly accessible using web interfaces.
1166
1167 Adding and deleting comment actions are possible through the external command pipe
1168 provided with the `ExternalCommandListener` configuration. The caller must
1169 pass the comment id in case of manipulating an existing comment.
1170
1171
1172 ## <a id="acknowledgements"></a> Acknowledgements
1173
1174 If a problem is alerted and notified you may signal the other notification
1175 recipients that you are aware of the problem and will handle it.
1176
1177 By sending an acknowledgement to Icinga 2 (using the external command pipe
1178 provided with `ExternalCommandListener` configuration) all future notifications
1179 are suppressed, a new comment is added with the provided description and
1180 a notification with the type `NotificationFilterAcknowledgement` is sent
1181 to all notified users.
1182
1183 ### <a id="expiring-acknowledgements"></a> Expiring Acknowledgements
1184
1185 Once a problem is acknowledged it may disappear from your `handled problems`
1186 dashboard and no-one ever looks at it again since it will suppress
1187 notifications too.
1188
1189 This `fire-and-forget` action is quite common. If you're sure that a
1190 current problem should be resolved in the future at a defined time,
1191 you can define an expiration time when acknowledging the problem.
1192
1193 Icinga 2 will clear the acknowledgement when expired and start to
1194 re-notify if the problem persists.
1195
1196
1197
1198 ## <a id="custom-attributes"></a> Custom Attributes
1199
1200 ### <a id="runtime-custom-attributes"></a> Using Custom Attributes at Runtime
1201
1202 Custom attributes may be used in command definitions to dynamically change how the command
1203 is executed.
1204
1205 Additionally there are Icinga 2 features such as the `PerfDataWriter` type
1206 which use custom attributes to format their output.
1207
1208 > **Tip**
1209 >
1210 > Custom attributes are identified by the 'vars' dictionary attribute as short name.
1211 > Accessing the different attribute keys is possible using the '.' accessor.
1212
1213 Custom attributes in command definitions or performance data templates are evaluated at
1214 runtime when executing a command. These custom attributes cannot be used elsewhere
1215 (e.g. in other configuration attributes).
1216
1217 Custom attribute values must be either a string, a number or a boolean value. Arrays
1218 and dictionaries cannot be used.
1219
1220 Here is an example of a command definition which uses user-defined custom attributes:
1221
1222     object CheckCommand "my-ping" {
1223       import "plugin-check-command"
1224
1225       command = [
1226         PluginDir + "/check_ping", "-4"
1227       ]
1228
1229       arguments = {
1230         "-H" = "$ping_address$"
1231         "-w" = "$ping_wrta$,$ping_wpl$%"
1232         "-c" = "$ping_crta$,$ping_cpl$%"
1233         "-p" = "$ping_packets$"
1234         "-t" = "$ping_timeout$"
1235       }
1236
1237       vars.ping_address = "$address$"
1238       vars.ping_wrta = 100
1239       vars.ping_wpl = 5
1240       vars.ping_crta = 200
1241       vars.ping_cpl = 15
1242       vars.ping_packets = 5
1243       vars.ping_timeout = 0
1244     }
1245
1246 Custom attribute names used at runtime must be enclosed in two `$` signs, e.g.
1247 `$address$`. When using the `$` sign as single character, you need to escape
1248 it with an additional dollar sign (`$$`). This example also makes use of the
1249 [command arguments](#command-arguments) passed to the command line. `-4` must
1250 be added as additional array key.
1251
1252 ### <a id="runtime-custom-attributes-evaluation-order"></a> Runtime Custom Attributes Evaluation Order
1253
1254 When executing commands Icinga 2 checks the following objects in this order to look
1255 up custom attributes and their respective values:
1256
1257 1. User object (only for notifications)
1258 2. Service object
1259 3. Host object
1260 4. Command object
1261 5. Global custom attributes in the `vars` constant
1262
1263 This execution order allows you to define default values for custom attributes
1264 in your command objects. The `my-ping` command shown above uses this to set
1265 default values for some of the latency thresholds and timeouts.
1266
1267 When using the `my-ping` command you can override some or all of the custom
1268 attributes in the service definition like this:
1269
1270     object Service "ping" {
1271       host_name = "localhost"
1272       check_command = "my-ping"
1273
1274       vars.ping_packets = 10 // Overrides the default value of 5 given in the command
1275     }
1276
1277 If a custom attribute isn't defined anywhere an empty value is used and a warning is
1278 emitted to the Icinga 2 log.
1279
1280 > **Best Practice**
1281 >
1282 > By convention every host should have an `address` attribute. Hosts
1283 > which have an IPv6 address should also have an `address6` attribute.
1284
1285 ### <a id="runtime-custom-attribute-env-vars"></a> Runtime Custom Attributes as Environment Variables
1286
1287 The `env` command object attribute specifies a list of environment variables with values calculated
1288 from either runtime macros or custom attributes which should be exported as environment variables
1289 prior to executing the command.
1290
1291 This is useful for example for hiding sensitive information on the command line output
1292 when passing credentials to database checks:
1293
1294     object CheckCommand "mysql-health" {
1295       import "plugin-check-command"
1296
1297       command = [
1298         PluginDir + "/check_mysql"
1299       ]
1300
1301       arguments = {
1302         "-H" = "$mysql_address$"
1303         "-d" = "$mysql_database$"
1304       }
1305
1306       vars.mysql_address = "$address$"
1307       vars.mysql_database = "icinga"
1308       vars.mysql_user = "icinga_check"
1309       vars.mysql_pass = "password"
1310
1311       env.MYSQLUSER = "$mysql_user$"
1312       env.MYSQLPASS = "$mysql_pass$"
1313     }
1314
1315 ### <a id="multiple-host-addresses-custom-attributes"></a> Multiple Host Addresses using Custom Attributes
1316
1317 The following example defines a `Host` with three different interface addresses defined as
1318 custom attributes in the `vars` dictionary. The `if-eth0` and `if-eth1` services will import
1319 these values into the `address` custom attribute. This attribute is available through the
1320 generic `$address$` runtime macro.
1321
1322     object Host "multi-ip" {
1323       check_command = "dummy"
1324       vars.address_lo = "127.0.0.1"
1325       vars.address_eth0 = "10.0.0.10"
1326       vars.address_eth1 = "192.168.1.10"
1327     }
1328
1329     apply Service "if-eth0" {
1330       import "generic-service"
1331
1332       vars.address = "$host.vars.address_eth0$"
1333       check_command = "my-generic-interface-check"
1334
1335       assign where host.vars.address_eth0 != ""
1336     }
1337
1338     apply Service "if-eth1" {
1339       import "generic-service"
1340
1341       vars.address = "$host.vars.address_eth1$"
1342       check_command = "my-generic-interface-check"
1343
1344       assign where host.vars.address_eth1 != ""
1345     }
1346
1347     object CheckCommand "my-generic-interface-check" {
1348       import "plugin-check-command"
1349
1350       command = "echo \"This would be the service $service.description$ using the address value: $address$\""
1351     }
1352
1353 The `CheckCommand` object is just an example to help you with testing and
1354 understanding the different custom attributes and runtime macros.
1355
1356 ### <a id="modified-attributes"></a> Modified Attributes
1357
1358 Icinga 2 allows you to modify defined object attributes at runtime different to
1359 the local configuration object attributes. These modified attributes are
1360 stored as bit-shifted-value and made available in backends. Icinga 2 stores
1361 modified attributes in its state file and restores them on restart.
1362
1363 Modified Attributes can be reset using external commands.
1364
1365
1366 ## <a id="runtime-macros"></a> Runtime Macros
1367
1368 Next to custom attributes there are additional runtime macros made available by Icinga 2.
1369 These runtime macros reflect the current object state and may change over time while
1370 custom attributes are configured statically (but can be modified at runtime using
1371 external commands).
1372
1373 ### <a id="runtime-macro-evaluation-order"></a> Runtime Macro Evaluation Order
1374
1375 Custom attributes can be accessed at [runtime](#runtime-custom-attributes) using their
1376 identifier omitting the `vars.` prefix.
1377 There are special cases when those custom attributes are not set and Icinga 2 provides
1378 a fallback to existing object attributes for example `host.address`.
1379
1380 In the following example the `$address$` macro will be resolved with the value of `vars.address`.
1381
1382     object Host "localhost" {
1383       import "generic-host"
1384       check_command = "my-host-macro-test"
1385       address = "127.0.0.1"
1386       vars.address = "127.2.2.2"
1387     }
1388
1389     object CheckCommand "my-host-macro-test" {
1390       command = "echo \"address: $address$ host.address: $host.address$ host.vars.address: $host.vars.address$\""
1391     }
1392
1393 The check command output will look like
1394
1395     "address: 127.2.2.2 host.address: 127.0.0.1 host.vars.address: 127.2.2.2"
1396
1397 If you alter the host object and remove the `vars.address` line, Icinga 2 will fail to look up `$address$` in the
1398 custom attributes dictionary and then look for the host object's attribute.
1399
1400 The check command output will change to
1401
1402     "address: 127.0.0.1 host.address: 127.0.0.1 host.vars.address: "
1403
1404
1405 The same example can be defined for services overriding the `address` field based on a specific host custom attribute.
1406
1407     object Host "localhost" {
1408       import "generic-host"
1409       address = "127.0.0.1"
1410       vars.macro_address = "127.3.3.3"
1411     }
1412
1413     apply Service "my-macro-test" to Host {
1414       import "generic-service"
1415       check_command = "my-service-macro-test"
1416       vars.address = "$host.vars.macro_address$"
1417
1418       assign where host.address
1419     }
1420
1421     object CheckCommand "my-service-macro-test" {
1422       command = "echo \"address: $address$ host.address: $host.address$ host.vars.macro_address: $host.vars.macro_address$ service.vars.address: $service.vars.address$\""
1423     }
1424
1425 When the service check is executed the output looks like
1426
1427     "address: 127.3.3.3 host.address: 127.0.0.1 host.vars.macro_address: 127.3.3.3 service.vars.address: 127.3.3.3"
1428
1429 That way you can easily override existing macros being accessed by their short name like `$address$` and refrain
1430 from defining multiple check commands (one for `$address$` and one for `$host.vars.macro_address$`).
1431
1432
1433 ### <a id="host-runtime-macros"></a> Host Runtime Macros
1434
1435 The following host custom attributes are available in all commands that are executed for
1436 hosts or services:
1437
1438   Name                         | Description
1439   -----------------------------|--------------
1440   host.name                    | The name of the host object.
1441   host.display_name            | The value of the `display_name` attribute.
1442   host.state                   | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
1443   host.state_id                | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
1444   host.state_type              | The host's current state type. Can be one of `SOFT` and `HARD`.
1445   host.check_attempt           | The current check attempt number.
1446   host.max_check_attempts      | The maximum number of checks which are executed before changing to a hard state.
1447   host.last_state              | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`.
1448   host.last_state_id           | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable).
1449   host.last_state_type         | The host's previous state type. Can be one of `SOFT` and `HARD`.
1450   host.last_state_change       | The last state change's timestamp.
1451   host.duration_sec            | The time since the last state change.
1452   host.latency                 | The host's check latency.
1453   host.execution_time          | The host's check execution time.
1454   host.output                  | The last check's output.
1455   host.perfdata                | The last check's performance data.
1456   host.last_check              | The timestamp when the last check was executed.
1457   host.num_services            | Number of services associated with the host.
1458   host.num_services_ok         | Number of services associated with the host which are in an `OK` state.
1459   host.num_services_warning    | Number of services associated with the host which are in a `WARNING` state.
1460   host.num_services_unknown    | Number of services associated with the host which are in an `UNKNOWN` state.
1461   host.num_services_critical   | Number of services associated with the host which are in a `CRITICAL` state.
1462
1463 ### <a id="service-runtime-macros"></a> Service Runtime Macros
1464
1465 The following service macros are available in all commands that are executed for
1466 services:
1467
1468   Name                       | Description
1469   ---------------------------|--------------
1470   service.name               | The short name of the service object.
1471   service.display_name       | The value of the `display_name` attribute.
1472   service.check_command      | The short name of the command along with any arguments to be used for the check.
1473   service.state              | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
1474   service.state_id           | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
1475   service.state_type         | The service's current state type. Can be one of `SOFT` and `HARD`.
1476   service.check_attempt      | The current check attempt number.
1477   service.max_check_attempts | The maximum number of checks which are executed before changing to a hard state.
1478   service.last_state         | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`.
1479   service.last_state_id      | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown).
1480   service.last_state_type    | The service's previous state type. Can be one of `SOFT` and `HARD`.
1481   service.last_state_change  | The last state change's timestamp.
1482   service.duration_sec       | The time since the last state change.
1483   service.latency            | The service's check latency.
1484   service.execution_time     | The service's check execution time.
1485   service.output             | The last check's output.
1486   service.perfdata           | The last check's performance data.
1487   service.last_check         | The timestamp when the last check was executed.
1488
1489 ### <a id="command-runtime-macros"></a> Command Runtime Macros
1490
1491 The following custom attributes are available in all commands:
1492
1493   Name                   | Description
1494   -----------------------|--------------
1495   command.name           | The name of the command object.
1496
1497 ### <a id="user-runtime-macros"></a> User Runtime Macros
1498
1499 The following custom attributes are available in all commands that are executed for
1500 users:
1501
1502   Name                   | Description
1503   -----------------------|--------------
1504   user.name              | The name of the user object.
1505   user.display_name      | The value of the display_name attribute.
1506
1507 ### <a id="notification-runtime-macros"></a> Notification Runtime Macros
1508
1509   Name                   | Description
1510   -----------------------|--------------
1511   notification.type      | The type of the notification.
1512   notification.author    | The author of the notification comment, if existing.
1513   notification.comment   | The comment of the notification, if existing.
1514
1515 ### <a id="global-runtime-macros"></a> Global Runtime Macros
1516
1517 The following macros are available in all executed commands:
1518
1519   Name                   | Description
1520   -----------------------|--------------
1521   icinga.timet           | Current UNIX timestamp.
1522   icinga.long_date_time  | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000`
1523   icinga.short_date_time | Current date and time. Example: `2014-01-03 11:23:08`
1524   icinga.date            | Current date. Example: `2014-01-03`
1525   icinga.time            | Current time including timezone information. Example: `11:23:08 +0000`
1526   icinga.uptime          | Current uptime of the Icinga 2 process.
1527
1528 The following macros provide global statistics:
1529
1530   Name                              | Description
1531   ----------------------------------|--------------
1532   icinga.num_services_ok            | Current number of services in state 'OK'.
1533   icinga.num_services_warning       | Current number of services in state 'Warning'.
1534   icinga.num_services_critical      | Current number of services in state 'Critical'.
1535   icinga.num_services_unknown       | Current number of services in state 'Unknown'.
1536   icinga.num_services_pending       | Current number of pending services.
1537   icinga.num_services_unreachable   | Current number of unreachable services.
1538   icinga.num_services_flapping      | Current number of flapping services.
1539   icinga.num_services_in_downtime   | Current number of services in downtime.
1540   icinga.num_services_acknowledged  | Current number of acknowledged service problems.
1541   icinga.num_hosts_up               | Current number of hosts in state 'Up'.
1542   icinga.num_hosts_down             | Current number of hosts in state 'Down'.
1543   icinga.num_hosts_unreachable      | Current number of unreachable hosts.
1544   icinga.num_hosts_flapping         | Current number of flapping hosts.
1545   icinga.num_hosts_in_downtime      | Current number of hosts in downtime.
1546   icinga.num_hosts_acknowledged     | Current number of acknowledged host problems.
1547
1548
1549 ## <a id="check-result-freshness"></a> Check Result Freshness
1550
1551 In Icinga 2 active check freshness is enabled by default. It is determined by the
1552 `check_interval` attribute and no incoming check results in that period of time.
1553
1554     threshold = last check execution time + check interval
1555
1556 Passive check freshness is calculated from the `check_interval` attribute if set.
1557
1558     threshold = last check result time + check interval
1559
1560 If the freshness checks are invalid, a new check is executed defined by the
1561 `check_command` attribute.
1562
1563
1564 ## <a id="check-flapping"></a> Check Flapping
1565
1566 The flapping algorithm used in Icinga 2 does not store the past states but
1567 calculcates the flapping threshold from a single value based on counters and
1568 half-life values. Icinga 2 compares the value with a single flapping threshold
1569 configuration attribute named `flapping_threshold`.
1570
1571 Flapping detection can be enabled or disabled using the `enable_flapping` attribute.
1572
1573
1574 ## <a id="volatile-services"></a> Volatile Services
1575
1576 By default all services remain in a non-volatile state. When a problem
1577 occurs, the `SOFT` state applies and once `max_check_attempts` attribute
1578 is reached with the check counter, a `HARD` state transition happens.
1579 Notifications are only triggered by `HARD` state changes and are then
1580 re-sent defined by the `interval` attribute.
1581
1582 It may be reasonable to have a volatile service which stays in a `HARD`
1583 state type if the service stays in a `NOT-OK` state. That way each
1584 service recheck will automatically trigger a notification unless the
1585 service is acknowledged or in a scheduled downtime.
1586
1587
1588 ## <a id="external-commands"></a> External Commands
1589
1590 Icinga 2 provides an external command pipe for processing commands
1591 triggering specific actions (for example rescheduling a service check
1592 through the web interface).
1593
1594 In order to enable the `ExternalCommandListener` configuration use the
1595 following command and restart Icinga 2 afterwards:
1596
1597     # icinga2-enable-feature command
1598
1599 Icinga 2 creates the command pipe file as `/var/run/icinga2/cmd/icinga2.cmd`
1600 using the default configuration.
1601
1602 Web interfaces and other Icinga addons are able to send commands to
1603 Icinga 2 through the external command pipe, for example for rescheduling
1604 a forced service check:
1605
1606     # /bin/echo "[`date +%s`] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;`date +%s`" >> /var/run/icinga2/cmd/icinga2.cmd
1607
1608     # tail -f /var/log/messages
1609
1610     Oct 17 15:01:25 icinga-server icinga2: Executing external command: [1382014885] SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382014885
1611     Oct 17 15:01:25 icinga-server icinga2: Rescheduling next check for service 'ping4'
1612
1613 By default the command pipe file is owned by the group `icingacmd` with read/write
1614 permissions. Add your webserver's user to the group `icingacmd` to
1615 enable sending commands to Icinga 2 through your web interface:
1616
1617     # usermod -G -a icingacmd www-data
1618
1619 Debian packages use `nagios` as the default user and group name. Therefore change `icingacmd` to
1620 `nagios`. The webserver's user is different between distributions as well.
1621
1622 ### <a id="external-command-list"></a> External Command List
1623
1624 A list of currently supported external commands can be found [here](#external-commands-list-detail).
1625
1626 Detailed information on the commands and their required parameters can be found
1627 on the [Icinga 1.x documentation](http://docs.icinga.org/latest/en/extcommands2.html).
1628
1629
1630 ## <a id="logging"></a> Logging
1631
1632 Icinga 2 supports three different types of logging:
1633
1634 * File logging
1635 * Syslog (on *NIX-based operating systems)
1636 * Console logging (`STDOUT` on tty)
1637
1638 You can enable additional loggers using the `icinga2-enable-feature`
1639 and `icinga2-disable-feature` commands to configure loggers:
1640
1641 Feature  | Description
1642 ---------|------------
1643 debuglog | Debug log (path: `/var/log/icinga2/debug.log`, severity: `debug` or higher)
1644 mainlog  | Main log (path: `/var/log/icinga2/icinga2.log`, severity: `information` or higher)
1645 syslog   | Syslog (severity: `warning` or higher)
1646
1647 By default file the `mainlog` feature is enabled. When running Icinga 2
1648 on a terminal log messages with severity `information` or higher are
1649 written to the console.
1650
1651
1652 ## <a id="performance-data"></a> Performance Data
1653
1654 When a host or service check is executed plugins should provide so-called
1655 `performance data`. Next to that additional check performance data
1656 can be fetched using Icinga 2 runtime macros such as the check latency
1657 or the current service state (or additional custom attributes).
1658
1659 The performance data can be passed to external applications which aggregate and
1660 store them in their backends. These tools usually generate graphs for historical
1661 reporting and trending.
1662
1663 Well-known addons processing Icinga performance data are PNP4Nagios,
1664 inGraph and Graphite.
1665
1666 ### <a id="writing-performance-data-files"></a> Writing Performance Data Files
1667
1668 PNP4Nagios, inGraph and Graphios use performance data collector daemons to fetch
1669 the current performance files for their backend updates.
1670
1671 Therefore the Icinga 2 `PerfdataWriter` object allows you to define
1672 the output template format for host and services backed with Icinga 2
1673 runtime vars.
1674
1675     host_format_template = "DATATYPE::HOSTPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tHOSTPERFDATA::$host.perfdata$\tHOSTCHECKCOMMAND::$host.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$"
1676     service_format_template = "DATATYPE::SERVICEPERFDATA\tTIMET::$icinga.timet$\tHOSTNAME::$host.name$\tSERVICEDESC::$service.name$\tSERVICEPERFDATA::$service.perfdata$\tSERVICECHECKCOMMAND::$service.checkcommand$\tHOSTSTATE::$host.state$\tHOSTSTATETYPE::$host.statetype$\tSERVICESTATE::$service.state$\tSERVICESTATETYPE::$service.statetype$"
1677
1678 The default templates are already provided with the Icinga 2 feature configuration
1679 which can be enabled using
1680
1681     # icinga2-enable-feature perfdata
1682
1683 By default all performance data files are rotated in a 15 seconds interval into
1684 the `/var/spool/icinga2/perfdata/` directory as `host-perfdata.<timestamp>` and
1685 `service-perfdata.<timestamp>`.
1686 External collectors need to parse the rotated performance data files and then
1687 remove the processed files.
1688
1689 ### <a id="graphite-carbon-cache-writer"></a> Graphite Carbon Cache Writer
1690
1691 While there are some Graphite collector scripts and daemons like Graphios available for
1692 Icinga 1.x it's more reasonable to directly process the check and plugin performance
1693 in memory in Icinga 2. Once there are new metrics available, Icinga 2 will directly
1694 write them to the defined Graphite Carbon daemon tcp socket.
1695
1696 You can enable the feature using
1697
1698     # icinga2-enable-feature graphite
1699
1700 By default the `GraphiteWriter` object expects the Graphite Carbon Cache to listen at
1701 `127.0.0.1` on port `2003`.
1702
1703 The current naming schema is
1704
1705     icinga.<hostname>.<metricname>
1706     icinga.<hostname>.<servicename>.<metricname>
1707
1708 To make sure Icinga 2 writes a valid label into Graphite some characters are replaced
1709 with `_` in the target name:
1710
1711     \/.-  (and space)
1712
1713 The resulting name in Graphite might look like:
1714
1715     www-01 / http-cert / response time
1716     icinga.www_01.http_cert.response_time
1717
1718 In addition to the performance data retrieved from the check plugin, Icinga 2 sends
1719 internal check statistic data to Graphite:
1720
1721   metric             | description
1722   -------------------|------------------------------------------
1723   current_attempt    | current check attempt
1724   max_check_attempts | maximum check attempts until the hard state is reached
1725   reachable          | checked object is reachable
1726   execution_time     | check execution time
1727   latency            | check latency
1728   state              | current state of the checked object
1729   state_type         | 0=SOFT, 1=HARD state
1730
1731 The following example illustrates how to configure the storage-schemas for Graphite Carbon
1732 Cache. Please make sure that the order is correct because the first match wins.
1733
1734     [icinga_internals]
1735     pattern = ^icinga\..*\.(max_check_attempts|reachable|current_attempt|execution_time|latency|state|state_type)
1736     retentions = 5m:7d
1737
1738     [icinga_default]
1739     # intervals like PNP4Nagios uses them per default
1740     pattern = ^icinga\.
1741     retentions = 1m:2d,5m:10d,30m:90d,360m:4y
1742
1743 ## <a id="status-data"></a> Status Data
1744
1745 Icinga 1.x writes object configuration data and status data in a cyclic
1746 interval to its `objects.cache` and `status.dat` files. Icinga 2 provides
1747 the `StatusDataWriter` object which dumps all configuration objects and
1748 status updates in a regular interval.
1749
1750     # icinga2-enable-feature statusdata
1751
1752 Icinga 1.x Classic UI requires this data set as part of its backend.
1753
1754 > **Note**
1755 >
1756 > If you are not using any web interface or addon which uses these files
1757 > you can safely disable this feature.
1758
1759
1760 ## <a id="compat-logging"></a> Compat Logging
1761
1762 The Icinga 1.x log format is considered being the `Compat Log`
1763 in Icinga 2 provided with the `CompatLogger` object.
1764
1765 These logs are not only used for informational representation in
1766 external web interfaces parsing the logs, but also to generate
1767 SLA reports and trends in Icinga 1.x Classic UI. Furthermore the
1768 [Livestatus](#livestatus) feature uses these logs for answering queries to
1769 historical tables.
1770
1771 The `CompatLogger` object can be enabled with
1772
1773     # icinga2-enable-feature compatlog
1774
1775 By default, the Icinga 1.x log file called `icinga.log` is located
1776 in `/var/log/icinga2/compat`. Rotated log files are moved into
1777 `var/log/icinga2/compat/archives`.
1778
1779 The format cannot be changed without breaking compatibility to
1780 existing log parsers.
1781
1782     # tail -f /var/log/icinga2/compat/icinga.log
1783
1784     [1382115688] LOG ROTATION: HOURLY
1785     [1382115688] LOG VERSION: 2.0
1786     [1382115688] HOST STATE: CURRENT;localhost;UP;HARD;1;
1787     [1382115688] SERVICE STATE: CURRENT;localhost;disk;WARNING;HARD;1;
1788     [1382115688] SERVICE STATE: CURRENT;localhost;http;OK;HARD;1;
1789     [1382115688] SERVICE STATE: CURRENT;localhost;load;OK;HARD;1;
1790     [1382115688] SERVICE STATE: CURRENT;localhost;ping4;OK;HARD;1;
1791     [1382115688] SERVICE STATE: CURRENT;localhost;ping6;OK;HARD;1;
1792     [1382115688] SERVICE STATE: CURRENT;localhost;processes;WARNING;HARD;1;
1793     [1382115688] SERVICE STATE: CURRENT;localhost;ssh;OK;HARD;1;
1794     [1382115688] SERVICE STATE: CURRENT;localhost;users;OK;HARD;1;
1795     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;disk;1382115705
1796     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;http;1382115705
1797     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;load;1382115705
1798     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ping4;1382115705
1799     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ping6;1382115705
1800     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;processes;1382115705
1801     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;ssh;1382115705
1802     [1382115706] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;users;1382115705
1803     [1382115731] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;localhost;ping6;2;critical test|
1804     [1382115731] SERVICE ALERT: localhost;ping6;CRITICAL;SOFT;2;critical test
1805
1806
1807
1808
1809 ## <a id="db-ido"></a> DB IDO
1810
1811 The IDO (Icinga Data Output) modules for Icinga 2 take care of exporting all
1812 configuration and status information into a database. The IDO database is used
1813 by a number of projects including Icinga Web 1.x and 2.
1814
1815 Details on the installation can be found in the [Getting Started](#configuring-ido)
1816 chapter. Details on the configuration can be found in the
1817 [IdoMysqlConnection](#objecttype-idomysqlconnection) and
1818 [IdoPgsqlConnection](#objecttype-idoPgsqlconnection)
1819 object configuration documentation.
1820 The DB IDO feature supports [High Availability](##high-availability-db-ido) in
1821 the Icinga 2 cluster.
1822
1823 The following example query checks the health of the current Icinga 2 instance
1824 writing its current status to the DB IDO backend table `icinga_programstatus`
1825 every 10 seconds. By default it checks 60 seconds into the past which is a reasonable
1826 amount of time - adjust it for your requirements. If the condition is not met,
1827 the query returns an empty result.
1828
1829 > **Tip**
1830 >
1831 > Use [check plugins](#plugins) to monitor the backend.
1832
1833 Replace the `default` string with your instance name, if different.
1834
1835 Example for MySQL:
1836
1837     # mysql -u root -p icinga -e "SELECT status_update_time FROM icinga_programstatus ps
1838       JOIN icinga_instances i ON ps.instance_id=i.instance_id
1839       WHERE (UNIX_TIMESTAMP(ps.status_update_time) > UNIX_TIMESTAMP(NOW())-60)
1840       AND i.instance_name='default';"
1841
1842     +---------------------+
1843     | status_update_time  |
1844     +---------------------+
1845     | 2014-05-29 14:29:56 |
1846     +---------------------+
1847
1848
1849 Example for PostgreSQL:
1850
1851     # export PGPASSWORD=icinga; psql -U icinga -d icinga -c "SELECT ps.status_update_time FROM icinga_programstatus AS ps
1852       JOIN icinga_instances AS i ON ps.instance_id=i.instance_id
1853       WHERE ((SELECT extract(epoch from status_update_time) FROM icinga_programstatus) > (SELECT extract(epoch from now())-60))
1854       AND i.instance_name='default'";
1855
1856     status_update_time
1857     ------------------------
1858      2014-05-29 15:11:38+02
1859     (1 Zeile)
1860
1861
1862 A detailed list on the available table attributes can be found in the [DB IDO Schema documentation](#schema-db-ido).
1863
1864
1865 ## <a id="livestatus"></a> Livestatus
1866
1867 The [MK Livestatus](http://mathias-kettner.de/checkmk_livestatus.html) project
1868 implements a query protocol that lets users query their Icinga instance for
1869 status information. It can also be used to send commands.
1870
1871 Details on the installation can be found in the [Getting Started](#setting-up-livestatus)
1872 chapter.
1873
1874 ### <a id="livestatus-sockets"></a> Livestatus Sockets
1875
1876 Other to the Icinga 1.x Addon, Icinga 2 supports two socket types
1877
1878 * Unix socket (default)
1879 * TCP socket
1880
1881 Details on the configuration can be found in the [LivestatusListener](#objecttype-livestatuslistener)
1882 object configuration.
1883
1884 ### <a id="livestatus-get-queries"></a> Livestatus GET Queries
1885
1886 > **Note**
1887 >
1888 > All Livestatus queries require an additional empty line as query end identifier.
1889 > The `unixcat` tool is either available by the MK Livestatus project or as separate
1890 > binary.
1891
1892 There also is a Perl module available in CPAN for accessing the Livestatus socket
1893 programmatically: [Monitoring::Livestatus](http://search.cpan.org/~nierlein/Monitoring-Livestatus-0.74/)
1894
1895
1896 Example using the unix socket:
1897
1898     # echo -e "GET services\n" | unixcat /var/run/icinga2/cmd/livestatus
1899
1900 Example using the tcp socket listening on port `6558`:
1901
1902     # echo -e 'GET services\n' | netcat 127.0.0.1 6558
1903
1904     # cat servicegroups <<EOF
1905     GET servicegroups
1906
1907     EOF
1908
1909     (cat servicegroups; sleep 1) | netcat 127.0.0.1 6558
1910
1911
1912 ### <a id="livestatus-command-queries"></a> Livestatus COMMAND Queries
1913
1914 A list of available external commands and their parameters can be found [here](#external-commands-list-detail)
1915
1916     $ echo -e 'COMMAND <externalcommandstring>' | netcat 127.0.0.1 6558
1917
1918
1919 ### <a id="livestatus-filters"></a> Livestatus Filters
1920
1921 and, or, negate
1922
1923   Operator  | Negate   | Description
1924   ----------|------------------------
1925    =        | !=       | Equality
1926    ~        | !~       | Regex match
1927    =~       | !=~      | Equality ignoring case
1928    ~~       | !~~      | Regex ignoring case
1929    <        |          | Less than
1930    >        |          | Greater than
1931    <=       |          | Less than or equal
1932    >=       |          | Greater than or equal
1933
1934
1935 ### <a id="livestatus-stats"></a> Livestatus Stats
1936
1937 Schema: "Stats: aggregatefunction aggregateattribute"
1938
1939   Aggregate Function | Description
1940   -------------------|--------------
1941   sum                | &nbsp;
1942   min                | &nbsp;
1943   max                | &nbsp;
1944   avg                | sum / count
1945   std                | standard deviation
1946   suminv             | sum (1 / value)
1947   avginv             | suminv / count
1948   count              | ordinary default for any stats query if not aggregate function defined
1949
1950 Example:
1951
1952     GET hosts
1953     Filter: has_been_checked = 1
1954     Filter: check_type = 0
1955     Stats: sum execution_time
1956     Stats: sum latency
1957     Stats: sum percent_state_change
1958     Stats: min execution_time
1959     Stats: min latency
1960     Stats: min percent_state_change
1961     Stats: max execution_time
1962     Stats: max latency
1963     Stats: max percent_state_change
1964     OutputFormat: json
1965     ResponseHeader: fixed16
1966
1967 ### <a id="livestatus-output"></a> Livestatus Output
1968
1969 * CSV
1970
1971 CSV Output uses two levels of array separators: The members array separator
1972 is a comma (1st level) while extra info and host|service relation separator
1973 is a pipe (2nd level).
1974
1975 Separators can be set using ASCII codes like:
1976
1977     Separators: 10 59 44 124
1978
1979 * JSON
1980
1981 Default separators.
1982
1983 ### <a id="livestatus-error-codes"></a> Livestatus Error Codes
1984
1985   Code      | Description
1986   ----------|--------------
1987   200       | OK
1988   404       | Table does not exist
1989   452       | Exception on query
1990
1991 ### <a id="livestatus-tables"></a> Livestatus Tables
1992
1993   Table         | Join      |Description
1994   --------------|-----------|----------------------------
1995   hosts         | &nbsp;    | host config and status attributes, services counter
1996   hostgroups    | &nbsp;    | hostgroup config, status attributes and host/service counters
1997   services      | hosts     | service config and status attributes
1998   servicegroups | &nbsp;    | servicegroup config, status attributes and service counters
1999   contacts      | &nbsp;    | contact config and status attributes
2000   contactgroups | &nbsp;    | contact config, members
2001   commands      | &nbsp;    | command name and line
2002   status        | &nbsp;    | programstatus, config and stats
2003   comments      | services  | status attributes
2004   downtimes     | services  | status attributes
2005   timeperiods   | &nbsp;    | name and is inside flag
2006   endpoints     | &nbsp;    | config and status attributes
2007   log           | services, hosts, contacts, commands | parses [compatlog](#objecttype-compatlogger) and shows log attributes
2008   statehist     | hosts, services | parses [compatlog](#objecttype-compatlogger) and aggregates state change attributes
2009
2010 The `commands` table is populated with `CheckCommand`, `EventCommand` and `NotificationCommand` objects.
2011
2012 A detailed list on the available table attributes can be found in the [Livestatus Schema documentation](#schema-livestatus).
2013
2014
2015 ## <a id="check-result-files"></a> Check Result Files
2016
2017 Icinga 1.x writes its check result files to a temporary spool directory
2018 where they are processed in a regular interval.
2019 While this is extremely inefficient in performance regards it has been
2020 rendered useful for passing passive check results directly into Icinga 1.x
2021 skipping the external command pipe.
2022
2023 Several clustered/distributed environments and check-aggregation addons
2024 use that method. In order to support step-by-step migration of these
2025 environments, Icinga 2 ships the `CheckResultReader` object.
2026
2027 There is no feature configuration available, but it must be defined
2028 on-demand in your Icinga 2 objects configuration.
2029
2030     object CheckResultReader "reader" {
2031       spool_dir = "/data/check-results"
2032     }