The acknowledgement is removed if a state change occurs or if the host/service
recovers (OK/Up state).
-If you acknowlege a problem once you've received a `Critical` notification,
+If you acknowledge a problem once you've received a `Critical` notification,
the acknowledgement will be removed if there is a state transition to `Warning`.
```
OK -> WARNING -> CRITICAL -> WARNING -> OK
preferred.
The following example defines a time period called `holidays` where
-notifications should be supressed:
+notifications should be suppressed:
object TimePeriod "holidays" {
import "legacy-timeperiod"
-
+
ranges = {
"january 1" = "00:00-24:00" //new year's day
"july 4" = "00:00-24:00" //independence day
object TimePeriod "weekends-excluded" {
import "legacy-timeperiod"
-
+
ranges = {
"saturday" = "00:00-09:00,18:00-24:00"
"sunday" = "00:00-09:00,18:00-24:00"
object TimePeriod "prod-notification" {
import "legacy-timeperiod"
-
+
excludes = [ "holidays", "weekends-excluded" ]
-
+
ranges = {
"monday" = "00:00-24:00"
"tuesday" = "00:00-24:00"
}
```
-References: [get_service](18-library-reference.md#objref-get_service), [nacro](18-library-reference.md#scoped-functions-macro), [DateTime](18-library-reference.md#datetime-type).
+References: [get_service](18-library-reference.md#objref-get_service), [macro](18-library-reference.md#scoped-functions-macro), [DateTime](18-library-reference.md#datetime-type).
Example output in Icinga Web 2:
Icinga 2 supports optional detection of hosts and services that are "flapping".
-Flapping occurs when a service or host changes state too frequently, resulting
-in a storm of problem and recovery notifications. Flapping can be the source of
-configuration problems (i.e. thresholds set too low), troublesome services,
-or real network problems.
+Flapping occurs when a service or host changes state too frequently, which would result in a storm of problem and
+recovery notifications. With flapping detection enabled a flapping notification will be sent while other notifications are
+suppresed until it calms down after receiving the same status from checks a few times. Flapping detection can help detect
+
+configuration problems (wrong thresholds), troublesome services, or network problems.
Flapping detection can be enabled or disabled using the `enable_flapping` attribute.
-The `flapping_threshold` attributes allows to specify the percentage of state changes
-when a [host](09-object-types.md#objecttype-host) or [service](objecttype-service) is considered to flap.
+The `flapping_threshold_high` and `flapping_threshold_low` attributes allows to specify the thresholds that control
+when a [host](09-object-types.md#objecttype-host) or [service](objecttype-service) is considered to be flapping.
+
+The default thresholds are 30% for high and 25% for low. If the computed flapping value exceeds the high threshold a
+host or service is considered flapping until it drops below the low flapping threshold.
+
+`FlappingStart` and `FlappingEnd` notifications will be sent out accordingly, if configured. See the chapter on
+[notifications](alert-notifications) for details
+
+> Note: There is no distinctions between hard and soft states with flapping. All state changes count and notifications
+> will be sent out regardless of the objects state.
+
+### How it works <a id="check-flapping-how-it-works"></a>
+
+Icinga 2 saves the last 20 state changes for every host and service. See the graphic below:
+
+![Icinga 2 Flapping State Timeline](images/advanced-topics/flapping-state-graph.png)
-Note: There are known issues with flapping detection. Please refrain from enabling
-flapping until [#4982](https://github.com/Icinga/icinga2/issues/4982) is fixed.
+All the states ware weighted, with the most recent one being worth the most (1.15) and the 20th the least (0.8). The
+states in between are fairly distributed. The final flapping value are the weighted state changes divided by the total
+count of 20.
+
+In the example above, the added states would have a total value of 7.82 (`0.84 + 0.86 + 0.88 + 0.9 + 0.98 + 1.06 + 1.12 + 1.18`).
+This yields a flapping percentage of 39.1% (`7.82 / 20 * 100`). As the default upper flapping threshold is 30%, it would be
+considered flapping.
+
+If the next seven check results then would not be state changes, the flapping percentage would fall below the lower threshold
+of 25% and therefore the host or service would recover from flapping.
## Volatile Services <a id="volatile-services"></a>
object User "short-dummy" {
}
-
+
object UserGroup "short-dummy-group" {
assign where user.name == "short-dummy"
}
-
+
apply Notification "mail-admins-short" to Host {
import "mail-host-notification"
command = "mail-host-notification-test"
}
log("Running command")
log(mailscript)
-
+
var cmd = [ SysconfDir + "/icinga2/scripts/" + mailscript ]
log(LogCritical, "me", cmd)
return cmd
}}
-
+
env = {
}
}
}
}
}
-
+
apply Service "ping4" {
import "generic-service"
check_command = "ping4"
-
+
vars.ping_wrta = group_specific_value("slow-lan", 300, 100)
vars.ping_crta = group_specific_value("slow-lan", 500, 200)
-
+
assign where true
}
warn | Value | Warning threshold value.
min | Value | Minimum value returned by the check.
max | Value | Maximum value returned by the check.
-
-