2019-09-10 13:32:51 -04:00
|
|
|
[role="xpack"]
|
2017-03-28 17:23:01 -04:00
|
|
|
[[actions]]
|
|
|
|
== Actions
|
|
|
|
|
2017-11-09 10:13:56 -05:00
|
|
|
When a watch's condition is met, its actions are executed unless it is being
|
2019-09-30 13:18:50 -04:00
|
|
|
<<actions-ack-throttle,throttled>>. A watch can perform multiple actions.
|
2017-06-27 20:16:51 -04:00
|
|
|
The actions are executed one at a time and each action executes independently.
|
|
|
|
Any failures encountered while executing an action are recorded in the
|
2017-03-28 17:23:01 -04:00
|
|
|
action result and in the watch history.
|
|
|
|
|
2017-06-27 20:16:51 -04:00
|
|
|
NOTE: If no actions are defined for a watch, no actions are executed.
|
2017-03-28 17:23:01 -04:00
|
|
|
However, a `watch_record` is still written to the watch history.
|
2017-06-27 20:16:51 -04:00
|
|
|
|
2017-03-28 17:23:01 -04:00
|
|
|
Actions have access to the payload in the execution context. They can use it to
|
|
|
|
support their execution in any way they need. For example, the payload might
|
|
|
|
serve as a model for a templated email body.
|
|
|
|
|
2017-06-27 20:16:51 -04:00
|
|
|
{watcher} supports the following types of actions:
|
2019-09-30 13:18:50 -04:00
|
|
|
<<actions-email,`email`>>, <<actions-webhook,`webhook`>>, <<actions-index,`index`>>,
|
|
|
|
<<actions-logging,`logging`>>, <<actions-slack,`slack`>>,
|
|
|
|
and <<actions-pagerduty,`pagerduty`>>.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
[[actions-ack-throttle]]
|
2019-09-30 13:18:50 -04:00
|
|
|
=== Acknowledgement and throttling
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
During the watch execution, once the condition is met, a decision is made per
|
|
|
|
configured action as to whether it should be throttled. The main purpose of
|
|
|
|
action throttling is to prevent too many executions of the same action for the
|
|
|
|
same watch.
|
|
|
|
|
|
|
|
For example, suppose you have a watch that detects errors in an application's log
|
|
|
|
entries. The watch is triggered every five minutes and searches for errors during
|
|
|
|
the last hour. In this case, if there are errors, there is a period of time where
|
|
|
|
the watch is checked and its actions are executed multiple times based on the same
|
|
|
|
errors. As a result, the system administrator receives multiple notifications about
|
|
|
|
the same issue, which can be annoying.
|
|
|
|
|
|
|
|
To address this issue, {watcher} supports time-based throttling. You can define
|
|
|
|
a throttling period as part of the action configuration to limit how often the
|
|
|
|
action is executed. When you set a throttling period, {watcher} prevents repeated
|
|
|
|
execution of the action if it has already executed within the throttling period
|
|
|
|
time frame (`now - throttling period`).
|
|
|
|
|
|
|
|
The following snippet shows a watch for the scenario described above - associating
|
|
|
|
a throttle period with the `email_administrator` action:
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/error_logs_alert
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"metadata" : {
|
|
|
|
"color" : "red"
|
|
|
|
},
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : {
|
|
|
|
"interval" : "5m"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"search" : {
|
|
|
|
"request" : {
|
|
|
|
"indices" : "log-events",
|
|
|
|
"body" : {
|
|
|
|
"size" : 0,
|
|
|
|
"query" : { "match" : { "status" : "error" } }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"condition" : {
|
2018-12-05 13:49:06 -05:00
|
|
|
"compare" : { "ctx.payload.hits.total.value" : { "gt" : 5 }}
|
2017-03-28 17:23:01 -04:00
|
|
|
},
|
|
|
|
"actions" : {
|
|
|
|
"email_administrator" : {
|
|
|
|
"throttle_period": "15m", <1>
|
|
|
|
"email" : { <2>
|
|
|
|
"to" : "sys.admino@host.domain",
|
2018-12-05 13:49:06 -05:00
|
|
|
"subject" : "Encountered {{ctx.payload.hits.total.value}} errors",
|
2017-03-28 17:23:01 -04:00
|
|
|
"body" : "Too many error in the system, see attached data",
|
|
|
|
"attachments" : {
|
|
|
|
"attached_data" : {
|
|
|
|
"data" : {
|
|
|
|
"format" : "json"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"priority" : "high"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2019-09-09 12:35:50 -04:00
|
|
|
|
2017-03-28 17:23:01 -04:00
|
|
|
<1> There will be at least 15 minutes between subsequent `email_administrator`
|
|
|
|
action executions.
|
2019-09-30 13:18:50 -04:00
|
|
|
<2> See <<actions-email>> for more information.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
You can also define a throttle period at the watch level. The watch-level
|
|
|
|
throttle period serves as the default throttle period for all of the actions
|
|
|
|
defined in the watch:
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/log_event_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"trigger" : {
|
2017-11-09 10:13:56 -05:00
|
|
|
"schedule" : { "interval" : "5m" }
|
2017-03-28 17:23:01 -04:00
|
|
|
},
|
|
|
|
"input" : {
|
2017-11-09 10:13:56 -05:00
|
|
|
"search" : {
|
|
|
|
"request" : {
|
|
|
|
"indices" : "log-events",
|
|
|
|
"body" : {
|
|
|
|
"size" : 0,
|
|
|
|
"query" : { "match" : { "status" : "error" } }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2017-03-28 17:23:01 -04:00
|
|
|
},
|
|
|
|
"condition" : {
|
2018-12-05 13:49:06 -05:00
|
|
|
"compare" : { "ctx.payload.hits.total.value" : { "gt" : 5 }}
|
2017-03-28 17:23:01 -04:00
|
|
|
},
|
|
|
|
"throttle_period" : "15m", <1>
|
|
|
|
"actions" : {
|
|
|
|
"email_administrator" : {
|
|
|
|
"email" : {
|
|
|
|
"to" : "sys.admino@host.domain",
|
2018-12-05 13:49:06 -05:00
|
|
|
"subject" : "Encountered {{ctx.payload.hits.total.value}} errors",
|
2017-03-28 17:23:01 -04:00
|
|
|
"body" : "Too many error in the system, see attached data",
|
|
|
|
"attachments" : {
|
|
|
|
"attached_data" : {
|
|
|
|
"data" : {
|
|
|
|
"format" : "json"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"priority" : "high"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"notify_pager" : {
|
|
|
|
"webhook" : {
|
|
|
|
"method" : "POST",
|
|
|
|
"host" : "pager.service.domain",
|
|
|
|
"port" : 1234,
|
|
|
|
"path" : "/{{watch_id}}",
|
2018-12-05 13:49:06 -05:00
|
|
|
"body" : "Encountered {{ctx.payload.hits.total.value}} errors"
|
2017-03-28 17:23:01 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
<1> There will be at least 15 minutes between subsequent action executions
|
|
|
|
(applies to both `email_administrator` and `notify_pager` actions)
|
|
|
|
|
|
|
|
If you do not define a throttle period at the action or watch level, the global
|
|
|
|
default throttle period is applied. Initially, this is set to 5 seconds. To
|
|
|
|
change the global default, configure the `xpack.watcher.execution.default_throttle_period`
|
|
|
|
setting in `elasticsearch.yml`:
|
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
--------------------------------------------------
|
|
|
|
xpack.watcher.execution.default_throttle_period: 15m
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
{watcher} also supports acknowledgement-based throttling. You can acknowledge a
|
2019-09-30 13:18:50 -04:00
|
|
|
watch using the <<watcher-api-ack-watch,ack watch API>> to prevent the
|
2017-06-27 20:16:51 -04:00
|
|
|
watch actions from being executed again while the watch condition remains `true`.
|
|
|
|
This essentially tells {watcher} "I received the notification and I'm handling
|
|
|
|
it, please do not notify me about this error again". An acknowledged watch action
|
|
|
|
remains in the `acked` state until the watch's condition evaluates to `false`.
|
|
|
|
When that happens, the action's state changes to `awaits_successful_execution`.
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
To acknowledge an action, you use the <<watcher-api-ack-watch,ack watch API>>:
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
----------------------------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
POST _watcher/watch/<id>/_ack/<action_ids>
|
2017-03-28 17:23:01 -04:00
|
|
|
----------------------------------------------------------------------
|
|
|
|
// TEST[skip:https://github.com/elastic/x-plugins/issues/2513]
|
|
|
|
|
|
|
|
Where `<id>` is the id of the watch and `<action_ids>` is a comma-separated list
|
|
|
|
of the action ids you want to acknowledge. To acknowledge all actions, omit the
|
|
|
|
`actions` parameter.
|
|
|
|
|
|
|
|
The following diagram illustrates the throttling decisions made for each action
|
|
|
|
of a watch during its execution:
|
|
|
|
|
2017-04-17 22:58:19 -04:00
|
|
|
image::images/action-throttling.jpg[align="center"]
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-11 19:32:47 -04:00
|
|
|
[role="xpack"]
|
2019-07-03 05:04:57 -04:00
|
|
|
[[action-foreach]]
|
|
|
|
=== Running an action for each element in an array
|
|
|
|
|
|
|
|
You can use the `foreach` field in an action to trigger the configured action
|
|
|
|
for every element within that array.
|
|
|
|
|
2019-08-27 17:57:20 -04:00
|
|
|
In order to protect from long running watches, you can use the `max_iterations`
|
|
|
|
field to limit the maximum amount of runs that each watch executes. If this limit
|
|
|
|
is reached, the execution is gracefully stopped. If not set, this field defaults
|
|
|
|
to one hundred.
|
2019-07-03 05:04:57 -04:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2019-07-03 05:04:57 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
PUT _watcher/watch/log_event_watch
|
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "5m" }
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"search" : {
|
|
|
|
"request" : {
|
|
|
|
"indices" : "log-events",
|
|
|
|
"body" : {
|
|
|
|
"query" : { "match" : { "status" : "error" } }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"condition" : {
|
|
|
|
"compare" : { "ctx.payload.hits.total" : { "gt" : 0 } }
|
|
|
|
},
|
|
|
|
"actions" : {
|
|
|
|
"log_hits" : {
|
|
|
|
"foreach" : "ctx.payload.hits.hits", <1>
|
2019-08-27 17:57:20 -04:00
|
|
|
"max_iterations" : 500,
|
2019-07-03 05:04:57 -04:00
|
|
|
"logging" : {
|
|
|
|
"text" : "Found id {{ctx.payload._id}} with field {{ctx.payload._source.my_field}}"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
<1> The logging statement will be executed for each of the returned search hits.
|
|
|
|
|
2019-09-11 19:32:47 -04:00
|
|
|
[role="xpack"]
|
2017-11-09 10:13:56 -05:00
|
|
|
[[action-conditions]]
|
|
|
|
=== Adding conditions to actions
|
|
|
|
|
|
|
|
When a watch is triggered, its condition determines whether or not to execute the
|
|
|
|
watch actions. Within each action, you can also add a condition per action. These
|
|
|
|
additional conditions enable a single alert to execute different actions depending
|
2019-01-07 08:44:12 -05:00
|
|
|
on a their respective conditions. The following watch would always send an email, when
|
2017-11-09 10:13:56 -05:00
|
|
|
hits are found from the input search, but only trigger the `notify_pager` action when
|
|
|
|
there are more than 5 hits in the search result.
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-11-09 10:13:56 -05:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/log_event_watch
|
2017-11-09 10:13:56 -05:00
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "5m" }
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"search" : {
|
|
|
|
"request" : {
|
|
|
|
"indices" : "log-events",
|
|
|
|
"body" : {
|
|
|
|
"size" : 0,
|
|
|
|
"query" : { "match" : { "status" : "error" } }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"condition" : {
|
2018-12-05 13:49:06 -05:00
|
|
|
"compare" : { "ctx.payload.hits.total.value" : { "gt" : 0 } }
|
2017-11-09 10:13:56 -05:00
|
|
|
},
|
|
|
|
"actions" : {
|
|
|
|
"email_administrator" : {
|
|
|
|
"email" : {
|
|
|
|
"to" : "sys.admino@host.domain",
|
2018-12-05 13:49:06 -05:00
|
|
|
"subject" : "Encountered {{ctx.payload.hits.total.value}} errors",
|
2017-11-09 10:13:56 -05:00
|
|
|
"body" : "Too many error in the system, see attached data",
|
|
|
|
"attachments" : {
|
|
|
|
"attached_data" : {
|
|
|
|
"data" : {
|
|
|
|
"format" : "json"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"priority" : "high"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"notify_pager" : {
|
|
|
|
"condition": { <1>
|
2018-12-05 13:49:06 -05:00
|
|
|
"compare" : { "ctx.payload.hits.total.value" : { "gt" : 5 } }
|
2017-11-09 10:13:56 -05:00
|
|
|
},
|
|
|
|
"webhook" : {
|
|
|
|
"method" : "POST",
|
|
|
|
"host" : "pager.service.domain",
|
|
|
|
"port" : 1234,
|
|
|
|
"path" : "/{{watch_id}}",
|
2018-12-05 13:49:06 -05:00
|
|
|
"body" : "Encountered {{ctx.payload.hits.total.value}} errors"
|
2017-11-09 10:13:56 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
<1> A `condition` that only applies to the `notify_pager` action, which
|
|
|
|
restricts its execution to when the condition succeeds (at least 5 hits in this case).
|
|
|
|
|
2017-03-28 17:23:01 -04:00
|
|
|
include::actions/email.asciidoc[]
|
|
|
|
|
|
|
|
include::actions/webhook.asciidoc[]
|
|
|
|
|
|
|
|
include::actions/index.asciidoc[]
|
|
|
|
|
|
|
|
include::actions/logging.asciidoc[]
|
|
|
|
|
|
|
|
include::actions/slack.asciidoc[]
|
|
|
|
|
|
|
|
include::actions/pagerduty.asciidoc[]
|
|
|
|
|
|
|
|
include::actions/jira.asciidoc[]
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[actions-ssl-openjdk]]
|
|
|
|
=== Using SSL/TLS with OpenJDK
|
|
|
|
|
|
|
|
As each distributor is free to choose how to package OpenJDK, it may happen,
|
|
|
|
that even despite the exact same version, an OpenJDK distribution contains
|
|
|
|
different parts under different Linux distributions.
|
|
|
|
|
|
|
|
This can lead to issues with any action or input that uses TLS, like the `jira`,
|
2019-02-25 17:08:46 -05:00
|
|
|
`pagerduty`, `slack`, or `webhook` one, because of missing CA certs.
|
2017-03-28 17:23:01 -04:00
|
|
|
If you encounter TLS errors, when writing watches that connect to TLS endpoints,
|
|
|
|
you should try to upgrade to the latest available OpenJDK distribution for your
|
|
|
|
platform and if that does not help, try to upgrade to Oracle JDK.
|