2017-03-28 17:23:01 -04:00
|
|
|
[[how-watcher-works]]
|
|
|
|
== How {watcher} Works
|
|
|
|
|
|
|
|
You <<watch-definition, add watches>> to automatically perform an action when
|
|
|
|
certain conditions are met. The conditions are generally based on data you've
|
|
|
|
loaded into the watch, also known as the _Watch Payload_. This payload can be
|
|
|
|
loaded from different sources - from Elasticsearch, an external HTTP service, or
|
|
|
|
even a combination of the two.
|
|
|
|
|
|
|
|
For example, you could configure a watch to send an email to the sysadmin when a
|
|
|
|
search in the logs data indicates that there are too many 503 errors in the last
|
|
|
|
5 minutes.
|
|
|
|
|
|
|
|
This topic describes the elements of a watch and how watches operate.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[watch-definition]]
|
|
|
|
=== Watch Definition
|
|
|
|
|
|
|
|
A watch consists of a _trigger_, _input_, _condition_, and _actions_. The actions
|
|
|
|
define what needs to be done once the condition is met. In addition, you can
|
|
|
|
define _conditions_ and _transforms_ to process and prepare the watch payload before
|
|
|
|
executing the actions.
|
|
|
|
|
|
|
|
<<trigger,Trigger>>::
|
|
|
|
Determines when the watch is checked. A watch must have a trigger.
|
|
|
|
|
|
|
|
<<input,Input>>::
|
|
|
|
Loads data into the watch payload. If no input is specified, an empty payload is
|
|
|
|
loaded.
|
|
|
|
|
|
|
|
<<condition,Condition>>::
|
|
|
|
Controls whether the watch actions are executed. If no condition is specified,
|
|
|
|
the condition defaults to `always`.
|
|
|
|
|
|
|
|
<<transform,Transform>>::
|
|
|
|
Processes the watch payload to prepare it for the watch actions. You can define
|
|
|
|
transforms at the watch level or define action-specific transforms. Optional.
|
|
|
|
|
|
|
|
<<actions,Actions>>::
|
|
|
|
Specify what happens when the watch condition is met.
|
|
|
|
|
|
|
|
[[watch-definition-example]]
|
|
|
|
|
2017-06-27 20:16:51 -04:00
|
|
|
For example, the following snippet shows a
|
|
|
|
{ref}/watcher-api-put-watch.html[Put Watch] request that defines a watch that
|
|
|
|
looks for log error events:
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
PUT _xpack/watcher/watch/log_errors
|
|
|
|
{
|
|
|
|
"metadata" : { <1>
|
|
|
|
"color" : "red"
|
|
|
|
},
|
|
|
|
"trigger" : { <2>
|
|
|
|
"schedule" : {
|
|
|
|
"interval" : "5m"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"input" : { <3>
|
|
|
|
"search" : {
|
|
|
|
"request" : {
|
|
|
|
"indices" : "log-events",
|
|
|
|
"body" : {
|
|
|
|
"size" : 0,
|
|
|
|
"query" : { "match" : { "status" : "error" } }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"condition" : { <4>
|
|
|
|
"compare" : { "ctx.payload.hits.total" : { "gt" : 5 }}
|
|
|
|
},
|
|
|
|
"transform" : { <5>
|
|
|
|
"search" : {
|
|
|
|
"request" : {
|
|
|
|
"indices" : "log-events",
|
|
|
|
"body" : {
|
|
|
|
"query" : { "match" : { "status" : "error" } }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"actions" : { <6>
|
|
|
|
"my_webhook" : {
|
|
|
|
"webhook" : {
|
|
|
|
"method" : "POST",
|
|
|
|
"host" : "mylisteninghost",
|
|
|
|
"port" : 9200,
|
|
|
|
"path" : "/{{watch_id}}",
|
|
|
|
"body" : "Encountered {{ctx.payload.hits.total}} errors"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"email_administrator" : {
|
|
|
|
"email" : {
|
|
|
|
"to" : "sys.admino@host.domain",
|
|
|
|
"subject" : "Encountered {{ctx.payload.hits.total}} errors",
|
|
|
|
"body" : "Too many error in the system, see attached data",
|
|
|
|
"attachments" : {
|
|
|
|
"attached_data" : {
|
|
|
|
"data" : {
|
|
|
|
"format" : "json"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"priority" : "high"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
<1> Metadata - You can attach optional static metadata to a watch.
|
|
|
|
<2> Trigger - This schedule trigger executes the watch every 5 minutes.
|
|
|
|
<3> Input - This input searches for errors in the `log-events` index and
|
|
|
|
loads the response into the watch payload.
|
|
|
|
<4> Condition - This condition checks to see if there are more than 5 error
|
|
|
|
events (hits in the search response). If there are, execution
|
|
|
|
continues for all `actions`.
|
|
|
|
<5> Transform - If the watch condition is met, this transform loads all of the
|
|
|
|
errors into the watch payload by searching for the errors using
|
|
|
|
the default search type, `query_then_fetch`. All of the watch
|
|
|
|
actions have access to this payload.
|
|
|
|
<6> Actions - This watch has two actions. The `my_webhook` action notifies a
|
|
|
|
3rd party system about the problem. The `email_administrator`
|
|
|
|
action sends a high priority email to the system administrator.
|
|
|
|
The watch payload that contains the errors is attached to the
|
|
|
|
email.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[watch-execution]]
|
|
|
|
=== Watch Execution
|
|
|
|
|
|
|
|
[[schedule-scheduler]]
|
|
|
|
When you add a watch, {watcher} immediately registers its trigger with the
|
2017-05-17 20:16:34 -04:00
|
|
|
appropriate trigger engine. Watches that have a `schedule` trigger are
|
|
|
|
registered with the `scheduler` trigger engine.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2017-05-17 20:16:34 -04:00
|
|
|
The scheduler tracks time and triggers watches according to their schedules.
|
Watcher: Distributed watch execution (elastic/x-pack-elasticsearch#544)
The distribution of watches now happens on the node which holds the
watches index, instead of on the master node. This requires several
changes to the current implementation.
1. Running on shards and replicas
In order to run watches on the nodes with the watches index on its
primaries and replicas. To ensure that watches do not run twice, there is
a logic which checks the local shards, runs a murmurhash on the id and
runs modulo against the number of shards and replicas, this is the way to
find out, if a watch should run local. Reloading happens
2. Several master node actions moved to a HandledTransportAction, as they
are basically just aliases for indexing actions, among them the
put/delete/get watch actions, the acknowledgement action, the de/activate
actions
3. Stats action moved to a broadcast node action, because we potentially
have to query every node to get watcher statistics
4. Starting/Stopping watcher now is a master node action, which updates
the cluster state and then listeners acts on those. Because of this watches
can be running on two systems, if you those have different cluster state
versions, until the new watcher state is propagated
5. Watcher is started on all nodes now. With the exception of the ticker
schedule engine most classes do not need a lot of resources while running.
However they have to run, because of the execute watch API, which can hit
any node - it does not make sense to find the right shard for this watch
and only then execute (as this also has to work with a watch, that has not
been stored before)
6. By using a indexing operation listener, each storing of a watch now
parses the watch first and only stores on successful parsing
7. Execute watch API now uses the watcher threadpool for execution
8. Getting the number of watches for the stats now simply queries the
different execution engines, how many watches are scheduled, so this is
not doing a search anymore
There will be follow up commits on this one, mainly to ensure BWC compatibility.
Original commit: elastic/x-pack-elasticsearch@0adb46e6589d4ec410739105dc09a26256f51a98
2017-05-02 04:12:46 -04:00
|
|
|
On each node, that contains one of the `.watches` shards, a scheduler, that is
|
|
|
|
bound to the watcher lifecycle runs. Even though all primaries and replicas are
|
|
|
|
taken into account, when a watch is triggered, watcher also ensures, that each
|
|
|
|
watch is only triggered on one of those shards. The more replica shards you
|
|
|
|
add, the more distributed the watches can be executed. If you add or remove
|
|
|
|
replicas, all watches need to be reloaded. If a shard is relocated, the
|
|
|
|
primary and all replicas of this particular shard will reload.
|
|
|
|
|
|
|
|
Because the watches are executed on the node, where the watch shards are, you can create
|
|
|
|
dedicated watcher nodes by using shard allocation filtering.
|
|
|
|
|
|
|
|
You could configure nodes with a dedicated `node.attr.watcher: true` property and
|
|
|
|
then configure the `.watches` index like this:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
------------------------
|
|
|
|
PUT .watches/_settings
|
|
|
|
{
|
|
|
|
"index.routing.allocation.include": "watcher"
|
|
|
|
}
|
|
|
|
------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[skip:indexes don't assign]
|
|
|
|
|
|
|
|
When the {watcher} service is stopped, the scheduler stops with it. Trigger
|
|
|
|
engines use a separate thread pool from the one used to execute watches.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
When a watch is triggered, {watcher} queues it up for execution. A `watch_record`
|
2017-05-17 20:16:34 -04:00
|
|
|
document is created and added to the watch history and the watch's status is set
|
2017-03-28 17:23:01 -04:00
|
|
|
to `awaits_execution`.
|
|
|
|
|
|
|
|
When execution starts, {watcher} creates a watch execution context for the watch.
|
2017-05-17 20:16:34 -04:00
|
|
|
The execution context provides scripts and templates with access to the watch
|
|
|
|
metadata, payload, watch ID, execution time, and trigger information. For more
|
2017-03-28 17:23:01 -04:00
|
|
|
information, see <<watch-execution-context, Watch Execution Context>>.
|
|
|
|
|
|
|
|
During the execution process, {watcher}:
|
|
|
|
|
|
|
|
. Loads the input data as the payload in the watch execution context. This makes
|
|
|
|
the data available to all subsequent steps in the execution process. This step
|
|
|
|
is controlled by the input of the watch.
|
|
|
|
. Evaluates the watch condition to determine whether or not to continue processing
|
|
|
|
the watch. If the condition is met (evaluates to `true`), processing advances
|
|
|
|
to the next step. If it is not met (evaluates to `false`), execution of the watch
|
|
|
|
stops.
|
|
|
|
. Applies transforms to the watch payload (if needed).
|
|
|
|
. Executes the watch actions granted the condition is met and the watch is not
|
|
|
|
<<watch-acknowledgment-throttling, throttled>>.
|
|
|
|
|
|
|
|
When the watch execution finishes, the execution result is recorded as a
|
|
|
|
_Watch Record_ in the watch history. The watch record includes the execution
|
|
|
|
time and duration, whether the watch condition was met, and the status of each
|
|
|
|
action that was executed.
|
|
|
|
|
|
|
|
The following diagram shows the watch execution process:
|
|
|
|
|
2017-04-17 22:58:19 -04:00
|
|
|
image::images/watch-execution.jpg[align="center"]
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
[[watch-acknowledgment-throttling]]
|
|
|
|
=== Watch Acknowledgment and Throttling
|
|
|
|
|
|
|
|
{watcher} supports both time-based and acknowledgment-based throttling. This
|
|
|
|
enables you to prevent actions from being repeatedly executed for the same event.
|
|
|
|
|
|
|
|
By default, {watcher} uses time-based throttling with a throttle period of 5
|
|
|
|
seconds. This means that if a watch is executed every second, its actions are
|
|
|
|
performed a maximum of once every 5 seconds, even when the condition is always
|
|
|
|
met. You can configure the throttle period on a per-action basis or at the
|
|
|
|
watch level.
|
|
|
|
|
|
|
|
Acknowledgment-based throttling enables you to tell {watcher} not to send any more
|
|
|
|
notifications about a watch as long as its condition is met. Once the condition
|
|
|
|
evaluates to `false`, the acknowledgment is cleared and {watcher} resumes executing
|
|
|
|
the watch actions normally.
|
|
|
|
|
|
|
|
For more information, see <<actions-ack-throttle>>.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[watch-active-state]]
|
|
|
|
=== Watch Active State
|
|
|
|
|
|
|
|
By default, when you add a watch it is immediately set to the _active_ state,
|
|
|
|
registered with the appropriate trigger engine, and executed according
|
|
|
|
to its configured trigger.
|
|
|
|
|
2017-05-17 20:16:34 -04:00
|
|
|
You can also set a watch to the _inactive_ state. Inactive watches are not
|
2017-03-28 17:23:01 -04:00
|
|
|
registered with a trigger engine and can never be triggered.
|
|
|
|
|
2017-05-17 20:16:34 -04:00
|
|
|
To set a watch to the inactive state when you create it, set the
|
2017-06-27 20:16:51 -04:00
|
|
|
{ref}/watcher-api-put-watch.html#watcher-api-put-watch-active-state[`active`]
|
|
|
|
parameter to _inactive_. To deactivate an existing watch, use the
|
|
|
|
{ref}/watcher-api-deactivate-watch.html[Deactivate Watch API]. To reactivate an inactive watch, use the
|
|
|
|
{ref}/watcher-api-activate-watch.html[Activate Watch API].
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2017-06-27 20:16:51 -04:00
|
|
|
NOTE: You can use the {ref}/watcher-api-execute-watch.html[Execute Watch API]
|
|
|
|
to force the execution of a watch even when it is inactive.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
Deactivating watches is useful in a variety of situations. For example, if you
|
|
|
|
have a watch that monitors an external system and you need to take that system
|
|
|
|
down for maintenance, you can deactivate the watch to prevent it from falsely
|
|
|
|
reporting availability issues during the maintenance window.
|
|
|
|
|
|
|
|
Deactivating a watch also enables you to keep it around for future use without
|
|
|
|
deleting it from the system.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[scripts-templates]]
|
|
|
|
=== Scripts and Templates
|
|
|
|
|
|
|
|
You can use scripts and templates when defining a watch. Scripts and templates
|
|
|
|
can reference elements in the watch execution context, including the watch payload.
|
|
|
|
The execution context defines variables you can use in a script and parameter
|
|
|
|
placeholders in a template.
|
|
|
|
|
|
|
|
{watcher} uses the Elasticsearch script infrastructure, which supports
|
2017-05-18 03:17:46 -04:00
|
|
|
<<inline-templates-scripts,inline>> and <<stored-templates-scripts, stored>>.
|
|
|
|
Scripts and templates are compiled
|
2017-05-17 20:16:34 -04:00
|
|
|
and cached by Elasticsearch to optimize recurring execution. Autoloading is also
|
|
|
|
supported. For more information, see {ref}/modules-scripting.html[Scripting] and
|
|
|
|
{ref}/modules-scripting-using.html[How to use scripts] in the Elasticsearch
|
|
|
|
Reference.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
[[watch-execution-context]]
|
|
|
|
==== Watch Execution Context
|
|
|
|
|
|
|
|
The following snippet shows the basic structure of the _Watch Execution Context_:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
{
|
|
|
|
"ctx" : {
|
|
|
|
"metadata" : { ... }, <1>
|
|
|
|
"payload" : { ... }, <2>
|
|
|
|
"watch_id" : "<id>", <3>
|
|
|
|
"execution_time" : "20150220T00:00:10Z", <4>
|
|
|
|
"trigger" : { <5>
|
|
|
|
"triggered_time" : "20150220T00:00:10Z",
|
|
|
|
"scheduled_time" : "20150220T00:00:00Z"
|
|
|
|
},
|
|
|
|
"vars" : { ... } <6>
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
<1> Any static metadata specified in the watch definition.
|
|
|
|
<2> The current watch payload.
|
|
|
|
<3> The id of the executing watch.
|
|
|
|
<4> A timestamp that shows when the watch execution started.
|
|
|
|
<5> Information about the trigger event. For a `schedule` trigger, this
|
|
|
|
consists of the `triggered_time` (when the watch was triggered)
|
|
|
|
and the `scheduled_time` (when the watch was scheduled to be triggered).
|
|
|
|
<6> Dynamic variables that can be set and accessed by different constructs
|
|
|
|
during the execution. These variables are scoped to a single execution
|
|
|
|
(i.e they're not persisted and can't be used between different executions
|
|
|
|
of the same watch)
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[scripts]]
|
|
|
|
==== Using Scripts
|
|
|
|
|
|
|
|
You can use scripts to define <<condition-script, conditions>> and
|
|
|
|
<<transform-script, transforms>>. The default scripting language is
|
|
|
|
{ref}/modules-scripting-painless.html[Painless].
|
|
|
|
|
|
|
|
NOTE: Starting with 5.0, Elasticsearch is shipped with the new
|
|
|
|
{ref}/modules-scripting-painless.html[Painless] scripting language.
|
|
|
|
Painless was created and designed specifically for use in Elasticsearch.
|
|
|
|
Beyond providing an extensive feature set, its biggest trait is that it's
|
|
|
|
properly sandboxed and safe to use anywhere in the system (including in
|
|
|
|
{watcher}) without the need to enable dynamic scripting.
|
|
|
|
|
|
|
|
|
|
|
|
Scripts can reference any of the values in the watch execution context or values
|
|
|
|
explicitly passed through script parameters.
|
|
|
|
|
|
|
|
For example, if the watch metadata contains a `color` field
|
|
|
|
(e.g. `"metadata" : {"color": "red"}`), you can access its value with the via the
|
|
|
|
`ctx.metadata.color` variable. If you pass in a `color` parameter as part of the
|
|
|
|
condition or transform definition (e.g. `"params" : {"color": "red"}`), you can
|
|
|
|
access its value via the `color` variable.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[templates]]
|
|
|
|
==== Using Templates
|
|
|
|
|
|
|
|
You use templates to define dynamic content for a watch. At execution time,
|
|
|
|
templates pull in data from the watch execution context. For example, you can use
|
|
|
|
a template to populate the `subject` field for an `email` action with data stored
|
|
|
|
in the watch payload. Templates can also access values explicitly passed through
|
|
|
|
template parameters.
|
|
|
|
|
|
|
|
You specify templates using the https://mustache.github.io[Mustache] scripting
|
|
|
|
language.
|
|
|
|
|
|
|
|
For example, the following snippet shows how templates enable dynamic subjects
|
|
|
|
in sent emails:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
{
|
|
|
|
"actions" : {
|
|
|
|
"email_notification" : {
|
|
|
|
"email" : {
|
|
|
|
"subject" : "{{ctx.metadata.color}} alert"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[inline-templates-scripts]]
|
|
|
|
===== Inline Templates and Scripts
|
|
|
|
|
|
|
|
To define an inline template or script, you simply specify it directly in the
|
|
|
|
value of a field. For example, the following snippet configures the subject of
|
|
|
|
the `email` action using an inline template that references the `color` value in
|
|
|
|
the context metadata.
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
"actions" : {
|
|
|
|
"email_notification" : {
|
|
|
|
"email" : {
|
|
|
|
"subject" : "{{ctx.metadata.color}} alert"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
For a script, you simply specify the inline script as the value of the `script`
|
|
|
|
field. For example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
"condition" : {
|
|
|
|
"script" : "return true"
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
You can also explicitly specify the inline type by using a formal object
|
|
|
|
definition as the field value. For example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
"actions" : {
|
|
|
|
"email_notification" : {
|
|
|
|
"email" : {
|
|
|
|
"subject" : {
|
2017-06-09 11:29:36 -04:00
|
|
|
"source" : "{{ctx.metadata.color}} alert"
|
2017-03-28 17:23:01 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
The formal object definition for a script would be:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
"condition" : {
|
|
|
|
"script" : {
|
2017-06-09 11:29:36 -04:00
|
|
|
"source": "return true"
|
2017-03-28 17:23:01 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[stored-templates-scripts]]
|
|
|
|
===== Stored Templates and Scripts
|
|
|
|
|
2017-05-17 20:16:34 -04:00
|
|
|
If you {ref}/modules-scripting-using.html#modules-scripting-stored-scripts[store]
|
2017-03-28 17:23:01 -04:00
|
|
|
your templates and scripts, you can reference them by id.
|
|
|
|
|
|
|
|
To reference a stored script or template, you use the formal object definition
|
|
|
|
and specify its id in the `id` field. For example, the following snippet
|
|
|
|
references the `email_notification_subject` template:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
{
|
|
|
|
...
|
|
|
|
"actions" : {
|
|
|
|
"email_notification" : {
|
|
|
|
"email" : {
|
|
|
|
"subject" : {
|
|
|
|
"id" : "email_notification_subject",
|
|
|
|
"params" : {
|
|
|
|
"color" : "red"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
----------------------------------------------------------------------
|