OpenSearch/watcher/docs/getting-started.asciidoc

488 lines
16 KiB
Plaintext
Raw Normal View History

[[getting-started]]
== Getting Started
This getting started guide walks you through installing Watcher and creating your first watches,
and introduces the building blocks you'll use to create custom watches. You can install Watcher
on nodes running Elasticsearch 1.5 or later.
To install and run Watcher:
. Run `bin/plugin install` from `ES_HOME` to install the License plugin:
+
[source,shell]
----------------------------------------------------------
bin/plugin install elasticsearch/license/latest
----------------------------------------------------------
+
NOTE: You need to install the License and Watcher plugins on each node in your cluster.
. Run `bin/plugin install` to install the Watcher plugin.
+
[source,shell]
----------------------------------------------------------
bin/plugin install elasticsearch/watcher/latest
----------------------------------------------------------
+
NOTE: If you are using a <<package-installation, DEB/RPM distribution>> of Elasticsearch,
run the installation with superuser permissions. To perform an offline installation,
<<offline-installation, download the Watcher binaries>>.
. Start Elasticsearch.
+
[source,shell]
----------------------------------------------------------
bin/elasticsearch
----------------------------------------------------------
. To verify that Watcher is set up, call the Watcher `_stats` API:
+
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/_watcher/stats?pretty'
--------------------------------------------------
+
You haven't set up any watches yet, so the `watch_count` is zero and the `execution_thread_pool` queue
is empty:
+
[source,js]
--------------------------------------------------
{
"watcher_state": "started",
"watch_count": 0,
"execution_thread_pool": {
"queue_size": 0,
"max_size": 0
}
}
--------------------------------------------------
Ready to start building watches? Choose one of the following scenarios:
* <<watch-log-data, Watch Log Data for Errors>>
* <<watch-cluster-status, Watch Your Cluster Health>>
[[watch-log-data]]
=== Watch Log Data for Errors
You can easily configure a watch that periodically checks your log data for error conditions:
* <<log-add-input, Schedule the watch and define an input>> to search your log data for error events.
* <<log-add-condition, Add a condition>> that checks to see if any errors were found.
* <<log-take-action, Take action>> if there are any errors.
[float]
[[log-add-input]]
==== Schedule the Watch and Add an Input
A watch <<trigger-schedule, schedule>> controls how often a watch is triggered. The watch
<<input, input>> gets the data that you want to evaluate.
To periodically search your log data and load the results into the watch, you use an
<<schedule-interval, interval>> schedule and a <<input-search, search>> input. For example, the
following Watch searches the `logs` index for errors every 10 seconds:
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/log_error_watch' -d '{
"trigger" : {
"schedule" : { "interval" : "10s" } <1>
},
"input" : {
"search" : {
"request" : {
"indices" : [ "logs" ],
"body" : {
"query" : {
"match" : { "message": "error" }
}
}
}
}
}
}'
--------------------------------------------------
<1> Schedules are typically configured to run less frequently. This example sets the interval to
10 seconds so you can easily see the watches being triggered. Since this watch runs so frequently,
don't forget to <<log-delete, delete the watch>> when you're done experimenting.
If you check the watch history you'll see that the watch is being triggered every 10 seconds.
However, the search isn't returning any results so nothing is loaded into the watch payload.
For example, the following snippet gets the last ten watch executions (a.k.a watch records) from
the watch history:
[source,js]
--------------------------------------------------------------------------------
curl -XGET 'http://localhost:9200/.watch_history*/_search?pretty' -d '{
"sort" : [
{ "result.execution_time" : "desc" }
]
}'
--------------------------------------------------------------------------------
[float]
[[log-add-condition]]
==== Add a Condition
A <<condition, condition>> evaluates the data you've loaded into the watch and determines if any
action is required. Since you've defined an input that loads log errors into the watch, you can
define a condition that checks to see if any errors were found.
For example, you could add a condition that simply checks to see if the search input returned
any hits.
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/log_error_watch' -d '{
"trigger" : { "schedule" : { "interval" : "10s" } },
"input" : {
"search" : {
"request" : {
"indices" : [ "logs" ],
"body" : {
"query" : {
"match" : { "message": "error" }
}
}
}
}
},
"condition" : {
"compare" : { "ctx.payload.hits.total" : { "gt" : 0 }} <1>
}
}'
--------------------------------------------------
<1> The <<condition-compare, compare>> condition lets you easily compare against values in the
execution context without enabling dynamic scripting.
The condition result is recorded as part of the `watch_record` each time the watch executes. Since
there are currently no log events in the `logs` index, the watch condition will not be met. If you
search the history for watch executions where the condition was met during the last 5 seconds,
there are no hits:
[source,js]
--------------------------------------------------------------------------------
curl -XGET 'http://localhost:9200/.watch_history*/_search?pretty' -d '{
"query" : {
"bool" : {
"must" : [
{ "match" : { "result.condition.met" : true }},
{ "range" : { "result.execution_time" : { "from" : "now-10s"}}}
]
}
}
}'
--------------------------------------------------------------------------------
For the condition in the example above to evaluate to `true`, you need to add an event to the
`logs` index that contains an error.
For example, the following snippet adds a 404 error to the `logs` index:
[source,js]
--------------------------------------------------
curl -XPOST 'http://localhost:9200/logs/event' -d '{
"timestamp" : "2015-05-17T18:12:07.613Z",
"request" : "GET index.html",
"status_code" : 404,
"message" : "Error: File not found"
}'
--------------------------------------------------
Once you add this event, the next time the watch executes its condition will evaluate to `true`.
You can verify this by searching the watch history:
[source,js]
--------------------------------------------------------------------------------
curl -XGET 'http://localhost:9200/.watch_history*/_search?pretty' -d '{
"query" : {
"bool" : {
"must" : [
{ "match" : { "result.condition.met" : true }},
{ "range" : { "result.execution_time" : { "from" : "now-10s"}}}
]
}
}
}'
--------------------------------------------------------------------------------
[float]
[[log-take-action]]
==== Take Action
Recording `watch_records` in the watch history is nice, but the real power of Watcher is being able
to do something when the watch condition is met. The watch's <<actions, actions>> define what to
do when the watch condition evaluates to `true`--you can send emails, call third-party webhooks,
write documents to an Elasticsearch or log messages to the standards Elasticsearch log files.
For example, you could add an action to write a message to the Elasticsearch log when an error is
detected.
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/log_error_watch' -d '{
"trigger" : { "schedule" : { "interval" : "10s" } },
"input" : {
"search" : {
"request" : {
"indices" : [ "logs" ],
"body" : {
"query" : {
"match" : { "message": "error" }
}
}
}
}
},
"condition" : {
"compare" : { "ctx.payload.hits.total" : { "gt" : 0 }}
},
"actions" : {
"log_error" : {
"logging" : {
"text" : "Found {{ctx.payload.hits.total}} errors in the logs"
}
}
}
}'
--------------------------------------------------
[float]
[[log-delete]]
==== Delete the Watch
Since the `log_error_watch` is configured to run every 10 seconds, make sure you delete it when
you're done experimenting. Otherwise, the noise from this sample watch will make it hard to see
what else is going on in your watch history and log file.
To remove the watch, use the <<api-rest-delete-watch, DELETE watch>> API:
[source,js]
--------------------------------------------------
curl -XDELETE 'http://localhost:9200/_watcher/watch/log_error_watch'
--------------------------------------------------
[[watch-cluster-status]]
=== Watch Your Cluster Health
You can easily configure a basic watch to monitor the health of your Elasticsearch cluster:
* <<health-add-input, Schedule the watch and define an input>> that gets the cluster health status.
* <<health-add-condition, Add a condition>> that evaluates the health status to determine if action
is required.
* <<health-take-action, Take action>> if the cluster is RED.
[float]
[[health-add-input]]
==== Schedule the Watch and Add an Input
A watch <<trigger-schedule, schedule>> controls how often a watch is triggered. The watch
<<input, input>> gets the data that you want to evaluate.
The simplest way to define a schedule is to specify an interval. For example, the following
schedule runs every 10 seconds:
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/cluster_health_watch' -d '{
"trigger" : {
"schedule" : { "interval" : "10s" } <1>
}
}'
--------------------------------------------------
<1> Schedules are typically configured to run less frequently. This example sets the interval to
10 seconds to you can easily see the watches being triggered. Since this watch runs so frequently,
don't forget to <<health-delete, delete the watch>> when you're done experimenting.
To get the status of your cluster, you can call the Elasticsearch
{ref}//cluster-health.html[cluster health] API:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/_cluster/health?pretty'
--------------------------------------------------
To load the health status into your watch, you simply add an <<input-http, HTTP input>> that calls
the cluster health API:
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/cluster_health_watch' -d '{
"trigger" : {
"schedule" : { "interval" : "10s" }
},
"input" : {
"http" : {
"request" : {
"host" : "localhost",
"port" : 9200,
"path" : "/_cluster/health"
}
}
}
}'
--------------------------------------------------
If you check the watch history, you'll see that the cluster status is recorded as part of the
`watch_record` each time the watch executes.
For example, the following snippet gets the last ten watch records from the watch history:
[source,js]
--------------------------------------------------------------------------------
curl -XGET 'http://localhost:9200/.watch_history*/_search' -d '{
"sort" : [
{ "result.execution_time" : "desc" }
]
}'
--------------------------------------------------------------------------------
[float]
[[health-add-condition]]
==== Add a Condition
A <<condition, condition>> evaluates the data you've loaded into the watch and determines if any
action is required. Since you've defined an input that loads the cluster status into the watch,
you can define a condition that checks that status.
For example, you could add a condition to check to see if the status is RED.
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/cluster_health_watch' -d '{
"trigger" : {
"schedule" : { "interval" : "10s" } <1>
},
"input" : {
"http" : {
"request" : {
"host" : "localhost",
"port" : 9200,
"path" : "/_cluster/health"
}
}
},
"condition" : {
"compare" : {
"ctx.payload.status" : { "eq" : "red" }
}
}
}'
--------------------------------------------------
<1> Schedules are typically configured to run less frequently. This example sets the interval to
10 seconds to you can easily see the watches being triggered.
If you check the watch history, you'll see that the condition result is recorded as part of the
`watch_record` each time the watch executes.
To check to see if the condition was met, you can run the following query.
[source,js]
--------------------------------------------------------------------------------
curl -XGET 'http://localhost:9200/.watch_history*/_search?pretty' -d '{
"query" : {
"match" : { "result.condition.met" : true }
}
}'
--------------------------------------------------------------------------------
[float]
[[health-take-action]]
==== Take Action
Recording `watch_records` in the watch history is nice, but the real power of Watcher is being able
to do something in response to an alert. A watch's <<actions, actions>> define what to do when the
watch condition is true--you can send emails, call third-party webhooks, or write documents to an
Elasticsearch index or log when the watch condition is met.
For example, you could add an action to index the cluster status information when the status is RED.
[source,js]
--------------------------------------------------
curl -XPUT 'http://localhost:9200/_watcher/watch/cluster_health_watch' -d '{
"trigger" : {
"schedule" : { "interval" : "10s" }
},
"input" : {
"http" : {
"request" : {
"host" : "localhost",
"port" : 9200,
"path" : "/_cluster/health"
}
}
},
"condition" : {
"compare" : {
"ctx.payload.status" : { "eq" : "red" }
}
},
"actions" : {
"send_email" : {
"email" : {
"to" : "<username>@<domainname>",
"subject" : "Cluster Status Warning",
"body" : "Cluster status is RED"
}
}
}
}'
--------------------------------------------------
For Watcher to send email, you must configure an email account in your `elasticsearch.yml`
configuration file and restart Elasticsearch. To add an email account, set the
`watcher.actions.email.service.account` property.
For example, the following snippet configures a single Gmail account named `work`.
[source,shell]
----------------------------------------------------------
watcher.actions.email.service.account:
work:
profile: gmail
email_defaults:
from: <email> <1>
smtp:
auth: true
starttls.enable: true
host: smtp.gmail.com
port: 587
user: <username> <2>
password: <password> <3>
----------------------------------------------------------
<1> Replace `<email>` with the email address from which you want to send notifications.
<2> Replace `<username>` with your Gmail user name (typically your Gmail address).
<3> Replace `<password>` with your Gmail password.
NOTE: If you have advanced security options enabled for your email account, you need to take
additional steps to send email from Watcher. For more information, see
<<email-services, Working with Various Email Services>>.
You can check the watch history or the `status_index` to see that the action was performed.
[source,js]
--------------------------------------------------------------------------------
curl -XGET 'http://localhost:9200/.watch_history*/_search?pretty' -d '{
"query" : {
"match" : { "result.condition.met" : true }
}
}'
--------------------------------------------------------------------------------
[float]
[[health-delete]]
==== Delete the Watch
Since the `cluster_health_watch` is configured to run every 10 seconds, make sure you delete it
when you're done experimenting. Otherwise, you'll spam yourself indefinitely.
To remove the watch, use the <<api-rest-delete-watch, DELETE watch>> API:
[source,js]
--------------------------------------------------------------------------------
curl -XDELETE 'http://localhost:9200/_watcher/watch/cluster_health_watch'
--------------------------------------------------------------------------------