2019-09-10 13:32:51 -04:00
|
|
|
[role="xpack"]
|
2017-03-28 17:23:01 -04:00
|
|
|
[[watch-cluster-status]]
|
2019-09-30 13:18:50 -04:00
|
|
|
=== Watching the status of an Elasticsearch cluster
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
You can easily configure a basic watch to monitor the health of your
|
|
|
|
Elasticsearch cluster:
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
* <<health-add-input,Schedule the watch and define an input>> that gets the
|
2017-03-28 17:23:01 -04:00
|
|
|
cluster health status.
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
* <<health-add-condition,Add a condition>> that evaluates the health status to
|
2017-03-28 17:23:01 -04:00
|
|
|
determine if action is required.
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
* <<health-take-action,Take action>> if the cluster is RED.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2017-03-28 17:23:01 -04:00
|
|
|
[[health-add-input]]
|
2019-09-30 13:18:50 -04:00
|
|
|
==== Schedule the watch and add an input
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
A watch <<trigger-schedule,schedule>> controls how often a watch is triggered.
|
|
|
|
The watch <<input,input>> gets the data that you want to evaluate.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
The simplest way to define a schedule is to specify an interval. For example,
|
|
|
|
the following schedule runs every 10 seconds:
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/cluster_health_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "10s" } <1>
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2019-09-09 12:35:50 -04:00
|
|
|
|
2017-03-28 17:23:01 -04:00
|
|
|
<1> Schedules are typically configured to run less frequently. This example sets
|
|
|
|
the interval to 10 seconds to you can easily see the watches being triggered.
|
|
|
|
Since this watch runs so frequently, don't forget to <<health-delete, delete the watch>>
|
|
|
|
when you're done experimenting.
|
|
|
|
|
|
|
|
To get the status of your cluster, you can call the Elasticsearch
|
|
|
|
{ref}//cluster-health.html[cluster health] API:
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
GET _cluster/health?pretty
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
To load the health status into your watch, you simply add an
|
2019-09-30 13:18:50 -04:00
|
|
|
<<input-http,HTTP input>> that calls the cluster health API:
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/cluster_health_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "10s" }
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"http" : {
|
|
|
|
"request" : {
|
|
|
|
"host" : "localhost",
|
|
|
|
"port" : 9200,
|
|
|
|
"path" : "/_cluster/health"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
If you're using Security, then you'll also need to supply some authentication
|
|
|
|
credentials as part of the watch configuration:
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/cluster_health_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "10s" }
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"http" : {
|
|
|
|
"request" : {
|
|
|
|
"host" : "localhost",
|
|
|
|
"port" : 9200,
|
|
|
|
"path" : "/_cluster/health",
|
|
|
|
"auth": {
|
|
|
|
"basic": {
|
|
|
|
"username": "elastic",
|
2017-06-29 16:27:57 -04:00
|
|
|
"password": "x-pack-test-password"
|
2017-03-28 17:23:01 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
It would be a good idea to create a user with the minimum privileges required
|
|
|
|
for use with such a watch configuration.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
Depending on how your cluster is configured, there may be additional settings
|
|
|
|
required before the watch can access your cluster such as keystores, truststores,
|
|
|
|
or certificates. For more information, see <<notification-settings>>.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
|
|
|
|
If you check the watch history, you'll see that the cluster status is recorded
|
|
|
|
as part of the `watch_record` each time the watch executes.
|
|
|
|
|
|
|
|
For example, the following request retrieves the last ten watch records from
|
|
|
|
the watch history:
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
GET .watcher-history*/_search
|
|
|
|
{
|
|
|
|
"sort" : [
|
|
|
|
{ "result.execution_time" : "desc" }
|
|
|
|
]
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST[continued]
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2017-03-28 17:23:01 -04:00
|
|
|
[[health-add-condition]]
|
2019-09-30 13:18:50 -04:00
|
|
|
==== Add a condition
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
A <<condition,condition>> evaluates the data you've loaded into the watch and
|
2017-03-28 17:23:01 -04:00
|
|
|
determines if any action is required. Since you've defined an input that loads
|
|
|
|
the cluster status into the watch, you can define a condition that checks that
|
|
|
|
status.
|
|
|
|
|
|
|
|
For example, you could add a condition to check to see if the status is RED.
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/cluster_health_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "10s" } <1>
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"http" : {
|
|
|
|
"request" : {
|
|
|
|
"host" : "localhost",
|
|
|
|
"port" : 9200,
|
|
|
|
"path" : "/_cluster/health"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"condition" : {
|
|
|
|
"compare" : {
|
|
|
|
"ctx.payload.status" : { "eq" : "red" }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2019-09-09 12:35:50 -04:00
|
|
|
|
2017-03-28 17:23:01 -04:00
|
|
|
<1> Schedules are typically configured to run less frequently. This example sets
|
|
|
|
the interval to 10 seconds to you can easily see the watches being triggered.
|
|
|
|
|
|
|
|
If you check the watch history, you'll see that the condition result is recorded
|
|
|
|
as part of the `watch_record` each time the watch executes.
|
|
|
|
|
|
|
|
To check to see if the condition was met, you can run the following query.
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
------------------------------------------------------
|
|
|
|
GET .watcher-history*/_search?pretty
|
|
|
|
{
|
|
|
|
"query" : {
|
|
|
|
"match" : { "result.condition.met" : true }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
------------------------------------------------------
|
|
|
|
// TEST[continued]
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2017-03-28 17:23:01 -04:00
|
|
|
[[health-take-action]]
|
2019-09-30 13:18:50 -04:00
|
|
|
==== Take action
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
Recording `watch_records` in the watch history is nice, but the real power of
|
|
|
|
{watcher} is being able to do something in response to an alert. A watch's
|
2019-09-30 13:18:50 -04:00
|
|
|
<<actions,actions>> define what to do when the watch condition is true--you
|
2017-03-28 17:23:01 -04:00
|
|
|
can send emails, call third-party webhooks, or write documents to an
|
|
|
|
Elasticsearch index or log when the watch condition is met.
|
|
|
|
|
|
|
|
For example, you could add an action to index the cluster status information
|
|
|
|
when the status is RED.
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
--------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
PUT _watcher/watch/cluster_health_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
{
|
|
|
|
"trigger" : {
|
|
|
|
"schedule" : { "interval" : "10s" }
|
|
|
|
},
|
|
|
|
"input" : {
|
|
|
|
"http" : {
|
|
|
|
"request" : {
|
|
|
|
"host" : "localhost",
|
|
|
|
"port" : 9200,
|
|
|
|
"path" : "/_cluster/health"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"condition" : {
|
|
|
|
"compare" : {
|
|
|
|
"ctx.payload.status" : { "eq" : "red" }
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"actions" : {
|
|
|
|
"send_email" : {
|
|
|
|
"email" : {
|
2018-10-18 05:54:50 -04:00
|
|
|
"to" : "username@example.org",
|
2017-03-28 17:23:01 -04:00
|
|
|
"subject" : "Cluster Status Warning",
|
|
|
|
"body" : "Cluster status is RED"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
For {watcher} to send email, you must configure an email account in your
|
|
|
|
`elasticsearch.yml` configuration file and restart Elasticsearch. To add an email
|
|
|
|
account, set the `xpack.notification.email.account` property.
|
|
|
|
|
|
|
|
For example, the following snippet configures a single Gmail account named `work`:
|
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
----------------------------------------------------------
|
|
|
|
xpack.notification.email.account:
|
|
|
|
work:
|
|
|
|
profile: gmail
|
|
|
|
email_defaults:
|
|
|
|
from: <email> <1>
|
|
|
|
smtp:
|
|
|
|
auth: true
|
|
|
|
starttls.enable: true
|
|
|
|
host: smtp.gmail.com
|
|
|
|
port: 587
|
|
|
|
user: <username> <2>
|
|
|
|
password: <password> <3>
|
|
|
|
----------------------------------------------------------
|
|
|
|
<1> Replace `<email>` with the email address from which you want to send
|
|
|
|
notifications.
|
|
|
|
<2> Replace `<username>` with your Gmail user name (typically your Gmail address).
|
|
|
|
<3> Replace `<password>` with your Gmail password.
|
|
|
|
|
|
|
|
NOTE: If you have advanced security options enabled for your email account,
|
|
|
|
you need to take additional steps to send email from {watcher}. For more
|
2019-09-30 13:18:50 -04:00
|
|
|
information, see <<configuring-email>>.
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
You can check the watch history or the `status_index` to see that the action was
|
|
|
|
performed.
|
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
-------------------------------------------------------
|
|
|
|
GET .watcher-history*/_search?pretty
|
|
|
|
{
|
|
|
|
"query" : {
|
|
|
|
"match" : { "result.condition.met" : true }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
-------------------------------------------------------
|
|
|
|
// TEST[continued]
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2017-03-28 17:23:01 -04:00
|
|
|
[[health-delete]]
|
2019-09-30 13:18:50 -04:00
|
|
|
==== Delete the watch
|
2017-03-28 17:23:01 -04:00
|
|
|
|
|
|
|
Since the `cluster_health_watch` is configured to run every 10 seconds, make
|
|
|
|
sure you delete it when you're done experimenting. Otherwise, you'll spam yourself
|
|
|
|
indefinitely.
|
|
|
|
|
2019-09-30 13:18:50 -04:00
|
|
|
To remove the watch, use the <<watcher-api-delete-watch,delete watch API>>:
|
2017-03-28 17:23:01 -04:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-03-28 17:23:01 -04:00
|
|
|
-------------------------------------------------------
|
2018-12-08 13:57:16 -05:00
|
|
|
DELETE _watcher/watch/cluster_health_watch
|
2017-03-28 17:23:01 -04:00
|
|
|
-------------------------------------------------------
|
|
|
|
// TEST[continued]
|