Commit Graph

149 Commits

Author SHA1 Message Date
Sandeep Kanabar 7e0fc8a112 [Docs] Correct spelling in update-settings.asciidoc (#27808) 2017-12-14 10:16:50 +01:00
Clinton Gormley e1aa6e2cda Fix cluster usage docs test
#27611 broke the docs tests because $node_name in the URL doesn't (#27616)seem to be replaced.

Changing this to a * to match all nodes seems to fix the test
2017-12-01 16:55:10 +01:00
佛陀.RML 756e170674 [Docs] Fix order of nodes usage example (#27611) 2017-12-01 10:42:42 +01:00
Nhat Nguyen 46b508d6c9
Add wait_for_no_initializing_shards to cluster health API (#27489)
This adds a new option to the cluster health request allowing to wait
until there is no initializing shards.

Closes #25623
2017-11-23 15:09:58 -05:00
Luca Cavanna 29450de7b5
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.

This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.

Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.

The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:

"_clusters" : {
    "total" : 3,
    "successful" : 2,
    "skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.

The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 11:41:47 +01:00
David Roberts a292740b9e Add cgroup memory usage/limit to OS stats on Linux (#26166)
This change adds cgroup memory usage/limit to the OS stats section of
the node stats on Linux.  This information is useful because in Docker
containers the standard node stats report the host memory limit, not
taking account of extra restrictions that may have been applied to the
container.

The original idea was to store these values as Long, truncating any values
outside the range of long.  However, this meant that in the relatively common
case of no limit being applied, users would not see the same value in the OS
stats as they see by querying Linux directly.  So instead the values are stored
as String.  This change places a burden on consumers of the strings to
convert the strings to numbers and decide what to do about extremely large
values, but there will be very few consumers and they would need to have a
policy for dealing with "no limit" in any case.
2017-10-03 12:08:36 +01:00
Peter Dyson 1f9e0fd0dd [Docs] improved description for fs.total.available_in_bytes (#26657) 2017-09-18 16:56:19 +10:00
Tanguy Leroux 643eb286dc [Docs] Convert remaining code snippets in docs (#26422)
This commit converts the last remaining code snippets so that they are
now testable.
2017-08-30 12:11:10 +02:00
Tanguy Leroux db54c4dc7c [Docs] Convert more doc snippets (#26404)
This commit converts some remaining doc snippets so that they are now
testable.
2017-08-30 09:30:36 +02:00
Tanguy Leroux f95dec797d [Docs] Convert more doc snippets (#26359)
This commit converts some remaining doc snippets so that they are now
testable.
2017-08-28 11:23:09 +02:00
Jason Tedor fd18e3239a Remove mention of http_address in nodes info docs
This commit removes an outdated reference to http_address in the nodes
info docs. This information is available in the http object for each
node in the nodes info API response.

Relates #25980
2017-07-31 22:04:16 +09:00
Daniel Mitterdorfer db90455afd Update plugin-related output in reference docs (#25897)
The example output for node info and cluster stats was outdated w.r.t.
to the information that is shown for plugins. With this commit we
updated the example output and update the explanation of the respective
fields.
2017-07-28 11:27:54 +02:00
Clinton Gormley ff4a2519f2 Update experimental labels in the docs (#25727)
Relates https://github.com/elastic/elasticsearch/issues/19798

Removed experimental label from:
* Painless
* Diversified Sampler Agg
* Sampler Agg
* Significant Terms Agg
* Terms Agg document count error and execution_hint
* Cardinality Agg precision_threshold
* Pipeline Aggregations
* index.shard.check_on_startup
* index.store.type (added warning)
* Preloading data into the file system cache
* foreach ingest processor
* Field caps API
* Profile API

Added experimental label to:
* Moving Average Agg Prediction


Changed experimental to beta for:
* Adjacency matrix agg
* Normalizers
* Tasks API
* Index sorting

Labelled experimental in Lucene:
* ICU plugin custom rules file
* Flatten graph token filter
* Synonym graph token filter
* Word delimiter graph token filter
* Simple pattern tokenizer
* Simple pattern split tokenizer

Replaced experimental label with warning that details may change in the future:
* Analysis explain output format
* Segments verbose output format
* Percentile Agg compression and HDR Histogram
* Percentile Rank Agg HDR Histogram
2017-07-18 14:06:22 +02:00
Colin Goodheart-Smithe 779fb9a1c0 Adds nodes usage API to monitor usages of actions (#24169)
* Adds nodes usage API to monitor usages of actions

The nodes usage API has 2 main endpoints

/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.

At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.

Still to do:

* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation

* Rafactors UsageService so counts are done by the handlers

* Fixing up docs tests

* Adds a name to all rest actions

* Addresses review comments
2017-06-02 08:46:38 +01:00
Clinton Gormley 1b0c93b07c Documented the level parameter to nodes stats
Closes #24999
2017-06-01 12:11:21 +02:00
propulkit 25516868fe TCorrecting api name (#24924)
As per REST request signature for reroute, API has no underscore.
2017-05-29 13:58:31 +02:00
Simon Willnauer f22e0dc30b Add cross-cluster search remote cluster info API (#23969)
This commit adds an API to discover information like seed nodes,
http addresses and connection status of a configured remote cluster.

Closes #23925
2017-04-11 09:24:40 +02:00
Christoph Wurm 2720fc0b43 Clarify task cancellation command (#23667)
Makes it explicit that the node_id has to be included when canceling a task.
2017-03-30 20:21:21 +02:00
Florian Hopf 09753d6a86 Fix typo in allocation explain API docs
This commit addresses a simple typo in the application explain API docs.

Relates #23669
2017-03-21 08:41:54 -04:00
Ali Beyad 577d2a6a1d Adds cluster state size to /_cluster/state response (#23440)
This commit adds the size of the cluster state to the response for the
get cluster state API call (GET /_cluster/state).  The size that is
returned is the size of the full cluster state in bytes when compressed.
This is the same size of the full cluster state when serialized to
transmit over the network.  Specifying the ?human flag displays the
compressed size in a more human friendly manner.  Note that even if the
cluster state request filters items from the cluster state (so a subset
of the cluster state is returned), the size that is returned is the
compressed size of the entire cluster state.

Closes #3415
2017-03-02 14:20:29 -05:00
Lee Hinman 6c9b89b882 [TEST] Fix incorrect test cluster name in cluster health doc tests 2017-02-22 17:18:11 -07:00
Lee Hinman 5443f7d625 Console-ify curl statements for allocation explain API docs (#23190)
* Console-ify curl statements for allocation explain API docs

Relates to #23001

* Fix tests

* Remove exclusion from build.gradle

* Call out index creation in prose

* Add console back and skip test
2017-02-15 17:18:07 -07:00
Ryan Ernst c91848e6a7 Docs: Consoleify cluster and indices settings docs (#23030)
relates #23001
2017-02-10 14:57:43 -08:00
Nik Everett 0e98c9107a Docs: CONSOLEify some more docs
These need to be CONSOLEified *now* because we're starting to
require Content-Type headers and they didn't have any.

* cluster/reroute: Marked as CONSOLE but skipped because the docs
build runs with a single node.
* docs/bulk: Marked as NOTCONSOLE because the snippets describe
either examples or `curl` commands. Fixed the `curl` command to
include the `Content-Type` header.
* query-dsl/terms-query: Marked as CONSOLE.
* search/request/rescore: Marked as CONSOLE. Fixed deprecated
syntax.

Relates #23001
Relates #18160
2017-02-07 16:49:01 -05:00
Clinton Gormley f5a0d18c4c Docs: Cluster allocation explain should be on one page 2017-01-26 11:38:19 +01:00
Ali Beyad 26f92f8482 Cluster allocation explain API documentation (#22436)
This commit updates the cluster allocation explain API documentation to
explain the new request parameters and response formats, and gives
examples of the explain API responses under various scenarios.
2017-01-10 08:55:39 -06:00
Nik Everett 923820c6c9 Document the `detailed` parameter of tasks API (#22425)
Provides an example of using is and an example return description
and explains that we've added descriptions for some tasks but not
even close to all of them. And that we expect to change the
descriptions as we learn more.

Closes #22407

* Fix example

Getting a single task is always detailed, no need to specify.

* Rewrite like imotov wants it
2017-01-06 10:24:52 -05:00
Jason Tedor 41ffb008ad Fix doc bug for cgroup cpuacct usage metric
This commit fixes a silly doc bug where the field that represents the
total CPU time consumed by all tasks in the same cgroup was mistakenly
reported as "usage" instead of "usage_nanos".

Relates #21029
2016-12-15 23:22:54 -05:00
Clinton Gormley cfabc95f59 Fixed bad asciidoc ID in node stats 2016-11-15 17:39:15 +00:00
Jason Tedor f5ac0e5076 Remove lenient stats parsing
Today when parsing a stats request, Elasticsearch silently ignores
incorrect metrics. This commit removes lenient parsing of stats requests
for the nodes stats and indices stats APIs.

Relates #21417
2016-11-15 12:17:26 -05:00
Jason Tedor aec09a76d6 Clarify requesting all stats in node stats docs
This commit clarifies how to explicitly obtain all stats from the node
stats API.
2016-11-08 13:47:15 -05:00
Igor Motov 17ad88d539 Makes search action cancelable by task management API
Long running searches now can be cancelled using standard task cancellation mechanism.
2016-10-25 12:27:34 -10:00
Jason Tedor 900ee0536e Strengthen handling of unavailable cgroup stats
On some systems, cgroups will be available but not configured. And in
some cases, cgroups will be configured, but not for the subsystems that
we are expecting (e.g., cpu and cpuacct). This commit strengthens the
handling of cgroup stats on such systems.

Relates #21094
2016-10-24 16:36:51 -04:00
Jason Tedor 3d642ab0eb Add basic cgroup CPU metrics
This commit adds basic cgroup CPU metrics to the node stats API.

Relates #21029
2016-10-24 08:26:56 -04:00
Jason Tedor 51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Tanguy Leroux 656596c2a9 [DOC] Remove obsolete node names from documentation
Funny node names have been removed in #19456 and replaced by UUID. This commit removes these obsolete node names and replace them by real UUIDs in the documentation.

closes #20065
2016-09-19 11:56:28 +02:00
Lee Hinman fd3392aef8 [DOCS] Mark cluster allocation explain API as experimental in docs 2016-09-06 11:29:33 -06:00
Nik Everett 5cff2a046d Remove most of the need for `// NOTCONSOLE`
and be much more stingy about what we consider a console candidate.

* Add `// CONSOLE` to check-running
* Fix version in some snippets
* Mark groovy snippets as groovy
* Fix versions in plugins
* Fix language marker errors
* Fix language parsing in snippets

  This adds support for snippets who's language is written like
  `[source, txt]` and `["source","js",subs="attributes,callouts"]`.

  This also makes language required for snippets which is nice because
  then we can be sure we can grep for snippets in a particular language.
2016-09-06 10:32:54 -04:00
javanna 5f299ff46f add mem section back to cluster stats
The mem section was buggy in cluster stats and removed. It is now added back with the same structure as in node stats, containing total memory, available memory, used memory and percentages. All the values are the sum of all the nodes across the cluster (or at least the ones that we were able to get the values from).
2016-09-01 11:26:03 +02:00
Ali Beyad 4641254ea6 Parameter improvements to Cluster Health API wait for shards (#20223)
* Params improvements to Cluster Health API wait for shards

Previously, the cluster health API used a strictly numeric value
for `wait_for_active_shards`. However, with the introduction of
ActiveShardCount and the removal of write consistency level for
replication operations, `wait_for_active_shards` is used for
write operations to represent values for ActiveShardCount. This
commit moves the cluster health API's usage of `wait_for_active_shards`
to be consistent with its usage in the write operation APIs.

This commit also changes `wait_for_relocating_shards` from a
numeric value to a simple boolean value `wait_for_no_relocating_shards`
to set whether the cluster health operation should wait for
all relocating shards to complete relocation.

* Addresses code review comments

* Don't be lenient if `wait_for_relocating_shards` is set
2016-08-31 11:58:19 -04:00
Nik Everett 777ea124c7 Fix health docs test
It failed inconsistently when there were pending tasks.
2016-07-16 07:18:11 -04:00
Nik Everett 9f78f8cc91 Convert snippets in health docs to CONSOLE
This should make them easier to read and adds them to the test suite
I changed the example from a two node cluster to a single node cluster
because that is what we have running in the integration tests. It is also
what a user just starting out is likely to see so I think that is ok.
2016-07-15 16:31:37 -04:00
Lee Hinman 58db63b610 Expose the ClusterInfo object in the allocation explain output
This adds an optional parameter to the cluster allocation explain API
that will return the cluster info object, `include_disk_info`, the
output looks like:

GET /_cluster/allocation/explain?include_disk_info -d'
{"index": "i", "shard": 0, "primary": false}'

{
  ... other info ...

  "cluster_info" : {
    "nodes" : {
      "7Uws-vL7R6WVm3ZwQA1n5A" : {
        "node_name" : "Kraven the Hunter",
        "least_available" : {
          "path" : "/path/to/data1",
          "total_bytes" : 165999570944,
          "used_bytes" : 118180614144,
          "free_bytes" : 47818956800,
          "free_disk_percent" : 28.80667493781158,
          "used_disk_percent" : 71.19332506218842
        },
        "most_available" : {
          "path" : "/path/to/data2",
          "total_bytes" : 165999570944,
          "used_bytes" : 118180614144,
          "free_bytes" : 47818956800,
          "free_disk_percent" : 28.80667493781158,
          "used_disk_percent" : 71.19332506218842
        }
      }
    },
    "shard_sizes" : {
      "[i][2][p]_bytes" : 0,
      "[i][4][p]_bytes" : 130,
      "[i][1][p]_bytes" : 0,
      "[i][3][p]_bytes" : 0,
      "[i][0][p]_bytes" : 130
    },
    "shard_paths" : {
      "[i][3], node[7Uws-vL7R6WVm3ZwQA1n5A], [P], s[STARTED], a[id=LegZLDniTVaw0Y1urv7s3g]" : "/path/to/data1/nodes/0",
      "[i][1], node[7Uws-vL7R6WVm3ZwQA1n5A], [P], s[STARTED], a[id=lAU_4vf_SKmoRdtg0ACnjQ]" : "/path/to/data1/nodes/0",
      "[i][2], node[7Uws-vL7R6WVm3ZwQA1n5A], [P], s[STARTED], a[id=Aurpeuj7SeGeyPDDpCtRgg]" : "/path/to/data1/nodes/0",
      "[i][0], node[7Uws-vL7R6WVm3ZwQA1n5A], [P], s[STARTED], a[id=Vgg8GlQTQ82C2j6HYBq8DQ]" : "/path/to/data1/nodes/0",
      "[i][4], node[7Uws-vL7R6WVm3ZwQA1n5A], [P], s[STARTED], a[id=t8hQlVSxQe-58fSeaXcAqg]" : "/path/to/data1/nodes/0"
    }
  }
}

Resolves #14405
2016-07-12 15:52:20 -06:00
Mike McCandless eecf094ac1 add indices nodes info flag to docs 2016-06-20 14:23:32 -04:00
Mike McCandless 3f221bf7cb Add total_indexing_buffer/_in_bytes to nodes info API 2016-06-16 04:39:34 -04:00
Nik Everett e392e0b1df Create get task API that falls back to the .tasks index
This adds a get task API that supports GET /_tasks/${taskId} and
removes that responsibility from the list tasks API. The get task
API supports wait_for_complation just as the list tasks API does
but doesn't support any of the list task API's filters. In exchange,
it supports falling back to the .results index when the task isn't
running any more. Like any good GET API it 404s when it doesn't
find the task.

Then we change reindex, update-by-query, and delete-by-query to
persist the task result when wait_for_completion=false. The leads
to the neat behavior that, once you start a reindex with
wait_for_completion=false, you can fetch the result of the task by
using the get task API and see the result when it has finished.

Also rename the .results index to .tasks.
2016-06-14 13:37:34 -04:00
Mike McCandless 5c525e6606 Remove index_writer_max_memory stat from segment stats 2016-05-31 06:29:29 -04:00
Lee Hinman bfce901edf Merge remote-tracking branch 'dakrone/explain-add-fetch-in-progress' 2016-05-23 09:43:16 -06:00
Lee Hinman 8040ed0c16 Add whether the shard state fetch is pending to the allocation explain API
If the shard state fetch is still pending, this will now return a
message like:

```json
{
  "shard" : {
    "index" : "i",
    "index_uuid" : "de1W1374T4qgvUP4a9Ieaw",
    "id" : 0,
    "primary" : false
  },
  "assigned" : false,
  "shard_state_fetch_pending": true,
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",
    "at" : "2016-04-26T16:34:53.227Z"
  },
  "allocation_delay_ms" : 0,
  "remaining_delay_ms" : 0,
  "nodes" : {
    "z-CbkiELT-SoWT91HIszLA" : {
      "node_name" : "Brain Cell",
      "node_attributes" : {
        "testattr" : "test"
      },
      "store" : {
        "shard_copy" : "NONE"
      },
      "final_decision" : "NO",
      "final_explanation" : "the shard state fetch is pending",
      "weight" : 5.0,
      "decisions" : [ ]
    }
  }
}
```

Adds the `shard_state_fetch_pending` field and uses the state to
influence the final decision and final explanation.

Relates to #17372
2016-05-23 09:42:57 -06:00
Simon Willnauer 35e705877b Limit retries of failed allocations per index (#18467)
Today if a shard fails during initialization phase due to misconfiguration, broken disks,
missing analyzers, not installed plugins etc. elasticsaerch keeps on trying to initialize
or rather allocate that shard. Yet, in the worst case scenario this ends in an endless
allocation loop. To prevent this loop and all it's sideeffects like spamming log files over
and over again this commit adds an allocation decider that stops allocating a shard that
failed more than N times in a row to allocate. The number or retries can be configured via
`index.allocation.max_retry` and it's default is set to `5`. Once the setting is updated
shards with less failures than the number set per index will be allowed to allocate again.

Internally we maintain a counter on the UnassignedInfo that is reset to `0` once the shards
has been started.

Relates to #18417
2016-05-20 20:37:45 +02:00