Documentation: Consoleify cat shards/recovery API docs (#23116)

Relates #23001
This commit is contained in:
Alexander Reelsen 2017-02-22 09:18:10 +01:00 committed by GitHub
parent d31e41547a
commit 6781c4320c
3 changed files with 134 additions and 95 deletions

View File

@ -81,8 +81,6 @@ buildRestTests.expectedUnconvertedCandidates = [
'reference/analysis/tokenfilters/synonym-tokenfilter.asciidoc',
'reference/analysis/tokenfilters/synonym-graph-tokenfilter.asciidoc',
'reference/analysis/tokenfilters/word-delimiter-tokenfilter.asciidoc',
'reference/cat/recovery.asciidoc',
'reference/cat/shards.asciidoc',
'reference/cat/snapshots.asciidoc',
'reference/cat/templates.asciidoc',
'reference/cat/thread_pool.asciidoc',

View File

@ -12,17 +12,24 @@ way for shards to be loaded from disk when a node starts up.
As an example, here is what the recovery state of a cluster may look like when there
are no shards in transit from one node to another:
[source,sh]
[source,js]
----------------------------------------------------------------------------
> curl -XGET 'localhost:9200/_cat/recovery?v'
index shard time type stage source_host source_node target_host target_node repository snapshot files files_percent bytes bytes_percent
total_files total_bytes translog translog_percent total_translog
index 0 87ms store done 127.0.0.1 I8hydUG 127.0.0.1 I8hydUG n/a n/a 0 0.0% 0 0.0% 0 0 0 100.0% 0
index 1 97ms store done 127.0.0.1 I8hydUG 127.0.0.1 I8hydUG n/a n/a 0 0.0% 0 0.0% 0 0 0 100.0% 0
index 2 93ms store done 127.0.0.1 I8hydUG 127.0.0.1 I8hydUG n/a n/a 0 0.0% 0 0.0% 0 0 0 100.0% 0
index 3 90ms store done 127.0.0.1 I8hydUG 127.0.0.1 I8hydUG n/a n/a 0 0.0% 0 0.0% 0 0 0 100.0% 0
index 4 9ms store done 127.0.0.1 I8hydUG 127.0.0.1 I8hydUG n/a n/a 0 0.0% 0 0.0% 0 0 0 100.0% 0
GET _cat/recovery?v
---------------------------------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The response of this request will be something like:
[source,js]
---------------------------------------------------------------------------
index shard time type stage source_host source_node target_host target_node repository snapshot files files_recovered files_percent files_total bytes bytes_recovered bytes_percent bytes_total translog_ops translog_ops_recovered translog_ops_percent
twitter 0 13ms store done n/a n/a 127.0.0.1 node-0 n/a n/a 0 0 100% 13 0 0 100% 9928 0 0 100.0%
---------------------------------------------------------------------------
// TESTRESPONSE[s/store/empty_store/]
// TESTRESPONSE[s/100%/0.0%/]
// TESTRESPONSE[s/9928/0/]
// TESTRESPONSE[s/13/\\d+/ _cat]
In the above case, the source and target nodes are the same because the recovery
type was store, i.e. they were read from local storage on node start.
@ -31,43 +38,46 @@ Now let's see what a live recovery looks like. By increasing the replica count
of our index and bringing another node online to host the replicas, we can see
what a live shard recovery looks like.
[source,sh]
[source,js]
----------------------------------------------------------------------------
> curl -XPUT 'localhost:9200/wiki/_settings' -d'{"number_of_replicas":1}'
{"acknowledged":true}
GET _cat/recovery?v&h=i,s,t,ty,st,shost,thost,f,fp,b,bp
---------------------------------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
> curl -XGET 'localhost:9200/_cat/recovery?v&h=i,s,t,ty,st,shost,thost,f,fp,b,bp'
i s t ty st shost thost f fp b bp
wiki 0 1252ms store done hostA hostA 4 100.0% 23638870 100.0%
wiki 0 1672ms replica index hostA hostB 4 75.0% 23638870 48.8%
wiki 1 1698ms replica index hostA hostB 4 75.0% 23348540 49.4%
wiki 1 4812ms store done hostA hostA 33 100.0% 24501912 100.0%
wiki 2 1689ms replica index hostA hostB 4 75.0% 28681851 40.2%
wiki 2 5317ms store done hostA hostA 36 100.0% 30267222 100.0%
This will return a line like:
[source,js]
---------------------------------------------------------------------------
i s t ty st shost thost f fp b bp
twitter 0 1252ms peer done 192.168.1.1 192.168.1.2 0 100.0% 0 100.0%
----------------------------------------------------------------------------
// TESTRESPONSE[s/peer/empty_store/]
// TESTRESPONSE[s/192.168.1.2/127.0.0.1/]
// TESTRESPONSE[s/192.168.1.1/n\/a/]
// TESTRESPONSE[s/100.0%/0.0%/]
// TESTRESPONSE[s/1252/\\d+/ _cat]
We can see in the above listing that our 3 initial shards are in various stages
of being replicated from one node to another. Notice that the recovery type is
shown as `replica`. The files and bytes copied are real-time measurements.
We can see in the above listing that our thw twitter shard was recovered from another node.
Notice that the recovery type is shown as `peer`. The files and bytes copied are
real-time measurements.
Finally, let's see what a snapshot recovery looks like. Assuming I have previously
made a backup of my index, I can restore it using the <<modules-snapshots,snapshot and restore>>
API.
[source,sh]
[source,js]
--------------------------------------------------------------------------------
> curl -XPOST 'localhost:9200/_snapshot/imdb/snapshot_2/_restore'
{"acknowledged":true}
> curl -XGET 'localhost:9200/_cat/recovery?v&h=i,s,t,ty,st,rep,snap,f,fp,b,bp'
i s t ty st rep snap f fp b bp
imdb 0 1978ms snapshot done imdb snap_1 79 8.0% 12086 9.0%
imdb 1 2790ms snapshot index imdb snap_1 88 7.7% 11025 8.1%
imdb 2 2790ms snapshot index imdb snap_1 85 0.0% 12072 0.0%
imdb 3 2796ms snapshot index imdb snap_1 85 2.4% 12048 7.2%
imdb 4 819ms snapshot init imdb snap_1 0 0.0% 0 0.0%
GET _cat/recovery?v&h=i,s,t,ty,st,rep,snap,f,fp,b,bp
---------------------------------------------------------------------------
// CONSOLE
// TEST[skip:no need to execute snapshot/restore here]
This will show a recovery of type snapshot in the response
[source,js]
---------------------------------------------------------------------------
i s t ty st rep snap f fp b bp
twitter 0 1978ms snapshot done twitter snap_1 79 8.0% 12086 9.0%
--------------------------------------------------------------------------------
// TESTRESPONSE[_cat]

View File

@ -5,15 +5,26 @@ The `shards` command is the detailed view of what nodes contain which
shards. It will tell you if it's a primary or replica, the number of
docs, the bytes it takes on disk, and the node where it's located.
Here we see a single index, with three primary shards and no replicas:
Here we see a single index, with one primary shard and no replicas:
[source,sh]
--------------------------------------------------
% curl 192.168.56.20:9200/_cat/shards
wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 bGG90GE
wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 I8hydUG
--------------------------------------------------
[source,js]
---------------------------------------------------------------------------
GET _cat/shards
---------------------------------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
This will return
[source,js]
---------------------------------------------------------------------------
twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
---------------------------------------------------------------------------
// TESTRESPONSE[s/3014/\\d+/]
// TESTRESPONSE[s/31.1/\\d+\.\\d+/]
// TESTRESPONSE[s/mb/.*/]
// TESTRESPONSE[s/192.168.56.10/.*/]
// TESTRESPONSE[s/H5dfFeA/node-0/ _cat]
[float]
[[index-pattern]]
@ -23,30 +34,47 @@ If you have many shards, you may wish to limit which indices show up
in the output. You can always do this with `grep`, but you can save
some bandwidth by supplying an index pattern to the end.
[source,sh]
--------------------------------------------------
% curl 192.168.56.20:9200/_cat/shards/wiki*
wiki2 0 p STARTED 197 3.2mb 192.168.56.10 H5dfFeA
wiki2 1 p STARTED 205 5.9mb 192.168.56.30 bGG90GE
wiki2 2 p STARTED 275 7.8mb 192.168.56.20 I8hydUG
--------------------------------------------------
[source,js]
---------------------------------------------------------------------------
GET _cat/shards/twitt*
---------------------------------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
Which will return the following
[source,js]
---------------------------------------------------------------------------
twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
---------------------------------------------------------------------------
// TESTRESPONSE[s/3014/\\d+/]
// TESTRESPONSE[s/31.1/\\d+\.\\d+/]
// TESTRESPONSE[s/mb/.*/]
// TESTRESPONSE[s/192.168.56.10/.*/]
// TESTRESPONSE[s/H5dfFeA/node-0/ _cat]
[float]
[[relocation]]
=== Relocation
Let's say you've checked your health and you see two relocating
Let's say you've checked your health and you see a relocating
shards. Where are they from and where are they going?
[source,sh]
--------------------------------------------------
% curl 192.168.56.10:9200/_cat/health
1384315316 20:01:56 foo green 3 3 12 6 2 0 0
% curl 192.168.56.10:9200/_cat/shards | fgrep RELO
wiki1 0 r RELOCATING 3014 31.1mb 192.168.56.20 I8hydUG -> 192.168.56.30 bGG90GE
wiki1 1 r RELOCATING 3013 29.6mb 192.168.56.10 H5dfFeA -> 192.168.56.30 bGG90GE
--------------------------------------------------
[source,js]
---------------------------------------------------------------------------
GET _cat/shards
---------------------------------------------------------------------------
// CONSOLE
// TEST[skip:for now, relocation cannot be recreated]
A relocating shard will be shown as follows
[source,js]
---------------------------------------------------------------------------
twitter 0 p RELOCATING 3014 31.1mb 192.168.56.10 H5dfFeA -> -> 192.168.56.30 bGG90GE
---------------------------------------------------------------------------
// TESTRESPONSE[_cat]
[float]
[[states]]
@ -55,42 +83,45 @@ wiki1 1 r RELOCATING 3013 29.6mb 192.168.56.10 H5dfFeA -> 192.168.56.30 bGG90GE
Before a shard can be used, it goes through an `INITIALIZING` state.
`shards` can show you which ones.
[source,sh]
--------------------------------------------------
% curl -XPUT 192.168.56.20:9200/_settings -d'{"number_of_replicas":1}'
{"acknowledged":true}
% curl 192.168.56.20:9200/_cat/shards
wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
wiki1 0 r INITIALIZING 0 14.3mb 192.168.56.30 bGG90GE
wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 bGG90GE
wiki1 1 r INITIALIZING 0 13.1mb 192.168.56.20 I8hydUG
wiki1 2 r INITIALIZING 0 14mb 192.168.56.10 H5dfFeA
wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 I8hydUG
--------------------------------------------------
[source,js]
---------------------------------------------------------------------------
GET _cat/shards
---------------------------------------------------------------------------
// CONSOLE
// TEST[skip:there is no guarantee to test for shards in initializing state]
You can the the initializing state in the response like this
[source,js]
---------------------------------------------------------------------------
twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
twitter 0 r INITIALIZING 0 14.3mb 192.168.56.30 bGG90GE
---------------------------------------------------------------------------
// TESTRESPONSE[_cat]
If a shard cannot be assigned, for example you've overallocated the
number of replicas for the number of nodes in the cluster, the shard
will remain `UNASSIGNED` with the <<reason-unassigned,reason code>> `ALLOCATION_FAILED`.
[source,sh]
--------------------------------------------------
% curl -XPUT 192.168.56.20:9200/_settings -d'{"number_of_replicas":3}'
% curl 192.168.56.20:9200/_cat/health
1384316325 20:18:45 foo yellow 3 3 9 3 0 0 3
% curl 192.168.56.20:9200/_cat/shards
wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
wiki1 0 r STARTED 3014 31.1mb 192.168.56.30 bGG90GE
wiki1 0 r STARTED 3014 31.1mb 192.168.56.20 I8hydUG
wiki1 0 r UNASSIGNED ALLOCATION_FAILED
wiki1 1 r STARTED 3013 29.6mb 192.168.56.10 H5dfFeA
wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 bGG90GE
wiki1 1 r STARTED 3013 29.6mb 192.168.56.20 I8hydUG
wiki1 1 r UNASSIGNED ALLOCATION_FAILED
wiki1 2 r STARTED 3973 38.1mb 192.168.56.10 H5dfFeA
wiki1 2 r STARTED 3973 38.1mb 192.168.56.30 bGG90GE
wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 I8hydUG
wiki1 2 r UNASSIGNED ALLOCATION_FAILED
--------------------------------------------------
You can use the shards API to find out that reason.
[source,js]
---------------------------------------------------------------------------
GET _cat/shards?h=index,shard,prirep,state,unassigned.reason
---------------------------------------------------------------------------
// CONSOLE
// TEST[skip:for now]
The reason for an unassigned shard will be listed as the last field
[source,js]
---------------------------------------------------------------------------
twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
twitter 0 r STARTED 3014 31.1mb 192.168.56.30 bGG90GE
twitter 0 r STARTED 3014 31.1mb 192.168.56.20 I8hydUG
twitter 0 r UNASSIGNED ALLOCATION_FAILED
---------------------------------------------------------------------------
// TESTRESPONSE[_cat]
[float]
[[reason-unassigned]]