Docs: Included Nodes Task API and tidied reindex/update-by-query

2025-03-25 01:19:02 +00:00 · 2016-03-29 13:51:11 +02:00 · 2016-03-29 13:51:11 +02:00 · 978b24327e
commit 978b24327e
parent b87beeb05f
5 changed files with 83 additions and 73 deletions
--- a/docs/reference/cluster.asciidoc
+++ b/docs/reference/cluster.asciidoc
@ -45,6 +45,8 @@ include::cluster/nodes-stats.asciidoc[]

 include::cluster/nodes-info.asciidoc[]

+include::cluster/nodes-task.asciidoc[]
+
 include::cluster/nodes-hot-threads.asciidoc[]

 include::cluster/allocation-explain.asciidoc[]
--- a/docs/reference/cluster/nodes-task.asciidoc
+++ b/docs/reference/cluster/nodes-task.asciidoc
@ -0,0 +1,49 @@
+[[nodes-task]]
+== Nodes Task API
+
+The nodes task management API retrieves information about the tasks currently
+executing on one or more nodes in the cluster.
+
+[source,js]
+--------------------------------------------------
+GET /_tasks <1>
+GET /_tasks/nodeId1,nodeId2 <2>
+GET /_tasks/nodeId1,nodeId2/cluster:* <3>
+--------------------------------------------------
+// AUTOSENSE
+
+<1> Retrieves all tasks currently running on all nodes in the cluster.
+<2> Retrieves all tasks running on nodes `nodeId1` and `nodeId2`.  See <<cluster-nodes>> for more info about how to select individual nodes.
+<3> Retrieves all cluster-related tasks running on nodes `nodeId1` and `nodeId2`.
+
+The result will look similar to the following:
+
+[source,js]
+--------------------------------------------------
+{
+  "nodes": {
+    "fDlEl7PrQi6F-awHZ3aaDw": {
+      "name": "Gazer",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "tasks": [
+        {
+          "node": "fDlEl7PrQi6F-awHZ3aaDw",
+          "id": 105,
+          "type": "transport",
+          "action": "cluster:monitor/nodes/tasks"
+        },
+        {
+          "node": "fDlEl7PrQi6F-awHZ3aaDw",
+          "id": 106,
+          "type": "direct",
+          "action": "cluster:monitor/nodes/tasks[n]",
+          "parent_node": "fDlEl7PrQi6F-awHZ3aaDw",
+          "parent_id": 105
+        }
+      ]
+    }
+  }
+}
+--------------------------------------------------
--- a/docs/reference/docs/reindex.asciidoc
+++ b/docs/reference/docs/reindex.asciidoc
@ -1,8 +1,8 @@
 [[docs-reindex]]
 == Reindex API

-`_reindex`'s most basic form just copies documents from one index to another.
-This will copy documents from `twitter` into `new_twitter`:
+The most basic form of `_reindex` just copies documents from one index to another.
+This will copy documents from the `twitter` index into the `new_twitter` index:

 [source,js]
 --------------------------------------------------
@ -32,12 +32,13 @@ That will return something like this:
 }
 --------------------------------------------------

-Just like `_update_by_query`, `_reindex` gets a snapshot of the source index
-but its target must be a **different** index so version conflicts are unlikely.
-The `dest` element can be configured like the index API to control optimistic
-concurrency control. Just leaving out `version_type` (as above) or setting it
-to `internal` will cause Elasticsearch to blindly dump documents into the
-target, overwriting any that happen to have the same type and id:
+Just like <<docs-update-by-query,`_update_by_query`>>, `_reindex` gets a
+snapshot of the source index but its target must be a **different** index so
+version conflicts are unlikely. The `dest` element can be configured like the
+index API to control optimistic concurrency control. Just leaving out
+`version_type` (as above) or setting it to `internal` will cause Elasticsearch
+to blindly dump documents into the target, overwriting any that happen to have
+the same type and id:

 [source,js]
 --------------------------------------------------
@ -113,7 +114,7 @@ POST /_reindex
 // AUTOSENSE

 You can limit the documents by adding a type to the `source` or by adding a
-query. This will only copy `tweet`s made by `kimchy` into `new_twitter`:
+query. This will only copy ++tweet++&apos;s made by `kimchy` into `new_twitter`:

 [source,js]
 --------------------------------------------------
@ -140,9 +141,9 @@ lots of sources in one request. This will copy documents from the `tweet` and
 `post` types in the `twitter` and `blog` index. It'd include the `post` type in
 the `twitter` index and the `tweet` type in the `blog` index. If you want to be
 more specific you'll need to use the `query`. It also makes no effort to handle
-id collisions. The target index will remain valid but it's not easy to predict
+ID collisions. The target index will remain valid but it's not easy to predict
 which document will survive because the iteration order isn't well defined.
-Just avoid that situation, ok?
+
 [source,js]
 --------------------------------------------------
 POST /_reindex
@ -222,14 +223,15 @@ POST /_reindex

 Think of the possibilities! Just be careful! With great power.... You can
 change:
- * "_id"
- * "_type"
- * "_index"
- * "_version"
- * "_routing"
- * "_parent"
- * "_timestamp"
- * "_ttl"
+
+ * `_id`
+ * `_type`
+ * `_index`
+ * `_version`
+ * `_routing`
+ * `_parent`
+ * `_timestamp`
+ * `_ttl`

 Setting `_version` to `null` or clearing it from the `ctx` map is just like not
 sending the version in an indexing request. It will cause that document to be
@ -257,6 +259,7 @@ the `=`.
 For example, you can use the following request to copy all documents from
 the `source` index with the company name `cat` into the `dest` index with
 routing set to `cat`.
+
 [source,js]
 --------------------------------------------------
 POST /_reindex
@ -316,7 +319,7 @@ Elasticsearch log file. This will be fixed soon.
 `consistency` controls how many copies of a shard must respond to each write
 request. `timeout` controls how long each write request waits for unavailable
 shards to become available. Both work exactly how they work in the
-{ref}/docs-bulk.html[Bulk API].
+<<docs-bulk,Bulk API>>.

 `requests_per_second` can be set to any decimal number (1.4, 6, 1000, etc) and
 throttle the number of requests per second that the reindex issues. The
@ -385,7 +388,7 @@ from aborting the operation.
 === Works with the Task API

 While Reindex is running you can fetch their status using the
-{ref}/task/list.html[Task List APIs]:
+<<nodes-task,Nodes Task API>>:

 [source,js]
 --------------------------------------------------
--- a/docs/reference/docs/update-by-query.asciidoc
+++ b/docs/reference/docs/update-by-query.asciidoc
@ -56,7 +56,7 @@ POST /twitter/tweet/_update_by_query?conflicts=proceed
 // AUTOSENSE

 You can also limit `_update_by_query` using the
-{ref}/query-dsl.html[Query DSL]. This will update all documents from the
+<<query-dsl,Query DSL>>. This will update all documents from the
 `twitter` index for the user `kimchy`:

 [source,js]
@ -73,7 +73,7 @@ POST /twitter/_update_by_query?conflicts=proceed
 // AUTOSENSE

 <1> The query must be passed as a value to the `query` key, in the same
-way as the {ref}/search-search.html[Search API]. You can also use the `q`
+way as the <<search-search,Search API>>. You can also use the `q`
 parameter in the same way as the search api.

 So far we've only been updating documents without changing their source. That
@ -81,6 +81,7 @@ is genuinely useful for things like
 <<picking-up-a-new-property,picking up new properties>> but it's only half the
 fun. `_update_by_query` supports a `script` object to update the document. This
 will increment the `likes` field on all of kimchy's tweets:
+
 [source,js]
 --------------------------------------------------
 POST /twitter/_update_by_query
@ -97,7 +98,7 @@ POST /twitter/_update_by_query
 --------------------------------------------------
 // AUTOSENSE

-Just as in {ref}/docs-update.html[Update API] you can set `ctx.op = "noop"` if
+Just as in <<docs-update,Update API>> you can set `ctx.op = "noop"` if
 your script decides that it doesn't have to make any changes. That will cause
 `_update_by_query` to omit that document from its updates. Setting `ctx.op` to
 anything else is an error. If you want to delete by a query you can use the
@ -167,7 +168,7 @@ the Elasticsearch log file. This will be fixed soon.
 `consistency` controls how many copies of a shard must respond to each write
 request. `timeout` controls how long each write request waits for unavailable
 shards to become available. Both work exactly how they work in the
-{ref}/docs-bulk.html[Bulk API].
+<<docs-bulk,Bulk API>>.

 `requests_per_second` can be set to any decimal number (1.4, 6, 1000, etc) and
 throttle the number of requests per second that the update by query issues. The
@ -232,7 +233,7 @@ from aborting the operation.
 === Works with the Task API

 While Update By Query is running you can fetch their status using the
-{ref}/task/list.html[Task List APIs]:
+<<nodes-task,Nodes Task API>>:

 [source,js]
 --------------------------------------------------
@ -285,6 +286,7 @@ progress by adding the `updated`, `created`, and `deleted` fields. The request
 will finish when their sum is equal to the `total` field.


+[float]
 [[picking-up-a-new-property]]
 === Pick up a new property

@ -379,4 +381,4 @@ POST test/_search?filter_path=hits.total
 }
 --------------------------------------------------

-Hurray! You can do the exact same thing when adding a field to a multifield.
+You can do the exact same thing when adding a field to a multifield.
--- a/docs/reference/tasks/list.asciidoc
+++ b/docs/reference/tasks/list.asciidoc
@ -1,46 +0,0 @@
-[[tasks-list]]
-== Tasks List
-
-The task management API allows to retrieve information about currently running tasks.
-
-[source,js]
--------------------------------------------------
-curl -XGET 'http://localhost:9200/_tasks'
-curl -XGET 'http://localhost:9200/_tasks/nodeId1,nodeId2'
-curl -XGET 'http://localhost:9200/_tasks/nodeId1,nodeId2/cluster:*'
--------------------------------------------------
-
-The first command retrieves all tasks currently running on all nodes.
-The second command selectively retrieves tasks from nodes
-`nodeId1` and `nodeId2`. All the nodes selective options are explained
-<<cluster-nodes,here>>.
-The third command retrieves all cluster-related tasks running on nodes `nodeId1` and `nodeId2`.
-
-The result will look similar to:
-
-[source,js]
--------------------------------------------------
-{
-  "nodes" : {
-    "fDlEl7PrQi6F-awHZ3aaDw" : {
-      "name" : "Gazer",
-      "transport_address" : "127.0.0.1:9300",
-      "host" : "127.0.0.1",
-      "ip" : "127.0.0.1:9300",
-      "tasks" : [ {
-        "node" : "fDlEl7PrQi6F-awHZ3aaDw",
-        "id" : 105,
-        "type" : "transport",
-        "action" : "cluster:monitor/nodes/tasks"
-      }, {
-        "node" : "fDlEl7PrQi6F-awHZ3aaDw",
-        "id" : 106,
-        "type" : "direct",
-        "action" : "cluster:monitor/nodes/tasks[n]",
-        "parent_node" : "fDlEl7PrQi6F-awHZ3aaDw",
-        "parent_id" : 105
-      } ]
-    }
-  }
-}
--------------------------------------------------