[DOCS] Reformat index API. (#45415)

* [DOCS] Reformat index API.

* Incorporated review feedback.
This commit is contained in:
debadair 2019-08-12 14:50:15 -07:00 committed by Deb Adair
parent 3393f9599e
commit e9e9526192
2 changed files with 278 additions and 121 deletions

View File

@ -1,77 +1,144 @@
[[docs-index_]] [[docs-index_]]
=== Index API === Index API
++++
<titleabbrev>Index</titleabbrev>
++++
IMPORTANT: See <<removal-of-types>>. IMPORTANT: See <<removal-of-types>>.
The index API adds or updates a JSON document in a specific index, Adds a JSON document to the specified index and makes
making it searchable. The following example inserts the JSON document it searchable. If the document already exists,
into the "twitter" index with an id of 1: updates the document and increments its version.
[source,js] [[docs-index-api-request]]
-------------------------------------------------- ==== {api-request-title}
PUT twitter/_doc/1
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
--------------------------------------------------
// CONSOLE
The result of the above index operation is: `PUT /<index>/_doc/<_id>`
[source,js] `POST /<index>/_doc/`
--------------------------------------------------
{
"_shards" : {
"total" : 2,
"failed" : 0,
"successful" : 2
},
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"result" : "created"
}
--------------------------------------------------
// TESTRESPONSE[s/"successful" : 2/"successful" : 1/]
The `_shards` header provides information about the replication process of the index operation: `PUT /<index>/_create/<_id>`
`total`:: Indicates how many shard copies (primary and replica shards) the index operation should be executed on. `POST /<index>/_create/<_id>`
`successful`:: Indicates the number of shard copies the index operation succeeded on.
`failed`:: An array that contains replication-related errors in the case an index operation failed on a replica shard.
The index operation is successful in the case `successful` is at least 1. [[docs-index-api-path-params]]
==== {api-path-parms-title}
NOTE: Replica shards may not all be started when an indexing operation successfully returns (by default, only the `<index>`::
primary is required, but this behavior can be <<index-wait-for-active-shards,changed>>). In that case, (Required, string) Name of the target index. By default, the index is created
`total` will be equal to the total shards based on the `number_of_replicas` setting and `successful` will be automatically if it doesn't exist. For more information, see <<index-creation>>.
equal to the number of shards started (primary plus replicas). If there were no failures, the `failed` will be 0.
`<_id>`::
(Optional, string) Unique identifier for the document. Required if you are
using a PUT request. Omit to automatically generate an ID when using a
POST request.
[[docs--api-query-params]]
==== {api-query-parms-title}
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-seq-no]
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-primary-term]
`op_type`::
(Optional, enum) Set to `create` to only index the document
if it does not already exist (_put if absent_). If a document with the specified
`_id` already exists, the indexing operation will fail. Same as using the
`<index>/_create` endpoint. Valid values: `index`, `create`. Default: `index`.
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-pipeline]
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-refresh]
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-routing]
include::{docdir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-version]
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-version-type]
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-wait-for-active-shards]
[[docs-index-api-request-body]]
==== {api-request-body-title}
`<field>`::
(Required, string) Request body contains the JSON source for the document
data.
[[docs-index-api-response-body]]
==== {api-response-body-title}
`_shards`::
Provides information about the replication process of the index operation.
`_shards.total`::
Indicates how many shard copies (primary and replica shards) the index operation
should be executed on.
`_shards.successful`::
Indicates the number of shard copies the index operation succeeded on.
When the index operation is successful, `successful` is at least 1.
+
NOTE: Replica shards might not all be started when an indexing operation
returns successfully--by default, only the primary is required. Set
`wait_for_active_shards` to change this default behavior. See
<<index-wait-for-active-shards>>.
`_shards.failed`::
An array that contains replication-related errors in the case an index operation
failed on a replica shard. 0 indicates there were no failures.
`_index`::
The name of the index the document was added to.
`_type`::
The document type. {es} indices now support a single document type, `_doc`.
`_id`::
The unique identifier for the added document.
`_version`::
The document version. Incremented each time the document is updated.
`_seq_no`::
The sequence number assigned to the document for the indexing operation.
Sequence numbers are used to ensure an older version of a document
doesnt overwrite a newer version. See <<optimistic-concurrency-control-index>>.
`_primary_term`::
The primary term assigned to the document for the indexing operation.
See <<optimistic-concurrency-control-index>>.
`result`::
The result of the indexing operation, `created` or `updated`.
[[docs-index-api-desc]]
==== {api-description-title}
You can index a new JSON document with the `_doc` or `_create` resource. Using
`_create` guarantees that the document is only indexed if it does not already
exist. To update an existing document, you must use the `_doc` resource.
[float]
[[index-creation]] [[index-creation]]
==== Automatic Index Creation ===== Create indices automatically
The index operation automatically creates an index if it does not already If the specified index does not already exist, by default the index operation
exist, and applies any <<indices-templates,index templates>> that are automatically creates it and applies any configured
configured. The index operation also creates a dynamic mapping if one does not <<indices-templates,index templates>>. If no mapping exists, the index opration
already exist. By default, new fields and objects will automatically be added creates a dynamic mapping. By default, new fields and objects are
to the mapping definition if needed. Check out the <<mapping,mapping>> section automatically added to the mapping if needed. For more information about field
for more information on mapping definitions, and the mapping, see <<mapping,mapping>> and the <<indices-put-mapping,put mapping>> API.
<<indices-put-mapping,put mapping>> API for information about updating mappings
manually.
Automatic index creation is controlled by the `action.auto_create_index` Automatic index creation is controlled by the `action.auto_create_index`
setting. This setting defaults to `true`, meaning that indices are always setting. This setting defaults to `true`, which allows any index to be created
automatically created. Automatic index creation can be permitted only for automatically. You can modify this setting to explicitly allow or block
indices matching certain patterns by changing the value of this setting to a automatic creation of indices that match specified patterns, or set it to
comma-separated list of these patterns. It can also be explicitly permitted and `false` to disable automatic index creation entirely. Specify a
forbidden by prefixing patterns in the list with a `+` or `-`. Finally it can comma-separated list of patterns you want to allow, or prefix each pattern with
be completely disabled by changing this setting to `false`. `+` or `-` to indicate whether it should be allowed or blocked.
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
@ -98,56 +165,30 @@ PUT _cluster/settings
-------------------------------------------------- --------------------------------------------------
// CONSOLE // CONSOLE
<1> Permit only the auto-creation of indices called `twitter`, `index10`, no <1> Allow auto-creation of indices called `twitter` or `index10`, block the
other index matching `index1*`, and any other index matching `ind*`. The creation of indices that match the pattern `index1*`, and allow creation of
patterns are matched in the order in which they are given. any other indices that match the `ind*` pattern. Patterns are matched in
the order specified.
<2> Completely disable the auto-creation of indices. <2> Disable automatic index creation entirely.
<3> Permit the auto-creation of indices with any name. This is the default. <3> Allow automatic creation of any index. This is the default.
[float] [float]
[[operation-type]] [[operation-type]]
==== Operation Type ===== Put if absent
The index operation also accepts an `op_type` that can be used to force You can force a create operation by using the `_create` resource or
a `create` operation, allowing for "put-if-absent" behavior. When setting the `op_type` parameter to _create_. In this case,
`create` is used, the index operation will fail if a document by that id the index operation fails if a document with the specified ID
already exists in the index. already exists in the index.
Here is an example of using the `op_type` parameter:
[source,js]
--------------------------------------------------
PUT twitter/_doc/1?op_type=create
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
--------------------------------------------------
// CONSOLE
Another option to specify `create` is to use the following uri:
[source,js]
--------------------------------------------------
PUT twitter/_create/1
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
--------------------------------------------------
// CONSOLE
[float] [float]
==== Automatic ID Generation ===== Create document IDs automatically
The index operation can be executed without specifying the id. In such a If you don't specify a document ID when using POST, the `op_type` is
case, an id will be generated automatically. In addition, the `op_type` automatically set to `create` and the index operation generates a unique ID
will automatically be set to `create`. Here is an example (note the for the document.
*POST* used instead of *PUT*):
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
@ -160,7 +201,7 @@ POST twitter/_doc/
-------------------------------------------------- --------------------------------------------------
// CONSOLE // CONSOLE
The result of the above index operation is: The API returns the following result:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
@ -183,17 +224,17 @@ The result of the above index operation is:
[float] [float]
[[optimistic-concurrency-control-index]] [[optimistic-concurrency-control-index]]
==== Optimistic concurrency control ===== Optimistic concurrency control
Index operations can be made conditional and only be performed if the last Index operations can be made conditional and only be performed if the last
modification to the document was assigned the sequence number and primary modification to the document was assigned the sequence number and primary
term specified by the `if_seq_no` and `if_primary_term` parameters. If a term specified by the `if_seq_no` and `if_primary_term` parameters. If a
mismatch is detected, the operation will result in a `VersionConflictException` mismatch is detected, the operation will result in a `VersionConflictException`
and a status code of 409. See <<optimistic-concurrency-control>> for more details. and a status code of 409. See <<optimistic-concurrency-control>> for more details.
[float] [float]
[[index-routing]] [[index-routing]]
==== Routing ===== Routing
By default, shard placement ? or `routing` ? is controlled by using a By default, shard placement ? or `routing` ? is controlled by using a
hash of the document's id value. For more explicit control, the value hash of the document's id value. For more explicit control, the value
@ -211,11 +252,11 @@ POST twitter/_doc?routing=kimchy
-------------------------------------------------- --------------------------------------------------
// CONSOLE // CONSOLE
In the example above, the "_doc" document is routed to a shard based on In this example, the document is routed to a shard based on
the `routing` parameter provided: "kimchy". the `routing` parameter provided: "kimchy".
When setting up explicit mapping, the `_routing` field can be optionally When setting up explicit mapping, you can also use the `_routing` field
used to direct the index operation to extract the routing value from the to direct the index operation to extract the routing value from the
document itself. This does come at the (very minimal) cost of an document itself. This does come at the (very minimal) cost of an
additional document parsing pass. If the `_routing` mapping is defined additional document parsing pass. If the `_routing` mapping is defined
and set to be `required`, the index operation will fail if no routing and set to be `required`, the index operation will fail if no routing
@ -223,7 +264,7 @@ value is provided or extracted.
[float] [float]
[[index-distributed]] [[index-distributed]]
==== Distributed ===== Distributed
The index operation is directed to the primary shard based on its route The index operation is directed to the primary shard based on its route
(see the Routing section above) and performed on the actual node (see the Routing section above) and performed on the actual node
@ -232,7 +273,7 @@ if needed, the update is distributed to applicable replicas.
[float] [float]
[[index-wait-for-active-shards]] [[index-wait-for-active-shards]]
==== Wait For Active Shards ===== Active shards
To improve the resiliency of writes to the system, indexing operations To improve the resiliency of writes to the system, indexing operations
can be configured to wait for a certain number of active shard copies can be configured to wait for a certain number of active shard copies
@ -290,14 +331,14 @@ replication succeeded/failed.
[float] [float]
[[index-refresh]] [[index-refresh]]
==== Refresh ===== Refresh
Control when the changes made by this request are visible to search. See Control when the changes made by this request are visible to search. See
<<docs-refresh,refresh>>. <<docs-refresh,refresh>>.
[float] [float]
[[index-noop]] [[index-noop]]
==== Noop Updates ===== Noop updates
When updating a document using the index API a new version of the document is When updating a document using the index API a new version of the document is
always created even if the document hasn't changed. If this isn't acceptable always created even if the document hasn't changed. If this isn't acceptable
@ -312,7 +353,7 @@ Elasticsearch runs on the shard receiving the updates.
[float] [float]
[[timeout]] [[timeout]]
==== Timeout ===== Timeout
The primary shard assigned to perform the index operation might not be The primary shard assigned to perform the index operation might not be
available when the index operation is executed. Some reasons for this available when the index operation is executed. Some reasons for this
@ -336,15 +377,15 @@ PUT twitter/_doc/1?timeout=5m
[float] [float]
[[index-versioning]] [[index-versioning]]
==== Versioning ===== Versioning
Each indexed document is given a version number. By default, Each indexed document is given a version number. By default,
internal versioning is used that starts at 1 and increments internal versioning is used that starts at 1 and increments
with each update, deletes included. Optionally, the version number can be with each update, deletes included. Optionally, the version number can be
set to an external value (for example, if maintained in a set to an external value (for example, if maintained in a
database). To enable this functionality, `version_type` should be set to database). To enable this functionality, `version_type` should be set to
`external`. The value provided must be a numeric, long value greater than or equal to 0, `external`. The value provided must be a numeric, long value greater than or equal to 0,
and less than around 9.2e+18. and less than around 9.2e+18.
When using the external version type, the system checks to see if When using the external version type, the system checks to see if
the version number passed to the index request is greater than the the version number passed to the index request is greater than the
@ -363,11 +404,12 @@ PUT twitter/_doc/1?version=2&version_type=external
// CONSOLE // CONSOLE
// TEST[continued] // TEST[continued]
*NOTE:* Versioning is completely real time, and is not affected by the NOTE: Versioning is completely real time, and is not affected by the
near real time aspects of search operations. If no version is provided, near real time aspects of search operations. If no version is provided,
then the operation is executed without any version checks. then the operation is executed without any version checks.
The above will succeed since the supplied version of 2 is higher than In the previous example, the operation will succeed since the supplied
version of 2 is higher than
the current document version of 1. If the document was already updated the current document version of 1. If the document was already updated
and its version was set to 2 or higher, the indexing command will fail and its version was set to 2 or higher, the indexing command will fail
and result in a conflict (409 http status code). and result in a conflict (409 http status code).
@ -381,11 +423,11 @@ latest version will be used if the index operations arrive out of order for
whatever reason. whatever reason.
[float] [float]
[[index-version-types]]
===== Version types ===== Version types
Next to the `external` version type explained above, Elasticsearch In addition to the `external` version type, Elasticsearch
also supports other types for specific use cases. Here is an overview of also supports other types for specific use cases:
the different version types and their semantics.
`internal`:: Only index the document if the given version is identical to the version `internal`:: Only index the document if the given version is identical to the version
of the stored document. of the stored document.
@ -400,8 +442,72 @@ than the version of the stored document. If there is no existing document
the operation will succeed as well. The given version will be used as the new version the operation will succeed as well. The given version will be used as the new version
and will be stored with the new document. The supplied version must be a non-negative long number. and will be stored with the new document. The supplied version must be a non-negative long number.
*NOTE*: The `external_gte` version type is meant for special use cases and NOTE: The `external_gte` version type is meant for special use cases and
should be used with care. If used incorrectly, it can result in loss of data. should be used with care. If used incorrectly, it can result in loss of data.
There is another option, `force`, which is deprecated because it can cause There is another option, `force`, which is deprecated because it can cause
primary and replica shards to diverge. primary and replica shards to diverge.
[[docs-index-api-example]]
==== {api-examples-title}
Insert a JSON document into the `twitter` index with an `_id` of 1:
[source,js]
--------------------------------------------------
PUT twitter/_doc/1
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
--------------------------------------------------
// CONSOLE
The API returns the following result:
[source,js]
--------------------------------------------------
{
"_shards" : {
"total" : 2,
"failed" : 0,
"successful" : 2
},
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"result" : "created"
}
--------------------------------------------------
// TESTRESPONSE[s/"successful" : 2/"successful" : 1/]
Use the `_create` resource to index a document into the `twitter` index if
no document with that ID exists:
[source,js]
--------------------------------------------------
PUT twitter/_create/1
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
--------------------------------------------------
// CONSOLE
Set the `op_type` parameter to _create_ to index a document into the `twitter`
index if no document with that ID exists:
[source,js]
--------------------------------------------------
PUT twitter/_doc/1?op_type=create
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
--------------------------------------------------
// CONSOLE

View File

@ -40,7 +40,7 @@ end::cat-h[]
tag::flat-settings[] tag::flat-settings[]
`flat_settings`:: `flat_settings`::
(Optional, boolean) If `true`, returns settings in flat format. Defaults to (Optional, boolean) If `true`, returns settings in flat format. Defaults to
`false`. `false`.
end::flat-settings[] end::flat-settings[]
@ -112,6 +112,57 @@ tag::cat-v[]
to `false`. to `false`.
end::cat-v[] end::cat-v[]
tag::doc-pipeline[]
`pipeline`::
(Optional, string) ID of the pipeline to use to preprocess incoming documents.
end::doc-pipeline[]
tag::doc-refresh[]
`refresh`::
(Optional, enum) If `true`, {es} refreshes the affected shards to make this
operation visible to search, if `wait_for` then wait for a refresh to make
this operation visible to search, if `false` do nothing with refreshes.
Valid values: `true`, `false`, `wait_for`. Default: `false`.
end::doc-refresh[]
tag::doc-seq-no[]
`if_seq_no`::
(Optional, integer) Only perform the operation if the document has this
sequence number. See <<optimistic-concurrency-control-index>>.
end::doc-seq-no[]
tag::doc-primary-term[]
`if_primary_term`::
(Optional, integer) Only perform the operation if the document has
this primary term. See <<optimistic-concurrency-control-index>>.
end::doc-primary-term[]
tag::doc-routing[]
`routing`::
(Optional, string) Target the specified primary shard.
end::doc-routing[]
tag::doc-version[]
`version`::
(Optional, integer) Explicit version number for concurrency control.
The specified version must match the current version of the document for the
request to succeed.
end::doc-version[]
tag::doc-version-type[]
`version_type`::
(Optional, enum) Specific version type: `internal`, `external`,
`external_gte`, `force`.
end::doc-version-type[]
tag::doc-wait-for-active-shards[]
`wait_for_active_shards`::
(Optional, string) The number of shard copies that must be active before
proceeding with the operation. Set to `all` or any positive integer up
to the total number of shards in the index (`number_of_replicas+1`).
Default: 1, the primary shard.
end::doc-wait-for-active-shards[]
tag::timeoutparms[] tag::timeoutparms[]
`timeout`:: `timeout`::
@ -126,4 +177,4 @@ a connection to the master node. If no response is received before the timeout
expires, the request fails and returns an error. Defaults to `30s`. expires, the request fails and returns an error. Defaults to `30s`.
end::master-timeout[] end::master-timeout[]
end::timeoutparms[] end::timeoutparms[]