Documentation changes for wait_for_active_shards (#19581)

Documentation changes and migration doc changes for introducing 
wait_for_active_shards and removing write consistency level.

Closes #19581
This commit is contained in:
Ali Beyad 2016-08-02 09:15:01 -04:00 committed by GitHub
parent 3d2a105825
commit a21dd80f1b
13 changed files with 170 additions and 67 deletions

View File

@ -128,27 +128,21 @@ field. It automatically follows the behavior of the index / delete
operation based on the `_parent` / `_routing` mapping.
[float]
[[bulk-consistency]]
=== Write Consistency
[[bulk-wait-for-active-shards]]
=== Wait For Active Shards
When making bulk calls, you can require a minimum number of active
shards in the partition through the `consistency` parameter. The values
allowed are `one`, `quorum`, and `all`. It defaults to the node level
setting of `action.write_consistency`, which in turn defaults to
`quorum`.
For example, in a N shards with 2 replicas index, there will have to be
at least 2 active shards within the relevant partition (`quorum`) for
the operation to succeed. In a N shards with 1 replica scenario, there
will need to be a single shard active (in this case, `one` and `quorum`
are the same).
When making bulk calls, you can set the `wait_for_active_shards`
parameter to require a minimum number of shard copies to be active
before starting to process the bulk request. See
<<index-wait-for-active-shards,here>> for further details and a usage
example.
[float]
[[bulk-refresh]]
=== Refresh
Control when the changes made by this request are visible to search. See
<<docs-refresh>>.
<<docs-refresh,refresh>>.
[float]
[[bulk-update]]

View File

@ -144,7 +144,7 @@ POST twitter/_delete_by_query?scroll_size=5000
=== URL Parameters
In addition to the standard parameters like `pretty`, the Delete By Query API
also supports `refresh`, `wait_for_completion`, `consistency`, and `timeout`.
also supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, and `timeout`.
Sending the `refresh` will refresh all shards involved in the delete by query
once the request completes. This is different than the Delete API's `refresh`
@ -159,8 +159,9 @@ record of this task as a document at `.tasks/task/${taskId}`. This is yours
to keep or remove as you see fit. When you are done with it, delete it so
Elasticsearch can reclaim the space it uses.
`consistency` controls how many copies of a shard must respond to each write
request. `timeout` controls how long each write request waits for unavailable
`wait_for_active_shards` controls how many copies of a shard must be active
before proceeding with the request. See <<index-wait-for-active-shards,here>>
for details. `timeout` controls how long each write request waits for unavailable
shards to become available. Both work exactly how they work in the
<<docs-bulk,Bulk API>>.

View File

@ -95,20 +95,14 @@ redirected into the primary shard within that id group, and replicated
(if needed) to shard replicas within that id group.
[float]
[[delete-consistency]]
=== Write Consistency
[[delete-wait-for-active-shards]]
=== Wait For Active Shards
Control if the operation will be allowed to execute based on the number
of active shards within that partition (replication group). The values
allowed are `one`, `quorum`, and `all`. The parameter to set it is
`consistency`, and it defaults to the node level setting of
`action.write_consistency` which in turn defaults to `quorum`.
For example, in a N shards with 2 replicas index, there will have to be
at least 2 active shards within the relevant partition (`quorum`) for
the operation to succeed. In a N shards with 1 replica scenario, there
will need to be a single shard active (in this case, `one` and `quorum`
is the same).
When making delete requests, you can set the `wait_for_active_shards`
parameter to require a minimum number of shard copies to be active
before starting to process the delete request. See
<<index-wait-for-active-shards,here>> for further details and a usage
example.
[float]
[[delete-refresh]]

View File

@ -44,10 +44,10 @@ The `_shards` header provides information about the replication process of the i
The index operation is successful in the case `successful` is at least 1.
NOTE: Replica shards may not all be started when an indexing operation successfully returns (by default, a quorum is
required). In that case, `total` will be equal to the total shards based on the index replica settings and
`successful` will be equal to the number of shards started (primary plus replicas). As there were no failures,
the `failed` will be 0.
NOTE: Replica shards may not all be started when an indexing operation successfully returns (by default, only the
primary is required, but this behavior can be <<index-wait-for-active-shards,changed>>). In that case,
`total` will be equal to the total shards based on the `number_of_replicas` setting and `successful` will be
equal to the number of shards started (primary plus replicas). If there were no failures, the `failed` will be 0.
[float]
[[index-creation]]
@ -308,31 +308,68 @@ containing this shard. After the primary shard completes the operation,
if needed, the update is distributed to applicable replicas.
[float]
[[index-consistency]]
=== Write Consistency
[[index-wait-for-active-shards]]
=== Wait For Active Shards
To prevent writes from taking place on the "wrong" side of a network
partition, by default, index operations only succeed if a quorum
(>replicas/2+1) of active shards are available. This default can be
overridden on a node-by-node basis using the `action.write_consistency`
setting. To alter this behavior per-operation, the `consistency` request
parameter can be used.
To improve the resiliency of writes to the system, indexing operations
can be configured to wait for a certain number of active shard copies
before proceeding with the operation. If the requisite number of active
shard copies are not available, then the write operation must wait and
retry, until either the requisite shard copies have started or a timeout
occurs. By default, write operations only wait for the primary shards
to be active before proceeding (i.e. `wait_for_active_shards=1`).
This default can be overridden in the index settings dynamically
by setting `index.write.wait_for_active_shards`. To alter this behavior
per operation, the `wait_for_active_shards` request parameter can be used.
Valid write consistency values are `one`, `quorum`, and `all`.
Valid values are `all` or any positive integer up to the total number
of configured copies per shard in the index (which is `number_of_replicas+1`).
Specifying a negative value or a number greater than the number of
shard copies will throw an error.
Note, for the case where the number of replicas is 1 (total of 2 copies
of the data), then the default behavior is to succeed if 1 copy (the primary)
can perform the write.
For example, suppose we have a cluster of three nodes, `A, `B`, and `C` and
we create an index `index` with the number of replicas set to 3 (resulting in
4 shard copies, one more copy than there are nodes). If we
attempt an indexing operation, by default the operation will only ensure
the primary copy of each shard is available before proceeding. This means
that even if `B` and `C` went down, and `A` hosted the primary shard copies,
the indexing operation would still proceed with only one copy of the data.
If `wait_for_active_shards` is set on the request to `3` (and all 3 nodes
are up), then the indexing operation will require 3 active shard copies
before proceeding, a requirement which should be met because there are 3
active nodes in the cluster, each one holding a copy of the shard. However,
if we set `wait_for_active_shards` to `all` (or to `4`, which is the same),
the indexing operation will not proceed as we do not have all 4 copies of
each shard active in the index. The operation will timeout
unless a new node is brought up in the cluster to host the fourth copy of
the shard.
The index operation only returns after all *active* shards within the
replication group have indexed the document (sync replication).
It is important to note that this setting greatly reduces the chances of
the write operation not writing to the requisite number of shard copies,
but it does not completely eliminate the possibility, because this check
occurs before the write operation commences. Once the write operation
is underway, it is still possible for replication to fail on any number of
shard copies but still succeed on the primary. The `_shards` section of the
write operation's response reveals the number of shard copies on which
replication succeeded/failed.
[source,js]
--------------------------------------------------
{
"_shards" : {
"total" : 2,
"failed" : 0,
"successful" : 2
}
}
--------------------------------------------------
[float]
[[index-refresh]]
=== Refresh
Control when the changes made by this request are visible to search. See
<<docs-refresh>>.
<<docs-refresh,refresh>>.
[float]
[[index-noop]]

View File

@ -419,7 +419,7 @@ is sent directly to the remote host without validation or modification.
=== URL Parameters
In addition to the standard parameters like `pretty`, the Reindex API also
supports `refresh`, `wait_for_completion`, `consistency`, `timeout`, and
supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, `timeout`, and
`requests_per_second`.
Sending the `refresh` url parameter will cause all indexes to which the request
@ -434,8 +434,9 @@ record of this task as a document at `.tasks/task/${taskId}`. This is yours
to keep or remove as you see fit. When you are done with it, delete it so
Elasticsearch can reclaim the space it uses.
`consistency` controls how many copies of a shard must respond to each write
request. `timeout` controls how long each write request waits for unavailable
`wait_for_active_shards` controls how many copies of a shard must be active
before proceeding with the reindexing. See <<index-wait-for-active-shards,here>>
for details. `timeout` controls how long each write request waits for unavailable
shards to become available. Both work exactly how they work in the
<<docs-bulk,Bulk API>>.

View File

@ -202,7 +202,7 @@ POST twitter/_update_by_query?pipeline=set-foo
=== URL Parameters
In addition to the standard parameters like `pretty`, the Update By Query API
also supports `refresh`, `wait_for_completion`, `consistency`, and `timeout`.
also supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, and `timeout`.
Sending the `refresh` will update all shards in the index being updated when
the request completes. This is different than the Index API's `refresh`
@ -216,8 +216,9 @@ record of this task as a document at `.tasks/task/${taskId}`. This is yours
to keep or remove as you see fit. When you are done with it, delete it so
Elasticsearch can reclaim the space it uses.
`consistency` controls how many copies of a shard must respond to each write
request. `timeout` controls how long each write request waits for unavailable
`wait_for_active_shards` controls how many copies of a shard must be active
before proceeding with the request. See <<index-wait-for-active-shards,here>>
for details. `timeout` controls how long each write request waits for unavailable
shards to become available. Both work exactly how they work in the
<<docs-bulk,Bulk API>>.

View File

@ -245,9 +245,10 @@ If an alias index routing is specified then it overrides the parent routing and
Timeout waiting for a shard to become available.
`consistency`::
`wait_for_active_shards`::
The write consistency of the index/delete operation.
The number of shard copies required to be active before proceeding with the update operation.
See <<index-wait-for-active-shards,here>> for details.
`refresh`::

View File

@ -260,7 +260,8 @@ And the response:
--------------------------------------------------
curl -XPUT 'localhost:9200/customer?pretty'
{
"acknowledged" : true
"acknowledged" : true,
"shards_acknowledged": true
}
curl 'localhost:9200/_cat/indices?v'

View File

@ -122,3 +122,53 @@ curl -XPUT localhost:9200/test -d '{
--------------------------------------------------
<1> `creation_date` is set using epoch time in milliseconds.
[float]
[[create-index-wait-for-active-shards]]
=== Wait For Active Shards
By default, index creation will only return a response to the client when the primary copies of
each shard have been started, or the request times out. The index creation response will indicate
what happened:
[source,js]
--------------------------------------------------
{
"acknowledged": true,
"shards_acknowledged": true
}
--------------------------------------------------
`acknowledged` indicates whether the index was successfully created in the cluster, while
`shards_acknowledged` indices whether the requisite number of shard copies were started for
each shard in the index before timing out. Note that it is still possible for either
`acknowledged` or `shards_acknowledged` to be `false`, but the index creation was successful.
These values simply indicate whether the operation completed before the timeout. If
`acknowledged` is `false`, then we timed out before the cluster state was updated with the
newly created index, but it probably will be created sometime soon. If `shards_acknowledged`
is `false`, then we timed out before the requisite number of shards were started (by default
just the primaries), even if the cluster state was successfully updated to reflect the newly
created index (i.e. `acknowledged=true`).
We can change the default of only waiting for the primary shards to start through the index
setting `index.write.wait_for_active_shards` (note that changing this setting will also affect
the `wait_for_active_shards` value on all subsequent write operations):
[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/test -d '{
"settings": {
"index.write.wait_for_active_shards": "2"
}
}
--------------------------------------------------
or through the request parameter `wait_for_active_shards`:
[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/test?wait_for_active_shards=2
--------------------------------------------------
A detailed explanation of `wait_for_active_shards` and its possible values can be found
<<index-wait-for-active-shards,here>>.

View File

@ -126,3 +126,9 @@ POST logs_write/_rollover?dry_run
--------------------------------------------------
// CONSOLE
[float]
=== Wait For Active Shards
Because the rollover operation creates a new index to rollover to, the
<<create-index-wait-for-active-shards,wait for active shards>> setting on
index creation applies to the rollover action as well.

View File

@ -136,3 +136,9 @@ shrink process begins. When the shrink operation completes, the shard will
become `active`. At that point, Elasticsearch will try to allocate any
replicas and may decide to relocate the primary shard to another node.
[float]
=== Wait For Active Shards
Because the shrink operation creates a new index to shrink the shards to,
the <<create-index-wait-for-active-shards,wait for active shards>> setting
on index creation applies to the shrink index action as well.

View File

@ -91,6 +91,26 @@ language `org.codehaus.groovy:groovy` artifact.
error description). This will influence code that use the `IndexRequest.opType()` or `IndexRequest.create()`
to index a document only if it doesn't already exist.
==== writeConsistencyLevel removed on write requests
In previous versions of Elasticsearch, the various write requests had a
`setWriteConsistencyLevel` method to set the shard consistency level for
write operations. However, the semantics of write consistency were ambiguous
as this is just a pre-operation check to ensure the specified number of
shards were available before the operation commenced. The write consistency
level did not guarantee that the data would be replicated to those number
of copies by the time the operation finished. The `setWriteConsistencyLevel`
method on these write requests has been changed to `setWaitForActiveShards`,
which can take a numerical value up to the total number of shard copies or
`ActiveShardCount.ALL` for all shard copies. The default is to just wait
for the primary shard to be active before proceeding with the operation.
See the section on <<index-wait-for-active-shards,wait for active shards>>
for more details.
This change affects `IndexRequest`, `IndexRequestBuilder`, `BulkRequest`,
`BulkRequestBuilder`, `UpdateRequest`, `UpdateRequestBuilder`, `DeleteRequest`,
and `DeleteRequestBuilder`.
==== Changes to Query Builders
===== BoostingQueryBuilder

View File

@ -361,15 +361,6 @@ Certain versions of the JVM are known to have bugs which can cause index corrupt
When a node is experience network issues, the master detects it and removes the node from the cluster. That causes all ongoing recoveries from and to that node to be stopped and a new location is found for the relevant shards. However, in the of case partial network partition, where there are connectivity issues between the source and target nodes of a recovery but not between those nodes and the current master things may go wrong. While the nodes successfully restore the connection, the on going recoveries may have encountered issues. In {GIT}8720[#8720], we added test simulations for these and solved several issues that were flagged by them.
[float]
=== Validate quorum before accepting a write request (STATUS: DONE, v1.4.0)
Today, when a node holding a primary shard receives an index request, it checks the local cluster state to see whether a quorum of shards is available before it accepts the request. However, it can take some time before an unresponsive node is removed from the cluster state. We are adding an optional live check, where the primary node tries to contact its replicas to confirm that they are still responding before accepting any changes. See {GIT}6937[#6937].
While the work is going on, we tightened the current checks by bringing them closer to the index code. See {GIT}7873[#7873] (STATUS: DONE, fixed in v1.4.0)
[float]
=== Improving Zen Discovery (STATUS: DONE, v1.4.0.Beta1)