mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-29 19:38:29 +00:00
Documentation changes for wait_for_active_shards (#19581)
Documentation changes and migration doc changes for introducing wait_for_active_shards and removing write consistency level. Closes #19581
This commit is contained in:
parent
3d2a105825
commit
a21dd80f1b
@ -128,27 +128,21 @@ field. It automatically follows the behavior of the index / delete
|
|||||||
operation based on the `_parent` / `_routing` mapping.
|
operation based on the `_parent` / `_routing` mapping.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[bulk-consistency]]
|
[[bulk-wait-for-active-shards]]
|
||||||
=== Write Consistency
|
=== Wait For Active Shards
|
||||||
|
|
||||||
When making bulk calls, you can require a minimum number of active
|
When making bulk calls, you can set the `wait_for_active_shards`
|
||||||
shards in the partition through the `consistency` parameter. The values
|
parameter to require a minimum number of shard copies to be active
|
||||||
allowed are `one`, `quorum`, and `all`. It defaults to the node level
|
before starting to process the bulk request. See
|
||||||
setting of `action.write_consistency`, which in turn defaults to
|
<<index-wait-for-active-shards,here>> for further details and a usage
|
||||||
`quorum`.
|
example.
|
||||||
|
|
||||||
For example, in a N shards with 2 replicas index, there will have to be
|
|
||||||
at least 2 active shards within the relevant partition (`quorum`) for
|
|
||||||
the operation to succeed. In a N shards with 1 replica scenario, there
|
|
||||||
will need to be a single shard active (in this case, `one` and `quorum`
|
|
||||||
are the same).
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[bulk-refresh]]
|
[[bulk-refresh]]
|
||||||
=== Refresh
|
=== Refresh
|
||||||
|
|
||||||
Control when the changes made by this request are visible to search. See
|
Control when the changes made by this request are visible to search. See
|
||||||
<<docs-refresh>>.
|
<<docs-refresh,refresh>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[bulk-update]]
|
[[bulk-update]]
|
||||||
|
@ -144,7 +144,7 @@ POST twitter/_delete_by_query?scroll_size=5000
|
|||||||
=== URL Parameters
|
=== URL Parameters
|
||||||
|
|
||||||
In addition to the standard parameters like `pretty`, the Delete By Query API
|
In addition to the standard parameters like `pretty`, the Delete By Query API
|
||||||
also supports `refresh`, `wait_for_completion`, `consistency`, and `timeout`.
|
also supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, and `timeout`.
|
||||||
|
|
||||||
Sending the `refresh` will refresh all shards involved in the delete by query
|
Sending the `refresh` will refresh all shards involved in the delete by query
|
||||||
once the request completes. This is different than the Delete API's `refresh`
|
once the request completes. This is different than the Delete API's `refresh`
|
||||||
@ -159,8 +159,9 @@ record of this task as a document at `.tasks/task/${taskId}`. This is yours
|
|||||||
to keep or remove as you see fit. When you are done with it, delete it so
|
to keep or remove as you see fit. When you are done with it, delete it so
|
||||||
Elasticsearch can reclaim the space it uses.
|
Elasticsearch can reclaim the space it uses.
|
||||||
|
|
||||||
`consistency` controls how many copies of a shard must respond to each write
|
`wait_for_active_shards` controls how many copies of a shard must be active
|
||||||
request. `timeout` controls how long each write request waits for unavailable
|
before proceeding with the request. See <<index-wait-for-active-shards,here>>
|
||||||
|
for details. `timeout` controls how long each write request waits for unavailable
|
||||||
shards to become available. Both work exactly how they work in the
|
shards to become available. Both work exactly how they work in the
|
||||||
<<docs-bulk,Bulk API>>.
|
<<docs-bulk,Bulk API>>.
|
||||||
|
|
||||||
|
@ -95,20 +95,14 @@ redirected into the primary shard within that id group, and replicated
|
|||||||
(if needed) to shard replicas within that id group.
|
(if needed) to shard replicas within that id group.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[delete-consistency]]
|
[[delete-wait-for-active-shards]]
|
||||||
=== Write Consistency
|
=== Wait For Active Shards
|
||||||
|
|
||||||
Control if the operation will be allowed to execute based on the number
|
When making delete requests, you can set the `wait_for_active_shards`
|
||||||
of active shards within that partition (replication group). The values
|
parameter to require a minimum number of shard copies to be active
|
||||||
allowed are `one`, `quorum`, and `all`. The parameter to set it is
|
before starting to process the delete request. See
|
||||||
`consistency`, and it defaults to the node level setting of
|
<<index-wait-for-active-shards,here>> for further details and a usage
|
||||||
`action.write_consistency` which in turn defaults to `quorum`.
|
example.
|
||||||
|
|
||||||
For example, in a N shards with 2 replicas index, there will have to be
|
|
||||||
at least 2 active shards within the relevant partition (`quorum`) for
|
|
||||||
the operation to succeed. In a N shards with 1 replica scenario, there
|
|
||||||
will need to be a single shard active (in this case, `one` and `quorum`
|
|
||||||
is the same).
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[delete-refresh]]
|
[[delete-refresh]]
|
||||||
|
@ -44,10 +44,10 @@ The `_shards` header provides information about the replication process of the i
|
|||||||
|
|
||||||
The index operation is successful in the case `successful` is at least 1.
|
The index operation is successful in the case `successful` is at least 1.
|
||||||
|
|
||||||
NOTE: Replica shards may not all be started when an indexing operation successfully returns (by default, a quorum is
|
NOTE: Replica shards may not all be started when an indexing operation successfully returns (by default, only the
|
||||||
required). In that case, `total` will be equal to the total shards based on the index replica settings and
|
primary is required, but this behavior can be <<index-wait-for-active-shards,changed>>). In that case,
|
||||||
`successful` will be equal to the number of shards started (primary plus replicas). As there were no failures,
|
`total` will be equal to the total shards based on the `number_of_replicas` setting and `successful` will be
|
||||||
the `failed` will be 0.
|
equal to the number of shards started (primary plus replicas). If there were no failures, the `failed` will be 0.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[index-creation]]
|
[[index-creation]]
|
||||||
@ -308,31 +308,68 @@ containing this shard. After the primary shard completes the operation,
|
|||||||
if needed, the update is distributed to applicable replicas.
|
if needed, the update is distributed to applicable replicas.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[index-consistency]]
|
[[index-wait-for-active-shards]]
|
||||||
=== Write Consistency
|
=== Wait For Active Shards
|
||||||
|
|
||||||
To prevent writes from taking place on the "wrong" side of a network
|
To improve the resiliency of writes to the system, indexing operations
|
||||||
partition, by default, index operations only succeed if a quorum
|
can be configured to wait for a certain number of active shard copies
|
||||||
(>replicas/2+1) of active shards are available. This default can be
|
before proceeding with the operation. If the requisite number of active
|
||||||
overridden on a node-by-node basis using the `action.write_consistency`
|
shard copies are not available, then the write operation must wait and
|
||||||
setting. To alter this behavior per-operation, the `consistency` request
|
retry, until either the requisite shard copies have started or a timeout
|
||||||
parameter can be used.
|
occurs. By default, write operations only wait for the primary shards
|
||||||
|
to be active before proceeding (i.e. `wait_for_active_shards=1`).
|
||||||
|
This default can be overridden in the index settings dynamically
|
||||||
|
by setting `index.write.wait_for_active_shards`. To alter this behavior
|
||||||
|
per operation, the `wait_for_active_shards` request parameter can be used.
|
||||||
|
|
||||||
Valid write consistency values are `one`, `quorum`, and `all`.
|
Valid values are `all` or any positive integer up to the total number
|
||||||
|
of configured copies per shard in the index (which is `number_of_replicas+1`).
|
||||||
|
Specifying a negative value or a number greater than the number of
|
||||||
|
shard copies will throw an error.
|
||||||
|
|
||||||
Note, for the case where the number of replicas is 1 (total of 2 copies
|
For example, suppose we have a cluster of three nodes, `A, `B`, and `C` and
|
||||||
of the data), then the default behavior is to succeed if 1 copy (the primary)
|
we create an index `index` with the number of replicas set to 3 (resulting in
|
||||||
can perform the write.
|
4 shard copies, one more copy than there are nodes). If we
|
||||||
|
attempt an indexing operation, by default the operation will only ensure
|
||||||
|
the primary copy of each shard is available before proceeding. This means
|
||||||
|
that even if `B` and `C` went down, and `A` hosted the primary shard copies,
|
||||||
|
the indexing operation would still proceed with only one copy of the data.
|
||||||
|
If `wait_for_active_shards` is set on the request to `3` (and all 3 nodes
|
||||||
|
are up), then the indexing operation will require 3 active shard copies
|
||||||
|
before proceeding, a requirement which should be met because there are 3
|
||||||
|
active nodes in the cluster, each one holding a copy of the shard. However,
|
||||||
|
if we set `wait_for_active_shards` to `all` (or to `4`, which is the same),
|
||||||
|
the indexing operation will not proceed as we do not have all 4 copies of
|
||||||
|
each shard active in the index. The operation will timeout
|
||||||
|
unless a new node is brought up in the cluster to host the fourth copy of
|
||||||
|
the shard.
|
||||||
|
|
||||||
The index operation only returns after all *active* shards within the
|
It is important to note that this setting greatly reduces the chances of
|
||||||
replication group have indexed the document (sync replication).
|
the write operation not writing to the requisite number of shard copies,
|
||||||
|
but it does not completely eliminate the possibility, because this check
|
||||||
|
occurs before the write operation commences. Once the write operation
|
||||||
|
is underway, it is still possible for replication to fail on any number of
|
||||||
|
shard copies but still succeed on the primary. The `_shards` section of the
|
||||||
|
write operation's response reveals the number of shard copies on which
|
||||||
|
replication succeeded/failed.
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
{
|
||||||
|
"_shards" : {
|
||||||
|
"total" : 2,
|
||||||
|
"failed" : 0,
|
||||||
|
"successful" : 2
|
||||||
|
}
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[index-refresh]]
|
[[index-refresh]]
|
||||||
=== Refresh
|
=== Refresh
|
||||||
|
|
||||||
Control when the changes made by this request are visible to search. See
|
Control when the changes made by this request are visible to search. See
|
||||||
<<docs-refresh>>.
|
<<docs-refresh,refresh>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[index-noop]]
|
[[index-noop]]
|
||||||
|
@ -419,7 +419,7 @@ is sent directly to the remote host without validation or modification.
|
|||||||
=== URL Parameters
|
=== URL Parameters
|
||||||
|
|
||||||
In addition to the standard parameters like `pretty`, the Reindex API also
|
In addition to the standard parameters like `pretty`, the Reindex API also
|
||||||
supports `refresh`, `wait_for_completion`, `consistency`, `timeout`, and
|
supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, `timeout`, and
|
||||||
`requests_per_second`.
|
`requests_per_second`.
|
||||||
|
|
||||||
Sending the `refresh` url parameter will cause all indexes to which the request
|
Sending the `refresh` url parameter will cause all indexes to which the request
|
||||||
@ -434,8 +434,9 @@ record of this task as a document at `.tasks/task/${taskId}`. This is yours
|
|||||||
to keep or remove as you see fit. When you are done with it, delete it so
|
to keep or remove as you see fit. When you are done with it, delete it so
|
||||||
Elasticsearch can reclaim the space it uses.
|
Elasticsearch can reclaim the space it uses.
|
||||||
|
|
||||||
`consistency` controls how many copies of a shard must respond to each write
|
`wait_for_active_shards` controls how many copies of a shard must be active
|
||||||
request. `timeout` controls how long each write request waits for unavailable
|
before proceeding with the reindexing. See <<index-wait-for-active-shards,here>>
|
||||||
|
for details. `timeout` controls how long each write request waits for unavailable
|
||||||
shards to become available. Both work exactly how they work in the
|
shards to become available. Both work exactly how they work in the
|
||||||
<<docs-bulk,Bulk API>>.
|
<<docs-bulk,Bulk API>>.
|
||||||
|
|
||||||
|
@ -202,7 +202,7 @@ POST twitter/_update_by_query?pipeline=set-foo
|
|||||||
=== URL Parameters
|
=== URL Parameters
|
||||||
|
|
||||||
In addition to the standard parameters like `pretty`, the Update By Query API
|
In addition to the standard parameters like `pretty`, the Update By Query API
|
||||||
also supports `refresh`, `wait_for_completion`, `consistency`, and `timeout`.
|
also supports `refresh`, `wait_for_completion`, `wait_for_active_shards`, and `timeout`.
|
||||||
|
|
||||||
Sending the `refresh` will update all shards in the index being updated when
|
Sending the `refresh` will update all shards in the index being updated when
|
||||||
the request completes. This is different than the Index API's `refresh`
|
the request completes. This is different than the Index API's `refresh`
|
||||||
@ -216,8 +216,9 @@ record of this task as a document at `.tasks/task/${taskId}`. This is yours
|
|||||||
to keep or remove as you see fit. When you are done with it, delete it so
|
to keep or remove as you see fit. When you are done with it, delete it so
|
||||||
Elasticsearch can reclaim the space it uses.
|
Elasticsearch can reclaim the space it uses.
|
||||||
|
|
||||||
`consistency` controls how many copies of a shard must respond to each write
|
`wait_for_active_shards` controls how many copies of a shard must be active
|
||||||
request. `timeout` controls how long each write request waits for unavailable
|
before proceeding with the request. See <<index-wait-for-active-shards,here>>
|
||||||
|
for details. `timeout` controls how long each write request waits for unavailable
|
||||||
shards to become available. Both work exactly how they work in the
|
shards to become available. Both work exactly how they work in the
|
||||||
<<docs-bulk,Bulk API>>.
|
<<docs-bulk,Bulk API>>.
|
||||||
|
|
||||||
|
@ -245,9 +245,10 @@ If an alias index routing is specified then it overrides the parent routing and
|
|||||||
|
|
||||||
Timeout waiting for a shard to become available.
|
Timeout waiting for a shard to become available.
|
||||||
|
|
||||||
`consistency`::
|
`wait_for_active_shards`::
|
||||||
|
|
||||||
The write consistency of the index/delete operation.
|
The number of shard copies required to be active before proceeding with the update operation.
|
||||||
|
See <<index-wait-for-active-shards,here>> for details.
|
||||||
|
|
||||||
`refresh`::
|
`refresh`::
|
||||||
|
|
||||||
|
@ -260,7 +260,8 @@ And the response:
|
|||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
curl -XPUT 'localhost:9200/customer?pretty'
|
curl -XPUT 'localhost:9200/customer?pretty'
|
||||||
{
|
{
|
||||||
"acknowledged" : true
|
"acknowledged" : true,
|
||||||
|
"shards_acknowledged": true
|
||||||
}
|
}
|
||||||
|
|
||||||
curl 'localhost:9200/_cat/indices?v'
|
curl 'localhost:9200/_cat/indices?v'
|
||||||
|
@ -122,3 +122,53 @@ curl -XPUT localhost:9200/test -d '{
|
|||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
<1> `creation_date` is set using epoch time in milliseconds.
|
<1> `creation_date` is set using epoch time in milliseconds.
|
||||||
|
|
||||||
|
[float]
|
||||||
|
[[create-index-wait-for-active-shards]]
|
||||||
|
=== Wait For Active Shards
|
||||||
|
|
||||||
|
By default, index creation will only return a response to the client when the primary copies of
|
||||||
|
each shard have been started, or the request times out. The index creation response will indicate
|
||||||
|
what happened:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
{
|
||||||
|
"acknowledged": true,
|
||||||
|
"shards_acknowledged": true
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
`acknowledged` indicates whether the index was successfully created in the cluster, while
|
||||||
|
`shards_acknowledged` indices whether the requisite number of shard copies were started for
|
||||||
|
each shard in the index before timing out. Note that it is still possible for either
|
||||||
|
`acknowledged` or `shards_acknowledged` to be `false`, but the index creation was successful.
|
||||||
|
These values simply indicate whether the operation completed before the timeout. If
|
||||||
|
`acknowledged` is `false`, then we timed out before the cluster state was updated with the
|
||||||
|
newly created index, but it probably will be created sometime soon. If `shards_acknowledged`
|
||||||
|
is `false`, then we timed out before the requisite number of shards were started (by default
|
||||||
|
just the primaries), even if the cluster state was successfully updated to reflect the newly
|
||||||
|
created index (i.e. `acknowledged=true`).
|
||||||
|
|
||||||
|
We can change the default of only waiting for the primary shards to start through the index
|
||||||
|
setting `index.write.wait_for_active_shards` (note that changing this setting will also affect
|
||||||
|
the `wait_for_active_shards` value on all subsequent write operations):
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
curl -XPUT localhost:9200/test -d '{
|
||||||
|
"settings": {
|
||||||
|
"index.write.wait_for_active_shards": "2"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
or through the request parameter `wait_for_active_shards`:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
curl -XPUT localhost:9200/test?wait_for_active_shards=2
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
A detailed explanation of `wait_for_active_shards` and its possible values can be found
|
||||||
|
<<index-wait-for-active-shards,here>>.
|
||||||
|
@ -126,3 +126,9 @@ POST logs_write/_rollover?dry_run
|
|||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// CONSOLE
|
// CONSOLE
|
||||||
|
|
||||||
|
[float]
|
||||||
|
=== Wait For Active Shards
|
||||||
|
|
||||||
|
Because the rollover operation creates a new index to rollover to, the
|
||||||
|
<<create-index-wait-for-active-shards,wait for active shards>> setting on
|
||||||
|
index creation applies to the rollover action as well.
|
||||||
|
@ -136,3 +136,9 @@ shrink process begins. When the shrink operation completes, the shard will
|
|||||||
become `active`. At that point, Elasticsearch will try to allocate any
|
become `active`. At that point, Elasticsearch will try to allocate any
|
||||||
replicas and may decide to relocate the primary shard to another node.
|
replicas and may decide to relocate the primary shard to another node.
|
||||||
|
|
||||||
|
[float]
|
||||||
|
=== Wait For Active Shards
|
||||||
|
|
||||||
|
Because the shrink operation creates a new index to shrink the shards to,
|
||||||
|
the <<create-index-wait-for-active-shards,wait for active shards>> setting
|
||||||
|
on index creation applies to the shrink index action as well.
|
||||||
|
@ -91,6 +91,26 @@ language `org.codehaus.groovy:groovy` artifact.
|
|||||||
error description). This will influence code that use the `IndexRequest.opType()` or `IndexRequest.create()`
|
error description). This will influence code that use the `IndexRequest.opType()` or `IndexRequest.create()`
|
||||||
to index a document only if it doesn't already exist.
|
to index a document only if it doesn't already exist.
|
||||||
|
|
||||||
|
==== writeConsistencyLevel removed on write requests
|
||||||
|
|
||||||
|
In previous versions of Elasticsearch, the various write requests had a
|
||||||
|
`setWriteConsistencyLevel` method to set the shard consistency level for
|
||||||
|
write operations. However, the semantics of write consistency were ambiguous
|
||||||
|
as this is just a pre-operation check to ensure the specified number of
|
||||||
|
shards were available before the operation commenced. The write consistency
|
||||||
|
level did not guarantee that the data would be replicated to those number
|
||||||
|
of copies by the time the operation finished. The `setWriteConsistencyLevel`
|
||||||
|
method on these write requests has been changed to `setWaitForActiveShards`,
|
||||||
|
which can take a numerical value up to the total number of shard copies or
|
||||||
|
`ActiveShardCount.ALL` for all shard copies. The default is to just wait
|
||||||
|
for the primary shard to be active before proceeding with the operation.
|
||||||
|
See the section on <<index-wait-for-active-shards,wait for active shards>>
|
||||||
|
for more details.
|
||||||
|
|
||||||
|
This change affects `IndexRequest`, `IndexRequestBuilder`, `BulkRequest`,
|
||||||
|
`BulkRequestBuilder`, `UpdateRequest`, `UpdateRequestBuilder`, `DeleteRequest`,
|
||||||
|
and `DeleteRequestBuilder`.
|
||||||
|
|
||||||
==== Changes to Query Builders
|
==== Changes to Query Builders
|
||||||
|
|
||||||
===== BoostingQueryBuilder
|
===== BoostingQueryBuilder
|
||||||
|
@ -361,15 +361,6 @@ Certain versions of the JVM are known to have bugs which can cause index corrupt
|
|||||||
|
|
||||||
When a node is experience network issues, the master detects it and removes the node from the cluster. That causes all ongoing recoveries from and to that node to be stopped and a new location is found for the relevant shards. However, in the of case partial network partition, where there are connectivity issues between the source and target nodes of a recovery but not between those nodes and the current master things may go wrong. While the nodes successfully restore the connection, the on going recoveries may have encountered issues. In {GIT}8720[#8720], we added test simulations for these and solved several issues that were flagged by them.
|
When a node is experience network issues, the master detects it and removes the node from the cluster. That causes all ongoing recoveries from and to that node to be stopped and a new location is found for the relevant shards. However, in the of case partial network partition, where there are connectivity issues between the source and target nodes of a recovery but not between those nodes and the current master things may go wrong. While the nodes successfully restore the connection, the on going recoveries may have encountered issues. In {GIT}8720[#8720], we added test simulations for these and solved several issues that were flagged by them.
|
||||||
|
|
||||||
|
|
||||||
[float]
|
|
||||||
=== Validate quorum before accepting a write request (STATUS: DONE, v1.4.0)
|
|
||||||
|
|
||||||
Today, when a node holding a primary shard receives an index request, it checks the local cluster state to see whether a quorum of shards is available before it accepts the request. However, it can take some time before an unresponsive node is removed from the cluster state. We are adding an optional live check, where the primary node tries to contact its replicas to confirm that they are still responding before accepting any changes. See {GIT}6937[#6937].
|
|
||||||
|
|
||||||
While the work is going on, we tightened the current checks by bringing them closer to the index code. See {GIT}7873[#7873] (STATUS: DONE, fixed in v1.4.0)
|
|
||||||
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
=== Improving Zen Discovery (STATUS: DONE, v1.4.0.Beta1)
|
=== Improving Zen Discovery (STATUS: DONE, v1.4.0.Beta1)
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user