mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-25 17:38:44 +00:00
Document Seq No powered optimistic concurrency control (#37284)
Add documentation to describe the new sequence number powered optimistic concurrency control Relates #36148 Relates #10708
This commit is contained in:
parent
1eba1d1df9
commit
cae71cddfe
@ -50,3 +50,5 @@ include::docs/termvectors.asciidoc[]
|
|||||||
include::docs/multi-termvectors.asciidoc[]
|
include::docs/multi-termvectors.asciidoc[]
|
||||||
|
|
||||||
include::docs/refresh.asciidoc[]
|
include::docs/refresh.asciidoc[]
|
||||||
|
|
||||||
|
include::docs/concurrency-control.asciidoc[]
|
||||||
|
@ -197,6 +197,17 @@ size for your particular workload.
|
|||||||
If using the HTTP API, make sure that the client does not send HTTP
|
If using the HTTP API, make sure that the client does not send HTTP
|
||||||
chunks, as this will slow things down.
|
chunks, as this will slow things down.
|
||||||
|
|
||||||
|
[float]
|
||||||
|
[[bulk-optimistic-concurrency-control]]
|
||||||
|
=== Optimistic Concurrency Control
|
||||||
|
|
||||||
|
Each `index` and `delete` action within a bulk API call may include the
|
||||||
|
`if_seq_no` and `if_primary_term` parameters in their respective action
|
||||||
|
and meta data lines. The `if_seq_no` and `if_primary_term` parameters control
|
||||||
|
how operations are executed, based on the last modification to existing
|
||||||
|
documents. See <<optimistic-concurrency-control>> for more details.
|
||||||
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[bulk-versioning]]
|
[[bulk-versioning]]
|
||||||
=== Versioning
|
=== Versioning
|
||||||
|
114
docs/reference/docs/concurrency-control.asciidoc
Normal file
114
docs/reference/docs/concurrency-control.asciidoc
Normal file
@ -0,0 +1,114 @@
|
|||||||
|
[[optimistic-concurrency-control]]
|
||||||
|
== Optimistic concurrency control
|
||||||
|
|
||||||
|
Elasticsearch is distributed. When documents are created, updated, or deleted,
|
||||||
|
the new version of the document has to be replicated to other nodes in the cluster.
|
||||||
|
Elasticsearch is also asynchronous and concurrent, meaning that these replication
|
||||||
|
requests are sent in parallel, and may arrive at their destination out of sequence.
|
||||||
|
Elasticsearch needs a way of ensuring that an older version of a document never
|
||||||
|
overwrites a newer version.
|
||||||
|
|
||||||
|
|
||||||
|
To ensure an older version of a document doesn't overwrite a newer version, every
|
||||||
|
operation performed to a document is assigned a sequence number by the primary
|
||||||
|
shard that coordinates that change. The sequence number is increased with each
|
||||||
|
operation and thus newer operations are guaranteed to have a higher sequence
|
||||||
|
number than older operations. Elasticsearch can then use the sequence number of
|
||||||
|
operations to make sure they never override a newer document version is never
|
||||||
|
overridden by a change that has a smaller sequence number assigned to it.
|
||||||
|
|
||||||
|
For example, the following indexing command will create a document and assign it
|
||||||
|
an initial sequence number and primary term:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
PUT products/_doc/1567
|
||||||
|
{
|
||||||
|
"product" : "r2d2",
|
||||||
|
"details" : "A resourceful astromech droid"
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
|
You can see the assigned sequence number and primary term in the
|
||||||
|
the `_seq_no` and `_primary_term` fields of the response:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
{
|
||||||
|
"_shards" : {
|
||||||
|
"total" : 2,
|
||||||
|
"failed" : 0,
|
||||||
|
"successful" : 1
|
||||||
|
},
|
||||||
|
"_index" : "products",
|
||||||
|
"_type" : "_doc",
|
||||||
|
"_id" : "1567",
|
||||||
|
"_version" : 1,
|
||||||
|
"_seq_no" : 362,
|
||||||
|
"_primary_term" : 2,
|
||||||
|
"result" : "created"
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/]
|
||||||
|
|
||||||
|
|
||||||
|
Elasticsearch keeps tracks of the sequence number and primary of the last
|
||||||
|
operation to have changed each of the document it stores. The sequence number
|
||||||
|
and primary term are returned in the `_seq_no` and `_primary_term` fields in
|
||||||
|
the response of the <<docs-get,GET API>>:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
GET products/_doc/1567
|
||||||
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
// TEST[continued]
|
||||||
|
|
||||||
|
returns:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
{
|
||||||
|
"_index" : "products",
|
||||||
|
"_type" : "_doc",
|
||||||
|
"_id" : "1567",
|
||||||
|
"_version" : 1,
|
||||||
|
"_seq_no" : 362,
|
||||||
|
"_primary_term" : 2,
|
||||||
|
"found": true,
|
||||||
|
"_source" : {
|
||||||
|
"product" : "r2d2",
|
||||||
|
"details" : "A resourceful astromech droid"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/]
|
||||||
|
|
||||||
|
|
||||||
|
Note: The <<search-search,Search API>> can return the `_seq_no` and `_primary_term`
|
||||||
|
for each search hit by requesting the `_seq_no` and `_primary_term` <<search-request-docvalue-fields,Doc Value Fields>>.
|
||||||
|
|
||||||
|
The sequence number and the primary term uniquely identify a change. By noting down
|
||||||
|
the sequence number and primary term returned, you can make sure to only change the
|
||||||
|
document if no other change was made to it since you retrieved it. This
|
||||||
|
is done by setting the `if_seq_no` and `if_primary_term` parameters of either the
|
||||||
|
<<docs-index_,Index API>> or the <<docs-delete,Delete API>>.
|
||||||
|
|
||||||
|
For example, the following indexing call will make sure to add a tag to the
|
||||||
|
document without losing any potential change to the description or an addition
|
||||||
|
of another tag by another API:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
PUT products/_doc/1567?if_seq_no=362&if_primary_term=2
|
||||||
|
{
|
||||||
|
"product" : "r2d2",
|
||||||
|
"details" : "A resourceful astromech droid",
|
||||||
|
"tags": ["droid"]
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
// TEST[continued]
|
||||||
|
// TEST[catch: conflict]
|
||||||
|
|
@ -35,6 +35,16 @@ The result of the above delete operation is:
|
|||||||
// TESTRESPONSE[s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
|
// TESTRESPONSE[s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
|
||||||
// TESTRESPONSE[s/"_seq_no" : 5/"_seq_no" : $body._seq_no/]
|
// TESTRESPONSE[s/"_seq_no" : 5/"_seq_no" : $body._seq_no/]
|
||||||
|
|
||||||
|
[float]
|
||||||
|
[[optimistic-concurrency-control-delete]]
|
||||||
|
=== Optimistic concurrency control
|
||||||
|
|
||||||
|
Delete operations can be made optional and only be performed if the last
|
||||||
|
modification to the document was assigned the sequence number and primary
|
||||||
|
term specified by the `if_seq_no` and `if_primary_term` parameters. If a
|
||||||
|
mismatch is detected, the operation will result in a `VersionConflictException`
|
||||||
|
and a status code of 409. See <<optimistic-concurrency-control>> for more details.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[delete-versioning]]
|
[[delete-versioning]]
|
||||||
=== Versioning
|
=== Versioning
|
||||||
|
@ -79,89 +79,6 @@ Automatic index creation can include a pattern based white/black list,
|
|||||||
for example, set `action.auto_create_index` to `+aaa*,-bbb*,+ccc*,-*` (+
|
for example, set `action.auto_create_index` to `+aaa*,-bbb*,+ccc*,-*` (+
|
||||||
meaning allowed, and - meaning disallowed).
|
meaning allowed, and - meaning disallowed).
|
||||||
|
|
||||||
[float]
|
|
||||||
[[index-versioning]]
|
|
||||||
=== Versioning
|
|
||||||
|
|
||||||
Each indexed document is given a version number. The associated
|
|
||||||
`version` number is returned as part of the response to the index API
|
|
||||||
request. The index API optionally allows for
|
|
||||||
http://en.wikipedia.org/wiki/Optimistic_concurrency_control[optimistic
|
|
||||||
concurrency control] when the `version` parameter is specified. This
|
|
||||||
will control the version of the document the operation is intended to be
|
|
||||||
executed against. A good example of a use case for versioning is
|
|
||||||
performing a transactional read-then-update. Specifying a `version` from
|
|
||||||
the document initially read ensures no changes have happened in the
|
|
||||||
meantime. For example:
|
|
||||||
|
|
||||||
[source,js]
|
|
||||||
--------------------------------------------------
|
|
||||||
PUT twitter/_doc/1?version=2
|
|
||||||
{
|
|
||||||
"message" : "elasticsearch now has versioning support, double cool!"
|
|
||||||
}
|
|
||||||
--------------------------------------------------
|
|
||||||
// CONSOLE
|
|
||||||
// TEST[continued]
|
|
||||||
// TEST[catch: conflict]
|
|
||||||
|
|
||||||
*NOTE:* versioning is completely real time, and is not affected by the
|
|
||||||
near real time aspects of search operations. If no version is provided,
|
|
||||||
then the operation is executed without any version checks.
|
|
||||||
|
|
||||||
By default, internal versioning is used that starts at 1 and increments
|
|
||||||
with each update, deletes included. Optionally, the version number can be
|
|
||||||
supplemented with an external value (for example, if maintained in a
|
|
||||||
database). To enable this functionality, `version_type` should be set to
|
|
||||||
`external`. The value provided must be a numeric, long value greater or equal to 0,
|
|
||||||
and less than around 9.2e+18. When using the external version type, instead
|
|
||||||
of checking for a matching version number, the system checks to see if
|
|
||||||
the version number passed to the index request is greater than the
|
|
||||||
version of the currently stored document. If true, the document will be
|
|
||||||
indexed and the new version number used. If the value provided is less
|
|
||||||
than or equal to the stored document's version number, a version
|
|
||||||
conflict will occur and the index operation will fail.
|
|
||||||
|
|
||||||
WARNING: External versioning supports the value 0 as a valid version number.
|
|
||||||
This allows the version to be in sync with an external versioning system
|
|
||||||
where version numbers start from zero instead of one. It has the side effect
|
|
||||||
that documents with version number equal to zero cannot neither be updated
|
|
||||||
using the <<docs-update-by-query,Update-By-Query API>> nor be deleted
|
|
||||||
using the <<docs-delete-by-query,Delete By Query API>> as long as their
|
|
||||||
version number is equal to zero.
|
|
||||||
|
|
||||||
A nice side effect is that there is no need to maintain strict ordering
|
|
||||||
of async indexing operations executed as a result of changes to a source
|
|
||||||
database, as long as version numbers from the source database are used.
|
|
||||||
Even the simple case of updating the Elasticsearch index using data from
|
|
||||||
a database is simplified if external versioning is used, as only the
|
|
||||||
latest version will be used if the index operations are out of order for
|
|
||||||
whatever reason.
|
|
||||||
|
|
||||||
[float]
|
|
||||||
==== Version types
|
|
||||||
|
|
||||||
Next to the `internal` & `external` version types explained above, Elasticsearch
|
|
||||||
also supports other types for specific use cases. Here is an overview of
|
|
||||||
the different version types and their semantics.
|
|
||||||
|
|
||||||
`internal`:: only index the document if the given version is identical to the version
|
|
||||||
of the stored document.
|
|
||||||
|
|
||||||
`external` or `external_gt`:: only index the document if the given version is strictly higher
|
|
||||||
than the version of the stored document *or* if there is no existing document. The given
|
|
||||||
version will be used as the new version and will be stored with the new document. The supplied
|
|
||||||
version must be a non-negative long number.
|
|
||||||
|
|
||||||
`external_gte`:: only index the document if the given version is *equal* or higher
|
|
||||||
than the version of the stored document. If there is no existing document
|
|
||||||
the operation will succeed as well. The given version will be used as the new version
|
|
||||||
and will be stored with the new document. The supplied version must be a non-negative long number.
|
|
||||||
|
|
||||||
*NOTE*: The `external_gte` version type is meant for special use cases and
|
|
||||||
should be used with care. If used incorrectly, it can result in loss of data.
|
|
||||||
There is another option, `force`, which is deprecated because it can cause
|
|
||||||
primary and replica shards to diverge.
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[operation-type]]
|
[[operation-type]]
|
||||||
@ -238,6 +155,16 @@ The result of the above index operation is:
|
|||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/ s/"successful" : 2/"successful" : 1/]
|
// TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/ s/"successful" : 2/"successful" : 1/]
|
||||||
|
|
||||||
|
[float]
|
||||||
|
[[optimistic-concurrency-control-index]]
|
||||||
|
=== Optimistic concurrency control
|
||||||
|
|
||||||
|
Index operations can be made optional and only be performed if the last
|
||||||
|
modification to the document was assigned the sequence number and primary
|
||||||
|
term specified by the `if_seq_no` and `if_primary_term` parameters. If a
|
||||||
|
mismatch is detected, the operation will result in a `VersionConflictException`
|
||||||
|
and a status code of 409. See <<optimistic-concurrency-control>> for more details.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[index-routing]]
|
[[index-routing]]
|
||||||
=== Routing
|
=== Routing
|
||||||
@ -380,3 +307,83 @@ PUT twitter/_doc/1?timeout=5m
|
|||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// CONSOLE
|
// CONSOLE
|
||||||
|
|
||||||
|
[float]
|
||||||
|
[[index-versioning]]
|
||||||
|
=== Versioning
|
||||||
|
|
||||||
|
Each indexed document is given a version number. By default,
|
||||||
|
internal versioning is used that starts at 1 and increments
|
||||||
|
with each update, deletes included. Optionally, the version number can be
|
||||||
|
set to an external value (for example, if maintained in a
|
||||||
|
database). To enable this functionality, `version_type` should be set to
|
||||||
|
`external`. The value provided must be a numeric, long value greater or equal to 0,
|
||||||
|
and less than around 9.2e+18.
|
||||||
|
|
||||||
|
When using the external version type, the system checks to see if
|
||||||
|
the version number passed to the index request is greater than the
|
||||||
|
version of the currently stored document. If true, the document will be
|
||||||
|
indexed and the new version number used. If the value provided is less
|
||||||
|
than or equal to the stored document's version number, a version
|
||||||
|
conflict will occur and the index operation will fail. For example:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
PUT twitter/_doc/1?version=2&version_type=external
|
||||||
|
{
|
||||||
|
"message" : "elasticsearch now has versioning support, double cool!"
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
// TEST[continued]
|
||||||
|
|
||||||
|
*NOTE:* versioning is completely real time, and is not affected by the
|
||||||
|
near real time aspects of search operations. If no version is provided,
|
||||||
|
then the operation is executed without any version checks.
|
||||||
|
|
||||||
|
The above will succeed since the the supplied version of 2 is higher than
|
||||||
|
the current document version of 1. If the document was already updated
|
||||||
|
and it's version was set to 2 or higher, the indexing command will fail
|
||||||
|
and result in a conflict (409 http status code).
|
||||||
|
|
||||||
|
WARNING: External versioning supports the value 0 as a valid version number.
|
||||||
|
This allows the version to be in sync with an external versioning system
|
||||||
|
where version numbers start from zero instead of one. It has the side effect
|
||||||
|
that documents with version number equal to zero cannot neither be updated
|
||||||
|
using the <<docs-update-by-query,Update-By-Query API>> nor be deleted
|
||||||
|
using the <<docs-delete-by-query,Delete By Query API>> as long as their
|
||||||
|
version number is equal to zero.
|
||||||
|
|
||||||
|
A nice side effect is that there is no need to maintain strict ordering
|
||||||
|
of async indexing operations executed as a result of changes to a source
|
||||||
|
database, as long as version numbers from the source database are used.
|
||||||
|
Even the simple case of updating the Elasticsearch index using data from
|
||||||
|
a database is simplified if external versioning is used, as only the
|
||||||
|
latest version will be used if the index operations are out of order for
|
||||||
|
whatever reason.
|
||||||
|
|
||||||
|
[float]
|
||||||
|
==== Version types
|
||||||
|
|
||||||
|
Next to the `external` version type explained above, Elasticsearch
|
||||||
|
also supports other types for specific use cases. Here is an overview of
|
||||||
|
the different version types and their semantics.
|
||||||
|
|
||||||
|
`internal`:: only index the document if the given version is identical to the version
|
||||||
|
of the stored document.
|
||||||
|
|
||||||
|
`external` or `external_gt`:: only index the document if the given version is strictly higher
|
||||||
|
than the version of the stored document *or* if there is no existing document. The given
|
||||||
|
version will be used as the new version and will be stored with the new document. The supplied
|
||||||
|
version must be a non-negative long number.
|
||||||
|
|
||||||
|
`external_gte`:: only index the document if the given version is *equal* or higher
|
||||||
|
than the version of the stored document. If there is no existing document
|
||||||
|
the operation will succeed as well. The given version will be used as the new version
|
||||||
|
and will be stored with the new document. The supplied version must be a non-negative long number.
|
||||||
|
|
||||||
|
*NOTE*: The `external_gte` version type is meant for special use cases and
|
||||||
|
should be used with care. If used incorrectly, it can result in loss of data.
|
||||||
|
There is another option, `force`, which is deprecated because it can cause
|
||||||
|
primary and replica shards to diverge.
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user