mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-25 09:28:27 +00:00
Document Seq No powered optimistic concurrency control (#37284)
Add documentation to describe the new sequence number powered optimistic concurrency control Relates #36148 Relates #10708
This commit is contained in:
parent
1eba1d1df9
commit
cae71cddfe
@ -50,3 +50,5 @@ include::docs/termvectors.asciidoc[]
|
||||
include::docs/multi-termvectors.asciidoc[]
|
||||
|
||||
include::docs/refresh.asciidoc[]
|
||||
|
||||
include::docs/concurrency-control.asciidoc[]
|
||||
|
@ -197,6 +197,17 @@ size for your particular workload.
|
||||
If using the HTTP API, make sure that the client does not send HTTP
|
||||
chunks, as this will slow things down.
|
||||
|
||||
[float]
|
||||
[[bulk-optimistic-concurrency-control]]
|
||||
=== Optimistic Concurrency Control
|
||||
|
||||
Each `index` and `delete` action within a bulk API call may include the
|
||||
`if_seq_no` and `if_primary_term` parameters in their respective action
|
||||
and meta data lines. The `if_seq_no` and `if_primary_term` parameters control
|
||||
how operations are executed, based on the last modification to existing
|
||||
documents. See <<optimistic-concurrency-control>> for more details.
|
||||
|
||||
|
||||
[float]
|
||||
[[bulk-versioning]]
|
||||
=== Versioning
|
||||
|
114
docs/reference/docs/concurrency-control.asciidoc
Normal file
114
docs/reference/docs/concurrency-control.asciidoc
Normal file
@ -0,0 +1,114 @@
|
||||
[[optimistic-concurrency-control]]
|
||||
== Optimistic concurrency control
|
||||
|
||||
Elasticsearch is distributed. When documents are created, updated, or deleted,
|
||||
the new version of the document has to be replicated to other nodes in the cluster.
|
||||
Elasticsearch is also asynchronous and concurrent, meaning that these replication
|
||||
requests are sent in parallel, and may arrive at their destination out of sequence.
|
||||
Elasticsearch needs a way of ensuring that an older version of a document never
|
||||
overwrites a newer version.
|
||||
|
||||
|
||||
To ensure an older version of a document doesn't overwrite a newer version, every
|
||||
operation performed to a document is assigned a sequence number by the primary
|
||||
shard that coordinates that change. The sequence number is increased with each
|
||||
operation and thus newer operations are guaranteed to have a higher sequence
|
||||
number than older operations. Elasticsearch can then use the sequence number of
|
||||
operations to make sure they never override a newer document version is never
|
||||
overridden by a change that has a smaller sequence number assigned to it.
|
||||
|
||||
For example, the following indexing command will create a document and assign it
|
||||
an initial sequence number and primary term:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT products/_doc/1567
|
||||
{
|
||||
"product" : "r2d2",
|
||||
"details" : "A resourceful astromech droid"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
You can see the assigned sequence number and primary term in the
|
||||
the `_seq_no` and `_primary_term` fields of the response:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"_shards" : {
|
||||
"total" : 2,
|
||||
"failed" : 0,
|
||||
"successful" : 1
|
||||
},
|
||||
"_index" : "products",
|
||||
"_type" : "_doc",
|
||||
"_id" : "1567",
|
||||
"_version" : 1,
|
||||
"_seq_no" : 362,
|
||||
"_primary_term" : 2,
|
||||
"result" : "created"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/]
|
||||
|
||||
|
||||
Elasticsearch keeps tracks of the sequence number and primary of the last
|
||||
operation to have changed each of the document it stores. The sequence number
|
||||
and primary term are returned in the `_seq_no` and `_primary_term` fields in
|
||||
the response of the <<docs-get,GET API>>:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
GET products/_doc/1567
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
returns:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"_index" : "products",
|
||||
"_type" : "_doc",
|
||||
"_id" : "1567",
|
||||
"_version" : 1,
|
||||
"_seq_no" : 362,
|
||||
"_primary_term" : 2,
|
||||
"found": true,
|
||||
"_source" : {
|
||||
"product" : "r2d2",
|
||||
"details" : "A resourceful astromech droid"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/]
|
||||
|
||||
|
||||
Note: The <<search-search,Search API>> can return the `_seq_no` and `_primary_term`
|
||||
for each search hit by requesting the `_seq_no` and `_primary_term` <<search-request-docvalue-fields,Doc Value Fields>>.
|
||||
|
||||
The sequence number and the primary term uniquely identify a change. By noting down
|
||||
the sequence number and primary term returned, you can make sure to only change the
|
||||
document if no other change was made to it since you retrieved it. This
|
||||
is done by setting the `if_seq_no` and `if_primary_term` parameters of either the
|
||||
<<docs-index_,Index API>> or the <<docs-delete,Delete API>>.
|
||||
|
||||
For example, the following indexing call will make sure to add a tag to the
|
||||
document without losing any potential change to the description or an addition
|
||||
of another tag by another API:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT products/_doc/1567?if_seq_no=362&if_primary_term=2
|
||||
{
|
||||
"product" : "r2d2",
|
||||
"details" : "A resourceful astromech droid",
|
||||
"tags": ["droid"]
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
// TEST[catch: conflict]
|
||||
|
@ -35,6 +35,16 @@ The result of the above delete operation is:
|
||||
// TESTRESPONSE[s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
|
||||
// TESTRESPONSE[s/"_seq_no" : 5/"_seq_no" : $body._seq_no/]
|
||||
|
||||
[float]
|
||||
[[optimistic-concurrency-control-delete]]
|
||||
=== Optimistic concurrency control
|
||||
|
||||
Delete operations can be made optional and only be performed if the last
|
||||
modification to the document was assigned the sequence number and primary
|
||||
term specified by the `if_seq_no` and `if_primary_term` parameters. If a
|
||||
mismatch is detected, the operation will result in a `VersionConflictException`
|
||||
and a status code of 409. See <<optimistic-concurrency-control>> for more details.
|
||||
|
||||
[float]
|
||||
[[delete-versioning]]
|
||||
=== Versioning
|
||||
|
@ -79,89 +79,6 @@ Automatic index creation can include a pattern based white/black list,
|
||||
for example, set `action.auto_create_index` to `+aaa*,-bbb*,+ccc*,-*` (+
|
||||
meaning allowed, and - meaning disallowed).
|
||||
|
||||
[float]
|
||||
[[index-versioning]]
|
||||
=== Versioning
|
||||
|
||||
Each indexed document is given a version number. The associated
|
||||
`version` number is returned as part of the response to the index API
|
||||
request. The index API optionally allows for
|
||||
http://en.wikipedia.org/wiki/Optimistic_concurrency_control[optimistic
|
||||
concurrency control] when the `version` parameter is specified. This
|
||||
will control the version of the document the operation is intended to be
|
||||
executed against. A good example of a use case for versioning is
|
||||
performing a transactional read-then-update. Specifying a `version` from
|
||||
the document initially read ensures no changes have happened in the
|
||||
meantime. For example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT twitter/_doc/1?version=2
|
||||
{
|
||||
"message" : "elasticsearch now has versioning support, double cool!"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
// TEST[catch: conflict]
|
||||
|
||||
*NOTE:* versioning is completely real time, and is not affected by the
|
||||
near real time aspects of search operations. If no version is provided,
|
||||
then the operation is executed without any version checks.
|
||||
|
||||
By default, internal versioning is used that starts at 1 and increments
|
||||
with each update, deletes included. Optionally, the version number can be
|
||||
supplemented with an external value (for example, if maintained in a
|
||||
database). To enable this functionality, `version_type` should be set to
|
||||
`external`. The value provided must be a numeric, long value greater or equal to 0,
|
||||
and less than around 9.2e+18. When using the external version type, instead
|
||||
of checking for a matching version number, the system checks to see if
|
||||
the version number passed to the index request is greater than the
|
||||
version of the currently stored document. If true, the document will be
|
||||
indexed and the new version number used. If the value provided is less
|
||||
than or equal to the stored document's version number, a version
|
||||
conflict will occur and the index operation will fail.
|
||||
|
||||
WARNING: External versioning supports the value 0 as a valid version number.
|
||||
This allows the version to be in sync with an external versioning system
|
||||
where version numbers start from zero instead of one. It has the side effect
|
||||
that documents with version number equal to zero cannot neither be updated
|
||||
using the <<docs-update-by-query,Update-By-Query API>> nor be deleted
|
||||
using the <<docs-delete-by-query,Delete By Query API>> as long as their
|
||||
version number is equal to zero.
|
||||
|
||||
A nice side effect is that there is no need to maintain strict ordering
|
||||
of async indexing operations executed as a result of changes to a source
|
||||
database, as long as version numbers from the source database are used.
|
||||
Even the simple case of updating the Elasticsearch index using data from
|
||||
a database is simplified if external versioning is used, as only the
|
||||
latest version will be used if the index operations are out of order for
|
||||
whatever reason.
|
||||
|
||||
[float]
|
||||
==== Version types
|
||||
|
||||
Next to the `internal` & `external` version types explained above, Elasticsearch
|
||||
also supports other types for specific use cases. Here is an overview of
|
||||
the different version types and their semantics.
|
||||
|
||||
`internal`:: only index the document if the given version is identical to the version
|
||||
of the stored document.
|
||||
|
||||
`external` or `external_gt`:: only index the document if the given version is strictly higher
|
||||
than the version of the stored document *or* if there is no existing document. The given
|
||||
version will be used as the new version and will be stored with the new document. The supplied
|
||||
version must be a non-negative long number.
|
||||
|
||||
`external_gte`:: only index the document if the given version is *equal* or higher
|
||||
than the version of the stored document. If there is no existing document
|
||||
the operation will succeed as well. The given version will be used as the new version
|
||||
and will be stored with the new document. The supplied version must be a non-negative long number.
|
||||
|
||||
*NOTE*: The `external_gte` version type is meant for special use cases and
|
||||
should be used with care. If used incorrectly, it can result in loss of data.
|
||||
There is another option, `force`, which is deprecated because it can cause
|
||||
primary and replica shards to diverge.
|
||||
|
||||
[float]
|
||||
[[operation-type]]
|
||||
@ -238,6 +155,16 @@ The result of the above index operation is:
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/ s/"successful" : 2/"successful" : 1/]
|
||||
|
||||
[float]
|
||||
[[optimistic-concurrency-control-index]]
|
||||
=== Optimistic concurrency control
|
||||
|
||||
Index operations can be made optional and only be performed if the last
|
||||
modification to the document was assigned the sequence number and primary
|
||||
term specified by the `if_seq_no` and `if_primary_term` parameters. If a
|
||||
mismatch is detected, the operation will result in a `VersionConflictException`
|
||||
and a status code of 409. See <<optimistic-concurrency-control>> for more details.
|
||||
|
||||
[float]
|
||||
[[index-routing]]
|
||||
=== Routing
|
||||
@ -380,3 +307,83 @@ PUT twitter/_doc/1?timeout=5m
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
[float]
|
||||
[[index-versioning]]
|
||||
=== Versioning
|
||||
|
||||
Each indexed document is given a version number. By default,
|
||||
internal versioning is used that starts at 1 and increments
|
||||
with each update, deletes included. Optionally, the version number can be
|
||||
set to an external value (for example, if maintained in a
|
||||
database). To enable this functionality, `version_type` should be set to
|
||||
`external`. The value provided must be a numeric, long value greater or equal to 0,
|
||||
and less than around 9.2e+18.
|
||||
|
||||
When using the external version type, the system checks to see if
|
||||
the version number passed to the index request is greater than the
|
||||
version of the currently stored document. If true, the document will be
|
||||
indexed and the new version number used. If the value provided is less
|
||||
than or equal to the stored document's version number, a version
|
||||
conflict will occur and the index operation will fail. For example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT twitter/_doc/1?version=2&version_type=external
|
||||
{
|
||||
"message" : "elasticsearch now has versioning support, double cool!"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
*NOTE:* versioning is completely real time, and is not affected by the
|
||||
near real time aspects of search operations. If no version is provided,
|
||||
then the operation is executed without any version checks.
|
||||
|
||||
The above will succeed since the the supplied version of 2 is higher than
|
||||
the current document version of 1. If the document was already updated
|
||||
and it's version was set to 2 or higher, the indexing command will fail
|
||||
and result in a conflict (409 http status code).
|
||||
|
||||
WARNING: External versioning supports the value 0 as a valid version number.
|
||||
This allows the version to be in sync with an external versioning system
|
||||
where version numbers start from zero instead of one. It has the side effect
|
||||
that documents with version number equal to zero cannot neither be updated
|
||||
using the <<docs-update-by-query,Update-By-Query API>> nor be deleted
|
||||
using the <<docs-delete-by-query,Delete By Query API>> as long as their
|
||||
version number is equal to zero.
|
||||
|
||||
A nice side effect is that there is no need to maintain strict ordering
|
||||
of async indexing operations executed as a result of changes to a source
|
||||
database, as long as version numbers from the source database are used.
|
||||
Even the simple case of updating the Elasticsearch index using data from
|
||||
a database is simplified if external versioning is used, as only the
|
||||
latest version will be used if the index operations are out of order for
|
||||
whatever reason.
|
||||
|
||||
[float]
|
||||
==== Version types
|
||||
|
||||
Next to the `external` version type explained above, Elasticsearch
|
||||
also supports other types for specific use cases. Here is an overview of
|
||||
the different version types and their semantics.
|
||||
|
||||
`internal`:: only index the document if the given version is identical to the version
|
||||
of the stored document.
|
||||
|
||||
`external` or `external_gt`:: only index the document if the given version is strictly higher
|
||||
than the version of the stored document *or* if there is no existing document. The given
|
||||
version will be used as the new version and will be stored with the new document. The supplied
|
||||
version must be a non-negative long number.
|
||||
|
||||
`external_gte`:: only index the document if the given version is *equal* or higher
|
||||
than the version of the stored document. If there is no existing document
|
||||
the operation will succeed as well. The given version will be used as the new version
|
||||
and will be stored with the new document. The supplied version must be a non-negative long number.
|
||||
|
||||
*NOTE*: The `external_gte` version type is meant for special use cases and
|
||||
should be used with care. If used incorrectly, it can result in loss of data.
|
||||
There is another option, `force`, which is deprecated because it can cause
|
||||
primary and replica shards to diverge.
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user