mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-17 10:25:15 +00:00
Minor improvements to translog docs (#28237)
The use of the phrase "translog" vs "transaction log" was inconsistent, and it was apparently unclear that the translog was stored on every shard copy.
This commit is contained in:
parent
b7e1d6fe3e
commit
0a4a4c8a0e
@ -1,41 +1,44 @@
|
|||||||
[[index-modules-translog]]
|
[[index-modules-translog]]
|
||||||
== Translog
|
== Translog
|
||||||
|
|
||||||
Changes to Lucene are only persisted to disk during a Lucene commit,
|
Changes to Lucene are only persisted to disk during a Lucene commit, which is a
|
||||||
which is a relatively heavy operation and so cannot be performed after every
|
relatively expensive operation and so cannot be performed after every index or
|
||||||
index or delete operation. Changes that happen after one commit and before another
|
delete operation. Changes that happen after one commit and before another will
|
||||||
will be lost in the event of process exit or HW failure.
|
be removed from the index by Lucene in the event of process exit or hardware
|
||||||
|
failure.
|
||||||
|
|
||||||
To prevent this data loss, each shard has a _transaction log_ or write ahead
|
Because Lucene commits are too expensive to perform on every individual change,
|
||||||
log associated with it. Any index or delete operation is written to the
|
each shard copy also has a _transaction log_ known as its _translog_ associated
|
||||||
translog after being processed by the internal Lucene index.
|
with it. All index and delete operations are written to the translog after
|
||||||
|
being processed by the internal Lucene index but before they are acknowledged.
|
||||||
In the event of a crash, recent transactions can be replayed from the
|
In the event of a crash, recent transactions that have been acknowledged but
|
||||||
transaction log when the shard recovers.
|
not yet included in the last Lucene commit can instead be recovered from the
|
||||||
|
translog when the shard recovers.
|
||||||
|
|
||||||
An Elasticsearch flush is the process of performing a Lucene commit and
|
An Elasticsearch flush is the process of performing a Lucene commit and
|
||||||
starting a new translog. It is done automatically in the background in order
|
starting a new translog. Flushes are performed automatically in the background
|
||||||
to make sure the transaction log doesn't grow too large, which would make
|
in order to make sure the translog doesn't grow too large, which would make
|
||||||
replaying its operations take a considerable amount of time during recovery.
|
replaying its operations take a considerable amount of time during recovery.
|
||||||
It is also exposed through an API, though its rarely needed to be performed
|
The ability to perform a flush manually is also exposed through an API,
|
||||||
manually.
|
although this is rarely needed.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
=== Translog settings
|
=== Translog settings
|
||||||
|
|
||||||
The data in the transaction log is only persisted to disk when the translog is
|
The data in the translog is only persisted to disk when the translog is
|
||||||
++fsync++ed and committed. In the event of hardware failure, any data written
|
++fsync++ed and committed. In the event of hardware failure, any data written
|
||||||
since the previous translog commit will be lost.
|
since the previous translog commit will be lost.
|
||||||
|
|
||||||
By default, Elasticsearch ++fsync++s and commits the translog every 5 seconds if `index.translog.durability` is set
|
By default, Elasticsearch ++fsync++s and commits the translog every 5 seconds
|
||||||
to `async` or if set to `request` (default) at the end of every <<docs-index_,index>>, <<docs-delete,delete>>,
|
if `index.translog.durability` is set to `async` or if set to `request`
|
||||||
<<docs-update,update>>, or <<docs-bulk,bulk>> request. In fact, Elasticsearch
|
(default) at the end of every <<docs-index_,index>>, <<docs-delete,delete>>,
|
||||||
will only report success of an index, delete, update, or bulk request to the
|
<<docs-update,update>>, or <<docs-bulk,bulk>> request. More precisely, if set
|
||||||
client after the transaction log has been successfully ++fsync++ed and committed
|
to `request`, Elasticsearch will only report success of an index, delete,
|
||||||
on the primary and on every allocated replica.
|
update, or bulk request to the client after the translog has been successfully
|
||||||
|
++fsync++ed and committed on the primary and on every allocated replica.
|
||||||
|
|
||||||
The following <<indices-update-settings,dynamically updatable>> per-index settings
|
The following <<indices-update-settings,dynamically updatable>> per-index
|
||||||
control the behaviour of the transaction log:
|
settings control the behaviour of the translog:
|
||||||
|
|
||||||
`index.translog.sync_interval`::
|
`index.translog.sync_interval`::
|
||||||
|
|
||||||
@ -64,17 +67,20 @@ update, or bulk request. This setting accepts the following parameters:
|
|||||||
|
|
||||||
`index.translog.flush_threshold_size`::
|
`index.translog.flush_threshold_size`::
|
||||||
|
|
||||||
The translog stores all operations that are not yet safely persisted in Lucene (i.e., are
|
The translog stores all operations that are not yet safely persisted in Lucene
|
||||||
not part of a lucene commit point). Although these operations are available for reads, they will
|
(i.e., are not part of a Lucene commit point). Although these operations are
|
||||||
need to be reindexed if the shard was to shutdown and has to be recovered. This settings controls
|
available for reads, they will need to be reindexed if the shard was to
|
||||||
the maximum total size of these operations, to prevent recoveries from taking too long. Once the
|
shutdown and has to be recovered. This settings controls the maximum total size
|
||||||
maximum size has been reached a flush will happen, generating a new Lucene commit. Defaults to `512mb`.
|
of these operations, to prevent recoveries from taking too long. Once the
|
||||||
|
maximum size has been reached a flush will happen, generating a new Lucene
|
||||||
|
commit point. Defaults to `512mb`.
|
||||||
|
|
||||||
`index.translog.retention.size`::
|
`index.translog.retention.size`::
|
||||||
|
|
||||||
The total size of translog files to keep. Keeping more translog files increases the chance of performing
|
The total size of translog files to keep. Keeping more translog files increases
|
||||||
an operation based sync when recovering replicas. If the translog files are not sufficient, replica recovery
|
the chance of performing an operation based sync when recovering replicas. If
|
||||||
will fall back to a file based sync. Defaults to `512mb`
|
the translog files are not sufficient, replica recovery will fall back to a
|
||||||
|
file based sync. Defaults to `512mb`
|
||||||
|
|
||||||
|
|
||||||
`index.translog.retention.age`::
|
`index.translog.retention.age`::
|
||||||
@ -86,10 +92,14 @@ The maximum duration for which translog files will be kept. Defaults to `12h`.
|
|||||||
[[corrupt-translog-truncation]]
|
[[corrupt-translog-truncation]]
|
||||||
=== What to do if the translog becomes corrupted?
|
=== What to do if the translog becomes corrupted?
|
||||||
|
|
||||||
In some cases (a bad drive, user error) the translog can become corrupted. When
|
In some cases (a bad drive, user error) the translog on a shard copy can become
|
||||||
this corruption is detected by Elasticsearch due to mismatching checksums,
|
corrupted. When this corruption is detected by Elasticsearch due to mismatching
|
||||||
Elasticsearch will fail the shard and refuse to allocate that copy of the data
|
checksums, Elasticsearch will fail that shard copy and refuse to use that copy
|
||||||
to the node, recovering from a replica if available.
|
of the data. If there are other copies of the shard available then
|
||||||
|
Elasticsearch will automatically recover from one of them using the normal
|
||||||
|
shard allocation and recovery mechanism. In particular, if the corrupt shard
|
||||||
|
copy was the primary when the corruption was detected then one of its replicas
|
||||||
|
will be promoted in its place.
|
||||||
|
|
||||||
If there is no copy of the data from which Elasticsearch can recover
|
If there is no copy of the data from which Elasticsearch can recover
|
||||||
successfully, a user may want to recover the data that is part of the shard at
|
successfully, a user may want to recover the data that is part of the shard at
|
||||||
|
Loading…
x
Reference in New Issue
Block a user