2013-08-28 19:24:34 -04:00
|
|
|
[[index-modules-translog]]
|
|
|
|
== Translog
|
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
Changes to Lucene are only persisted to disk during a Lucene commit, which is a
|
|
|
|
relatively expensive operation and so cannot be performed after every index or
|
|
|
|
delete operation. Changes that happen after one commit and before another will
|
|
|
|
be removed from the index by Lucene in the event of process exit or hardware
|
|
|
|
failure.
|
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
Lucene commits are too expensive to perform on every individual change, so each
|
|
|
|
shard copy also writes operations into its _transaction log_ known as the
|
|
|
|
_translog_. All index and delete operations are written to the translog after
|
2018-01-19 05:17:22 -05:00
|
|
|
being processed by the internal Lucene index but before they are acknowledged.
|
2019-09-04 11:37:00 -04:00
|
|
|
In the event of a crash, recent operations that have been acknowledged but not
|
|
|
|
yet included in the last Lucene commit are instead recovered from the translog
|
|
|
|
when the shard recovers.
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
An {es} <<indices-flush,flush>> is the process of performing a Lucene commit and
|
|
|
|
starting a new translog generation. Flushes are performed automatically in the
|
|
|
|
background in order to make sure the translog does not grow too large, which
|
|
|
|
would make replaying its operations take a considerable amount of time during
|
|
|
|
recovery. The ability to perform a flush manually is also exposed through an
|
|
|
|
API, although this is rarely needed.
|
2015-05-05 16:01:58 -04:00
|
|
|
|
2015-05-05 15:32:41 -04:00
|
|
|
[float]
|
|
|
|
=== Translog settings
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
The data in the translog is only persisted to disk when the translog is
|
2019-09-04 11:37:00 -04:00
|
|
|
++fsync++ed and committed. In the event of a hardware failure or an operating
|
2019-07-05 13:55:25 -04:00
|
|
|
system crash or a JVM crash or a shard failure, any data written since the
|
|
|
|
previous translog commit will be lost.
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
By default, `index.translog.durability` is set to `request` meaning that
|
|
|
|
Elasticsearch will only report success of an index, delete, update, or bulk
|
|
|
|
request to the client after the translog has been successfully ++fsync++ed and
|
|
|
|
committed on the primary and on every allocated replica. If
|
|
|
|
`index.translog.durability` is set to `async` then Elasticsearch ++fsync++s and
|
|
|
|
commits the translog only every `index.translog.sync_interval` which means that
|
|
|
|
any operations that were performed just before a crash may be lost when the node
|
|
|
|
recovers.
|
2015-06-30 13:08:31 -04:00
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
The following <<indices-update-settings,dynamically updatable>> per-index
|
|
|
|
settings control the behaviour of the translog:
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2015-03-27 05:18:09 -04:00
|
|
|
`index.translog.sync_interval`::
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
How often the translog is ++fsync++ed to disk and committed, regardless of
|
|
|
|
write operations. Defaults to `5s`. Values less than `100ms` are not allowed.
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2015-06-30 13:08:31 -04:00
|
|
|
`index.translog.durability`::
|
|
|
|
+
|
|
|
|
--
|
|
|
|
|
|
|
|
Whether or not to `fsync` and commit the translog after every index, delete,
|
2019-09-04 11:37:00 -04:00
|
|
|
update, or bulk request. This setting accepts the following parameters:
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2015-06-30 13:08:31 -04:00
|
|
|
`request`::
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
(default) `fsync` and commit after every request. In the event of hardware
|
|
|
|
failure, all acknowledged writes will already have been committed to disk.
|
2015-06-30 13:08:31 -04:00
|
|
|
|
|
|
|
`async`::
|
|
|
|
|
|
|
|
`fsync` and commit in the background every `sync_interval`. In
|
2019-07-05 13:55:25 -04:00
|
|
|
the event of a failure, all acknowledged writes since the last
|
2015-06-30 13:08:31 -04:00
|
|
|
automatic commit will be discarded.
|
2016-08-02 17:43:14 -04:00
|
|
|
--
|
|
|
|
|
2017-06-22 11:08:14 -04:00
|
|
|
`index.translog.flush_threshold_size`::
|
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
The translog stores all operations that are not yet safely persisted in Lucene
|
|
|
|
(i.e., are not part of a Lucene commit point). Although these operations are
|
|
|
|
available for reads, they will need to be replayed if the shard was stopped
|
|
|
|
and had to be recovered. This setting controls the maximum total size of these
|
|
|
|
operations, to prevent recoveries from taking too long. Once the maximum size
|
|
|
|
has been reached a flush will happen, generating a new Lucene commit point.
|
|
|
|
Defaults to `512mb`.
|
2017-06-22 11:08:14 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
[float]
|
|
|
|
[[index-modules-translog-retention]]
|
|
|
|
==== Translog retention
|
|
|
|
|
|
|
|
If an index is not using <<index-modules-history-retention,soft deletes>> to
|
|
|
|
retain historical operations then {es} recovers each replica shard by replaying
|
|
|
|
operations from the primary's translog. This means it is important for the
|
|
|
|
primary to preserve extra operations in its translog in case it needs to
|
|
|
|
rebuild a replica. Moreover it is important for each replica to preserve extra
|
|
|
|
operations in its translog in case it is promoted to primary and then needs to
|
|
|
|
rebuild its own replicas in turn. The following settings control how much
|
|
|
|
translog is retained for peer recoveries.
|
2019-08-22 16:40:06 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
`index.translog.retention.size`::
|
2017-06-22 11:08:14 -04:00
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
This controls the total size of translog files to keep for each shard.
|
|
|
|
Keeping more translog files increases the chance of performing an operation
|
|
|
|
based sync when recovering a replica. If the translog files are not
|
|
|
|
sufficient, replica recovery will fall back to a file based sync. Defaults to
|
|
|
|
`512mb`. This setting is ignored, and should not be set, if soft deletes are
|
|
|
|
enabled. Soft deletes are enabled by default in indices created in {es}
|
|
|
|
versions 7.0.0 and later.
|
2017-06-22 11:08:14 -04:00
|
|
|
|
|
|
|
`index.translog.retention.age`::
|
|
|
|
|
2019-09-04 11:37:00 -04:00
|
|
|
This controls the maximum duration for which translog files are kept by each
|
|
|
|
shard. Keeping more translog files increases the chance of performing an
|
|
|
|
operation based sync when recovering replicas. If the translog files are not
|
|
|
|
sufficient, replica recovery will fall back to a file based sync. Defaults to
|
|
|
|
`12h`. This setting is ignored, and should not be set, if soft deletes are
|
|
|
|
enabled. Soft deletes are enabled by default in indices created in {es}
|
|
|
|
versions 7.0.0 and later.
|