2013-08-28 19:24:34 -04:00
|
|
|
[[index-modules-translog]]
|
|
|
|
== Translog
|
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
Changes to Lucene are only persisted to disk during a Lucene commit, which is a
|
|
|
|
relatively expensive operation and so cannot be performed after every index or
|
|
|
|
delete operation. Changes that happen after one commit and before another will
|
|
|
|
be removed from the index by Lucene in the event of process exit or hardware
|
|
|
|
failure.
|
|
|
|
|
|
|
|
Because Lucene commits are too expensive to perform on every individual change,
|
|
|
|
each shard copy also has a _transaction log_ known as its _translog_ associated
|
|
|
|
with it. All index and delete operations are written to the translog after
|
|
|
|
being processed by the internal Lucene index but before they are acknowledged.
|
|
|
|
In the event of a crash, recent transactions that have been acknowledged but
|
|
|
|
not yet included in the last Lucene commit can instead be recovered from the
|
|
|
|
translog when the shard recovers.
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2015-05-05 16:01:58 -04:00
|
|
|
An Elasticsearch flush is the process of performing a Lucene commit and
|
2018-01-19 05:17:22 -05:00
|
|
|
starting a new translog. Flushes are performed automatically in the background
|
|
|
|
in order to make sure the translog doesn't grow too large, which would make
|
2015-05-05 16:01:58 -04:00
|
|
|
replaying its operations take a considerable amount of time during recovery.
|
2018-01-19 05:17:22 -05:00
|
|
|
The ability to perform a flush manually is also exposed through an API,
|
|
|
|
although this is rarely needed.
|
2015-05-05 16:01:58 -04:00
|
|
|
|
2015-05-05 15:32:41 -04:00
|
|
|
[float]
|
|
|
|
=== Translog settings
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
The data in the translog is only persisted to disk when the translog is
|
2015-06-30 13:08:31 -04:00
|
|
|
++fsync++ed and committed. In the event of hardware failure, any data written
|
|
|
|
since the previous translog commit will be lost.
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
By default, Elasticsearch ++fsync++s and commits the translog every 5 seconds
|
|
|
|
if `index.translog.durability` is set to `async` or if set to `request`
|
|
|
|
(default) at the end of every <<docs-index_,index>>, <<docs-delete,delete>>,
|
|
|
|
<<docs-update,update>>, or <<docs-bulk,bulk>> request. More precisely, if set
|
|
|
|
to `request`, Elasticsearch will only report success of an index, delete,
|
|
|
|
update, or bulk request to the client after the translog has been successfully
|
|
|
|
++fsync++ed and committed on the primary and on every allocated replica.
|
2015-06-30 13:08:31 -04:00
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
The following <<indices-update-settings,dynamically updatable>> per-index
|
|
|
|
settings control the behaviour of the translog:
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2015-03-27 05:18:09 -04:00
|
|
|
`index.translog.sync_interval`::
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2015-06-30 13:08:31 -04:00
|
|
|
How often the translog is ++fsync++ed to disk and committed, regardless of
|
2016-01-27 06:28:38 -05:00
|
|
|
write operations. Defaults to `5s`. Values less than `100ms` are not allowed.
|
2014-07-31 08:06:06 -04:00
|
|
|
|
2015-06-30 13:08:31 -04:00
|
|
|
`index.translog.durability`::
|
|
|
|
+
|
|
|
|
--
|
|
|
|
|
|
|
|
Whether or not to `fsync` and commit the translog after every index, delete,
|
|
|
|
update, or bulk request. This setting accepts the following parameters:
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2015-06-30 13:08:31 -04:00
|
|
|
`request`::
|
2015-05-05 15:32:41 -04:00
|
|
|
|
2015-06-30 13:08:31 -04:00
|
|
|
(default) `fsync` and commit after every request. In the event
|
|
|
|
of hardware failure, all acknowledged writes will already have been
|
2015-10-26 16:43:25 -04:00
|
|
|
committed to disk.
|
2015-06-30 13:08:31 -04:00
|
|
|
|
|
|
|
`async`::
|
|
|
|
|
|
|
|
`fsync` and commit in the background every `sync_interval`. In
|
|
|
|
the event of hardware failure, all acknowledged writes since the last
|
|
|
|
automatic commit will be discarded.
|
2016-08-02 17:43:14 -04:00
|
|
|
--
|
|
|
|
|
2017-06-22 11:08:14 -04:00
|
|
|
`index.translog.flush_threshold_size`::
|
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
The translog stores all operations that are not yet safely persisted in Lucene
|
|
|
|
(i.e., are not part of a Lucene commit point). Although these operations are
|
|
|
|
available for reads, they will need to be reindexed if the shard was to
|
|
|
|
shutdown and has to be recovered. This settings controls the maximum total size
|
|
|
|
of these operations, to prevent recoveries from taking too long. Once the
|
|
|
|
maximum size has been reached a flush will happen, generating a new Lucene
|
|
|
|
commit point. Defaults to `512mb`.
|
2017-06-22 11:08:14 -04:00
|
|
|
|
|
|
|
`index.translog.retention.size`::
|
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
The total size of translog files to keep. Keeping more translog files increases
|
|
|
|
the chance of performing an operation based sync when recovering replicas. If
|
|
|
|
the translog files are not sufficient, replica recovery will fall back to a
|
|
|
|
file based sync. Defaults to `512mb`
|
2017-06-22 11:08:14 -04:00
|
|
|
|
|
|
|
|
|
|
|
`index.translog.retention.age`::
|
|
|
|
|
|
|
|
The maximum duration for which translog files will be kept. Defaults to `12h`.
|
|
|
|
|
|
|
|
|
2016-08-02 17:43:14 -04:00
|
|
|
[float]
|
|
|
|
[[corrupt-translog-truncation]]
|
|
|
|
=== What to do if the translog becomes corrupted?
|
|
|
|
|
2018-01-19 05:17:22 -05:00
|
|
|
In some cases (a bad drive, user error) the translog on a shard copy can become
|
|
|
|
corrupted. When this corruption is detected by Elasticsearch due to mismatching
|
|
|
|
checksums, Elasticsearch will fail that shard copy and refuse to use that copy
|
|
|
|
of the data. If there are other copies of the shard available then
|
|
|
|
Elasticsearch will automatically recover from one of them using the normal
|
|
|
|
shard allocation and recovery mechanism. In particular, if the corrupt shard
|
|
|
|
copy was the primary when the corruption was detected then one of its replicas
|
|
|
|
will be promoted in its place.
|
2016-08-02 17:43:14 -04:00
|
|
|
|
|
|
|
If there is no copy of the data from which Elasticsearch can recover
|
|
|
|
successfully, a user may want to recover the data that is part of the shard at
|
|
|
|
the cost of losing the data that is currently contained in the translog. We
|
|
|
|
provide a command-line tool for this, `elasticsearch-translog`.
|
|
|
|
|
|
|
|
[WARNING]
|
|
|
|
The `elasticsearch-translog` tool should *not* be run while Elasticsearch is
|
2018-05-15 04:42:30 -04:00
|
|
|
running. If you attempt to run this tool while Elasticsearch is running, you
|
|
|
|
will permanently lose the documents that were contained only in the translog!
|
2016-08-02 17:43:14 -04:00
|
|
|
|
|
|
|
In order to run the `elasticsearch-translog` tool, specify the `truncate`
|
|
|
|
subcommand as well as the directory for the corrupted translog with the `-d`
|
|
|
|
option:
|
|
|
|
|
2017-08-30 03:30:36 -04:00
|
|
|
[source,txt]
|
2016-08-02 17:43:14 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
$ bin/elasticsearch-translog truncate -d /var/lib/elasticsearchdata/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/
|
|
|
|
Checking existing translog files
|
|
|
|
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
|
|
|
! WARNING: Elasticsearch MUST be stopped before running this tool !
|
|
|
|
! !
|
|
|
|
! WARNING: Documents inside of translog files will be lost !
|
|
|
|
! !
|
|
|
|
! WARNING: The following files will be DELETED! !
|
|
|
|
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
|
|
|
--> data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog-41.ckp
|
|
|
|
--> data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog-6.ckp
|
|
|
|
--> data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog-37.ckp
|
|
|
|
--> data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog-24.ckp
|
|
|
|
--> data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog-11.ckp
|
|
|
|
|
|
|
|
Continue and DELETE files? [y/N] y
|
|
|
|
Reading translog UUID information from Lucene commit from shard at [data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/index]
|
|
|
|
Translog Generation: 3
|
|
|
|
Translog UUID : AxqC4rocTC6e0fwsljAh-Q
|
|
|
|
Removing existing translog files
|
|
|
|
Creating new empty checkpoint at [data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog.ckp]
|
|
|
|
Creating new empty translog at [data/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/translog-3.tlog]
|
|
|
|
Done.
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
You can also use the `-h` option to get a list of all options and parameters
|
|
|
|
that the `elasticsearch-translog` tool supports.
|