Merge pull request #11955 from clintongormley/translog_docs

Docs: Updated the translog docs to reflect the new behaviour/settings
2015-07-07 15:37:38 +02:00 · 2015-07-07 15:37:38 +02:00 · 3ffb50828b
parent 21b4f9b6f8 93fe8f8910
commit 3ffb50828b
1 changed files with 47 additions and 32 deletions
--- a/docs/reference/index-modules/translog.asciidoc
+++ b/docs/reference/index-modules/translog.asciidoc
@ -44,51 +44,66 @@ How long to wait before triggering a flush regardless of translog size. Defaults
 How often to check if a flush is needed, randomized between the interval value
 and 2x the interval value. Defaults to `5s`.
 [float]
 === Translog settings
-The translog itself is only persisted to disk when it is ++fsync++ed.  Until
+The data in the transaction log is only persisted to disk when the translog is
-then, data recently written to the translog may only exist in the file system
++fsync++ed and committed.  In the event of hardware failure, any data written
-cache and could potentially be lost in the event of hardware failure.
+since the previous translog commit will be lost.
-The following <<indices-update-settings,dynamically updatable>> settings
+By default, Elasticsearch ++fsync++s and commits the translog every 5 seconds
 and at the end of every <<docs-index_,index>>, <<doc-delete,delete>>,
 <<doc-update,update>>, or  <<docs-bulk,bulk>> request.  In fact, Elasticsearch
 will only report success of an index, delete, update, or bulk request to the
 client after the transaction log has been successfully ++fsync++ed and committed
 on the primary and on every allocated replica.
 The following <<indices-update-settings,dynamically updatable>> per-index settings
 control the behaviour of the transaction log:
 `index.translog.sync_interval`::
-How often the translog is ++fsync++ed to disk. Defaults to `5s`. Can be set to
+How often the translog is ++fsync++ed to disk and committed, regardless of
-`0` to sync after each operation.
+write operations. Defaults to `5s`.
 `index.translog.durability`::
 +
 --
 Whether or not to `fsync` and commit the translog after every index, delete,
 update, or bulk request.  This setting accepts the following parameters:
 `request`::
    (default) `fsync` and commit after every request. In the event
    of hardware failure, all acknowledged writes will already have been
    commited to disk.
 `async`::
    `fsync` and commit in the background every `sync_interval`. In
    the event of hardware failure, all acknowledged writes since the last
    automatic commit will be discarded.
 --
 `index.translog.fs.type`::
 +
 --
-Either a `buffered` translog (default) which buffers 64kB in memory before
+Whether to buffer writes to the transaction log in memory or not.  This
-writing to disk, or a `simple` translog which writes every entry to disk
+setting accepts the following parameters:
 immediately.  Whichever is used, these writes are only ++fsync++ed according
 to the `sync_interval`.
-The `buffered` translog is written to disk when it reaches 64kB in size, or
+`buffered`::
 whenever a `sync` is triggered by the `sync_interval`.
-.Why don't we `fsync` the translog after every write?
+    (default) Translog writes first go to a 64kB buffer in memory,
-******************************************************
+    and are only written to the disk when the buffer is full, or when an
    `fsync` is triggered by a write request or the `sync_interval`.
-The disk is the slowest part of any server. An `fsync` ensures that data in
+`simple`::
 the file system buffer has been physically written to disk, but this
 persistence comes with a performance cost.
-However, the translog is not the only persistence mechanism in Elasticsearch.
+    Translog writes are written to the file system immediately, without
-Any index or update request is first written to the primary shard, then
+    buffering.  However, these writes will only be persisted to disk when an
-forwarded in parallel to any replica shards. The primary waits for the action
+    `fsync` and commit is triggered by a write request or the `sync_interval`.
 to be completed on the replicas before returning success to the client.
-If the node holding the primary shard dies for some reason, its transaction
+--
 log could be missing the last 5 seconds of data. However, that data should
 already be available on a replica shard on a different node.  Of course, if
 the whole data centre loses power at the same time, then it is possible that
 you could lose the last 5 seconds (or `sync_interval`) of data.
 We are constantly monitoring the perfromance implications of better default
 translog sync semantics, so the default might change as time passes and HW,
 virtualization, and other aspects improve.
 ******************************************************