OpenSearch/docs/reference/index-modules/store.asciidoc

[[index-modules-store]]
== Store

The store module allows you to control how index data is stored.

The index can either be stored in-memory (no persistence) or on-disk
(the default). In-memory indices provide better performance at the cost
of limiting the index size to the amount of available physical memory.

When using a local gateway (the default), file system storage with *no*
in memory storage is required to maintain index consistency. This is
required since the local gateway constructs its state from the local
index state of each node.

Another important aspect of memory based storage is the fact that
ElasticSearch supports storing the index in memory *outside of the JVM
heap space* using the "Memory" (see below) storage type. It translates
to the fact that there is no need for extra large JVM heaps (with their
own consequences) for storing the index in memory.


[float]
[[store-throttling]]
=== Store Level Throttling

The way Lucene, the IR library elasticsearch uses under the covers,
works is by creating immutable segments (up to deletes) and constantly
merging them (the merge policy settings allow to control how those
merges happen). The merge process happens in an asynchronous manner
without affecting the indexing / search speed. The problem though,
especially on systems with low IO, is that the merge process can be
expensive and affect search / index operation simply by the fact that
the box is now taxed with more IO happening.

The store module allows to have throttling configured for merges (or
all) either on the node level, or on the index level. The node level
throttling will make sure that out of all the shards allocated on that
node, the merge process won't pass the specific setting bytes per
second. It can be set by setting `indices.store.throttle.type` to
`merge`, and setting `indices.store.throttle.max_bytes_per_sec` to
something like `5mb`. The node level settings can be changed dynamically
using the cluster update settings API. The default is set
to `20mb` with type `merge`.

If specific index level configuration is needed, regardless of the node
level settings, it can be set as well using the
`index.store.throttle.type`, and
`index.store.throttle.max_bytes_per_sec`. The default value for the type
is `node`, meaning it will throttle based on the node level settings and
participate in the global throttling happening. Both settings can be set
using the index update settings API dynamically.

The following sections lists all the different storage types supported.

[float]
[[file-system]]
=== File System

File system based storage is the default storage used. There are
different implementations or storage types. The best one for the
operating environment will be automatically chosen: `mmapfs` on
Solaris/Windows 64bit, `simplefs` on Windows 32bit, and `niofs` for the
rest.

The following are the different file system based storage types:

[float]
==== Simple FS

The `simplefs` type is a straightforward implementation of file system
storage (maps to Lucene `SimpleFsDirectory`) using a random access file.
This implementation has poor concurrent performance (multiple threads
will bottleneck). It is usually better to use the `niofs` when you need
index persistence.

[float]
==== NIO FS

The `niofs` type stores the shard index on the file system (maps to
Lucene `NIOFSDirectory`) using NIO. It allows multiple threads to read
from the same file concurrently. It is not recommended on Windows
because of a bug in the SUN Java implementation.

[float]
==== MMap FS

The `mmapfs` type stores the shard index on the file system (maps to
Lucene `MMapDirectory`) by mapping a file into memory (mmap). Memory
mapping uses up a portion of the virtual memory address space in your
process equal to the size of the file being mapped. Before using this
class, be sure your have plenty of virtual address space.

[float]
[[store-memory]]
=== Memory

The `memory` type stores the index in main memory with the following
configuration options:

There are also *node* level settings that control the caching of buffers
(important when using direct buffers):

[cols="<,<",options="header",]
|=======================================================================
|Setting |Description
|`cache.memory.direct` |Should the memory be allocated outside of the
JVM heap. Defaults to `true`.

|`cache.memory.small_buffer_size` |The small buffer size, defaults to
`1kb`.

|`cache.memory.large_buffer_size` |The large buffer size, defaults to
`1mb`.

|`cache.memory.small_cache_size` |The small buffer cache size, defaults
to `10mb`.

|`cache.memory.large_cache_size` |The large buffer cache size, defaults
to `500mb`.
|=======================================================================
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			`[[index-modules-store]]`
			`== Store`

			`The store module allows you to control how index data is stored.`

			`The index can either be stored in-memory (no persistence) or on-disk`
			`(the default). In-memory indices provide better performance at the cost`
			`of limiting the index size to the amount of available physical memory.`

			`When using a local gateway (the default), file system storage with no`
			`in memory storage is required to maintain index consistency. This is`
			`required since the local gateway constructs its state from the local`
			`index state of each node.`

			`Another important aspect of memory based storage is the fact that`
			`ElasticSearch supports storing the index in memory *outside of the JVM`
			`heap space* using the "Memory" (see below) storage type. It translates`
			`to the fact that there is no need for extra large JVM heaps (with their`
			`own consequences) for storing the index in memory.`


			`[float]`
Uniquify anchor links to fix asciidoc/docbook generation 2013-09-30 17:32:00 -04:00			`[[store-throttling]]`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			`=== Store Level Throttling`

			`The way Lucene, the IR library elasticsearch uses under the covers,`
			`works is by creating immutable segments (up to deletes) and constantly`
			`merging them (the merge policy settings allow to control how those`
			`merges happen). The merge process happens in an asynchronous manner`
			`without affecting the indexing / search speed. The problem though,`
			`especially on systems with low IO, is that the merge process can be`
			`expensive and affect search / index operation simply by the fact that`
			`the box is now taxed with more IO happening.`

			`The store module allows to have throttling configured for merges (or`
			`all) either on the node level, or on the index level. The node level`
			`throttling will make sure that out of all the shards allocated on that`
			`node, the merge process won't pass the specific setting bytes per`
			second. It can be set by setting `indices.store.throttle.type` to
			`merge`, and setting `indices.store.throttle.max_bytes_per_sec` to
			something like `5mb`. The node level settings can be changed dynamically
[DOCS] Removed outdated new/deprecated version notices 2013-09-03 15:27:49 -04:00			`using the cluster update settings API. The default is set`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			to `20mb` with type `merge`.

			`If specific index level configuration is needed, regardless of the node`
			`level settings, it can be set as well using the`
			`index.store.throttle.type`, and
			`index.store.throttle.max_bytes_per_sec`. The default value for the type
			is `node`, meaning it will throttle based on the node level settings and
			`participate in the global throttling happening. Both settings can be set`
			`using the index update settings API dynamically.`

			`The following sections lists all the different storage types supported.`

			`[float]`
Add more anchor links to documentation Related to #3679 2013-09-25 12:17:40 -04:00			`[[file-system]]`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			`=== File System`

			`File system based storage is the default storage used. There are`
			`different implementations or storage types. The best one for the`
			operating environment will be automatically chosen: `mmapfs` on
			Solaris/Windows 64bit, `simplefs` on Windows 32bit, and `niofs` for the
			`rest.`

			`The following are the different file system based storage types:`

			`[float]`
			`==== Simple FS`

			The `simplefs` type is a straightforward implementation of file system
			storage (maps to Lucene `SimpleFsDirectory`) using a random access file.
			`This implementation has poor concurrent performance (multiple threads`
			will bottleneck). It is usually better to use the `niofs` when you need
			`index persistence.`

			`[float]`
			`==== NIO FS`

			The `niofs` type stores the shard index on the file system (maps to
			Lucene `NIOFSDirectory`) using NIO. It allows multiple threads to read
			`from the same file concurrently. It is not recommended on Windows`
			`because of a bug in the SUN Java implementation.`

			`[float]`
			`==== MMap FS`

			The `mmapfs` type stores the shard index on the file system (maps to
			Lucene `MMapDirectory`) by mapping a file into memory (mmap). Memory
			`mapping uses up a portion of the virtual memory address space in your`
			`process equal to the size of the file being mapped. Before using this`
			`class, be sure your have plenty of virtual address space.`

			`[float]`
Uniquify anchor links to fix asciidoc/docbook generation 2013-09-30 17:32:00 -04:00			`[[store-memory]]`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			`=== Memory`

			The `memory` type stores the index in main memory with the following
			`configuration options:`

			`There are also node level settings that control the caching of buffers`
			`(important when using direct buffers):`

			`[cols="<,<",options="header",]`
			`\|=======================================================================`
			`\|Setting \|Description`
			\|`cache.memory.direct` \|Should the memory be allocated outside of the
			JVM heap. Defaults to `true`.

			\|`cache.memory.small_buffer_size` \|The small buffer size, defaults to
			`1kb`.

			\|`cache.memory.large_buffer_size` \|The large buffer size, defaults to
			`1mb`.

			\|`cache.memory.small_cache_size` \|The small buffer cache size, defaults
			to `10mb`.

			\|`cache.memory.large_cache_size` \|The large buffer cache size, defaults
			to `500mb`.
			`\|=======================================================================`