OpenSearch/docs/reference/index-modules/store.asciidoc

[[index-modules-store]]
== Store

The store module allows you to control how index data is stored and accessed on disk.

[float]
[[file-system]]
=== File system storage types

There are different file system implementations or _storage types_. By default,
elasticsearch will pick the best implementation based on the operating
environment.

This can be overridden for all indices by adding this to the
`config/elasticsearch.yml` file:

[source,yaml]
---------------------------------
index.store.type: niofs
---------------------------------

It is a _static_ setting that can be set on a per-index basis at index
creation time:

[source,js]
---------------------------------
PUT /my_index
{
  "settings": {
    "index.store.type": "niofs"
  }
}
---------------------------------

WARNING: This is an expert-only setting and may be removed in the future.

The following sections lists all the different storage types supported.

`fs`::

Default file system implementation. This will pick the best implementation
depending on the operating environment, which is currently `mmapfs` on all
supported systems but is subject to change.

[[simplefs]]`simplefs`::

The Simple FS type is a straightforward implementation of file system
storage (maps to Lucene `SimpleFsDirectory`) using a random access file.
This implementation has poor concurrent performance (multiple threads
will bottleneck). It is usually better to use the `niofs` when you need
index persistence.

[[niofs]]`niofs`::

The NIO FS type stores the shard index on the file system (maps to
Lucene `NIOFSDirectory`) using NIO. It allows multiple threads to read
from the same file concurrently. It is not recommended on Windows
because of a bug in the SUN Java implementation.

[[mmapfs]]`mmapfs`::

The MMap FS type stores the shard index on the file system (maps to
Lucene `MMapDirectory`) by mapping a file into memory (mmap). Memory
mapping uses up a portion of the virtual memory address space in your
process equal to the size of the file being mapped. Before using this
class, be sure you have allowed plenty of
<<vm-max-map-count,virtual address space>>.

[[default_fs]]`default_fs` deprecated[5.0.0, The `default_fs` store type is deprecated - use `fs` instead]::

The `default` type is deprecated and is aliased to `fs` for backward
compatibility.

=== Pre-loading data into the file system cache

NOTE: This is an expert setting, the details of which may change in the future.

By default, elasticsearch completely relies on the operating system file system
cache for caching I/O operations. It is possible to set `index.store.preload`
in order to tell the operating system to load the content of hot index
files into memory upon opening. This setting accept a comma-separated list of
files extensions: all files whose extension is in the list will be pre-loaded
upon opening. This can be useful to improve search performance of an index,
especially when the host operating system is restarted, since this causes the
file system cache to be trashed. However note that this may slow down the
opening of indices, as they will only become available after data have been
loaded into physical memory.

This setting is best-effort only and may not work at all depending on the store
type and host operating system.

The `index.store.preload` is a static setting that can either be set in the
`config/elasticsearch.yml`:

[source,yaml]
---------------------------------
index.store.preload: ["nvd", "dvd"]
---------------------------------

or in the index settings at index creation time:

[source,js]
---------------------------------
PUT /my_index
{
  "settings": {
    "index.store.preload": ["nvd", "dvd"]
  }
}
---------------------------------

The default value is the empty array, which means that nothing will be loaded
into the file-system cache eagerly. For indices that are actively searched,
you might want to set it to `["nvd", "dvd"]`, which will cause norms and doc
values to be loaded eagerly into physical memory. These are the two first
extensions to look at since elasticsearch performs random access on them.

A wildcard can be used in order to indicate that all files should be preloaded:
`index.store.preload: ["*"]`. Note however that it is generally not useful to
load all files into memory, in particular those for stored fields and term
vectors, so a better option might be to set it to
`["nvd", "dvd", "tim", "doc", "dim"]`, which will preload norms, doc values,
terms dictionaries, postings lists and points, which are the most important
parts of the index for search and aggregations.

Note that this setting can be dangerous on indices that are larger than the size
of the main memory of the host, as it would cause the filesystem cache to be
trashed upon reopens after large merges, which would make indexing and searching
_slower_.
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			`[[index-modules-store]]`
			`== Store`

Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`The store module allows you to control how index data is stored and accessed on disk.`
put back fixed throttling, but off by default 2015-01-14 05:35:09 -05:00
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			`[float]`
Add more anchor links to documentation Related to #3679 2013-09-25 12:17:40 -04:00			`[[file-system]]`
Docs: Updated index-modules/store and setup/configuration Explain how to set different index storage types, and added the vm settings required to stop mmapfs from running out of memory Closes #6327 2014-06-12 07:56:06 -04:00			`=== File system storage types`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`There are different file system implementations or _storage types_. By default,`
			`elasticsearch will pick the best implementation based on the operating`
			`environment.`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Docs: Updated index-modules/store and setup/configuration Explain how to set different index storage types, and added the vm settings required to stop mmapfs from running out of memory Closes #6327 2014-06-12 07:56:06 -04:00			`This can be overridden for all indices by adding this to the`
			`config/elasticsearch.yml` file:

			`[source,yaml]`
			`---------------------------------`
			`index.store.type: niofs`
			`---------------------------------`

Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`It is a _static_ setting that can be set on a per-index basis at index`
			`creation time:`
Docs: Updated index-modules/store and setup/configuration Explain how to set different index storage types, and added the vm settings required to stop mmapfs from running out of memory Closes #6327 2014-06-12 07:56:06 -04:00
Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting 2015-07-14 12:14:09 -04:00			`[source,js]`
Docs: Updated index-modules/store and setup/configuration Explain how to set different index storage types, and added the vm settings required to stop mmapfs from running out of memory Closes #6327 2014-06-12 07:56:06 -04:00			`---------------------------------`
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`PUT /my_index`
			`{`
			`"settings": {`
			`"index.store.type": "niofs"`
			`}`
			`}`
Docs: Updated index-modules/store and setup/configuration Explain how to set different index storage types, and added the vm settings required to stop mmapfs from running out of memory Closes #6327 2014-06-12 07:56:06 -04:00			`---------------------------------`

Update experimental labels in the docs (#25727) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram 2017-07-18 08:06:22 -04:00			`WARNING: This is an expert-only setting and may be removed in the future.`
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00
Docs: Updated index-modules/store and setup/configuration Explain how to set different index storage types, and added the vm settings required to stop mmapfs from running out of memory Closes #6327 2014-06-12 07:56:06 -04:00			`The following sections lists all the different storage types supported.`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`fs`::

			`Default file system implementation. This will pick the best implementation`
Remove reference to 32-bit systems. (#25971) They are not supported anymore as of #25435. 2017-07-31 03:55:09 -04:00			depending on the operating environment, which is currently `mmapfs` on all
			`supported systems but is subject to change.`
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			[[simplefs]]`simplefs`::
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`The Simple FS type is a straightforward implementation of file system`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			storage (maps to Lucene `SimpleFsDirectory`) using a random access file.
			`This implementation has poor concurrent performance (multiple threads`
			will bottleneck). It is usually better to use the `niofs` when you need
			`index persistence.`

Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			[[niofs]]`niofs`::
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`The NIO FS type stores the shard index on the file system (maps to`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			Lucene `NIOFSDirectory`) using NIO. It allows multiple threads to read
			`from the same file concurrently. It is not recommended on Windows`
			`because of a bug in the SUN Java implementation.`

Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			[[mmapfs]]`mmapfs`::
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`The MMap FS type stores the shard index on the file system (maps to`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00			Lucene `MMapDirectory`) by mapping a file into memory (mmap). Memory
			`mapping uses up a portion of the virtual memory address space in your`
			`process equal to the size of the file being mapped. Before using this`
Docs: Refactored modules and index modules sections 2015-06-22 17:49:45 -04:00			`class, be sure you have allowed plenty of`
			`<<vm-max-map-count,virtual address space>>.`
Migrated documentation into the main repo 2013-08-28 19:24:34 -04:00
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			[[default_fs]]`default_fs` deprecated[5.0.0, The `default_fs` store type is deprecated - use `fs` instead]::

			The `default` type is deprecated and is aliased to `fs` for backward
			`compatibility.`

			`=== Pre-loading data into the file system cache`

Update experimental labels in the docs (#25727) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram 2017-07-18 08:06:22 -04:00			`NOTE: This is an expert setting, the details of which may change in the future.`
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00
			`By default, elasticsearch completely relies on the operating system file system`
			cache for caching I/O operations. It is possible to set `index.store.preload`
			`in order to tell the operating system to load the content of hot index`
			`files into memory upon opening. This setting accept a comma-separated list of`
Fixed typos (#20843) 2016-10-10 16:51:47 -04:00			`files extensions: all files whose extension is in the list will be pre-loaded`
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`upon opening. This can be useful to improve search performance of an index,`
			`especially when the host operating system is restarted, since this causes the`
			`file system cache to be trashed. However note that this may slow down the`
			`opening of indices, as they will only become available after data have been`
			`loaded into physical memory.`

			`This setting is best-effort only and may not work at all depending on the store`
			`type and host operating system.`

Update store.asciidoc (#21353) * Update store.asciidoc * Update store.asciidoc * Update store.asciidoc 2016-11-05 09:57:22 -04:00			The `index.store.preload` is a static setting that can either be set in the
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`config/elasticsearch.yml`:

			`[source,yaml]`
			`---------------------------------`
Update store.asciidoc (#21353) * Update store.asciidoc * Update store.asciidoc * Update store.asciidoc 2016-11-05 09:57:22 -04:00			`index.store.preload: ["nvd", "dvd"]`
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`---------------------------------`

			`or in the index settings at index creation time:`
[STORE] Make a hybrid directory default using `mmapfs` and `niofs` `mmapfs` is really good for random access but can have sideeffects if memory maps are large depending on the operating system etc. A hybrid solution where only selected files are actually memory mapped but others mostly consumed sequentially brings the best of both worlds and minimizes the memory map impact. This commit mmaps only the `dvd` and `tim` file for fast random access on docvalues and term dictionaries. Closes #6636 2014-06-26 16:46:21 -04:00
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`[source,js]`
			`---------------------------------`
			`PUT /my_index`
			`{`
			`"settings": {`
Update store.asciidoc (#21353) * Update store.asciidoc * Update store.asciidoc * Update store.asciidoc 2016-11-05 09:57:22 -04:00			`"index.store.preload": ["nvd", "dvd"]`
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`}`
			`}`
			`---------------------------------`
[STORE] Make a hybrid directory default using `mmapfs` and `niofs` `mmapfs` is really good for random access but can have sideeffects if memory maps are large depending on the operating system etc. A hybrid solution where only selected files are actually memory mapped but others mostly consumed sequentially brings the best of both worlds and minimizes the memory map impact. This commit mmaps only the `dvd` and `tim` file for fast random access on docvalues and term dictionaries. Closes #6636 2014-06-26 16:46:21 -04:00
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`The default value is the empty array, which means that nothing will be loaded`
			`into the file-system cache eagerly. For indices that are actively searched,`
			you might want to set it to `["nvd", "dvd"]`, which will cause norms and doc
			`values to be loaded eagerly into physical memory. These are the two first`
			`extensions to look at since elasticsearch performs random access on them.`

			`A wildcard can be used in order to indicate that all files should be preloaded:`
Update store.asciidoc (#21353) * Update store.asciidoc * Update store.asciidoc * Update store.asciidoc 2016-11-05 09:57:22 -04:00			`index.store.preload: ["*"]`. Note however that it is generally not useful to
Expose MMapDirectory.preLoad(). #18880 The MMapDirectory has a switch that allows the content of files to be loaded into the filesystem cache upon opening. This commit exposes it with the new `index.store.pre_load` setting. 2016-06-15 03:07:18 -04:00			`load all files into memory, in particular those for stored fields and term`
			`vectors, so a better option might be to set it to`
			`["nvd", "dvd", "tim", "doc", "dim"]`, which will preload norms, doc values,
			`terms dictionaries, postings lists and points, which are the most important`
			`parts of the index for search and aggregations.`

			`Note that this setting can be dangerous on indices that are larger than the size`
			`of the main memory of the host, as it would cause the filesystem cache to be`
			`trashed upon reopens after large merges, which would make indexing and searching`
			`_slower_.`