2013-08-28 19:24:34 -04:00
|
|
|
[[index-modules-store]]
|
|
|
|
== Store
|
|
|
|
|
|
|
|
The store module allows you to control how index data is stored.
|
|
|
|
|
|
|
|
The index can either be stored in-memory (no persistence) or on-disk
|
|
|
|
(the default). In-memory indices provide better performance at the cost
|
|
|
|
of limiting the index size to the amount of available physical memory.
|
|
|
|
|
|
|
|
When using a local gateway (the default), file system storage with *no*
|
|
|
|
in memory storage is required to maintain index consistency. This is
|
|
|
|
required since the local gateway constructs its state from the local
|
|
|
|
index state of each node.
|
|
|
|
|
|
|
|
Another important aspect of memory based storage is the fact that
|
|
|
|
ElasticSearch supports storing the index in memory *outside of the JVM
|
|
|
|
heap space* using the "Memory" (see below) storage type. It translates
|
|
|
|
to the fact that there is no need for extra large JVM heaps (with their
|
|
|
|
own consequences) for storing the index in memory.
|
|
|
|
|
|
|
|
|
|
|
|
[float]
|
2013-09-30 17:32:00 -04:00
|
|
|
[[store-throttling]]
|
2013-08-28 19:24:34 -04:00
|
|
|
=== Store Level Throttling
|
|
|
|
|
|
|
|
The way Lucene, the IR library elasticsearch uses under the covers,
|
|
|
|
works is by creating immutable segments (up to deletes) and constantly
|
|
|
|
merging them (the merge policy settings allow to control how those
|
|
|
|
merges happen). The merge process happens in an asynchronous manner
|
|
|
|
without affecting the indexing / search speed. The problem though,
|
|
|
|
especially on systems with low IO, is that the merge process can be
|
|
|
|
expensive and affect search / index operation simply by the fact that
|
|
|
|
the box is now taxed with more IO happening.
|
|
|
|
|
|
|
|
The store module allows to have throttling configured for merges (or
|
|
|
|
all) either on the node level, or on the index level. The node level
|
|
|
|
throttling will make sure that out of all the shards allocated on that
|
|
|
|
node, the merge process won't pass the specific setting bytes per
|
|
|
|
second. It can be set by setting `indices.store.throttle.type` to
|
|
|
|
`merge`, and setting `indices.store.throttle.max_bytes_per_sec` to
|
|
|
|
something like `5mb`. The node level settings can be changed dynamically
|
2013-09-03 15:27:49 -04:00
|
|
|
using the cluster update settings API. The default is set
|
2013-08-28 19:24:34 -04:00
|
|
|
to `20mb` with type `merge`.
|
|
|
|
|
|
|
|
If specific index level configuration is needed, regardless of the node
|
|
|
|
level settings, it can be set as well using the
|
|
|
|
`index.store.throttle.type`, and
|
|
|
|
`index.store.throttle.max_bytes_per_sec`. The default value for the type
|
|
|
|
is `node`, meaning it will throttle based on the node level settings and
|
|
|
|
participate in the global throttling happening. Both settings can be set
|
|
|
|
using the index update settings API dynamically.
|
|
|
|
|
|
|
|
The following sections lists all the different storage types supported.
|
|
|
|
|
|
|
|
[float]
|
2013-09-25 12:17:40 -04:00
|
|
|
[[file-system]]
|
2013-08-28 19:24:34 -04:00
|
|
|
=== File System
|
|
|
|
|
|
|
|
File system based storage is the default storage used. There are
|
|
|
|
different implementations or storage types. The best one for the
|
|
|
|
operating environment will be automatically chosen: `mmapfs` on
|
|
|
|
Solaris/Windows 64bit, `simplefs` on Windows 32bit, and `niofs` for the
|
|
|
|
rest.
|
|
|
|
|
|
|
|
The following are the different file system based storage types:
|
|
|
|
|
|
|
|
[float]
|
|
|
|
==== Simple FS
|
|
|
|
|
|
|
|
The `simplefs` type is a straightforward implementation of file system
|
|
|
|
storage (maps to Lucene `SimpleFsDirectory`) using a random access file.
|
|
|
|
This implementation has poor concurrent performance (multiple threads
|
|
|
|
will bottleneck). It is usually better to use the `niofs` when you need
|
|
|
|
index persistence.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
==== NIO FS
|
|
|
|
|
|
|
|
The `niofs` type stores the shard index on the file system (maps to
|
|
|
|
Lucene `NIOFSDirectory`) using NIO. It allows multiple threads to read
|
|
|
|
from the same file concurrently. It is not recommended on Windows
|
|
|
|
because of a bug in the SUN Java implementation.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
==== MMap FS
|
|
|
|
|
|
|
|
The `mmapfs` type stores the shard index on the file system (maps to
|
|
|
|
Lucene `MMapDirectory`) by mapping a file into memory (mmap). Memory
|
|
|
|
mapping uses up a portion of the virtual memory address space in your
|
|
|
|
process equal to the size of the file being mapped. Before using this
|
|
|
|
class, be sure your have plenty of virtual address space.
|
|
|
|
|
|
|
|
[float]
|
2013-09-30 17:32:00 -04:00
|
|
|
[[store-memory]]
|
2013-08-28 19:24:34 -04:00
|
|
|
=== Memory
|
|
|
|
|
|
|
|
The `memory` type stores the index in main memory with the following
|
|
|
|
configuration options:
|
|
|
|
|
|
|
|
There are also *node* level settings that control the caching of buffers
|
|
|
|
(important when using direct buffers):
|
|
|
|
|
|
|
|
[cols="<,<",options="header",]
|
|
|
|
|=======================================================================
|
|
|
|
|Setting |Description
|
|
|
|
|`cache.memory.direct` |Should the memory be allocated outside of the
|
|
|
|
JVM heap. Defaults to `true`.
|
|
|
|
|
|
|
|
|`cache.memory.small_buffer_size` |The small buffer size, defaults to
|
|
|
|
`1kb`.
|
|
|
|
|
|
|
|
|`cache.memory.large_buffer_size` |The large buffer size, defaults to
|
|
|
|
`1mb`.
|
|
|
|
|
|
|
|
|`cache.memory.small_cache_size` |The small buffer cache size, defaults
|
|
|
|
to `10mb`.
|
|
|
|
|
|
|
|
|`cache.memory.large_cache_size` |The large buffer cache size, defaults
|
|
|
|
to `500mb`.
|
|
|
|
|=======================================================================
|
|
|
|
|