109 lines
5.3 KiB
Plaintext
109 lines
5.3 KiB
Plaintext
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[frozen-indices]]
|
|
= Frozen Indices
|
|
|
|
[partintro]
|
|
--
|
|
{es} indices keep some data structures in memory to allow you to search them
|
|
efficiently and to index into them. If you have a lot of indices then the
|
|
memory required for these data structures can add up to a significant amount.
|
|
For indices that are searched frequently it is better to keep these structures
|
|
in memory because it takes time to rebuild them. However, you might access some
|
|
of your indices so rarely that you would prefer to release the corresponding
|
|
memory and rebuild these data structures on each search.
|
|
|
|
For example, if you are using time-based indices to store log messages or time
|
|
series data then it is likely that older indices are searched much less often
|
|
than the more recent ones. Older indices also receive no indexing requests.
|
|
Furthermore, it is usually the case that searches of older indices are for
|
|
performing longer-term analyses for which a slower response is acceptable.
|
|
|
|
If you have such indices then they are good candidates for becoming _frozen
|
|
indices_. {es} builds the transient data structures of each shard of a frozen
|
|
index each time that shard is searched, and discards these data structures as
|
|
soon as the search is complete. Because {es} does not maintain these transient
|
|
data structures in memory, frozen indices consume much less heap than normal
|
|
indices. This allows for a much higher disk-to-heap ratio than would otherwise
|
|
be possible.
|
|
|
|
Searches performed on frozen indices use the small, dedicated,
|
|
<<search-throttled,`search_throttled` threadpool>> to control the number of
|
|
concurrent searches that hit frozen shards on each node. This limits the amount
|
|
of extra memory required for the transient data structures corresponding to
|
|
frozen shards, which consequently protects nodes against excessive memory
|
|
consumption.
|
|
|
|
Frozen indices are read-only: you cannot index into them.
|
|
|
|
Searches on frozen indices are expected to execute slowly. Frozen indices are
|
|
not intended for high search load. It is possible that a search of a frozen
|
|
index may take seconds or minutes to complete, even if the same searches
|
|
completed in milliseconds when the indices were not frozen.
|
|
--
|
|
|
|
== Best Practices
|
|
|
|
Since frozen indices provide a much higher disk to heap ratio at the expense of search latency, it is advisable to allocate frozen indices to
|
|
dedicated nodes to prevent searches on frozen indices influencing traffic on low latency nodes. There is significant overhead in loading
|
|
data structures on demand which can cause page faults and garbage collections, which further slow down query execution.
|
|
|
|
Since indices that are eligible for freezing are unlikely to change in the future, disk space can be optimized as described in <<tune-for-disk-usage>>.
|
|
|
|
It's highly recommended to <<indices-forcemerge,`_forcemerge`>> your indices prior to freezing to ensure that each shard has only a single
|
|
segment on disk. This not only provides much better compression but also simplifies the data structures needed to service aggregation
|
|
or sorted search requests.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
POST /twitter/_forcemerge?max_num_segments=1
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:twitter]
|
|
|
|
== Searching a frozen index
|
|
|
|
Frozen indices are throttled in order to limit memory consumptions per node. The number of concurrently loaded frozen indices per node is
|
|
limited by the number of threads in the <<search-throttled>> threadpool, which is `1` by default.
|
|
Search requests will not be executed against frozen indices by default, even if a frozen index is named explicitly. This is
|
|
to prevent accidental slowdowns by targeting a frozen index by mistake. To include frozen indices a search request must be executed with
|
|
the query parameter `ignore_throttled=false`.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /twitter/_search?q=user:kimchy&ignore_throttled=false
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:twitter]
|
|
|
|
[IMPORTANT]
|
|
================================
|
|
While frozen indices are slow to search, they can be pre-filtered efficiently. The request parameter `pre_filter_shard_size` specifies
|
|
a threshold that, when exceeded, will enforce a round-trip to pre-filter search shards that cannot possibly match.
|
|
This filter phase can limit the number of shards significantly. For instance, if a date range filter is applied, then all indices (frozen or unfrozen) that do not contain documents within the date range can be skipped efficiently.
|
|
The default value for `pre_filter_shard_size` is `128` but it's recommended to set it to `1` when searching frozen indices. There is no
|
|
significant overhead associated with this pre-filter phase.
|
|
================================
|
|
|
|
== Monitoring frozen indices
|
|
|
|
Frozen indices are ordinary indices that use search throttling and a memory efficient shard implementation. For API's like the
|
|
`<<cat-indices>>` frozen indicies may identified by an index's `search.throttled` property (`sth`).
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_cat/indices/twitter?v&h=i,sth
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[s/^/PUT twitter\nPOST twitter\/_freeze\n/]
|
|
|
|
The response looks like:
|
|
|
|
[source,txt]
|
|
--------------------------------------------------
|
|
i sth
|
|
twitter true
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[_cat]
|
|
|