OpenSearch/docs/reference/index-modules.asciidoc

266 lines
8.6 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[[index-modules]]
= Index Modules
[partintro]
--
Index Modules are modules created per index and control all aspects related to
an index.
[float]
[[index-modules-settings]]
== Index Settings
Index level settings can be set per-index. Settings may be:
_static_::
They can only be set at index creation time or on a
<<indices-open-close,closed index>>.
_dynamic_::
They can be changed on a live index using the
<<indices-update-settings,update-index-settings>> API.
WARNING: Changing static or dynamic index settings on a closed index could
result in incorrect settings that are impossible to rectify without deleting
and recreating the index.
[float]
=== Static index settings
Below is a list of all _static_ index settings that are not associated with any
specific index module:
`index.number_of_shards`::
The number of primary shards that an index should have. Defaults to 5.
This setting can only be set at index creation time. It cannot be
changed on a closed index. Note: the number of shards are limited to `1024` per
index. This limitation is a safety limit to prevent accidental creation of indices
that can destabilize a cluster due to resource allocation. The limit can be modified
by specifying `export ES_JAVA_OPTS="-Des.index.max_number_of_shards=128"` system property on every node that is
part of the cluster.
`index.shard.check_on_startup`::
+
--
Whether or not shards should be checked for corruption before opening. When
corruption is detected, it will prevent the shard from being opened. Accepts:
`false`::
(default) Don't check for corruption when opening a shard.
`checksum`::
Check for physical corruption.
`true`::
Check for both physical and logical corruption. This is much more
expensive in terms of CPU and memory usage.
`fix`::
Check for both physical and logical corruption. Segments that were reported
as corrupted will be automatically removed. This option *may result in data loss*.
Use with extreme caution!
WARNING: Expert only. Checking shards may take a lot of time on large indices.
--
[[index-codec]] `index.codec`::
The +default+ value compresses stored data with LZ4
compression, but this can be set to +best_compression+
which uses https://en.wikipedia.org/wiki/DEFLATE[DEFLATE] for a higher
compression ratio, at the expense of slower stored fields performance.
If you are updating the compression type, the new one will be applied
after segments are merged. Segment merging can be forced using
<<indices-forcemerge,force merge>>.
[[routing-partition-size]] `index.routing_partition_size`::
The number of shards a custom <<mapping-routing-field,routing>> value can go to.
Defaults to 1 and can only be set at index creation time. This value must be less
than the `index.number_of_shards` unless the `index.number_of_shards` value is also 1.
See <<routing-index-partition>> for more details about how this setting is used.
[float]
[[dynamic-index-settings]]
=== Dynamic index settings
Below is a list of all _dynamic_ index settings that are not associated with any
specific index module:
`index.number_of_replicas`::
The number of replicas each primary shard has. Defaults to 1.
`index.auto_expand_replicas`::
Auto-expand the number of replicas based on the number of available nodes.
Set to a dash delimited lower and upper bound (e.g. `0-5`) or use `all`
for the upper bound (e.g. `0-all`). Defaults to `false` (i.e. disabled).
`index.search.idle.after`::
How long a shard can not receive a search or get request until it's considered
search idle. (default is `30s`)
`index.refresh_interval`::
How often to perform a refresh operation, which makes recent changes to the
index visible to search. Defaults to `1s`. Can be set to `-1` to disable
refresh. If this setting is not explicitly set, shards that haven't seen
search traffic for at least `index.search.idle.after` seconds will not receive
background refreshes until they receive a search request. Searches that hit an
idle shard where a refresh is pending will wait for the next background
refresh (within `1s`). This behavior aims to automatically optimize bulk
indexing in the default case when no searches are performed. In order to opt
out of this behavior an explicit value of `1s` should set as the refresh
interval.
`index.max_result_window`::
The maximum value of `from + size` for searches to this index. Defaults to
`10000`. Search requests take heap memory and time proportional to
`from + size` and this limits that memory. See
<<search-request-scroll,Scroll>> or <<search-request-search-after,Search After>> for a more efficient alternative
to raising this.
`index.max_inner_result_window`::
The maximum value of `from + size` for inner hits definition and top hits aggregations to this index. Defaults to
`100`. Inner hits and top hits aggregation take heap memory and time proportional to `from + size` and this limits that memory.
`index.max_rescore_window`::
The maximum value of `window_size` for `rescore` requests in searches of this index.
Defaults to `index.max_result_window` which defaults to `10000`. Search
requests take heap memory and time proportional to
`max(window_size, from + size)` and this limits that memory.
`index.max_docvalue_fields_search`::
The maximum number of `docvalue_fields` that are allowed in a query.
Defaults to `100`. Doc-value fields are costly since they might incur
a per-field per-document seek.
`index.max_script_fields`::
The maximum number of `script_fields` that are allowed in a query.
Defaults to `32`.
`index.max_ngram_diff`::
The maximum allowed difference between min_gram and max_gram for NGramTokenizer and NGramTokenFilter.
Defaults to `1`.
`index.max_shingle_diff`::
The maximum allowed difference between max_shingle_size and min_shingle_size for ShingleTokenFilter.
Defaults to `3`.
`index.blocks.read_only`::
Set to `true` to make the index and index metadata read only, `false` to
allow writes and metadata changes.
`index.blocks.read_only_allow_delete`::
Identical to `index.blocks.read_only` but allows deleting the index to free
up resources.
`index.blocks.read`::
Set to `true` to disable read operations against the index.
`index.blocks.write`::
  Set to `true` to disable data write operations against the index. Unlike `read_only',
  this setting does not affect metadata. For instance, you can close an index with a `write`
  block, but not an index with a `read_only` block.
`index.blocks.metadata`::
Set to `true` to disable index metadata reads and writes.
`index.max_refresh_listeners`::
Maximum number of refresh listeners available on each shard of the index.
These listeners are used to implement <<docs-refresh,`refresh=wait_for`>>.
`index.analyze.max_token_count`::
The maximum number of tokens that can be produced using _analyze API.
Defaults to `10000`.
`index.highlight.max_analyzed_offset`::
The maximum number of characters that will be analyzed for a highlight request.
This setting is only applicable when highlighting is requested on a text that was indexed without offsets or term vectors.
Defaults to `10000`.
[float]
=== Settings in other index modules
Other index settings are available in index modules:
<<analysis,Analysis>>::
Settings to define analyzers, tokenizers, token filters and character
filters.
<<index-modules-allocation,Index shard allocation>>::
Control over where, when, and how shards are allocated to nodes.
<<index-modules-mapper,Mapping>>::
Enable or disable dynamic mapping for an index.
<<index-modules-merge,Merging>>::
Control over how shards are merged by the background merge process.
<<index-modules-similarity,Similarities>>::
Configure custom similarity settings to customize how search results are
scored.
<<index-modules-slowlog,Slowlog>>::
Control over how slow queries and fetch requests are logged.
<<index-modules-store,Store>>::
Configure the type of filesystem used to access shard data.
<<index-modules-translog,Translog>>::
Control over the transaction log and background flush operations.
--
include::index-modules/analysis.asciidoc[]
include::index-modules/allocation.asciidoc[]
include::index-modules/mapper.asciidoc[]
include::index-modules/merge.asciidoc[]
include::index-modules/similarity.asciidoc[]
include::index-modules/slowlog.asciidoc[]
include::index-modules/store.asciidoc[]
include::index-modules/translog.asciidoc[]
include::index-modules/index-sorting.asciidoc[]