Each shard repository consists of snapshot file for each snapshot - this file contains a map between original physical file that is snapshotted and its representation in repository. This data includes original filename, checksum and length. When a new snapshot is created, elasticsearch needs to read all these snapshot files to figure which file are already present in the repository and which files still have to be copied there. This change adds a new index file that contains all this information combined into a single file. So, if a repository has 1000 snapshots with 1000 shards elasticsearch will only need to read 1000 blobs (one per shard) instead of 1,000,000 to delete a snapshot. This change should also improve snapshot creation speed on repositories with large number of snapshot and high latency.
Fixes#8958
Some settings may be considered sensitive, such as passwords, and storing them
in the configuration file on disk is not good from a security perspective. This change
allows settings to have a special value, `${prompt::text}` or `${prompt::secret}`, to
indicate that elasticsearch should prompt the user for the actual value on startup.
This only works when started in the foreground. In cases where elasticsearch is started
as a service or in the background, an exception will be thrown.
Closes#10838
Currently, when trying to determine if a location is within one of the configured repository
paths, we compare the root path against a normalized path but the root path is never
normalized so the check may incorrectly fail. This change normalizes the root path and
compares it to the other normalized path.
Relates to #11426
documentaiton
Change log entries, add convenience methods, unit tests
Correct allocation decision messages to display used space
Missed a word
Correct displayed threshold value
A `_parent` field can only point to a type that doesn't exist yet.
The validation that checks this relied on have all mappings in the
MapperService. The issue is that this check is performed on the
elected master node and it may not have the IndexService at all
to perform this check. In that case it creates a temporary IndexService and
MapperService to perform mapping validation, but only the mappings
that are part of the put index call are created, not the already existing mappings.
Because of that the `_parent` field validation can't be performed.
By changing the validation to rely on the cluster state's IndexMetaData instead
we can get around the issue with the IndexService/MapperService on the elected
master node. The IndexMetaData always holds the MappingMetaData instances
for all types.
When the dfs phase runs a SearchContext is created on each node that has a shard
for this query. When the query phase (or query and fetch phase) failed that SearchContext
was released only if the query was actually executed on the node. If for example
the query was rejected because the thread pool queue was full then the search context
was not released.
This commit adds a dedicated call for releasing the SearchContext in this case.
In addition, we must set the docIdsToLoad to null in case the fetch phase failed, otherwise
search contexts might not be released in releaseIrrelevantSearchContexts.
closes#11400
Currently all settings that were specified during repository creation are displayed. This commit enables plugins such as cloud-aws to filter out sensitive settings from the repository.
Closes#11265
When using async fetch, we can end up with cluster updates and reroutes based on teh number of shards. While not disastrous we can optimize it, since a single reroute is enough to apply to all the async fetch results that arrived during that time.
* Cut the `has_child` and `has_parent` queries over to use Lucene's query time global ordinal join. The main benefit of this change is that parent/child queries can now efficiently execute if parent/child queries are wrapped in a bigger boolean query. If the rest of the query only hit a few documents both has_child and has_parent queries don't need to evaluate all parent or child documents any more.
* Cut the `_parent` field over to use doc values. This significantly reduces the on heap memory footprint of parent/child, because the parent id values are never loaded into memory.
Breaking changes:
* The `type` option on the `_parent` field can only point to a parent type that doesn't exist yet, so this means that an existing type/mapping can't become a parent type any longer.
* The `has_child` and `has_parent` queries can no longer be use in alias filters.
All these changes, improvements and breaks in compatibility only apply for indices created with ES version 2.0 or higher. For indices creates with ES <= 2.0 the older implementation is used.
It is highly recommended to re-index all your indices with parent and child documents to benefit from all the improvements that come with this refactoring. The easiest way to achieve this is by using the scan and bulk apis using a simple script.
Closes#6107Closes#8134
This is because commit 35a58d874e causes the following REST tests to fail and reverting the commit causes conflicts:
update/15_script/Script
script/10_basic/Indexed script
This change unifies the way scripts and templates are specified for all instances in the codebase. It builds on the Script class added previously and adds request building and parsing support as well as the ability to transfer script objects between nodes. It also adds a Template class which aims to provide the same functionality for template APIs
Closes#11091
Mappers are currently used at both index and query time for deciding
how to "use" a field. For #8871, we need the index wide view of
mappings to have a unified set of settings for each field of a given
name within the index.
This change moves all the current settings (and methods defining
query time behavior) into subclasses of FieldType. In a future
PR, this will allow storing the field type at the index level,
instead of mappers (which can still have settings that differ
per document type).
The change is quite large (I'm sorry). I could not see a way to
migrate to this in a more piecemeal way. I did leave out cutting
over callers of the query methods to using the field type, as
that can be done in a follow up.
LZF only stays for backward-compatibility reasons and can only read, not write.
DEFLATE is configured to use level=3, which is a nice trade-off between speed
and compression ratio and is the same as we use for Lucene's high compression
codec.
IndexRequestBuilder#setSource as well as CreateIndexRequestBuilder#setSettings and
CreateIndexRequestBuilder#setSouce() will not work with Map<String, String> argument
although the API looks like it should. This PR fixes the problem introducing correct
wildcard parameters and adds tests.
Closes#10825
We have a lot of module classes that don't contain any actual logic,
only declarative bind actions. These classes are unnecessary and can
be consolidated into the already existings IndexShardModule