This change adds an index setting to define how the documents should be sorted inside each Segment.
It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk.
It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index.
This change does not add early termination capabilities in the search layer. This will be added in a follow up.
Relates #6720
Elasticsearch runs as user elasticsearch with uid:gid 1000:1000 inside
the Docker container. Clarify that bind mounted local directories need
to be accessible by this user.
Relates #24092
This change simplifies how the rest test runner finds test files and
removes all leniency. Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.
closes#20240
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.
Some notes about the change:
- it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
- it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
- two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
The docs don't clearly explain that the deleted doc count also comes from lucene.
IMHO, it is worth highlighting this information separately, as a Note.
Apart from that, there should be an official recommended alternative as well.
Today Elasticsearch allows default settings to be used only if the
actual setting is not set. These settings are trappy, and the complexity
invites bugs. This commit removes support for default settings with the
exception of default.path.data, default.path.conf, and default.path.logs
which are maintainted to support packaging. A follow-up will remove
support for these as well.
Relates #24093
Drops any mention of non-sandboxed scripting languages other than a
brief "we don't support them and we shouldn't because A and B"
statement.
Relates to #23930
Now that we have incremental reduce functions for topN and aggregations
we can set the default for `action.search.shard_count.limit` to unlimited.
This still allows users to restrict these settings while by default we executed
across all shards matching the search requests index pattern.
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field.
In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can
retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures).
Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field.
This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently.
For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why
this PR is made directly to 6.0.
The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
This commit upgrades the Log4j dependencies from version 2.7 to version
2.8.2. This release includes a fix for a case where Log4j could lose
exceptions in the presence of a security manager.
Relates #23995
This commit removes some leniency from the plugin service which skips
hidden files in the plugins directory. We really want to ensure the
integrity of the plugin folder, so hasta la vista leniency.
Relates #23982
They needed to be updated now that Painless is the default and
the non-sandboxed scripting languages are going away or gone.
I dropped the entire section about customizing the classloader
whitelists. In master this barely does anything (exposes more
things to expressions).
Before now ranges where forbidden, because the percolator query itself could get cached and then the percolator queries with now ranges that should no longer match, incorrectly will continue to match.
By disabling caching when the `percolator` is being used, the percolator can now correctly support range queries with now based ranges.
I think this is the right tradeoff. The percolator query is likely to not be the same between search requests and disabling range queries with now ranges really disabled people using the percolator for their use cases.
Also fixed an issue that existed in the percolator fieldmapper, it was unable to find forbidden queries inside `dismax` queries.
Closes#23859
While there are use-cases where a single-node is in production, there
are also use-cases for starting a single-node that binds transport to an
external interface where the node is not in production (for example, for
testing the transport client against a node started in a Docker
container). It's tricky to balance the desire to always enforce the
bootstrap checks when a node might be in production with the need for
the community to perform testing in situations that would trip the
bootstrap checks. This commit enables some flexibility for these
users. By setting the discovery type to "single-node", we disable the
bootstrap checks independently of how transport is bound. While this
sounds like a hole in the bootstrap checks, the bootstrap checks can
already be avoided in the single-node use-case by binding only HTTP but
not transport. For users that are genuinely in production on a
single-node use-case with transport bound to an external use-case, they
can set the system property "es.enable.bootstrap.checks" to force
running the bootstrap checks. It would be a mistake for them not to do
this.
Relates #23598
Fielddata can no longer be configured to be loaded eagerly (it only accepts
`true` and `false`), so this line is a little misleading because it talks about
a procedure we can no longer do.
With this commit, Azure repositories are now using an Exponential Backoff policy before failing the backup.
It uses Azure SDK default values for this policy:
* `30s` delta backoff base with
* `3s` min
* `90s` max
* `3` retries max
Users can define the number of retries they wish by setting `cloud.azure.storage.xxx.max_retries` where `xxx` is the azure named account.
Closes#22728.
Converts the analysis docs to that were marked as json into `CONSOLE`
format. A few of them were in yaml but marked as json for historical
reasons. I added more complete examples for a few of the less obvious
sounding ones.
Relates to #18160
The pattern-analyzer docs contained a snippet that was an expanded
regex that was marked as `[source,js]`. This changes it to
`[source,regex]`.
The htmlstrip-charfilter and pattern-replace-charfilter docs had
examples that were actually a list of tokens but marked `[source,js]`.
This marks them as `[source,text]` so they don't count as unconverted
CONSOLE snippets.
The pattern-replace-charfilter also had a doc who's test was
skipped because of funny interaction with the test framework. This
fixes the test.
Three more down, eighty-two to go.
Relates to #18160
CONSOLEifies the lang-analyzer docs and replaces the (invalid)
empty `keyword_marker` setups that were on the page with one
that contains the word "example" translated into the appropriate
language.
Relates to #18160
This change introduces a new API called `_field_caps` that allows to retrieve the capabilities of specific fields.
Example:
````
GET t,s,v,w/_field_caps?fields=field1,field2
````
... returns:
````
{
"fields": {
"field1": {
"string": {
"searchable": true,
"aggregatable": true
}
},
"field2": {
"keyword": {
"searchable": false,
"aggregatable": true,
"non_searchable_indices": ["t"]
"indices": ["t", "s"]
},
"long": {
"searchable": true,
"aggregatable": false,
"non_aggregatable_indices": ["v"]
"indices": ["v", "w"]
}
}
}
}
````
In this example `field1` have the same type `text` across the requested indices `t`, `s`, `v`, `w`.
Conversely `field2` is defined with two conflicting types `keyword` and `long`.
Note that `_field_caps` does not treat this case as an error but rather return the list of unique types seen for this field.