This change rewrites search requests on the coordinating node before
we send requests to the individual shards. This will reduce the rewrite load
and object creation for each rewrite on the executing nodes and will fetch
resources only once instead of N times once per shard for queries like `terms`
query with index lookups. (among percolator and geo-shape)
Relates to #25791
The `QueryRewriteContext` used to provide a client object that can
be used to fetch geo-shapes, terms or documents for percolation. Unfortunately
all client calls used to be blocking calls which can have significant impact on the
rewrite phase since it occupies an entire search thread until the resource is
received. In the case that the index the resource is fetched from isn't on the local
node this can have significant impact on query throughput.
Note: this doesn't fix MLT since it fetches stuff in doQuery which is a different beast. Yet, it is a huge step in the right direction
Today we have duplicated code that is quite complicated to iterate
over rewriteable (`QueryBuilders` mainly) This change introduces a
`Rewriteable` interface that allow to share code to do the rewriting as
well as encapsulation and composition of queries.
We currently use fielddata on the `_id` field which is trappy, especially as we
do it implicitly. This changes the `random_score` function to use doc ids when
no seed is provided and to suggest a field when a seed is provided.
For now the change only emits a deprecation warning when no field is supplied
but this should be replaced by a strict check on 7.0.
Closes#25240
This commit does two things:
- bumps the version from 6.0.0-alpha3 to 6.0.0-beta1
- renames the 6.0.0-alpha3 version constant to 6.0.0-beta1
Relates #25621
QueryParseContext is currently only used as a wrapper for an XContentParser, so
this change removes it entirely and changes the appropriate APIs that use it so
far to only accept a parser instead.
Currently QueryParseContext is only a thin wrapper around an XContentParser that
adds little functionality of its own. I provides helpers for long deprecated
field names which can be removed and two helper methods that can be made static
and moved to other classes. This is a first step in helping to remove
QueryParseContext entirely.
Most notable changes:
- better update concurrency: LUCENE-7868
- TopDocs.totalHits is now a long: LUCENE-7872
- QueryBuilder does not remove the boolean query around multi-term synonyms:
LUCENE-7878
- removal of Fields: LUCENE-7500
For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding
of vInts and vLongs are compatible: you can write and read with any of them as
long as the value can be represented by a positive int.
The `document_type` parameter is no longer required to be specified,
because by default from 6.0 only a single type is allowed. (`index.mapping.single_type` defaults to `true`)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
This commit modifies query_string, simple_query_string and multi_match queries to always use a DisjunctionMaxQuery when a disjunction over multiple fields is built. The tiebreaker is set to 1 in order to behave like the boolean query in terms of scoring.
The removal of the coord factor in Lucene 7 made this change mandatory to correctly handle minimum_should_match.
Closes#23966
The `scorerSupplier` API allows to give a hint to queries in order to let them
know that they will be consumed in a random-access fashion. We should use this
for aggregations, function_score and matched queries.
This change moves the parent_id query to the parent-join module and handles the case when only the parent-join field can be declared on an index (index with single type on).
If single type is off it uses the legacy parent join field mapper and switch to the new one otherwise (default in 6).
Relates #20257
Approaching the release of 6.0 we need to sort out the usage of
`Version#minimumCompatibilityVersion` which was still set to 5.0.0.
Now this change moves it to the latest released version of 5.x (5.4 at this point)
to ensure we are compatible with the latest minor of the previous major. This change
also removes all the `_UNRELEASED` from the versions that where released and drops versions
that were never released and are not expected to be released (bugfixes in minors that are not
the latest in the previous major).
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
Range queries with now based date ranges were previously not allowed,
but since #23921 these queries were allowed. This change should really
fix range queries with now based date ranges.
* Add parent-join module
This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.
Relates #20257
This adds the `index.mapping.single_type` setting, which enforces that indices
have at most one type when it is true. The default value is true for 6.0+ indices
and false for old indices.
Relates #15613
The percolator doesn't close the IndexReader of the memory index any more.
Prior to 2.x the percolator had its own SearchContext (PercolatorContext) that did this,
but that was removed when the percolator was refactored as part of the 5.0 release.
I think an alternative way to fix this is to let percolator not use the bitset and fielddata caches,
that way we prevent the memory leak.
Closes#24108
This change simplifies how the rest test runner finds test files and
removes all leniency. Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.
closes#20240
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.
Some notes about the change:
- it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
- it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
- two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
Before now ranges where forbidden, because the percolator query itself could get cached and then the percolator queries with now ranges that should no longer match, incorrectly will continue to match.
By disabling caching when the `percolator` is being used, the percolator can now correctly support range queries with now based ranges.
I think this is the right tradeoff. The percolator query is likely to not be the same between search requests and disabling range queries with now ranges really disabled people using the percolator for their use cases.
Also fixed an issue that existed in the percolator fieldmapper, it was unable to find forbidden queries inside `dismax` queries.
Closes#23859
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].
Relates #23886
Removed `parse(String index, String type, String id, BytesReference source)` in DocumentMapper.java and replaced all of its use in Test files with `parse(SourceToParse source)`.
`parse(String index, String type, String id, BytesReference source)` was only used in test files and never in the main code so it was removed. All of the test files that used it was then modified to use `parse(SourceToParse source)` method that existing in DocumentMapper.java
We have many version constants in master that have already been
released, but are still marked (by naming convention) as unreleased.
This commit renames those version constants.
We have a bunch of interfaces that have only a single implementation
for 6 years now. These interfaces are pretty useless from a SW development
perspective and only add unnecessary abstractions. They also require
lots of casting in many places where we expect that there is only one
concrete implementation. This change removes the interfaces, makes
all of the classes final and removes the duplicate `foo` `getFoo` accessors
in favor of `getFoo` from these classes.
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.
Relates #22960
This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously.
The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type.
As part of this, many transport requests and builders were updated to provide methods that
accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated.
In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header.
See #19388
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.
This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.
I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.