* Limit the number of expanded fields it query_string and simple_query_string
This limits the number of automatically expanded fields for the "all fields"
mode (`"default_field": "*"`) for the `query_string` and `simple_query_string`
queries to 1024 fields.
Resolves#25105
* Add blurb about limit to the docs
* Throw a better error message for empty field names
When a document is parsed with a `""` for a field name, we currently throw a
confusing error about `.` being present in the field. This changes the error
message to be clearer about what's causing the problem.
Resolves#23348
* Fix exception message in test
The percolator will add a `_percolator_document_slot` field to all percolator
hits to indicate with what document it has matched. This number matches with
the order in which the documents have been specified in the percolate query.
Also improved the support for multiple percolate queries in a search request.
To protect against poisonous situations, ES will only try to allocate a shard 5 times (by default). After 5 consecutive failures, ES will stop assigning the shard and wait for an operator to fix the problem. Once the problem is fixed, the operator is expected to call `_reroute` with a `retry_failed` flag to force retrying of those shards. Currently that retry flag is only used for a single allocation run. However, if not all shards can be allocated at once (due to throttling) the operator has to keep on calling the API until all shards are assigned which is cumbersome. This PR changes the behavior of the flag to reset the failed allocations counter and this allowing shards to be assigned again.
This test should not rely on strict ordering for same score suggestions.
The Lucene completion suggester uses the doc id in case of a tie and documents are indexed randomly.
This commit removes a norelease from the codebase now that there is a CI
job that fails on the norelease pattern being present. Instead, a new
issue has been opened to track this one.
Relates #26544
The completion suggester has a `shard_size` option that sets the size of the suggestions to retrieve per shard but it is ignored
by the builder. This commit restores the handling of this option and fixes a test that can randomly fail without it.
This change exposes the duplicate removal option added in Lucene for the completion suggester
with a new option called `skip_duplicates` (defaults to false).
This commit also adapts the custom suggest collector to handle deduplication when multiple contexts match the input.
Closes#23364
This change fixes a regression introduced in 6 that removes the skipping of the rescore phase
when a sort other than _score is used.
We now fail the request when a sort is provided in conjunction with rescore instead of just skipping the rescore phase
This commit also adds an assert that checks if the topdocs are sorted by _score after the rescoring.
This is the responsibility of the rescorer to make sure that topdocs are sorted after rescore so we
just check that this condition is met in the rescore phase.
The three SortBuilders that can have inner NestedSortBuilders currently don't
rewrite any of the filters contained in them. This change adds a rewrite method
to NestedSortBuilder and changes rewriting in FieldSortBuilder,
ScriptSortBuilder and GeoDistanceSortBuilder to make sure inner nested sorts get
rewritten if they need to.
Improve testing around the ScriptSortBuilder#build method, adding checks for
correct transfers of the sort mode and nested sorts.
Also changing the behaviour around the nested_path, nested_filter vs. nested
parameter in a similar way as in #26490 and deprecating the setters/getters for
the old syntax.
Closes#17286
Security manager policy files contains grants for specific codebases,
where a codebase is a jar file. We use a system property containing the
name of the jar file to resolve the jar file location when parsing the
policy file. However, this means the version of the jars must be
modified when versions of dependencies change. This is particularly
messy for elasticsearch, where we now have a dependency on the rest
client, and need to support both a snapshot version for testing and non
snapshot for release.
This commit adds an alias for the elasticsearch rest client without a
version to be used in policy files. That allows the policy files to not care whether
the rest client is a snapshot or release.
Resoves #26332 where too many tasks occurred while adjustment was happening, the
measurements were reset to 0, and then an assert failed due to tasks executing
in 0 nanoseconds
When a cache entry expires, it remains in the cache (both the segment
that it belongs to, and the LRU list) until an eviction occurs. The
problem here is that the compute if absent implementation relies on
there not being an association to a key that we are trying to put
because it internally uses put if absent on the underlying segment. If
we try to put an association for a key that has expired but not been
evicted, then compute if absent will return as if there is nothing in
the cache for the given key, yet no call to compute if absent will
succeed in putting a new association for the key. To remedy this, we
modify the internal get method for the cache to let the caller take
action if the entry they are retrieving is expired. This allows the
compute if absent method to take the action of evicting the entry from
the cache, thus allowing the put if absent method used by compute if
absent to succeed for one of the callers trying to compute if absent a
new association.
Relates #26516
The current "Building Queries" and "Building Aggregations" pages are
located under the "Supported Apis" section because they are linked to
the "Search API" page.
It should instead be in a dedicated section: this commit adds a new
"Using Java Builders" section and renames few filenames in favor of
more meaningful names.
This change adds a dynamic cluster setting named `search.max_keep_alive`.
It is used as an upper limit for scroll expiry time in scroll queries and defaults to 1 hour.
This change also ensures that the existing setting `search.default_keep_alive` is always smaller than `search.max_keep_alive`.
Relates #11511
* check style
* add skip for bwc
* iter
* Add a maxium throttle wait time of 1h for reindex
* review
* remove empty line
It's easy to create a wrong Content-Type header when converting a
XContentType to a Apache HTTP ContentType instance.
This commit the direct usages of ContentType.create() methods in favor of a Request.createContentType(XContentType) method that does the right thing.
Closes#26438
* If in a range query upper is smaller than lower then ignore the range query
* If two empty range extractions are compared don't fail with NoSuchElementException
This commit adds the Log4j to SLF4J binding JAR to the repository-hdfs
plugin so that SLF4J can detect Log4j at runtime and therefore use the
server Log4j implementation for logging (and the usual Elasticsearch
APIs can be used for setting logging levels).
Relates #26514
The `_type` and `types` version of the current `type` parameter have been
deprecated since 5.0. We can remove support for them in 7.0 and also in 6.x and
6.0.
We currently have a weird relationship between Transport,
TransportService, and TransportServiceAdaptor. At some point I think
that we would like to collapse these all into one concept as we only
support TCP transports.
This commit moves in that direction by eliminating the adaptor and just
passing the transport service to the transport.
Improve testing around the GeoDistanceSortBuilder#build method, adding checks for correct
transfers of the sort order, mode, nested sorts and points validation and coercion.
Also changing the behaviour around the nested_path, nested_filter vs. nested parameter in
a similar way as in #26490 and deprecating the setters/getters for the old syntax.
Relates to #17286
Currently we allow both "old" and "new" way of setting nested sorts on the
FieldSortBuilder at the same time. This should throw an error, instead the user
should choose one of the two possible options.
Also adding testing for the now deprecated nestedPath/nestedFilter parameters,
inlcuding checks that they emmit warnings on parsing and that the new
NestetedSortBuilder overwrites the deprecated parameters when building the
SortField.
Relates to #17286
The `index.percolator.map_unmapped_fields_as_text` is a more better name, because unmapped fields are mapped to a text field with default settings
and string is no longer a field type (it is either keyword or text).
The definition of development vs. production mode has evolved slightly
over time (with the introduction of single-node) discovery. This commit
clarifies the documentation to better account for this adjustment.
Relates #26460
Adding a check to QueryStringQueryBuilderTests that checks the override
behaviour of `quote_analyzer`, also adding documentation explaining the use of
this parameter in `query_string` query.
Closes#25417
We are getting the default Object#toString implementation here, we need
more than this. This commit instead formats the primary response to JSON
so we can see into its soul.
There is a bug in Log4j on JDK 9 for walking the stack to find where a
log line is coming from. This bug is impacting some of our testing, so
this commit marks these tests as skippable only on JDK 9 until the bug
is fixed upstream.
Relates #26467
The new NestedSortBuilder currently is only tested via its use in the other
SortBuilder implementations it can be used in. This adds its own simple unit
test class that at first checks our usual fromXContent parsing, serialization
and hashCode/equals checks. It also adds tests for cases where NestedSortBuilder
is nested in itself and reuses the code for creating randomized instances in the
other SortBuilder tests.
In addition to the tests, this changes the `path` parameter in NestedSortBuilder
to be mandatory and removes the `read` method since it is not really needed.
The current script service has a script compilation limit for a one
minute window. This is set to a small default value of 15. Instead of
increasing that default value, this commit introduces a new setting
that allows to configure a rate per time unit, so that the script service can deal with bursts better.
The new setting is named `script.max_compilations_rate`,
requires a nonnegative number and a positive time value.
The default is `75/5m`, which is equivalent to the existing 15 per minute.