This PR removes the vInt that precedes every value in order to know how long
they are. Instead the query takes an enum that tells how to compute the length
of values: for fixed-length data (ip addresses, double, float) the length is a
constant while longs and integers use a variable-length representation that
allows the length to be computed from the encoded values.
Also the encoding of ints/longs was made a bit more efficient in order not to
waste 3 bits in the header. As a consequence, values between -8 and 7 can now
be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2
bytes or less.
Closes#26443
This commit adds a dependency to the install module task on the task
that builds the module. This is needed for standalone integration
tests that require other modules to be installed. Without this, we do
not have a guarantee that the module is bundled.
If the query coordinating node is also a data node that holds all the
shards for a search request, we can end up recursing through the can
match phase (because we send a local request and on response in the
listener move to the next shard and do this again, without ever having
returned from previous shards). This recursion can lead to stack
overflow for even a reasonable number of indices (daily indices over a
sixty days with five shards per day is enough to trigger the stack
overflow). Moreover, all this execution would be happening on a network
thread (the thread that initially received the query). With this commit,
we allow search phases to override max concurrent requests. This allows
the can match phase to avoid recursing through the shards towards a
stack overflow.
Relates #26484
Requesting to many docvalue_fields in a search request can potentially be costly
because it might incur a per-field per-document seek. This change introduces a
soft limit on the number of fields that can be retrieved. The setting can be
changed per index using the `index.max_docvalue_fields_search` setting.
Relates to #26390
You can define a proxy using the following settings:
```yml
azure.client.default.proxy.host: proxy.host
azure.client.default.proxy.port: 8888
azure.client.default.proxy.type: http
```
Supported values for `proxy.type` are `direct`, `http` or `socks`. Defaults to `direct` (no proxy).
Closes#23506
BTW I changed a test `testGetSelectedClientBackoffPolicyNbRetries` as it was using an old setting name `cloud.azure.storage.azure.max_retries` instead of `azure.client.azure1.max_retries`.
Follow up for #23405.
We remove azure deprecated settings in 7.0:
* The legacy azure settings which where starting with `cloud.azure.storage.` prefix have been removed.
This includes `account`, `key`, `default` and `timeout`.
You need to use settings which are starting with `azure.client.` prefix instead.
* Global timeout setting `cloud.azure.storage.timeout` has been removed.
You must set it per azure client instead. Like `azure.client.default.timeout: 10s` for example.
Today we don't have a pluggable way to validate if the cluster state
is compatible with the node that joins. We already apply some checks for index
compatibility that prevents nodes to join a cluster with indices it doesn't support
but for plugins this isn't possible. This change adds a cluster state validator that
allows plugins to prevent a join if the cluster-state is incompatible.
Remove "index.mapper.dynamic" setting for 6.0 (and after) indices, but
still keep working for 5.x (and before) indices. Remove two index
dynamic disable test cases as the disability of index.mapper.dynamic is
already removed for current version. Add a new test class for version
test.
This commit contains:
* update AWS SDK for ECS Task IAM support
* ignore dependencies not essential to `discovery-ec2`:
* jmespath seems to be used for `waiters`
* amazon ion is a protocol not used by EC2 or IAM
Javadoc linking between projects currently relies on
projectSubstitutions. However, that is an extension variable that is not
part of BuildPlugin. This commit moves the javadoc linking into the root
build.gradle, alongside where projectSubstitutions are defined.
This test case was leftover from the static bwc tests. There was still
one use for checking we do not load old indices, but this PR moves the
legacy code needed for that directly into the test. I also opened a
follow up issue to completely remove the unsupported test: #26583.
When determining if a build is a snapshot build, we look for a field in
the JAR manifest. However, when running tests, we are not running with a
compiled core Elasticsearch JAR, we are running with the compiled core
classes on the classpath. We have a fallback for this, we always assume
such a situation is a snapshot build. However, when running builds with
-Dbuild.snapshot=false, this is not the case. As such, we need to
fallback to the value of build.snapshot. However, there are cases where
we are not running with a compiled core Elasticsearch JAR (e.g., when
the transport client is embedded in a web container) so we should only
do this fallback if we are in tests. To verify we are in tests, we check
if randomized runner is on the classpath.
Relates #26554
RangeQueryBuilder needs to perform too many `instanceof` checks in order to
check for `date` or `range` fields in order to know what it should do with the
shape relation, time zone and date format.
This commit adds those 3 parameters to the `rangeQuery` factory method so that
those instanceof checks are not necessary anymore.
* Limit the number of expanded fields it query_string and simple_query_string
This limits the number of automatically expanded fields for the "all fields"
mode (`"default_field": "*"`) for the `query_string` and `simple_query_string`
queries to 1024 fields.
Resolves#25105
* Add blurb about limit to the docs
* Throw a better error message for empty field names
When a document is parsed with a `""` for a field name, we currently throw a
confusing error about `.` being present in the field. This changes the error
message to be clearer about what's causing the problem.
Resolves#23348
* Fix exception message in test
The percolator will add a `_percolator_document_slot` field to all percolator
hits to indicate with what document it has matched. This number matches with
the order in which the documents have been specified in the percolate query.
Also improved the support for multiple percolate queries in a search request.
To protect against poisonous situations, ES will only try to allocate a shard 5 times (by default). After 5 consecutive failures, ES will stop assigning the shard and wait for an operator to fix the problem. Once the problem is fixed, the operator is expected to call `_reroute` with a `retry_failed` flag to force retrying of those shards. Currently that retry flag is only used for a single allocation run. However, if not all shards can be allocated at once (due to throttling) the operator has to keep on calling the API until all shards are assigned which is cumbersome. This PR changes the behavior of the flag to reset the failed allocations counter and this allowing shards to be assigned again.
This test should not rely on strict ordering for same score suggestions.
The Lucene completion suggester uses the doc id in case of a tie and documents are indexed randomly.
This commit removes a norelease from the codebase now that there is a CI
job that fails on the norelease pattern being present. Instead, a new
issue has been opened to track this one.
Relates #26544
The completion suggester has a `shard_size` option that sets the size of the suggestions to retrieve per shard but it is ignored
by the builder. This commit restores the handling of this option and fixes a test that can randomly fail without it.
This change exposes the duplicate removal option added in Lucene for the completion suggester
with a new option called `skip_duplicates` (defaults to false).
This commit also adapts the custom suggest collector to handle deduplication when multiple contexts match the input.
Closes#23364
This change fixes a regression introduced in 6 that removes the skipping of the rescore phase
when a sort other than _score is used.
We now fail the request when a sort is provided in conjunction with rescore instead of just skipping the rescore phase
This commit also adds an assert that checks if the topdocs are sorted by _score after the rescoring.
This is the responsibility of the rescorer to make sure that topdocs are sorted after rescore so we
just check that this condition is met in the rescore phase.
The three SortBuilders that can have inner NestedSortBuilders currently don't
rewrite any of the filters contained in them. This change adds a rewrite method
to NestedSortBuilder and changes rewriting in FieldSortBuilder,
ScriptSortBuilder and GeoDistanceSortBuilder to make sure inner nested sorts get
rewritten if they need to.
Improve testing around the ScriptSortBuilder#build method, adding checks for
correct transfers of the sort mode and nested sorts.
Also changing the behaviour around the nested_path, nested_filter vs. nested
parameter in a similar way as in #26490 and deprecating the setters/getters for
the old syntax.
Closes#17286
Security manager policy files contains grants for specific codebases,
where a codebase is a jar file. We use a system property containing the
name of the jar file to resolve the jar file location when parsing the
policy file. However, this means the version of the jars must be
modified when versions of dependencies change. This is particularly
messy for elasticsearch, where we now have a dependency on the rest
client, and need to support both a snapshot version for testing and non
snapshot for release.
This commit adds an alias for the elasticsearch rest client without a
version to be used in policy files. That allows the policy files to not care whether
the rest client is a snapshot or release.
Resoves #26332 where too many tasks occurred while adjustment was happening, the
measurements were reset to 0, and then an assert failed due to tasks executing
in 0 nanoseconds
When a cache entry expires, it remains in the cache (both the segment
that it belongs to, and the LRU list) until an eviction occurs. The
problem here is that the compute if absent implementation relies on
there not being an association to a key that we are trying to put
because it internally uses put if absent on the underlying segment. If
we try to put an association for a key that has expired but not been
evicted, then compute if absent will return as if there is nothing in
the cache for the given key, yet no call to compute if absent will
succeed in putting a new association for the key. To remedy this, we
modify the internal get method for the cache to let the caller take
action if the entry they are retrieving is expired. This allows the
compute if absent method to take the action of evicting the entry from
the cache, thus allowing the put if absent method used by compute if
absent to succeed for one of the callers trying to compute if absent a
new association.
Relates #26516
The current "Building Queries" and "Building Aggregations" pages are
located under the "Supported Apis" section because they are linked to
the "Search API" page.
It should instead be in a dedicated section: this commit adds a new
"Using Java Builders" section and renames few filenames in favor of
more meaningful names.
This change adds a dynamic cluster setting named `search.max_keep_alive`.
It is used as an upper limit for scroll expiry time in scroll queries and defaults to 1 hour.
This change also ensures that the existing setting `search.default_keep_alive` is always smaller than `search.max_keep_alive`.
Relates #11511
* check style
* add skip for bwc
* iter
* Add a maxium throttle wait time of 1h for reindex
* review
* remove empty line