This removes pipeline aggregators from the aggregation result tree
except for a single field used for backwards compatibility with pre-7.8
versions of Elasticsearch. That field isn't populated unless we are
serializing to pre-7.8 Elasticsearch. So, good news! We no longer build
pipeline aggregators on the data node. Most of the time.
This allows subclasses of `InternalAggregationTestCase` to make a `List`
of values to reduce so that it can make values that are realistic
*together*. The first use of this is with `InternalTTest` which uses it
to make results that don't cause their `sum` field to wrap. It'd likely
be useful for a ton of other aggs but just one for now.
Adds a detailed example to the "Avoid scripts" section of the "Tune
for search speed" docs. The detail outlines how a script used to
transform indexed data can be moved to ingest.
The update also removes an outdated reference to supported script
languages.
The usage of blank lines as separator between tests can be tricky to
deal with in case of merges where such lines can be added by accident.
Further more counting non-consecutive lines is non-intuitive.
The tests have been aligned to use ; at the end of the query and
exceptions so that the presence or absence of empty lines is irrelevant.
The parsing of the spec has been changed to perform validation to not
allow invalid/incomplete specs to cause exceptions.
(cherry picked from commit 192ad88d3a51e1e1f1f82830526518720ec88217)
If you didn't explictly set `global_ordinals` execution mode we were
never collecting the information that we needed to select `depth_first`
based on the request so we were always defaulting to `breadth_first`.
This fixes it so we collect the information.
#54803 introduces more QA tests for Azure storage service, but
they fail the build is one of the key or token is missing. It should i
nstead work like repository-azure:qa tests.
* Remove Unused Snapshot Status Values
This is a left-over from before #41940 when we used the same status enum for the shards
and the snapshots overall. The two removed values were never used on the shard level
so we can simply remove them here.
`testReduceRandom` was bumping up against the serialization that I added
in #54776. This makes it use random values that reduce in ways that
don't cause the randomized serialization to fail.
When a data frame analytics job is stopped, if the reindexing
task was still in progress we cancel it. Cancelling it should
be done from the same context as when we executed the reindexing
task. That means from a thread context with ML origin.
Backport of #54874
Asserting on the failed_step field from the explainAPI can produce flakiness
because the ILM state is moved back and forth between the (failing) step and
the ERROR step (as the workflow is retry, fail then move to ERROR step,
move back to the (failing) step, retry, fail, etc) and the failed_step
information is only available whilst in the ERROR state.
Unmute other tests as they were collateral failures
A read-only index could not be deleted in the wipeCluster phase and caused
these failures
(cherry picked from commit 99a6d57aeb3cf11abc38b514f38a96bb1612e357)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
* Only allow retrieving a single index or component template
This changes the Index Template v2 APIs to only allow retrieving a single "named" entity, where the
named entity can be nothing (return everything), a wildcard (return the ones that match), or the
name of a template.
Relates to #53101
* Throw exception when resource is not found
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
`scripted_metric` did not work with cross cluster search because it
assumed that you'd never perform a partial reduction, serialize the
results, and then perform a final reduction. That
serialized-after-partial-reduction step was broken.
This is also required to support #54758.
This replaces the last bit of validation that pipeline aggregations
performed on the data nodes with explicit checks in a few
`PipelineAggregationBuilders`. We were *already* catching these
validation errors for pipeline aggregations that require that their
parent be squentially ordered. This just adds validation for pipelines
that require *any* parent like `bucket_selector` and `bucket_sort`.
This commit adds a new point field that is able to index arbitrary pair of values (x/y)
in the cartesian space. It only supports filtering using shape queries at the moment.
This is a backport of #54803 for 7.x.
This pull request cherry picks the squashed commit from #54803 with the additional commits:
6f50c92 which adjusts master code to 7.x
a114549 to mute a failing ILM test (#54818)
48cbca1 and 50186b2 that cleans up and fixes the previous test
aae12bb that adds a missing feature flag (#54861)
6f330e3 that adds missing serialization bits (#54864)
bf72c02 that adjust the version in YAML tests
a51955f that adds some plumbing for the transport client used in integration tests
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
This commit updates the link to the JDK 14 compiler bug that we have
found. At the time that we committed the workaround, we had a submission
ID, but not yet the public bug URL. This commit adds the public bug URL.
This change adds the response headers of the original search request
in the stored response in order to be able to restore them when retrieving a result
from the async-search index. It also ensures that response headers are preserved for
users that retrieve a final response on a running search task.
Partial response can eventually return response headers too but this change only ensures
that they are present when the response if final.
Relates #33936
This change ensures that the AsyncSearchUser is correctly (de)serialized when
an action executed by this user is sent to a remote node internally (via transport client).
Now that JDK 14 is available, and we are bundling it, this commit
enables us to run with helpful null pointer exceptions, which will be a
great aid in debugging.
Some functions act as shortcuts for more verbose declarations (sometimes
with certain constraints). This PR removes the boilerplate around
declaring such functions as well as a dedicated rule for the optimizer
to perform the actual substitution.
Fix#54334
(cherry picked from commit 3231d01b0c583deb89252fafe84db48878da3246)
There were some failures on 7.x of field collapse tests,
where total hits count was less then expected.
This adds an additional test to check total hits count
before field collapse queries to understand if the problem
is with field collapsing or with simply that writes have
not been finished yet
Relates to #52416
First step towards async search execution. At the moment we don't try to cancel
the underlying search requests, and just check if the task is canceled before
performing network operation (such as field caps and search)
Relates to #49638
Adds t_test metric aggregation that can perform paired and unpaired two-sample
t-tests. In this PR support for filters in unpaired is still missing. It will
be added in a follow-up PR.
Relates to #53692
Previously, the id of the `GetDataFrameAnalyticsStatsAction.Request`
could be `null` which caused NPE on serialization as `writeString`
is used (it doesn't accept null values).
This commit ensures the id is never null.
Closes#54807
Backport of #54808
Today when canceling a task we broadcast ban/unban requests to all nodes
in the cluster. This strategy does not scale well for hierarchical
cancellation. With this change, we will track outstanding child requests
and broadcast the cancellation to only nodes that have outstanding child
tasks. This change also prevents a parent task from sending child
requests once it got canceled.
Relates #50990
Supersedes #51157
Co-authored-by: Igor Motov <igor@motovs.org>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Looking into #50237 I realized that two of the examples given in the
documentation around date math rounding for range queries on date fields using
`gt` and `lt` is slightly off by a nanosecond. This PR changes this to the
bounds that are currently parsed using these parameters.
Added decompression of gzip when gzip value is return as an header from Elasticsearch
(cherry picked from commit 4a195b573ab85d4e756669c953419ebdb3003442)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Co-authored-by: Hakky54 <hakangoudberg@hotmail.com>
This commit addresses a long-standing `// TODO` in the coordinator tests to
ensure that the correct no-master block is applied when a node restarts while
disconnected from the cluster.
It also strengthens this test to check that the no-master block is applied
correctly on all nodes, not just the previous master.
It seems the 20 seconds timeout is occasionally not enough.
We still get sporadic failures where the logs reveal the job
wasn't opened within 20 seconds. I'm increasing the wait time
to 30 seconds.
Closes#54448
Backport of #54792
Test `testAckListenerReceivesNacksFromFollowerInHigherTerm` was suppressed as
when it was written it didn't work due the lack of proper term bumping. We
added term bumping but never got around to implementing this test. This commit
addresses this.