* ILM use Priority.IMMEDIATE for stop ILM cluster update (#54909)
This changes the priority of the cluster state update that stops ILM
altogether to `IMMEDIATE`. We've chosen to change this as it can be useful to
temporarily stop ILM if a cluster is overwhelmed, but a `NORMAL`
priority can see the "stop ILM update" not make it up the tasks queue.
On the same note, we're keeping the `start ILM` cluster update priority
to `NORMAL` on purpose such that we only start `ILM` if the cluster can
handle it.
(cherry picked from commit d67df3a7cd2a8619c2c9efac4dde3ba83271f2fa)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
This change makes sure that all internal client requests spawned by the
data frame analytics persistent task executor and that use the end user
security credentials, have the parent task id assigned. The objective here
is to permit auditing (as well as tracking for debugging purposes) of all
the end-user requests executed on its behalf by persistent tasks.
Because data frame analytics taks already implements graceful shutdown
of child tasks, this change does not interfere with it by opting out of
the persistent task cancellation of child tasks.
Relates #54943#52314
We added a fancy method to provide random realistic test data to the
reduction tests in #54910. This uses that to remove some of the more
esoteric machinations in the agg tests. This will marginally increase
the coverage of the serialiation tests and, more importantly, remove
some mysterious value generation code that only really made sense for
random reduction tests but was used all over the place. It doesn't, on
the other hand, make the tests shorter. Just *hopefully* more clear.
I only cleaned up a few tests this way. If we like this it'd probably be
worth grabbing others.
We currently create the .async-search index if necessary before performing any action (index, update or delete). Truth is that this is needed only before storing the initial response. The other operations are either update or delete, which will anyways not find the document to update/delete even if the index gets created when missing. This also caused `testCancellation` failures as we were trying to delete the document twice from the .async-search index, once from `TransportDeleteAsyncSearchAction` and once as a consequence of the search task being completed. The latter may be called after the test is completed, but before the cluster is shut down and causing problems to the after test checks, for instance if it happens after all the indices have been cleaned up. It is totally fine to try to delete a response that is no longer found, but not quite so if such call will also trigger an index creation.
With this commit we remove all the calls to createIndexIfNecessary from the update/delete operation, and we leave one call only from storeInitialResponse which is where the index is expected to be created.
Closes#54180
We recently cleaned up the use of the word "metadata" across the
codebase. Even more additional uses have trickled in, likely from
in-progress work. This commit cleans up these last few additional
instances.
Relates #54519
The use of available processors, the terminology, and the settings
around it have evolved over time. This commit cleans up some places in
the codes and in the docs to adjust to the current terminology.
Deprecate alternative sequence parameter declaration (with then by)
Disallow lack of time units inside maxspan
Fix#55023
Relate #54680
(cherry picked from commit 201adafba9def1de4bf843760defb9def3394f63)
Implement DATETIME_PARSE(<datetime_str>, <pattern_str>) function
which allows to parse a datetime string according to the specified
pattern into a datetime object. The patterns allowed are those of
java.time.format.DateTimeFormatter.
Relates to #53714
(cherry picked from commit 3febcd8f3cdf9fdda4faf01f23a5f139f38b57e0)
Today, we do not clear the recent errors in AutoFollowCoordinator when
we successfully auto-follow indices. This can lead to confusion for the
operators.
This change preserves the task id for internal requests for the `StartDatafeedPersistentTask`.
Task ids are a way to express a relationship between related internal requests.
In this particular case, the task ids are used for debugging and (soon) security auditing,
but not for task cancellation, because there is already a graceful-shutdown of child
internal requests (given a task id) in place.
This commit includes a number of changes to reduce overall build
configuration time. These optimizations include:
- Removing the usage of the 'nebula.info-scm' plugin. This plugin
leverages jgit to load read various pieces of VCS information. This
is mostly overkill and we have our own minimal implementation for
determining the current commit id.
- Removing unnecessary build dependencies such as perforce and jgit
now that we don't need them. This reduces our classpath considerably.
- Expanding the usage lazy task creation, particularly in our
distribution projects. The archives and packages projects create
lots of tasks with very complex configuration. Avoiding the creation
of these tasks at configuration time gives us a nice boost.
This change reintroduces the system index APIs for Kibana without the
changes made for marking what system indices could be accessed using
these APIs. In essence, this is a partial revert of #53912. The changes
for marking what system indices should be allowed access will be
handled in a separate change.
The APIs introduced here are wrapped versions of the existing REST
endpoints. A new setting is also introduced since the Kibana system
indices' names are allowed to be changed by a user in case multiple
instances of Kibana use the same instance of Elasticsearch.
Relates #52385
Backport of #54858
* Drop BASE TABLE type in favour for just TABLE
This commit drops the table type 'BASE TABLE' and replaces all
occurences with just 'TABLE', since his type is wider-used and
friendlier to the client applications that query for certain table types
in their discovery mode.
The 'TABLE' type is also explicitely mentioned by the JDBC and ODBC
standards and although other data source-specific types are permitted,
older apps will not work well with them.
* Refactor table type constants out of IndexType
Move SQL_TABLE/_ALIAS out of IndexType, so that they can also be used in
that Enum definition.
(cherry picked from commit 70241b52697ac2cf71004040042123c1ec050299)
Implement DATETIME_FORMAT(<date/datetime/time>, ) function
which allows for formatting a timestamp to the specified format. The
patterns allowed as those of java.time.format.DateTimeFormatter.
Related to #53714
(cherry picked from commit 72be0b54a9299e87e785469cdc9aafac2a48c046)
Today the shards of searchable snapshots are allocated with a naive
`ExistingShardsAllocator` which selects the first valid node for each shard.
Thanks to #54729 we can now allow these shards to fall through to the balanced
shards allocator so that they are allocated in a more balanced fashion.
Relates #50999
Guava was removed from Elasticsearch many years ago, but remnants of it
remain due to transitive dependencies. When a dependency pulls guava
into the compile classpath, devs can inadvertently begin using methods
from guava without realizing it. This commit moves guava to a runtime
dependency in the modules that it is needed.
Note that one special case is the html sanitizer in watcher. The third
party dep uses guava in the PolicyFactory class signature. However, only
calling a method on the PolicyFactory actually causes the class to be
loaded, a reference alone does not trigger compilation to look at the
class implementation. There we utilize a MethodHandle for invoking the
relevant method at runtime, where guava will continue to exist.
This commit introduces a new `geo` module that is intended
to be contain all the geo-spatial-specific features in server.
As a first step, the responsibility of registering the geo_shape
field mapper is moved to this module.
Co-authored-by: Nicholas Knize <nknize@gmail.com>
This removes pipeline aggregators from the aggregation result tree
except for a single field used for backwards compatibility with pre-7.8
versions of Elasticsearch. That field isn't populated unless we are
serializing to pre-7.8 Elasticsearch. So, good news! We no longer build
pipeline aggregators on the data node. Most of the time.
This allows subclasses of `InternalAggregationTestCase` to make a `List`
of values to reduce so that it can make values that are realistic
*together*. The first use of this is with `InternalTTest` which uses it
to make results that don't cause their `sum` field to wrap. It'd likely
be useful for a ton of other aggs but just one for now.
The usage of blank lines as separator between tests can be tricky to
deal with in case of merges where such lines can be added by accident.
Further more counting non-consecutive lines is non-intuitive.
The tests have been aligned to use ; at the end of the query and
exceptions so that the presence or absence of empty lines is irrelevant.
The parsing of the spec has been changed to perform validation to not
allow invalid/incomplete specs to cause exceptions.
(cherry picked from commit 192ad88d3a51e1e1f1f82830526518720ec88217)
#54803 introduces more QA tests for Azure storage service, but
they fail the build is one of the key or token is missing. It should i
nstead work like repository-azure:qa tests.
`testReduceRandom` was bumping up against the serialization that I added
in #54776. This makes it use random values that reduce in ways that
don't cause the randomized serialization to fail.
When a data frame analytics job is stopped, if the reindexing
task was still in progress we cancel it. Cancelling it should
be done from the same context as when we executed the reindexing
task. That means from a thread context with ML origin.
Backport of #54874
Asserting on the failed_step field from the explainAPI can produce flakiness
because the ILM state is moved back and forth between the (failing) step and
the ERROR step (as the workflow is retry, fail then move to ERROR step,
move back to the (failing) step, retry, fail, etc) and the failed_step
information is only available whilst in the ERROR state.
Unmute other tests as they were collateral failures
A read-only index could not be deleted in the wipeCluster phase and caused
these failures
(cherry picked from commit 99a6d57aeb3cf11abc38b514f38a96bb1612e357)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
`scripted_metric` did not work with cross cluster search because it
assumed that you'd never perform a partial reduction, serialize the
results, and then perform a final reduction. That
serialized-after-partial-reduction step was broken.
This is also required to support #54758.
This commit adds a new point field that is able to index arbitrary pair of values (x/y)
in the cartesian space. It only supports filtering using shape queries at the moment.
This is a backport of #54803 for 7.x.
This pull request cherry picks the squashed commit from #54803 with the additional commits:
6f50c92 which adjusts master code to 7.x
a114549 to mute a failing ILM test (#54818)
48cbca1 and 50186b2 that cleans up and fixes the previous test
aae12bb that adds a missing feature flag (#54861)
6f330e3 that adds missing serialization bits (#54864)
bf72c02 that adjust the version in YAML tests
a51955f that adds some plumbing for the transport client used in integration tests
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
This change adds the response headers of the original search request
in the stored response in order to be able to restore them when retrieving a result
from the async-search index. It also ensures that response headers are preserved for
users that retrieve a final response on a running search task.
Partial response can eventually return response headers too but this change only ensures
that they are present when the response if final.
Relates #33936
This change ensures that the AsyncSearchUser is correctly (de)serialized when
an action executed by this user is sent to a remote node internally (via transport client).
Some functions act as shortcuts for more verbose declarations (sometimes
with certain constraints). This PR removes the boilerplate around
declaring such functions as well as a dedicated rule for the optimizer
to perform the actual substitution.
Fix#54334
(cherry picked from commit 3231d01b0c583deb89252fafe84db48878da3246)
First step towards async search execution. At the moment we don't try to cancel
the underlying search requests, and just check if the task is canceled before
performing network operation (such as field caps and search)
Relates to #49638
Adds t_test metric aggregation that can perform paired and unpaired two-sample
t-tests. In this PR support for filters in unpaired is still missing. It will
be added in a follow-up PR.
Relates to #53692
Previously, the id of the `GetDataFrameAnalyticsStatsAction.Request`
could be `null` which caused NPE on serialization as `writeString`
is used (it doesn't accept null values).
This commit ensures the id is never null.
Closes#54807
Backport of #54808
It seems the 20 seconds timeout is occasionally not enough.
We still get sporadic failures where the logs reveal the job
wasn't opened within 20 seconds. I'm increasing the wait time
to 30 seconds.
Closes#54448
Backport of #54792
This changes a SamlServiceProvider to have a function that maps
from an "action-name" to set of role-names instead of a Map that does
so.
The on-disk representation of this mapping is a set of Java Regexp
Patterns, for which the first matching group is the role name.
For example "sso:(\w+)" would map any action that started with "sso:"
to the corresponding role name (e.g. "sso:superuser" -> "superuser").
Backport of: #54440
The SamlIdentityProviderTests IntegTests would sometimes encounter a
service unavailable exception when registering a new service provider.
This change ensure that there is a data node, and that the cluster
state is recovered before registering providers
Backport of: #54622
A remote client can throw a NoSuchRemoteClusterException while fetching
the cluster state from the leader cluster. We also need to handle that
exception when retrying to add a retention lease to the leader shard.
Closes#53225
The autoscaling REST tests use policies named "hot" in their test
cases. Instead, this commit changes the name of these policies to
"my_autoscaling_policy".
Some field name constants were not updaten when we moved from "string" to "text"
and "keyword" fields. Renaming them makes it easier and faster to know which
field type is used in test subclassing this base test case.
The test results are affected by the off-by-one error that is
fixed by https://github.com/elastic/ml-cpp/pull/1122
This test can be unmuted once that fix is merged and has been
built into ml-cpp snapshots.
Force stopping a failed job used to work but it
now puts the job in `stopping` state and hangs.
In addition, force stopping a `stopping` job is
not handled.
This commit addresses those issues with force
stopping data frame analytics. It inlines the
approach with that followed for anomaly detection
jobs.
Backport of #54650
Backport of #54576.
This commit is part of issue #40366 to remove disabled Xlint warnings
from gradle files. Remove the Xlint exclusions from the following files:
- x-pack/plugin/rollup/build.gradle
- x-pack/plugin/monitoring/build.gradle
- x-pack/qa/rolling-upgrade-basic/build.gradle
Add type parameters to parameterized types. Add wildcard-type parameters
or bounded wildcard-type parameters. Suppress `unchecked` and `rawtypes`
warnings at method level.
In #33933 we disallowed changing the `enabled` parameter in object mappings.
However, the fix didn't cover the root object mapper. This PR adjusts the change
to also include the root mapper and clarifies the error message.
This adds training_percent parameter to the analytics process for Classification and Regression. This parameter is then used to give more accurate memory estimations.
See native side pr: elastic/ml-cpp#1111
Removes pipeline aggregations from the aggregation result tree as they
are no longer used. This stops us from building the pipeline aggregators
at all on data nodes except for backwards compatibility serialization.
This will save a tiny bit of space in the aggregation tree which is
lovely, but the biggest benefit is that it is a step towards simplifying
pipeline aggregators.
This only does about half of the work to remove the pipeline aggs from
the tree. Removing all of it would, well, double the size of the change
and make it harder to review.
* [ML] add new inference_config field to trained model config (#54421)
A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder.
The inference processor can still override whatever is set as the default in the trained model config.
* fixing for backport
This commit addresses an issue with the autoscaling feature flag not
being registered in release builds of the internal cluster tests. This
commit addresses this by enabling the system property that is needed,
but only in release builds.
* [ML] prefer secondary authorization header for data[feed|frame] authz (#54121)
Secondary authorization headers are to be used to facilitate Kibana spaces support + ML jobs/datafeeds.
Now on PUT/Update/Preview datafeed, and PUT data frame analytics the secondary authorization is preferred over the primary (if provided).
closes https://github.com/elastic/elasticsearch/issues/53801
* fixing for backport
- Consolidates HDR/TDigest factories into a single factory
- Consolidates most HDR/TDigest builder into an abstract builder
- Deprecates method(), compression(), numSigFig() in favor of a new
unified PercentileConfig object
- Disallows setting algo options that don't apply to current algo
The unified config method carries both the method and algo-specific
setting. This provides a mechanism to reject settings that apply
to the wrong algorithm. For BWC the old methods are retained
but marked as deprecated, and can be removed in future versions.
Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
These methods are not needed, we were only following a pattern in the
rest of the codebase, but it's legacy from the HLRC sharing
request/response objects with the server.
When one of ML's normalize processes fails to connect to the JVM
quickly enough and another normalize process for the same job
starts shortly afterwards it is possible that their named pipes
can get mixed up.
This change avoids the risk of that by adding an incrementing
counter value into the named pipe names used for normalize
processes.
Backport of #54636
* [ML] add num_matches and preferred_to_categories to category defintion objects (#54214)
This adds two new fields to category definitions.
- `num_matches` indicating how many documents have been seen by this category
- `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized.
These fields are only guaranteed to be up to date after a `_flush` or `_close`
native change: https://github.com/elastic/ml-cpp/pull/1062
* adjusting for backport
* Rename cat.transform => cat.transforms
To match the url.
We typically prefer singular url nouns but _cat tends to use plural and
this API does in fact uses `/_cat/transforms`
* also rename the api in the spec and tests
(cherry picked from commit c495d220ac8fedba7f70f82387cd6d6a672b8b14)
SamlIdentityProviderTests is failing with 409 conflicts that have not
been reproducible outside of CI.
This change turn on additional logging in this test to determine why
these conflict occur.
Relates: #54423
Backport of: #54475
This commit updates the rest API specs to validate against a
JSON schema for the specifications. Most updates are to add
a description, whilst others fix typos and unify conventions
e.g. deprecations, descriptions, urls starting with /. The schema
conforms to draft-07 JSON schema.
(cherry picked from commit da37e01d32f9764c3937736ef0c7d3ab40af9a77)
* Refactor nodes stats request builders to match requests (#54363)
* Remove hard-coded setters from NodesInfoRequestBuilder
* Remove hard-coded setters from NodesStatsRequest
* Use static imports to reduce clutter
* Remove uses of old info APIs
The test name says it is testing the put autoscaling decision API, but
that is not right, since no such API exists (nor will exist). This
commit corrects the name of this test to reflect the fact that the test
is about the put autoscaling policy API.
Refactor SearchHit to have separate document and meta fields.
This is a part of bigger refactoring of issue #24422 to remove
dependency on MapperService to check if a field is metafield.
Relates to PR: #38373
Relates to issue #24422
Co-authored-by: sandmannn <bohdanpukalskyi@gmail.com>
We do not have access to JDK 9 collection convenience methods in 7.x
because we are compatible with JDK 8 there. Yet, we have recently added
a substitute for these convenience methods that even delegate to the
right places when running on JDK 9, to make backporting easier. This
commit utilizes these new methods in the autoscaling codebase.
Use the same ES cluster as both an SP and an IDP and perform
IDP initiated and SP initiated SSO. The REST client plays the role
of both the Cloud UI and Kibana in these flows
Backport of #54215
* fix compilation issues
This commit is the first in a series of commits that introduces
autoscaling policies, and APIs for working with them. For now, we
introduce the basic infrastructure, and a single API for putting an
autoscaling policy. We will follow in rapid succession with APIs for
getting, and deleting autoscaling policies.
When the SAML authentication is not successful, we return a SAML
Response with a status that indicates a failure. This commit adds
an error message in the REST API response along with the SAML
Response XML string so that the caller of the API can identify
that this is an unsuccessful response without needing to parse the
XML.
`AsyncSearchActionTests` currently fails quite often. That is since the introduction of `RestSubmitAsyncSearchActionTests` which indirectly manipulates the channels being tracked in `RestCancellableNodeClient`. There are channels left in the map after `RestSubmitAsyncSearchActionTests` is run, and later `AsyncSearchActionTests` checks that there are no channels in the map which makes each test method fail. This is particularily hard to reproduce as the order in which tests are run appears to be platform dependent.
The test cluster assertion that there are no channels in the map only makes sense in the context of internal cluster tests, while there may be collisions with unit tests that register http channels as part of their testing.
This can be solved by renaming `AsyncSearchActionTests` to `AsyncSearchActionIT`. This way it won't be run as part of unit tests but rather within another JVM where the number of channels is `0` and such assumption holds, because there are no expected manual manipulation of the channels.
Relates to #54180
This is a follow up to a previous commit that renamed MetaData to
Metadata in all of the places. In that commit in master, we renamed
META_DATA to METADATA, but lost this on the backport. This commit
addresses that.
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
* Comprehensively test supported/unsupported field type:agg combinations (#52493)
This adds a test to AggregatorTestCase that allows us to programmatically
verify that an aggregator supports or does not support a particular
field type. It fetches the list of registered field type parsers,
creates a MappedFieldType from the parser and then attempts to run
a basic agg against the field.
A supplied list of supported VSTypes are then compared against the
output (success or exception) and suceeds or fails the test accordingly.
Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com>
* Skip fields that are not aggregatable
* Use newIndexSearcher() to avoid incompatible readers (#52723)
Lucene's `newSearcher()` can generate readers like ParallelCompositeReader
which we can't use. We need to instead use our helper `newIndexSearcher`
This PR:
1. Fixes the bug where a cardinality estimate of zero could cause
a 500 status
2. Adds tests for that scenario and a few others
3. Adds sensible estimates for the cases that were previously TODO
Backport of #54462
This adds the ability for the IdP to define wildcard service
providers in a JSON file within the ES node's config directory.
If a request is made for a service provider that has not been
registered, then the set of wildcard services is consulted. If the
SP entity-id and ACS match one of the wildcard patterns, then a
dynamic service provider is defined from the associated mustache
template.
Backport of: #54148
This commit clarifies the autoscaling feature flag registration system
property. The intention is that this system property is:
- unset in snapshot builds
- unset, true, or false in release builds
- in release builds, unset behaves the same as false
- therefore, we only register the enabled flag if the build is a
snapshot build, or the build is a release build and the system
property is set to true
This commit clarifies that intention, and removed a confusion situation
where the AUTOSCALING_FEATURE_FLAG_REGISTERED field would be set to
false in a snapshot build, even though we were going to register the
setting.
* EQL: Use In from QL
* EQL: Add more In tests
* EQL: Test In duplicates
* EQL: Add test for In mixed types
* EQL: Copy In translation to QL
* SQL: Use InComparisons from QL
* EQL: Remove boost checks from QueryFolderOkTests
* QL: Add TranslatorHandler.convert
The allowTrial flag is always true, since trial licenses act as though
everything is licensed. This commit removes the allowTrial flag in
license checking helper methods.
Pipeline aggregations like `stats_bucket`, `sum_bucket`, and
`percentiles_bucket` only operate on buckets that have multiple buckets.
This adds support for those aggregations to `geo_distance`, `ip_range`,
`auto_date_histogram`, and `rare_terms`.
This all happened because we used a marker interface to mark compatible
aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget
to implement the interface.
This replaces the marker interface with an abstract method in
`AggregationBuilder`, `bucketCardinality` which makes you return `NONE`,
`ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At
this point `ONE` and `NONE` amount to about the same thing, but I
suspect that'll be a useful distinction when validating bucket sorts.
Closes#53215
Fixing the naming of the HLRC values to match the ToXContent field names (i.e. the field names returned from an API call).
Also fixes the names in the _cat API as well.
closes#53946
Backport of #53982
In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams,
the abstraction needs to be made more flexible, because currently it really can be only
an alias or an index.
* Renamed `AliasOrIndex` to `IndexAbstraction`.
* Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is.
* Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum.
* Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface.
* Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`.
* Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method.
Relates to #53100
Today the security plugin stashes a copy of the environment in its
constructor, and uses the stashed copy to construct its components even
though it is provided with an environment to create these
components. What is more, the environment it creates in its constructor
is not fully initialized, as it does not have the final copy of the
settings, but the environment passed in while creating components
does. This commit removes that stashed copy of the environment.
Today the machine learning plugin stashes a copy of the environment in
its constructor, and uses the stashed copy to construct its components
even though it is provided with an environment to create these
components. What is more, the environment it creates in its constructor
is not fully initialized, as it does not have the final copy of the
settings, but the environment passed in while creating components
does. This commit removes that stashed copy of the environment.
Currently all of our transport protocol decoding and aggregation occurs
in the individual transport modules. This means that each implementation
(test, netty, nio) must implement this logic. Additionally, it means
that the entire message has been read from the network before the server
package receives it.
This commit creates a pipeline in server which can be passed arbitrary
bytes to handle. Internally, the pipeline will decode, decompress, and
aggregate the messages. Additionally, this allows us to run many
megabytes of bytes through the pipeline in tests to ensure that the
logic works.
This work will enable future work:
Circuit breaking or backoff logic based on message type and byte
in the content aggregator.
Sharing bytes with the application layer using the ref counted
releasable network bytes.
Improved network monitoring based specifically on channels.
Finally, this fixes the bug where we do not circuit break on the correct
message size when compression is enabled.
* Add REST APIs for IndexTemplateV2Metadata CRUD (#54039)
* Add REST APIs for IndexTemplateV2Metadata CRUD
This commit adds the get/put/delete APIs for interacting with the now v2 versions of index
templates.
These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag.
Relates to #53101
* Add exceptions for HLRC tests
* Add skips for 7.x versions
* Use index_template instead of template_v2 in action names
* Add test for MetaDataIndexTemplateService.addIndexTemplateV2
* Move removal to static method and add test
* Add unit tests for request classes (implement hashCode & equals)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* Fix compilation
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Today we assign CCR persistent tasks to nodes with the data role. It
could be that the data node is not capable of connecting to remote
clusters, in which case the task will fail since it can not connect to
the remote cluster with the leader shard. Instead, we need to assign
such tasks to nodes that are capable of connecting to remote
clusters. This commit addresses this by enabling such persistent tasks
to only be assigned to nodes that have the data role, and also have the
remote cluster client role.
optimize transform for group_by on date_histogram by injecting an additional range query. This limits the number of search and index requests and avoids unnecessary updates. Only recent buckets get re-written.
fixes#54254
This commit causes negative TimeValues, other than -1 which is sometimes used as
a sentinel value, to be rejected during parsing.
Also introduces a hack to allow ILM to load policies which were written to the
cluster state with a negative min_age, treating those values as 0, which should
match the behavior of prior versions.
The NodesStatsRequest class uses a set of strings for its internal
serialization. This commit updates the class's interface so that we
no longer use hard-coded getters and setters, but rather
methods that add strings directly. For example, the old way of
adding "os" metrics to a request would be to call request.os(true).
The new way of doing this is to call request.addMetric("os").
For the time being, the canonical list of metrics is an enum in
NodesStatsRequest. This will eventually be replaced with something
pluggable.
This commit populates the _stats API response with sensible "empty"
`data_counts` and `memory_usage` objects when the job itself
has not started reporting them.
Backport of #54210
Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and
requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no
longer needed since we now only send the parameters along with the rest request
that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct
defaults are set on the client side. This change removes setting and sending
these defaults where possible, leaving only the overwrite of batchedReduceSize
with a default value of 5, since the default used in the vanilla SearchRequest
is 512. However, we don't need to send this value along as a request parameter
if its the default since the correct one will be set on the receiving end if no
value is specified.
Also adding tests for RestSubmitAsyncSearchAction that check the correct
defaults are set when parameters are missing on the server side.
Backport of #54200
When get filters is called without setting the `size`
paramter only up to 10 filters are returned. However,
100 filters should be returned. This commit fixes this
and adds an integ test to guard it.
It seems this was accidentally broken in #39976.
Closes#54206
Backport of #54207
This commit adjusts testWatcherRestart to vary the template version number it
checks for based on the ES version being upgraded from, because the v11 template
is only installed on clusters with all nodes >=7.7.0.
Today the `XPackRestTestHelper` makes some REST calls without the `error_trace`
parameter, so that if they fail due to an exception we do not see very much
detail. This commit adds the `error_trace` parameter to help identify why these
REST calls fail.
Changes ThreadPool's schedule method to run the schedule task in the context of the thread
that scheduled the task.
This is the more sensible default for this method, and eliminates a range of bugs where the
current thread context is mistakenly dropped.
Closes#17143
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search.
Also it renames clean_on_completion to keep_on_completion and turns around its behaviour.
Closes#54069
Xpack license state contains a helper method to determine whether
security is disabled due to license level defaults. Most code needs to
know whether security is enabled, not disabled, but this method exists
so that the security being explicitly disabled can be distinguished from
licence level defaulting to disabled. However, in the case that security
is explicitly disabled, the handlers in question are never registered,
so security is implicitly not disabled explicitly, and thus we can share
a single method to know whether licensing is enabled.
With the addition of a formal role for nodes indicating remote cluster connection, the transform specific attribute `node.attr.transform.remote_connect` is no longer necessary.
closes https://github.com/elastic/elasticsearch/issues/54179
This drop the "top level" pipeline aggregators from the aggregation
result tree which should save a little memory and a few serialization
bytes. Perhaps more imporantly, this provides a mechanism by which we
can remove *all* pipelines from the aggregation result tree. This will
save quite a bit of space when pipelines are deep in the tree.
Sadly, doing this isn't simple because of backwards compatibility. Nodes
before 7.7.0 *need* those pipelines. We provide them by setting passing
a `Supplier<PipelineTree>` into the root of the aggregation tree that we
only call if we need to serialize to a version before 7.7.0.
This solution works for cross cluster search because we always reduce
the aggregations in each remote cluster and then forward them back to
the coordinating node. Its quite possible that the coordinating node
needs the pipeline (say it is version 7.1.0) and the gateway node in the
remote cluster doesn't (version 7.7.0). In that case the data nodes
won't send the pipeline aggregations back to the gateway node.
Critically, the gateway node *will* send the pipeline aggregations back
to the coordinating node. This is all managed with that
`Supplier<PipelineTree>`, but *how* it is managed is a bit tricky.
* Upgrade GCS Dependency to 1.106.0 (#54092)
Upgrading GCS Dep + related dependencies as it seems some more retry bugs were fixed between .104 and .106
* transform.cat should live in the cat namespace.
Similarly to to ml cat API's also living in the `cat` namespace.
Clients treat the `cat` namespace differently then other API's (return
types, content types). This introduces an exception to this rule.
* rename the specification file as well
(cherry picked from commit 0a98904b1a73a30bbaebc32bd16a238c8d03c329)
* Required for elastic/kibana#50757.
Allows the kibana user to collect APM telemetry in a background task.
* removed unnecessary priviledges on `.ml-anomalies-*` for the `kibana_system` reserved role
This change merges the "feature-internal-idp" branch into Elasticsearch.
This introduces a small identity-provider plugin as a child of the x-pack module.
This allows ES to act as a SAML IdP, for users who are authenticated against the
Elasticsearch cluster.
This feature is intended for internal use within Elastic Cloud environments
and is not supported for any other use case. It falls under an enterprise license tier.
The IdP is disabled by default.
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
This commit ensures that the hidden index settings are only applied to the
Transform index templates when the cluster can support those settings.
Also unmutes the tests which were failing due to the previous behavior.
Improve separation of scripting between EQL and SQL by delegating common
methods to QL. The context detection is determined based on the package
to avoid having repetitive class hierarchies.
The Painless whitelists have been improved so that the declaring class
is used instead of the inherited one.
Relates #53688
(cherry picked from commit 6d46033e736c64ac9255c5d6964600d2a931430a)
EQL: Add Substring function with Python semantics (#53688)
Does not reuse substring from SQL due to the difference in semantics and
the accepted arguments.
Currently it is missing full integration tests as, due to the usage of
scripting, requires an actual integration test against a proper cluster
(and likely its own QA project).
(cherry picked from commit f58680bad33d5ce4139157a69a4d9f5f286bc3c4)
As classification now works for multiple classes, randomly
picking training/test data frame rows is not good enough.
This commit introduces a stratified cross validation splitter
that maintains the proportion of the each class in the dataset
in the sample that is used for training the model.
Backport of #54087
Fixes an issue where the elasticsearch-node command-line tools would not work correctly
because PersistentTasksCustomMetaData contains named XContent from plugins. This PR
makes it so that the parsing for all custom metadata is skipped, even if the core system would
know how to handle it.
Closes#53549
Submit async search forces pre_filter_shard_size for the underlying search that it creates.
With this commit we also prevent users from overriding such default as part of request validation.
This commit adds an explicit cancellation of the search task if
the initial async search submit task is cancelled (connection closed by the user).
This was previously done through the cancellation of the parent task but we don't
handle grand-children cancellation yet so we have to manually cancel the search task
in order to ensure that shard actions are cancelled too.
This change can be considered as a workaround until #50990 is fixed.
* add flush always output option that will flush the output printer
after each debug message when enabled (disabled by default)
* at debug output initializationtime, log debug output
information about OS, JVM and default JVM timezone
(cherry picked from commit b5db9657d1eadce9902041e5b128bf32c02d302a)
It is possible for ML jobs to open lazily if the "allow_lazy_open"
option in the job config is set to true. Such jobs wait in the
"opening" state until a node has sufficient capacity to run them.
This commit fixes the bug that prevented datafeeds for jobs lazily
waiting assignment from being started. The state of such datafeeds
is "starting", and they can be stopped by the stop datafeed API
while in this state with or without force.
Backport of #53918
Role names are now compiled from role templates before role mapping is saved.
This serves as validation for role templates to prevent malformed and invalid scripts
to be persisted, which could later break authentication.
Resolves: #48773
This commit instruments data frame analytics
with stats for the data that are being analyzed.
In particular, we count training docs, test docs,
and skipped docs.
In order to account docs with missing values as skipped
docs for analyses that do not support missing values,
this commit changes the extractor so that it only ignores
docs with missing values when it collects the data summary,
which is used to estimate memory usage.
Backport of #53998
add 2 additional stats: processing time and processing total which capture the
time spent for processing results and how often it ran. The 2 new stats
correspond to the existing indexing and search stats. Together with indexing
and search this now allows the user to see the full picture, all 3 stages.
This is the first in a series of commits that will introduce the
autoscaling deciders framework. This commit introduces the basic
framework for representing autoscaling decisions.
Adds multi-class feature importance calculation.
Feature importance objects are now mapped as follows
(logistic) Regression:
```
{
"feature_name": "feature_0",
"importance": -1.3
}
```
Multi-class [class names are `foo`, `bar`, `baz`]
```
{
“feature_name”: “feature_0”,
“importance”: 2.0, // sum(abs()) of class importances
“foo”: 1.0,
“bar”: 0.5,
“baz”: -0.5
},
```
For users to get the full benefit of aggregating and searching for feature importance, they should update their index mapping as follows (before turning this option on in their pipelines)
```
"ml.inference.feature_importance": {
"type": "nested",
"dynamic": true,
"properties": {
"feature_name": {
"type": "keyword"
},
"importance": {
"type": "double"
}
}
}
```
The mapping field name is as follows
`ml.<inference.target_field>.<inference.tag>.feature_importance`
if `inference.tag` is not provided in the processor definition, it is not part of the field path.
`inference.target_field` is defaulted to `ml.inference`.
//cc @lcawl ^ Where should we document this?
If this makes it in for 7.7, there shouldn't be any feature_importance at inference BWC worries as 7.7 is the first version to have it.
This commit removes the configuration time vs execution time distinction
with regards to certain BuildParms properties. Because of the cost of
determining Java versions for configuration JDK locations we deferred
this until execution time. This had two main downsides. First, we had
to implement all this build logic in tasks, which required a bunch of
additional plumbing and complexity. Second, because some information
wasn't known during configuration time, we had to nest any build logic
that depended on this in awkward callbacks.
We now defer to the JavaInstallationRegistry recently added in Gradle.
This utility uses a much more efficient method for probing Java
installations vs our jrunscript implementation. This, combined with some
optimizations to avoid probing the current JVM as well as deferring
some evaluation via Providers when probing installations for BWC builds
we can maintain effectively the same configuration time performance
while removing a bunch of complexity and runtime cost (snapshotting
inputs for the GenerateGlobalBuildInfoTask was very expensive). The end
result should be a much more responsive build execution in almost all
scenarios.
(cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)
This moves the pipeline aggregation validation from the data node to the
coordinating node so that we, eventually, can stop sending pipeline
aggregations to the data nodes entirely. In fact, it moves it into the
"request validation" stage so multiple errors can be accumulated and
sent back to the requester for the entire request. We can't always take
advantage of that, but it'll be nice for folks not to have to play
whack-a-mole with validation.
This is implemented by replacing `PipelineAggretionBuilder#validate`
with:
```
protected abstract void validate(ValidationContext context);
```
The `ValidationContext` handles the accumulation of validation failures,
provides access to the aggregation's siblings, and implements a few
validation utility methods.
Benchmarking showed that the effect of the ExitableDirectoryReader
is reduced considerably when checking every 8191 docs. Moreover,
set the cancellable task before calling QueryPhase#preProcess()
and make sure we don't wrap with an ExitableDirectoryReader at all
when lowLevelCancellation is set to false to avoid completely any
performance impact.
Follows: #52822
Follows: #53166
Follows: #53496
(cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)
Feature importance storage format is changing to encompass multi-class.
Feature importance objects are now mapped as follows
(logistic) Regression:
```
{
"feature_name": "feature_0",
"importance": -1.3
}
```
Multi-class [class names are `foo`, `bar`, `baz`]
```
{
“feature_name”: “feature_0”,
“importance”: 2.0, // sum(abs()) of class importances
“foo”: 1.0,
“bar”: 0.5,
“baz”: -0.5
},
```
This change adjusts the mapping creation for analytics so that the field is mapped as a `nested` type.
Native side change: https://github.com/elastic/ml-cpp/pull/1071
* Get Async Search: omit _clusters section when empty (#53907)
The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content.
This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`.
* Async search: remove version from response (#53960)
The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).
That said this commit clarifies this in the docs and removes the version field from the async search response
* Async Search: replicas to auto expand from 0 to 1 (#53964)
This way single node clusters that are green don't go yellow once async search is used, while
all the others still have one replica.
* [DOCS] address timing issue in async search docs tests (#53910)
The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final.
With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response.
Closes#53887Closes#53891
Since a data frame analytics job may have associated docs
in the .ml-stats-* indices, when the job is deleted we
should delete those docs too.
Backport of #53933
Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.
While `CustomProcessor` is generic and allows for flexibility, there
are new requirements that make cross validation a concept it's hard
to abstract behind custom processor. In particular, we would like to
add data_counts to the DFA jobs stats. Counting training VS. test
docs would be a useful statistic. We would also want to add a
different cross validation strategy for multiclass classification.
This commit renames custom processors to cross validation splitters
which allows for those enhancements without cryptically doing
things as a side effect of the abstract custom processing.
Backport of #53915
Upgrading AWS SDK to v1.11.749.
Required building clients inside privileged contexts because some class loading that requires privileges now happens there and working around a new SDK bug in the S3 client builder.
Closes#53191
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to thest the data stream API stubs.
This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.
The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.
The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.
Relates to #53100
Source-only snapshots currently create a second full source-only copy of the shard on disk to
support incrementality during upload. Given that stored fields are occupying a substantial part
of a shard's storage, this means that clusters with source-only snapshots can require up to
50% more local storage. Ideally we would only generate source-only parts of the shard for the
things that need to be uploaded (i.e. do incrementality checks on original file instead of
trimmed-down source-only versions), but that requires much bigger changes to the snapshot
infrastructure. This here is an attempt to dramatically cut down on the storage used by the
source-only copy of the shard by soft-linking the stored-fields files (fd*) instead of copying
them.
Relates #50231
This change adds a "grant API key action"
POST /_security/api_key/grant
that creates a new API key using the privileges of one user ("the
system user") to execute the action, but creates the API key with
the roles of the second user ("the end user").
This allows a system (such as Kibana) to create API keys representing
the identity and access of an authenticated user without requiring
that user to have permission to create API keys on their own.
This also creates a new QA project for security on trial licenses and runs
the API key tests there
Backport of: #52886
This change adds a new exception with consistent metadata for when
security features are not enabled. This allows clients to be able to
tell that an API failed due to a configuration option, and respond
accordingly.
Relates: kibana#55255
Resolves: #52311, #47759
Backport of: #52811
This commit introduces aarch64 packaging, including bundling an aarch64
JDK distribution. We had to make some interesting choices here:
- ML binaries are not compiled for aarch64, so for now we disable ML on
aarch64
- depending on underlying page sizes, we have to disable class data
sharing
This commit changes the Transforms notifications index to be hidden
index, with a hidden alias.
This commit also removes the temporary hack in
MetaDataCreateIndexService that prevents deprecation warnings for known
dot-prefixed index names which are not hidden/system indices, as this
was the last index pattern to need that hack.
In xpack the license state contains methods to determine whether a
particular feature is allowed to be used. The one exception is
allowsRealmTypes() which returns an enum of the types of realms allowed.
This change converts the enum values to boolean methods. There are 2
notable changes: NONE is removed as we always fall back to basic license
behavior, and NATIVE is not needed because it would always return true
since we should always have a basic license.
Fixes up the "forbidden" warnings that you get when you import
Elasticsearch using "import gradle projects".
With this, and the manual step of switching circular project definitions
to warnings this gets most thing *compiling*.
This commit adds a new AsyncSearchClient to the High Level Rest Client which
initially supporst the submitAsyncSearch in its blocking and non-blocking
flavour. Also adding client side request and response objects and parsing code
to parse the xContent output of the client side AsyncSearchResponse together
with parsing roundtrip tests and a simple roundtrip integration test.
Relates to #49091
Backport of #53592
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.
Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
Adds parsing and indexing of analysis instrumentation stats.
The latest one is also returned from the get-stats API.
Note that we chose to duplicate objects even where they are currently
similar. There are already ideas on how these will diverge in the future
and while the duplication looks ugly at the moment, it is the option
that offers the highest flexibility.
Backport of #53788
The AuditTrailService has historically been an AuditTrail itself, acting
as a composite of the configured audit trails. This commit removes that
interface from the service and instead builds a composite delegating
implementation internally. The service now has a single get() method to
get an AuditTrail implementation which may be called. If auditing is not
allowed by the license, an empty noop version is returned.
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.
Backport of #53663
* [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725)
This fixes two issues:
- Results persister would retry actions even if they are not intermittent. An example of an persistent failure is a doc mapping problem.
- Data frame analytics would continue to retry to persist results even after the job is stopped.
closes https://github.com/elastic/elasticsearch/issues/53687
Prior to this commit Watcher explicitly copied test between two
projects with a copy task. This commit removes the explicit copy in favor
of adding the Watcher tests to the available restResources that may be
copied between projects.
This is how inter-project dependencies should be modeled. However, only
Watcher is included here since it is (currently) the only project with
inter-project test dependencies.
* Fix feature flag setting for ComponentTemplate APIs (#53758)
The feature flag was set for *most* of the builds, but there are a couple where it was missing.
Resolves#53708
* Add skip for older versions of ES
Backport to 7x
Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928
Co-Authored-By: @iverase
* Adds per context settings:
`script.context.${CONTEXT}.cache_max_size` ~
`script.cache.max_size`
`script.context.${CONTEXT}.cache_expire` ~
`script.cache.expire`
`script.context.${CONTEXT}.max_compilations_rate` ~
`script.max_compilations_rate`
* Context cache is used if:
`script.max_compilations_rate=use-context`. This
value is dynamically updatable, so users can
switch back to the general cache if desired.
* Settings for context caches take the first value
that applies:
1) Context specific settings if set, eg
`script.context.ingest.cache_max_size`
2) Correlated general setting is set to the non-default
value, eg `script.cache.max_size`
3) Context default
The reason for 2's inclusion is to allow an easy
transition for users who've customized their general
cache settings.
Using the general cache settings for the context caches
results in higher effective settings, since they are
multiplied across the number of contexts. So a general
cache max size of 200 will become 200 * # of contexts.
However, this behavior it will avoid users snapping to a
value that is too low for them.
Backport of: #52855
Refs: #50152
Some clients have problems running this test as a numeric key is treated like an array index by default.
We can work around this by renaming the aggregation key to not be a numeric.
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.
Closes#53443
changes the output format of preview regarding deduced mappings and enhances
it to return all the details about auto-index creation. This allows the user
to customize the index creation. Using HLRC you can create a index request
from the output of the response.
backport #53572
Re-applies the change from #53523 along with test fixes.
closes#53626closes#53624closes#53622closes#53625
Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
Password changes are only allowed when the user is currently
authenticated by a realm (that permits the password to be changed)
and not when authenticated by a bearer token or an API key.
The current implicit behaviour is that when an API keys is used to create another API key,
the child key is created without any privilege. This implicit behaviour is surprising and is
a source of confusion for users.
This change makes that behaviour explicit.
If security was disabled (explicitly), then the SecurityContext would
be null, but the set_security_user processor was still registered.
Attempting to define a pipeline that used that processor would fail
with an (intentional) NPE. This behaviour, introduced in #52032, is a
regression from previous releases where the pipeline was allowed, but
was no usable.
This change restores the previous behaviour (with a new warning).
Backport of: #52691
This commit adds internalClusterTest in xpack core to run as part of
check. This was accidentally removed in a refactoring. Other xpack
modules already do this, but core was left out. This commit also mutes 2
tests that currently fail.
closes#53407
* Submit async search to work only with POST (#53368)
Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method.
* Refine SearchProgressListener internal API (#53373)
The following cumulative improvements have been made:
- rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce`
- add unit test for `SearchShard`
- on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private
- Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class.
- rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members
- make `SearchShard` and `SearchShardTask` classes final
* Move async search yaml tests to x-pack yaml test folder (#53537)
The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too.
* [DOCS] Add temporary redirect for async-search (#53454)
The following API spec files contain a link to a not-yet-created
async search docs page:
* [async_search.delete.json][0]
* [async_search.get.json][1]
* [async_search.submit.json][2]
The Elaticsearch-js client uses these spec files to create their docs.
This created a broken link in the Elaticsearch-js docs, which has broken
the docs build.
This PR adds a temporary redirect for the docs page. This redirect
should be removed when the actual API docs are added.
[0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json
[1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json
[2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
`PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.
This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.
Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
This commit adjusts the aliases used for the ILM and SLM history indices
to be hidden aliases.
Also tweaks the configuration of the `IndexTemplateRegistry`s used by
these history system to only upgrade the template from the master node,
as documents are indexed from the master node, so the template version
should only be upgraded from the master node.
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.
Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
The watcher TextTemplateEngine uses a fast path mechanism where it
checks for the existence of `{{` to decide if a mustache script
required compilation. This does not work for stored script, as the field
that is checked contains the id of the script, which means, the name of
the script is returned as its value.
This commit checks for the script type and does not involve this fast
path check if a stored script is used.
Closes#40212
This change removes the need to always get a new version when iterating
on an async search. This is needed since we cannot guarantee that shards will
be queried exactly in order.
Relates #53360
* New wildcard field optimised for wildcard queries (#49993)
Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values. Also supports aggregations and sorting.
This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.
````
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
"aggs": {
"date_histogram": {
"field": "@timestamp",
"fixed_interval": "1h"
}
}
}
````
If after 100ms the final response is not available, a `partial_response` is included in the body:
````
{
"id": "9N3J1m4BgyzUDzqgC15b",
"version": 1,
"is_running": true,
"is_partial": true,
"response": {
"_shards": {
"total": 100,
"successful": 5,
"failed": 0
},
"total_hits": {
"value": 1653433,
"relation": "eq"
},
"aggs": {
...
}
}
}
````
The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:
````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````
That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:
````
{
"id": "9N3J1m4BgyzUDzqgC15b",
"version": 10,
"is_running": false,
"response": {
"is_partial": false,
...
}
}
````
Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`.
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.
Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed.
Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.
````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````
The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.
The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.
Relates #49091
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Fix NPE when `null` is passed as a parameter for a parameterized
pattern of LIKE/RLIKE. e.g.: `field LIKE ?` params=[null]`
Check for null pattern in LIKE/RLIKE as for RLIKE (RegexpQuery) we
get an IllegalArgumentExpression from Lucence but for LIKE
(WildcardQuery) we get an NPE.
Fixes: #53557
(cherry picked from commit ec3481ed13254ecdec32acf7a0fafd536ec77aff)
Prepares classification analysis to support more than just
two classes. It introduces a new parameter to the process config
which dictates the `num_classes` to the process. It also
changes the max classes limit to `30` provisionally.
Backport of #53539
Add missing asScript() implementation for LIKE/RLIKE expressions.
When LIKE/RLIKE are used for example in GROUP BY or are wrapped with
scalar functions in a WHERE clause, the translation must produce a
painless script which will be executed to implement the correct
behaviour and previously this was completely missing, and as a
consquence wrong results were silently (no error) returned.
Fixes: #53486
(cherry picked from commit eaa8ead6742a8e7dcf343bcbaff8de031550fd77)
the ML portion of the x-pack info API was erroneously counting configuration documents and definition documents. The underlying implementation of our storage separates the two out.
This PR filters the query so that only trained model config documents are counted.
Adds a new parameter for classification that enables choosing whether to assign labels to
maximise accuracy or to maximise the minimum class recall.
Fixes#52427.
This change makes it possible to send secondary authentication
credentials to select endpoints that need to perform a single action
in the context of two users.
Typically this need arises when a server process needs to call an
endpoint that users should not (or might not) have direct access to,
but some part of that action must be performed using the logged-in
user's identity.
Backport of: #52093
This change adds a new parameter to the authenticate methods in the
AuthenticationService to optionally exclude support for the anonymous
user (if an anonymous user exists).
Backport of: #52094
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
The setting, `xpack.logstash.enabled`, exists to enable or disable the
logstash extensions found within x-pack. In practice, this setting had
no effect on the functionality of the extension. Given this, the
setting is now deprecated in preparation for removal.
Backport of #53367
The tests in this class had been failing for a while, but went unnoticed as not tested by CI (see #53442).
The reason the tests fail is that the can-match phase is smarter now, and filters out access to a non-existing field.
Closes#53442
Today we do not have any infrastructure for adding a deprecation check
for settings that are removed. This commit enables this by adding such
infrastructure. Note that this infrastructure is unused in this commit,
which is deliberate. However, the primary target for this commit is 7.x
where this infrastructue will be used, in a follow-up.
Adds a new `default_field_map` field to trained model config objects.
This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.
The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:
1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index
This commit avoids these issues by adding a UUID to SearchContexId.
This is a partial implementation of an endpoint for anomaly
detector model memory estimation.
It is not complete, lacking docs, HLRC and sensible numbers
for many anomaly detector configurations. These will be
added in a followup PR in time for 7.7 feature freeze.
A skeleton endpoint is useful now because it allows work on
the UI side of the change to commence. The skeleton endpoint
handles the same cases that the old UI code used to handle,
and produces very similar estimates for these cases.
Backport of #53333
We can't always have the same segment stats and doc stats between
InternalEngine and ReadOnlyEngine if there are some fully deleted
segments. ReadOnlyEngine always filters out them. InternalEngine,
however, will keep them if peer recovery retention leases exist or the
number of the retaining operations is non-zero.
This change reverts the fix in #51331 and uses the wrapped reader to
calculate the segment stats and doc stats. For the test, we need to
disable the extra retaining soft-deletes operations.
Closes#51303
This avoids NPE when executing SLM policy when no config was provided.
Related to #44465Closes#53171
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
A previous change (#53029) is causing analysis jobs to wait for certain indices to be made available. While this it is good for jobs to wait, they could fail early on _start.
This change will cause the persistent task to continually retry node assignment when the failure is due to shards not being available.
If the shards are not available by the time `timeout` is reached by the predicate, it is treated as a _start failure and the task is canceled.
For tasks seeking a new assignment after a node failure, that behavior is unchanged.
closes#53188
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
A freeze operation can partially fail in multiple places, including the
close verification step. This left the index in an unfrozen but
partially closed state. Now throw an exception to retry the freeze step
instead.
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
* Avoid race condition in ILMHistorySotre (#53039)
* Avoid race condition in ILMHistorySotre
This change modifies ILMHistoryStore to always apply correct settings and mappings,
even if template is deleted and not yet recreated. This ensures that ILM history index
is correctly managed by ILM and also fixes flaky history tests that were prone to
triggenring this race.
This commit also refactors and simplifies ILM history tests.
Closes#50353 and #52853
* Review comment
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* fixed tests
* backport #53306
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit adds a new request object field, "version", containing the version of the requesting client. This parameter is now accepted - and for certain clients required - by the server and the request is validated against it. Currently server's and client's versions still need to be equal in order for the request to be accepted. Relaxing this check is going to be part of future work.
On the clients' side, the only check remaining is to ensure that the peer server is supporting version backwards compatibility (i.e. is on, or newer than a certain release).
(cherry picked from commit a8f413a20fb023bec83af0de1211a2936a7f558c)
Update CommonEqlRestTestCase code to simplify making changes as requested.
Update EqlActionIT to simplify the test code as requested.
Replace Jackson parser with XContent in EqlActionIT.
Whitelist more EQL tests specs that are now supported.
This commit moves the global checkpoint listeners used in CCR to the CCR
thread pool. This removes the last use of the listener thread pool in
the codebase.
This check was added as part of: 0f2d26bdca
Checking this before the test starts makes more sense, because
the watches index has then also be removed.
Relates to #53177
Today when notifying a global checkpoint listener, we use the listener
thread pool. This commit turns this inside out so that the global
checkpoint listener must provide an executor on which to notify the
listener.
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.
The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
This commit updates the template used for watch history indices with
the hidden index setting so that new indices will be created as hidden.
Relates #50251
Backport of #52962
Datafeed bwc tests have been muted for some time in the 7.x. This is because of date_histogram interval deprecation warnings.
This commit fixes the tests as must as possible while still handling deprecation warnings.
When we test backwards compatibility we often end up in a situation
where we *sometimes* get a warning, and sometimes don't. Like, we won't
get the warning if we're testing against an older version, but we will
in a newer one. Or we won't get the warning if the request randomly
lands on a node with an old version of the code. But we wouldn't if it
randomed into a node with newer code.
This adds `allowed_warnings` to our yaml test runner for those cases:
warnings declared this way are "allowed" but not "required".
Blocks #52959
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Tests have been periodically failing due to a race condition on checking a recently `STOPPED` task's state. The `.ml-state` index is not created until the task has already been transitioned to `STARTED`. This allows the `_start` API call to return. But, if a user (or test) immediately attempts to `_stop` that job, the job could stop and the task removed BEFORE the `.ml-state|stats` indices are created/updated.
This change moves towards the task cleaning up itself in its main execution thread. `stop` flips the flag of the task to `isStopping` and now we check `isStopping` at every necessary method. Allowing the task to gracefully stop.
closes#53007
For analytics, we need a consistent way of indicating when a value is missing. Inheriting from anomaly detection, analysis sent `""` when a field is missing. This works fine with numbers, but the underlying analytics process actually treats `""` as a category in categorical values.
Consequently, you end up with this situation in the resulting model
```
{
"frequency_encoding" : {
"field" : "RainToday",
"feature_name" : "RainToday_frequency",
"frequency_map" : {
"" : 0.009844409027270245,
"No" : 0.6472019970785184,
"Yes" : 0.6472019970785184
}
}
}
```
For inference this is a problem, because inference will treat missing values as `null`. And thus not include them on the infer call against the model.
This PR takes advantage of our new `missing_field_value` option and supplies `\0` as the value.
The assumption added in #52631 skips a problematic test
if it fails to create the required conditions for the
scenario it is supposed to be testing. (This happens
very rarely.)
However, before skipping the test it needs to remove the
failed job it has created because the standard test
cleanup code treats failed jobs as fatal errors.
Closes#52608
Upgrading the GCS SDK to the most recent version.
Adjusting (i.e. improving) the REST mock accordingly.
This should significantly boost performance by pulling in
https://github.com/googleapis/java-core/issues/86 in some cases.
This moves the usage statistics gathering from the `AnalyticsPlugin`
into an `AnalyicsUsage`, removing the static state. It also checks the
license level when parsing all analytics aggregations. This is how we
were checking them before but we did it in an easy to forget way. This
way is slightly simpler, I think.
Add a unit test to verify that the optimization of expression
(e.g. COALESCE) is applied to all instances of the expression:
SELECT, WHERE, GROUP BY and HAVING.
Relates to #35270
(cherry picked from commit 2ceedc7f2019fad92cd86679af1a9c6fa594aa8d)
Set size/displaySize to 45 which is the maximum string for
an IP (v6), since IPs are returned as strings.
Fixes: #52762
(cherry picked from commit 815f01747a4d54a274ca248af6fc08e5ea0728c1)
This commit introduces a module for Kibana that exposes REST APIs that
will be used by Kibana for access to its system indices. These APIs are wrapped
versions of the existing REST endpoints. A new setting is also introduced since
the Kibana system indices' names are allowed to be changed by a user in case
multiple instances of Kibana use the same instance of Elasticsearch.
Additionally, the ThreadContext has been extended to indicate that the use of
system indices may be allowed in a request. This will be built upon in the future
for the protection of system indices.
Backport of #52385
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.
The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?
For this field there is a choice between
1. accepting values in `_source` when they are equal to the value configured
in mappings, but rejecting mapping updates
2. rejecting values in `_source` but then allowing updates to the value that
is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.
Backport of #49713
Currently _rollup_search requires manage privilege to access. It should really be
a read only operation. This PR changes the requirement to be read indices privilege.
Resolves: #50245
Backport: #52627
Add watcher to trigger server after index operation has succeeded,
instead of adding a watch to trigger service before
the actual index operation has performed on the shard level.
This logic is simpler to reason about in the case that a failure
does occur during the execution of an index operation on
the shard level.
Relates to #52453, but I think doesn't fix it, but makes it easier
to debug.
implement transform node attributes to disable transform on certain nodes and
test which nodes are allowed to do remote connections
closes#52200closes#50033closes#48734
backport #52712
Fix NPE when closing a webserver that hasn't started correctly.
This can happen when ssl context isn't initialized. The server instance is then never set,
which causes an NPE that masks the actual failure.
Example stacktrace that would mask an actual failure:
```
java.lang.NullPointerException
at org.elasticsearch.test.http.MockWebServer.close(MockWebServer.java:271)
at org.elasticsearch.xpack.watcher.test.integration.HttpSecretsIntegrationTests.cleanup(HttpSecretsIntegrationTests.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
```
Adds reporting of memory usage for data frame analytics jobs.
This commit introduces a new index pattern `.ml-stats-*` whose
first concrete index will be `.ml-stats-000001`. This index serves
to store instrumentation information for those jobs.
Backport of #52778 and #52958
* Move In, InPipe and InProcessor out of SQL to the common QL project.
* Move tests classes to the QL project.
* Create SQL dedicated In class to handle SQL specific data types.
* Update SQL classes to use the InPipe and InProcessor QL classes.
* Extract common Foldables methods in QL project.
* Be more explicit when folding and converting a foldable value, by
removing most of the code inside Foldables class.
(cherry picked from commit 7425042f86f66df8c207c5e96f9b9848bda2b4c3)
When user A runs as user B and performs any API key related operations,
user B's realm should always be used to associate with the API key.
Currently user A's realm is used when getting or invalidating API keys
and owner=true. The PR is to fix this bug.
resolves: #51975
* [ML][Inference] Add support for multi-value leaves to the tree model (#52531)
This adds support for multi-value leaves. This is a prerequisite for multi-class boosted tree classification.
This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index.
This is necessary for the following use cases:
- Reading from frozen indices
- Allowing certain indices in multiple index patterns to not exist yet
These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object.
closes https://github.com/elastic/elasticsearch/issues/48056
Add query execution and return actual results returned from
Elasticsearch inside the tests
(cherry picked from commit 3e039282bf991af87604a6d4f8eada19d5e33842)
This change removes TrainedModelConfig#isAvailableWithLicense method with calls to
XPackLicenseState#isAllowedByLicense.
Please note there are subtle changes to the code logic. But they are the right changes:
* Instead of Platinum license, Enterprise license nows guarantees availability.
* No explicit check when the license requirement is basic. Since basic license is always available, this check is unnecessary.
* Trial license is always allowed.
This PR moves the majority of the Watcher REST tests under
the Watcher x-pack plugin.
Specifically, moves the Watcher tests from:
x-pack/plugin/test
x-pack/qa/smoke-test-watcher
x-pack/qa/smoke-test-watcher-with-security
x-pack/qa/smoke-test-monitoring-with-watcher
to:
x-pack/plugin/watcher/qa/rest (/test and /qa/smoke-test-watcher)
x-pack/plugin/watcher/qa/with-security
x-pack/plugin/watcher/qa/with-monitoring
Additionally, this disables Watcher from the main
x-pack test cluster and consolidates the stop/start logic
for the tests listed.
No changes to the tests (beyond moving them) are included.
3rd party tests and doc tests (which also touch Watcher)
are not included in the changes here.
This commit changes the `index.hidden` setting from being final to a
dynamic setting. While the setting being final allows for easier
reasoning about an index, making this setting update-able has more
benefits in that we can upgrade existing indices to be hidden and it
will enable future features that would dynamically make indices hidden.
Backport of #52772
We've pretty well settled on `ContextParser` for a generic interface to
`ObjectParser`-like-things. This switches the interface used for
building parsing pipeline aggregations to `ContextParser` which saves a
couple of little wrappers around `ObjectParser`.
Generalize how queries on `_index` are handled at rewrite time (#52486)
Since this change refactors rewrites, I also took it as an opportunity to adrress #49254: instead of returning the same queries you would get on a keyword field when a field is unmapped, queries get rewritten to a MatchNoDocsQueryBuilder.
This change exposed a couple bugs, like the fact that the percolator doesn't rewrite queries at query time, or that the significant_terms aggregation doesn't rewrite its inner filter, which I fixed.
Closes#49254
* Smarter copying of the rest specs and tests (#52114)
This PR addresses the unnecessary copying of the rest specs and allows
for better semantics for which specs and tests are copied. By default
the rest specs will get copied if the project applies
`elasticsearch.standalone-rest-test` or `esplugin` and the project
has rest tests or you configure the custom extension `restResources`.
This PR also removes the need for dozens of places where the x-pack
specs were copied by supporting copying of the x-pack rest specs too.
The plugin/task introduced here can also copy the rest tests to the
local project through a similar configuration.
The new plugin/task allows a user to minimize the surface area of
which rest specs are copied. Per project can be configured to include
only a subset of the specs (or tests). Configuring a project to only
copy the specs when actually needed should help with build cache hit
rates since we can better define what is actually in use.
However, project level optimizations for build cache hit rates are
not included with this PR.
Also, with this PR you can no longer use the includePackaged flag on
integTest task.
The following items are included in this PR:
* new plugin: `elasticsearch.rest-resources`
* new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy
* new extension 'restResources'
```
restResources {
restApi {
includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar
includeXpack 'baz' //will include x-pack specs that start with baz
}
restTests {
includeCore 'foo', 'bar' //will include the core tests that start with foo and bar
includeXpack 'baz' //will include the x-pack tests that start with baz
}
}
```
XPackLicenseState reads to necessary to validate a number of cluster
operations. This reads occasionally occur on transport threads which
should not be blocked. Currently we sychronize when reading. However,
this is unecessary as only a single piece of state is updateable. This
commit makes this state volatile and removes the locking.
The existing wording in the file realm docs proved confusing
for users as it seemed to indicate that it should _only_ be
used as a fallback/recovery realm and that it is not a
first class realm.
This change attempts to clarify this and point out that recovery
is _a_ use case for the file realm but not the only intended one.
These tests didn't work properly when run against multi-shard indices.
The `_score` based sorting test expects fairly specific scores which
isn't going to happen with multiple shards so this disables multiple
shards for that test. The other tests were failing due to a fairly
sneaky race condition around `_bulk` and type inference. This fixes them
by always sending metric values as floating point numbers so
Elasticsearch always infers them to be doubles.
The sql-cli script sources x-pack-env, but it does so assuming the
current directory is ES_HOME. This commit alters the source command to
use ES_HOME which is available after running elasticsearch-env.
closes#47803