* Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAllocated
These tests were using randomly generated includes/excludes/requires for
routing, however, it was possible to generate mutually exclusive
allocation settings (about 1 out of 50,000 times for my runs).
This splits the test into three different tests, and removes the
randomization (it doesn't add anything to the testing here) to fix the
issue.
Resolves#47142
While it seemed like the PUT data frame analytics action did not
have to be a master node action as the config is stored in an index
rather than the cluster state, there are other subtle nuances which
make it worthwhile to convert it. In particular, it helps maintain
order of execution for put actions which are anyhow user driven and
are expected to have low volume.
This commit converts `TransportPutDataFrameAnalyticsAction` from
a handled transport action to a master node action.
Note this means that the action might fail in a mixed cluster
but as the API is still experimental and not widely used there will
be few moments more suitable to make this change than now.
Bulk requests currently do not allow adding "create" actions with auto-generated IDs.
This commit allows using the optype CREATE for append-only indexing operations. This is
mainly the user facing aspect of it.
Due to #47003 many clusters will have built up a
large backlog of expired results. On upgrading to
a version where that bug is fixed users could find
that the first ML daily maintenance task deletes
a very large amount of documents.
This change introduces throttling to the
delete-by-query that the ML daily maintenance uses
to delete expired results to limit it to deleting an
average 200 documents per second. (There is no
throttling for state/forecast documents as these
are expected to be lower volume.)
Additionally a rough time limit of 8 hours is applied
to the whole delete expired data action. (This is only
rough as it won't stop part way through a single
operation - it only checks the timeout between
operations.)
Relates #47103
This commit restores the model state if available in data
frame analytics jobs.
In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.
Note that in order to be able to distinguish between a job
that runs for the first time and another that is restarting,
we ensure reindexing progress is reported to be at least 1
for a running task.
Due to a regression bug the metadata Active Directory realm
setting is ignored (it works correctly for the LDAP realm type).
This commit redresses it.
Closes#45848
* [ML][Inference] adding .ml-inference* index and storage (#47267)
* [ML][Inference] adding .ml-inference* index and storage
* Addressing PR comments
* Allowing null definition, adding validation tests for model config
* fixing line length
* adjusting for backport
As a result of #45689 snapshot finalization started to
take significantly longer than before. This may be a
little unfortunate since it increases the likelihood
of failing to finalize after having written out all
the segment blobs.
This change parallelizes all the metadata writes that
can safely run in parallel in the finalization step to
speed the finalization step up again. Also, this will
generally speed up the snapshot process overall in case
of large number of indices.
This is also a nice to have for #46250 since we add yet
another step (deleting of old index- blobs in the shards
to the finalization.
Currently the policy config is placed directly in the json object
of the toplevel `policies` array field. For example:
```
{
"policies": [
{
"match": {
"name" : "my-policy",
"indices" : ["users"],
"match_field" : "email",
"enrich_fields" : [
"first_name",
"last_name",
"city",
"zip",
"state"
]
}
}
]
}
```
This change adds a `config` field in each policy json object:
```
{
"policies": [
{
"config": {
"match": {
"name" : "my-policy",
"indices" : ["users"],
"match_field" : "email",
"enrich_fields" : [
"first_name",
"last_name",
"city",
"zip",
"state"
]
}
}
}
]
}
```
This allows us in the future to add other information about policies
in the get policy api response.
The UI will consume this API to build an overview of all policies.
The UI may in the future include additional information about a policy
and the plan is to include that in the get policy api, so that this
information can be gathered in a single api call.
An example of the information that is likely to be added is:
* Last policy execution time
* The status of a policy (executing, executed, unexecuted)
* Information about the last failure if exists
These settings were using get raw to fallback to whether or not SSL is
enabled. Yet, we have a formal mechanism for falling back to a
setting. This commit cuts over to that formal mechanism.
Backport of #45794 to 7.x. Convert most `awaitBusy` calls to
`assertBusy`, and use asserts where possible. Follows on from #28548 by
@liketic.
There were a small number of places where it didn't make sense to me to
call `assertBusy`, so I kept the existing calls but renamed the method to
`waitUntil`. This was partly to better reflect its usage, and partly so
that anyone trying to add a new call to awaitBusy wouldn't be able to find
it.
I also didn't change the usage in `TransportStopRollupAction` as the
comments state that the local awaitBusy method is a temporary
copy-and-paste.
Other changes:
* Rework `waitForDocs` to scale its timeout. Instead of calling
`assertBusy` in a loop, work out a reasonable overall timeout and await
just once.
* Some tests failed after switching to `assertBusy` and had to be fixed.
* Correct the expect templates in AbstractUpgradeTestCase. The ES
Security team confirmed that they don't use templates any more, so
remove this from the expected templates. Also rewrite how the setup
code checks for templates, in order to give more information.
* Remove an expected ML template from XPackRestTestConstants The ML team
advised that the ML tests shouldn't be waiting for any
`.ml-notifications*` templates, since such checks should happen in the
production code instead.
* Also rework the template checking code in `XPackRestTestHelper` to give
more helpful failure messages.
* Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4
Drop the usage of `SimpleDateFormat` and use the `DateFormatter` instead
(cherry picked from commit 7cf509a7a11ecf6c40c44c18e8f03b8e81fcd1c2)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
This change also slightly modifies the stats response,
so that is can easier consumer by monitoring and other
users. (coordinators stats are now in a list instead of
a map and has an additional field for the node id)
Relates to #32789
In the current implementation, the validation of the role query
occurs at runtime when the query is being executed.
This commit adds validation for the role query when creating a role
but not for the template query as we do not have the runtime
information required for evaluating the template query (eg. authenticated user's
information). This is similar to the scripts that we
store but do not evaluate or parse if they are valid queries or not.
For validation, the query is evaluated (if not a template), parsed to build the
QueryBuilder and verify if the query type is allowed.
Closes#34252
* ILM: parse origination date from index name (#46755)
Introduce the `index.lifecycle.parse_origination_date` setting that
indicates if the origination date should be parsed from the index name.
If set to true an index which doesn't match the expected format (namely
`indexName-{dateFormat}-optional_digits` will fail before being created.
The origination date will be parsed when initialising a lifecycle for an
index and it will be set as the `index.lifecycle.origination_date` for
that index.
A user set value for `index.lifecycle.origination_date` will always
override a possible parsable date from the index name.
(cherry picked from commit c363d27f0210733dad0c307d54fa224a92ddb569)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
* Drop usage of Map.of to be java 8 compliant
* Wait for snapshot completion in SLM snapshot invocation
This changes the snapshots internally invoked by SLM to wait for
completion. This allows us to capture more snapshotting failure
scenarios.
For example, previously a snapshot would be created and then registered
as a "success", however, the snapshot may have been aborted, or it may
have had a subset of its shards fail. These cases are now handled by
inspecting the response to the `CreateSnapshotRequest` and ensuring that
there are no failures. If any failures are present, the history store
now stores the action as a failure instead of a success.
Relates to #38461 and #43663
Using arrays of objects with embedded IDs is preferred for new APIs over
using entity IDs as JSON keys. This commit changes the SLM stats API to
use the preferred format.
* [ML][Inference] Feature pre-processing objects and functions (#46777)
To support inference on pre-trained machine learning models, some basic feature encoding will be necessary. I am using a named object serialization approach so new encodings/pre-processing steps could be added in the future.
This PR lays down the ground work for 3 basic encodings:
* HotOne
* Target Mean
* Frequency
More feature encodings or pre-processings could be added in the future:
* Handling missing columns
* Standardization
* Label encoding
* etc....
* fixing compilation for namedxcontent tests
This change allows for the caller of the `saml/prepare` API to pass
a `relay_state` parameter that will then be part of the redirect
URL in the response as the `RelayState` query parameter.
The SAML IdP is required to reflect back the value of that relay
state when sending a SAML Response. The caller of the APIs can
then, when receiving the SAML Response, read and consume the value
as it see fit.
Previously, queries on the _index field were not able to specify index aliases.
This was a regression in functionality compared to the 'indices' query that was
deprecated and removed in 6.0.
Now queries on _index can specify an alias, which is resolved to the concrete
index names when we check whether an index matches. To match a remote shard
target, the pattern needs to be of the form 'cluster:index' to match the
fully-qualified index name. Index aliases can be specified in the following query
types: term, terms, prefix, and wildcard.
This commit changes the GET REST api so it will accept an optional comma
separated list of enrich policy ids. This change also modifies the
behavior of the GET API in that it will not error if it is passed a bad
enrich id anymore, but will instead just return an empty list.
This commit reuses the same state processor that is used for autodetect
to parse state output from data frame analytics jobs. We then index the
state document into the state index.
Backport of #46804
* [ML][Transforms] remove `force` flag from _start (#46414)
* [ML][Transforms] remove `force` flag from _start
* fixing expected error message
* adjusting bwc version
* Give kibana user reserved role privileges on .apm-* to create APM agent configuration index.
* fixed test to include checking all .apm-* permissions
* changed pattern from ".apm-*" to the more specific ".apm-agent-configuration"
* Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581
* Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons
* Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization
* Fixes#41581
* [ILM] Add date setting to calculate index age
Add the `index.lifecycle.origination_date` to allow users to configure a
custom date that'll be used to calculate the index age for the phase
transmissions (as opposed to the default index creation date).
This could be useful for users to create an index with an "older"
origination date when indexing old data.
Relates to #42449.
* [ILM] Don't override creation date on policy init
The initial approach we took was to override the lifecycle creation date
if the `index.lifecycle.origination_date` setting was set. This had the
disadvantage of the user not being able to update the `origination_date`
anymore once set.
This commit changes the way we makes use of the
`index.lifecycle.origination_date` setting by checking its value when
we calculate the index age (ie. at "read time") and, in case it's not
set, default to the index creation date.
* Make origination date setting index scope dynamic
* Document orignation date setting in ilm settings
(cherry picked from commit d5bd2bb77ee28c1978ab6679f941d7c02e389d32)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Since the `IndicesSegmentsRequest` scatters to all shards for the index,
it's possible that some of the shards may fail. This adds failure
handling and logging (since this is a best-effort step in the first
place) for this case.
This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them.
The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`.
Relates #46523
rename data frame transform plugin to transform:
- rename plugin data-frame to transform
- change all package names from o.e.*.dataframe.* to o.e.*.transform.*
- necessary changes to fix loading/testing
* More Efficient Ordering of Shard Upload Execution (#42791)
* Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard
* Inspired by #39657
* Cleanup BlobStoreRepository Abort and Failure Handling (#46208)
The enrich api returns enrich coordinator stats and
information about currently executing enrich policies.
The coordinator stats include per ingest node:
* The current number of search requests in the queue.
* The total number of outstanding remote requests that
have been executed since node startup. Each remote
request is likely to include multiple search requests.
This depends on how much search requests are in the
queue at the time when the remote request is performed.
* The number of current outstanding remote requests.
* The total number of search requests that `enrich`
processors have executed since node startup.
The current execution policies stats include:
* The name of policy that is executing
* A full blow task info object that is executing the policy.
Relates to #32789
This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext.
It's a spin off of #46527 as this change is required to allow aggregation builder to solely use the
query shard context.
Relates #46523