Commit Graph

5108 Commits

Author SHA1 Message Date
Christoph Büscher 0d17295601 [Docs] Minor fix for SubmitAsyncSearchRequest.keepOnCompletion javadoc (#54325)
The semantics and the default value for this parameter have changed, adapting
the javadoc accordingly.
2020-03-27 16:02:03 +01:00
Przemysław Witek 2eb079b67f
Add version guards around ML hidden indices settings (#54322) 2020-03-27 14:50:57 +01:00
Ioannis Kakavas 5983f6aceb
Mute testSpInitiatedSsoFailsForMalformedRequest (#54328) (#54339)
see #54285
2020-03-27 15:46:08 +02:00
Yannick Welsch 8126ad0ab1 Increase timeout on testUpdateAnalysisLeaderIndexSettings
Closes #54204
2020-03-27 13:41:47 +01:00
Przemysław Witek d40afc7871
[7.x] Do not fail Evaluate API when the actual and predicted fields' types differ (#54255) (#54319) 2020-03-27 10:05:19 +01:00
Jason Tedor c547fabb2b
Put CCR tasks on (data && remote cluster clients) (#54146)
Today we assign CCR persistent tasks to nodes with the data role. It
could be that the data node is not capable of connecting to remote
clusters, in which case the task will fail since it can not connect to
the remote cluster with the leader shard. Instead, we need to assign
such tasks to nodes that are capable of connecting to remote
clusters. This commit addresses this by enabling such persistent tasks
to only be assigned to nodes that have the data role, and also have the
remote cluster client role.
2020-03-26 23:50:16 -04:00
Hendrik Muhs 4ecf9904d5 [Transform] Transform optmize date histogram (#54068)
optimize transform for group_by on date_histogram by injecting an additional range query. This limits the number of search and index requests and avoids unnecessary updates. Only recent buckets get re-written.

fixes #54254
2020-03-26 21:39:50 +01:00
Gordon Brown 0d30b48613
Disallow negative TimeValues (#53913)
This commit causes negative TimeValues, other than -1 which is sometimes used as
a sentinel value, to be rejected during parsing.

Also introduces a hack to allow ILM to load policies which were written to the
cluster state with a negative min_age, treating those values as 0, which should
match the behavior of prior versions.
2020-03-26 13:30:35 -06:00
William Brafford 14204f8381
Use set-based interface for NodesStatsRequest (#53637) (#54141)
The NodesStatsRequest class uses a set of strings for its internal
serialization. This commit updates the class's interface so that we
no longer use hard-coded getters and setters, but rather
methods that add strings directly. For example, the old way of
adding "os" metrics to a request would be to call request.os(true).
The new way of doing this is to call request.addMetric("os").

For the time being, the canonical list of metrics is an enum in
NodesStatsRequest. This will eventually be replaced with something
pluggable.
2020-03-26 14:41:49 -04:00
Dimitris Athanasiou 13368aae37
[7.x][ML] DF Analytics should always display operational stats (#54210) (#54290)
This commit populates the _stats API response with sensible "empty"
`data_counts` and `memory_usage` objects when the job itself
has not started reporting them.

Backport of #54210
2020-03-26 20:03:14 +02:00
Christoph Büscher da404bbce2
HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200) (#54266)
Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and
requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no
longer needed since we now only send the parameters along with the rest request
that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct
defaults are set on the client side. This change removes setting and sending
these defaults where possible, leaving only the overwrite of batchedReduceSize
with a default value of 5, since the default used in the vanilla SearchRequest
is 512. However, we don't need to send this value along as a request parameter
if its the default since the correct one will be set on the receiving end if no
value is specified.
Also adding tests for RestSubmitAsyncSearchAction that check the correct
defaults are set when parameters are missing on the server side.

Backport of #54200
2020-03-26 19:01:17 +01:00
David Turner fc92bf4208 assertBusy in XPackRestIT#awaitCallApi (#54264)
Retries in this method were lost in #45794. This commit reinstates them.
2020-03-26 16:16:05 +00:00
Dimitris Athanasiou cc981fa377
[7.x][ML] Get ML filters size should default to 100 (#54207) (#54278)
When get filters is called without setting the `size`
paramter only up to 10 filters are returned. However,
100 filters should be returned. This commit fixes this
and adds an integ test to guard it.

It seems this was accidentally broken in #39976.

Closes #54206

Backport of #54207
2020-03-26 17:51:43 +02:00
David Turner f48e8f31b9 AwaitsFix for #54180 2020-03-26 15:35:36 +00:00
Gordon Brown bbc6bc0299
Fix WatcherRestartIT.testWatcherRestart (#54237)
This commit adjusts testWatcherRestart to vary the template version number it
checks for based on the ES version being upgraded from, because the v11 template
is only installed on clusters with all nodes >=7.7.0.
2020-03-26 08:12:15 -06:00
David Turner ffe1ba3754 Add error_trace parameter to REST test helper (#54259)
Today the `XPackRestTestHelper` makes some REST calls without the `error_trace`
parameter, so that if they fail due to an exception we do not see very much
detail. This commit adds the `error_trace` parameter to help identify why these
REST calls fail.
2020-03-26 14:04:52 +00:00
David Turner ad3c96e250 AwaitsFix for #54093 2020-03-26 13:24:33 +00:00
David Turner 53e2fec93d AwaitsFix for #53612 2020-03-26 10:41:37 +00:00
Yannick Welsch 1ba6783780 Schedule commands in current thread context (#54187)
Changes ThreadPool's schedule method to run the schedule task in the context of the thread
that scheduled the task.

This is the more sensible default for this method, and eliminates a range of bugs where the
current thread context is mistakenly dropped.

Closes #17143
2020-03-26 10:07:59 +01:00
Luca Cavanna ff269160af Async search: rename REST parameters (#54198)
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search.
Also it renames clean_on_completion to keep_on_completion and turns around its behaviour.

Closes #54069
2020-03-26 09:40:50 +01:00
Yang Wang 1afd510721
Check authentication type using enum instead of string (#54145) (#54246)
Avoid string comparison when we can use safer enums.
This refactor is a follow up for #52178.

Resolves: #52511
2020-03-26 15:45:10 +11:00
Tim Vernum 1fc518c25e
Improve stability of SamlServiceProviderIndexTests (#54241)
This test assumed cluster events would be processed quickly which is
not always true

Backport of: #54166
2020-03-26 13:07:42 +10:00
Ryan Ernst 5a5d6e9ef2
Invert license security disabled helper method (#54043) (#54239)
Xpack license state contains a helper method to determine whether
security is disabled due to license level defaults. Most code needs to
know whether security is enabled, not disabled, but this method exists
so that the security being explicitly disabled can be distinguished from
licence level defaulting to disabled. However, in the case that security
is explicitly disabled, the handlers in question are never registered,
so security is implicitly not disabled explicitly, and thus we can share
a single method to know whether licensing is enabled.
2020-03-25 19:20:10 -07:00
Benjamin Trent 6d68cf809c
[Transform] Remove node.attr.transform.remote_connect and use new remote cluster client node role (#54217) (#54224)
With the addition of a formal role for nodes indicating remote cluster connection, the transform specific attribute `node.attr.transform.remote_connect` is no longer necessary.

closes https://github.com/elastic/elasticsearch/issues/54179
2020-03-25 16:29:02 -04:00
Nik Everett 8f40f1435a
Save a little space in agg tree (backport of #53730) (#54213)
This drop the "top level" pipeline aggregators from the aggregation
result tree which should save a little memory and a few serialization
bytes. Perhaps more imporantly, this provides a mechanism by which we
can remove *all* pipelines from the aggregation result tree. This will
save quite a bit of space when pipelines are deep in the tree.

Sadly, doing this isn't simple because of backwards compatibility. Nodes
before 7.7.0 *need* those pipelines. We provide them by setting passing
a `Supplier<PipelineTree>` into the root of the aggregation tree that we
only call if we need to serialize to a version before 7.7.0.

This solution works for cross cluster search because we always reduce
the aggregations in each remote cluster and then forward them back to
the coordinating node. Its quite possible that the coordinating node
needs the pipeline (say it is version 7.1.0) and the gateway node in the
remote cluster doesn't (version 7.7.0). In that case the data nodes
won't send the pipeline aggregations back to the gateway node.
Critically, the gateway node *will* send the pipeline aggregations back
to the coordinating node. This is all managed with that
`Supplier<PipelineTree>`, but *how* it is managed is a bit tricky.
2020-03-25 15:51:16 -04:00
Jason Tedor d14f170093
Add cluster.remote.connect to deprecation info API (#54142)
This setting was recently deprecated in favor of
node.remote_cluster_client. This commit adds this setting to the
deprecation info API.
2020-03-25 15:11:59 -04:00
Nik Everett b8b7516790 Disable WatcherRestartIT from 7.7.0
It is failing. Tracked in #54220.
2020-03-25 14:51:33 -04:00
Hendrik Muhs cb0ecafdd8 [Transform] fix transform failure case for percentiles and spa… (#54202)
index null if percentiles could not be calculated due to sparse data

fixes #54201
2020-03-25 19:28:51 +01:00
Armin Braun 70b378cd1b
Upgrade GCS Dependency to 1.106.0 (#54092) (#54112)
* Upgrade GCS Dependency to 1.106.0 (#54092)

Upgrading GCS Dep + related dependencies as it seems some more retry bugs were fixed between .104 and .106
2020-03-25 19:05:01 +01:00
Martijn Laarman 077bf52acc transform.cat should live in the cat namespace. (#54196)
* transform.cat should live in the cat namespace.

Similarly to to ml cat API's also living in the `cat` namespace.

Clients treat the `cat` namespace differently then other API's (return
types, content types). This introduces an exception to this rule.

* rename the specification file as well

(cherry picked from commit 0a98904b1a73a30bbaebc32bd16a238c8d03c329)
2020-03-25 18:16:01 +01:00
Mark Vieira 7728ccd920
Encore consistent compile options across all projects (#54120)
(cherry picked from commit ddd068a7e92dc140774598664efdc15155ab05c2)
2020-03-25 08:24:21 -07:00
Dimitris Athanasiou ba09a778dc
[7.x][ML] Unmute classification cardinality integ test (#54165) (#54173)
Adjusts test to work for new cardinality limit.

Backport of #54165
2020-03-25 15:00:34 +02:00
Benjamin Trent ef05a4f416
[ML] relaxing parameters on stratified split test (#54127) (#54168)
Relaxing the error rate a bit on two of the tests.
Ran 1000s of times locally and never had a failure after these changes. 

closes https://github.com/elastic/elasticsearch/issues/54122
2020-03-25 08:06:15 -04:00
Tanguy Leroux 3a3930c7ec
Mute TooManyJobsIT.testCloseFailedJob on 7.x (#54163)
Relates #54162
2020-03-25 12:44:41 +01:00
Tanguy Leroux 4a2db4651e
Mute ReadActionsTests (#54153)
Relates #53340
2020-03-25 10:35:58 +01:00
Jason Tedor 381d7586e4
Introduce formal role for remote cluster client (#54138)
This commit introduce a formal role for identifying nodes that are
capable of making connections to remote clusters.

Relates #53924
2020-03-24 21:59:43 -04:00
Oliver Gupte 96f0c668a8
[APM] Allow kibana to collect APM telemetry in background task (#52917) (#54106)
* Required for elastic/kibana#50757.
Allows the kibana user to collect APM telemetry in a background task.

* removed unnecessary priviledges on `.ml-anomalies-*` for the `kibana_system` reserved role
2020-03-24 18:11:19 -07:00
David Roberts 7667004b20
[ML] Add a model memory estimation endpoint for anomaly detection (#54129)
A new endpoint for estimating anomaly detection job
model memory requirements:

POST _ml/anomaly_detectors/estimate_model_memory

Backport of #53507
2020-03-24 22:55:11 +00:00
Ioannis Kakavas 7c0123d6f3
Add SAML IdP plugin for internal use (#54046) (#54124)
This change merges the "feature-internal-idp" branch into Elasticsearch.

This introduces a small identity-provider plugin as a child of the x-pack module.
This allows ES to act as a SAML IdP, for users who are authenticated against the
Elasticsearch cluster.

This feature is intended for internal use within Elastic Cloud environments
and is not supported for any other use case. It falls under an enterprise license tier.

The IdP is disabled by default.

Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
2020-03-25 09:45:13 +11:00
Gordon Brown 82e041442e
Add version guards around Transform hidden index settings (#54036)
This commit ensures that the hidden index settings are only applied to the
Transform index templates when the cluster can support those settings.

Also unmutes the tests which were failing due to the previous behavior.
2020-03-24 15:52:56 -06:00
Ross Wolf 627ca03c72
EQL: Remove parser handling for functions (#54028)
* EQL: Remove parser handling for functions
* EQL: Comment out array functions in queries-unsupported.eql
2020-03-24 14:03:02 -06:00
Costin Leau 68f74cf593
EQL: Fix custom scripting for functions (#53935) (#54114)
Improve separation of scripting between EQL and SQL by delegating common
methods to QL. The context detection is determined based on the package
to avoid having repetitive class hierarchies.
The Painless whitelists have been improved so that the declaring class
is used instead of the inherited one.

Relates #53688

(cherry picked from commit 6d46033e736c64ac9255c5d6964600d2a931430a)

EQL: Add Substring function with Python semantics (#53688)

Does not reuse substring from SQL due to the difference in semantics and
the accepted arguments.
Currently it is missing full integration tests as, due to the usage of
scripting, requires an actual integration test against a proper cluster
(and likely its own QA project).

(cherry picked from commit f58680bad33d5ce4139157a69a4d9f5f286bc3c4)
2020-03-24 20:54:19 +02:00
markharwood 6a60f85bba
Wildcard field - add normalizer support (#53851) (#54109)
Backport support for normalisation to wildcard field

Closes #53603
2020-03-24 17:37:47 +00:00
Dimitris Athanasiou c141c1dd89
[7.x][ML] Stratified cross validation split for classification (#54087) (#54104)
As classification now works for multiple classes, randomly
picking training/test data frame rows is not good enough.
This commit introduces a stratified cross validation splitter
that maintains the proportion of the each class in the dataset
in the sample that is used for training the model.

Backport of #54087
2020-03-24 18:47:36 +02:00
Yannick Welsch e006d1f6cf Use special XContent registry for node tool (#54050)
Fixes an issue where the elasticsearch-node command-line tools would not work correctly
because PersistentTasksCustomMetaData contains named XContent from plugins. This PR
makes it so that the parsing for all custom metadata is skipped, even if the core system would
know how to handle it.

Closes #53549
2020-03-24 17:40:51 +01:00
Luca Cavanna 6b457abbd3 Async search: prevent users from overriding pre_filter_shard_size (#54088)
Submit async search forces pre_filter_shard_size for the underlying search that it creates.
With this commit we also prevent users from overriding such default as part of request validation.
2020-03-24 17:06:04 +01:00
Luca Cavanna 3c67762f1b Async search response: output start and expiration time as time fields (#54084)
This commits makes start_time and expiration_time time fields, so that their date variant will be printed out when human readable output is requested.
2020-03-24 17:05:56 +01:00
Jim Ferenczi 0330bef409 Improve async search's tasks cancellation (#53799)
This commit adds an explicit cancellation of the search task if
the initial async search submit task is cancelled (connection closed by the user).
This was previously done through the cancellation of the parent task but we don't
handle grand-children cancellation yet so we have to manually cancel the search task
in order to ensure that shard actions are cancelled too.
This change can be considered as a workaround until #50990 is fixed.
2020-03-24 15:51:10 +01:00
Andrei Stefan 3234b50e95
SQL: jdbc debugging enhancement (#53880) (#54081)
* add flush always output option that will flush the output printer
after each debug message when enabled (disabled by default)
* at debug output initializationtime, log debug output
information about OS, JVM and default JVM timezone

(cherry picked from commit b5db9657d1eadce9902041e5b128bf32c02d302a)
2020-03-24 16:09:53 +02:00
Alan Woodward 39d7d0dc10 Upgrade to lucene 8.5.0 release (#54077)
Upgrades our lucene dependency to the released 8.5.0 version.
2020-03-24 13:45:50 +00:00
David Roberts 1421471556
[ML] Introduce a "starting" datafeed state for lazy jobs (#54065)
It is possible for ML jobs to open lazily if the "allow_lazy_open"
option in the job config is set to true.  Such jobs wait in the
"opening" state until a node has sufficient capacity to run them.

This commit fixes the bug that prevented datafeeds for jobs lazily
waiting assignment from being started.  The state of such datafeeds
is "starting", and they can be stopped by the stop datafeed API
while in this state with or without force.

Backport of #53918
2020-03-24 13:00:04 +00:00
Peter Schretlen 92acb2859b
Allow kibana_system to create and invalidate API keys on behalf of other users 2020-03-24 08:38:12 -04:00
Dimitris Athanasiou be20bb5755
[7.x][ML] No refresh on indexing DFA stats (#53977) (#54064)
When we index data frame analytics stats docs we do not
need to refresh immediately.

Backport of #53977
2020-03-24 13:13:03 +02:00
Yang Wang d33d20bfdc
Validate role templates before saving role mapping (#52636) (#54059)
Role names are now compiled from role templates before role mapping is saved.
This serves as validation for role templates to prevent malformed and invalid scripts
to be persisted, which could later break authentication.

Resolves: #48773
2020-03-24 20:43:59 +11:00
Dimitris Athanasiou 5ce7c99e74
[7.x][ML] Data frame analytics data counts (#53998) (#54031)
This commit instruments data frame analytics
with stats for the data that are being analyzed.
In particular, we count training docs, test docs,
and skipped docs.

In order to account docs with missing values as skipped
docs for analyses that do not support missing values,
this commit changes the extractor so that it only ignores
docs with missing values when it collects the data summary,
which is used to estimate memory usage.

Backport of #53998
2020-03-24 11:30:43 +02:00
Hendrik Muhs 7dcacf531f
[7.x][Transform][Rollup] add processing stats to record the ti… (#54027)
add 2 additional stats: processing time and processing total which capture the
time spent for processing results and how often it ran. The 2 new stats
correspond to the existing indexing and search stats. Together with indexing
and search this now allows the user to see the full picture, all 3 stages.
2020-03-24 09:22:02 +01:00
Jason Tedor e3ca124537
Introduce autoscaling decisions (#53934)
This is the first in a series of commits that will introduce the
autoscaling deciders framework. This commit introduces the basic
framework for representing autoscaling decisions.
2020-03-23 23:08:06 -04:00
Tim Vernum 4bd853a6f2
Add "grant_api_key" cluster privilege (#54042)
This change adds a new cluster privilege "grant_api_key" that allows
the use of the new /_security/api_key/grant endpoint

Backport of: #53527
2020-03-24 13:17:45 +11:00
Benjamin Trent 19af869243
[ML] adds multi-class feature importance support (#53803) (#54024)
Adds multi-class feature importance calculation. 

Feature importance objects are now mapped as follows
(logistic) Regression:
```
{
   "feature_name": "feature_0",
   "importance": -1.3
}
```
Multi-class [class names are `foo`, `bar`, `baz`]
```
{ 
   “feature_name”: “feature_0”, 
   “importance”: 2.0, // sum(abs()) of class importances
   “foo”: 1.0, 
   “bar”: 0.5, 
   “baz”: -0.5 
},
```

For users to get the full benefit of aggregating and searching for feature importance, they should update their index mapping as follows (before turning this option on in their pipelines)
```
 "ml.inference.feature_importance": {
          "type": "nested",
          "dynamic": true,
          "properties": {
            "feature_name": {
              "type": "keyword"
            },
            "importance": {
              "type": "double"
            }
          }
        }
```
The mapping field name is as follows
`ml.<inference.target_field>.<inference.tag>.feature_importance`
if `inference.tag` is not provided in the processor definition, it is not part of the field path.
`inference.target_field` is defaulted to `ml.inference`.
//cc @lcawl ^ Where should we document this?

If this makes it in for 7.7, there shouldn't be any feature_importance at inference BWC worries as 7.7 is the first version to have it.
2020-03-23 18:49:07 -04:00
Gordon Brown e225f08613
Mute TransformSurvivesUpgradeIT.testTransformRollingUpgrade (#54037) 2020-03-23 16:38:04 -06:00
Mark Vieira 70cfedf542
Refactor global build info plugin to leverage JavaInstallationRegistry (#54026)
This commit removes the configuration time vs execution time distinction
with regards to certain BuildParms properties. Because of the cost of
determining Java versions for configuration JDK locations we deferred
this until execution time. This had two main downsides. First, we had
to implement all this build logic in tasks, which required a bunch of
additional plumbing and complexity. Second, because some information
wasn't known during configuration time, we had to nest any build logic
that depended on this in awkward callbacks.

We now defer to the JavaInstallationRegistry recently added in Gradle.
This utility uses a much more efficient method for probing Java
installations vs our jrunscript implementation. This, combined with some
optimizations to avoid probing the current JVM as well as deferring
some evaluation via Providers when probing installations for BWC builds
we can maintain effectively the same configuration time performance
while removing a bunch of complexity and runtime cost (snapshotting
inputs for the GenerateGlobalBuildInfoTask was very expensive). The end
result should be a much more responsive build execution in almost all
scenarios.

(cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)
2020-03-23 15:30:10 -07:00
Nik Everett b9bfba2c8b
Move pipeline agg validation to coordinating node (backport of #53669) (#54019)
This moves the pipeline aggregation validation from the data node to the
coordinating node so that we, eventually, can stop sending pipeline
aggregations to the data nodes entirely. In fact, it moves it into the
"request validation" stage so multiple errors can be accumulated and
sent back to the requester for the entire request. We can't always take
advantage of that, but it'll be nice for folks not to have to play
whack-a-mole with validation.

This is implemented by replacing `PipelineAggretionBuilder#validate`
with:
```
protected abstract void validate(ValidationContext context);
```

The `ValidationContext` handles the accumulation of validation failures,
provides access to the aggregation's siblings, and implements a few
validation utility methods.
2020-03-23 17:22:56 -04:00
Marios Trivyzas 3a3e964956
Reduce performance impact of ExitableDirectoryReader (#53978) (#54014)
Benchmarking showed that the effect of the ExitableDirectoryReader
is reduced considerably when checking every 8191 docs. Moreover,
set the cancellable task before calling QueryPhase#preProcess()
and make sure we don't wrap with an ExitableDirectoryReader at all
when lowLevelCancellation is set to false to avoid completely any
performance impact.

Follows: #52822
Follows: #53166
Follows: #53496

(cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)
2020-03-23 21:30:34 +01:00
Christoph Büscher 286c3660bd
Add async_search get and delete APIs to HLRC (#53828) (#53980)
This commit adds the "_async_searhc" get and delete APIs to the
AsyncSearchClient in the High Level Rest Client.

Relates to #49091
Backport of #53828
2020-03-23 21:21:36 +01:00
Benjamin Trent d276058c6c
[ML] adjusting feature importance mapping for multi-class support (#53821) (#54013)
Feature importance storage format is changing to encompass multi-class.

Feature importance objects are now mapped as follows
(logistic) Regression:
```
{
   "feature_name": "feature_0",
   "importance": -1.3
}
```
Multi-class [class names are `foo`, `bar`, `baz`]
```
{
   “feature_name”: “feature_0”,
   “importance”: 2.0, // sum(abs()) of class importances
   “foo”: 1.0,
   “bar”: 0.5,
   “baz”: -0.5
},
```

This change adjusts the mapping creation for analytics so that the field is mapped as a `nested` type.

Native side change: https://github.com/elastic/ml-cpp/pull/1071
2020-03-23 15:50:12 -04:00
Przemysław Witek 88c5d520b3
[7.x] Verify that the field is aggregatable before attempting cardinality aggregation (#53874) (#54004) 2020-03-23 19:36:33 +01:00
Luca Cavanna 932a7e3112
Backport of async search changes (#53976)
* Get Async Search: omit _clusters section when empty (#53907)

The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content.

This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`.

* Async search: remove version from response (#53960)

The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).

That said this commit clarifies this in the docs and removes the version field from the async search response

* Async Search: replicas to auto expand from 0 to 1 (#53964)

This way single node clusters that are green don't go yellow once async search is used, while
all the others still have one replica.

* [DOCS] address timing issue in async search docs tests (#53910)

The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final.

With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response.

Closes #53887
Closes #53891
2020-03-23 19:13:31 +01:00
Dimitris Athanasiou 965af3a68b
[7.x][ML] Delete DF analytics stats upon job deletion (#53933) (#53997)
Since a data frame analytics job may have associated docs
in the .ml-stats-* indices, when the job is deleted we
should delete those docs too.

Backport of #53933
2020-03-23 19:55:36 +02:00
Dimitris Athanasiou 08a8345269
[7.x][ML] Fix typo in outlier detection timing stats (#53988) (#53995)
The field holding the timing stats was mistakenly called
`timings_stats`.

Backport of #53988
2020-03-23 19:46:39 +02:00
Ryan Ernst 960d1fb578
Revert "Introduce system index APIs for Kibana (#53035)" (#53992)
This reverts commit c610e0893d.

backport of #53912
2020-03-23 10:29:35 -07:00
Armin Braun 5b9864db2c
Better Incrementality for Snapshots of Unchanged Shards (#52182) (#53984)
Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.
2020-03-23 16:43:41 +01:00
Dimitris Athanasiou 3873510332
[7.x][ML] Refactor DFA custom processor to cross validation splitter (#53915) (#53956)
While `CustomProcessor` is generic and allows for flexibility, there
are new requirements that make cross validation a concept it's hard
to abstract behind custom processor. In particular, we would like to
add data_counts to the DFA jobs stats. Counting training VS. test
docs would be a useful statistic. We would also want to add a
different cross validation strategy for multiclass classification.

This commit renames custom processors to cross validation splitters
which allows for those enhancements without cryptically doing
things as a side effect of the abstract custom processing.

Backport of #53915
2020-03-23 17:15:14 +02:00
Armin Braun 754d071c4e
Upgrade to AWS SDK 1.11.749 (#53962) (#53974)
Upgrading AWS SDK to v1.11.749.
Required building clients inside privileged contexts because some class loading that requires privileges now happens there and working around a new SDK bug in the S3 client builder.

Closes #53191
2020-03-23 15:31:29 +01:00
Marios Trivyzas af03200ad6
SQL: Extend DATE_TRUNC to also operate on intervals(elastic - #46632 ) (#47720) (#53972)
The function is extended to operate on intervals according to the PostgreSQL: https://www.postgresql.org/docs/9.1/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC

Closes : #46632
(cherry picked from commit 2dc79505825fa75e0711dcfa8e9c69e8028fc979)

Co-authored-by: musteaf <gs_mustea@hotmail.com>
2020-03-23 15:05:16 +01:00
Martijn van Groningen aef7b89219
Backport: initial data stream commit (#53959)
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to thest the data stream API stubs.

This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.

The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.

The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.

Relates to #53100
2020-03-23 12:58:09 +01:00
Yannick Welsch 060c72c799 Only link fd* files during source-only snapshot (#53463)
Source-only snapshots currently create a second full source-only copy of the shard on disk to
support incrementality during upload. Given that stored fields are occupying a substantial part
of a shard's storage, this means that clusters with source-only snapshots can require up to
50% more local storage. Ideally we would only generate source-only parts of the shard for the
things that need to be uploaded (i.e. do incrementality checks on original file instead of
trimmed-down source-only versions), but that requires much bigger changes to the snapshot
infrastructure. This here is an attempt to dramatically cut down on the storage used by the
source-only copy of the shard by soft-linking the stored-fields files (fd*) instead of copying
them.

Relates #50231
2020-03-23 11:04:53 +01:00
Tim Vernum cde8725e3c
Create API Key on behalf of other user (#53943)
This change adds a "grant API key action"

   POST /_security/api_key/grant

that creates a new API key using the privileges of one user ("the
system user") to execute the action, but creates the API key with
the roles of the second user ("the end user").

This allows a system (such as Kibana) to create API keys representing
the identity and access of an authenticated user without requiring
that user to have permission to create API keys on their own.

This also creates a new QA project for security on trial licenses and runs
the API key tests there

Backport of: #52886
2020-03-23 18:50:07 +11:00
Tim Vernum f003a419a5
Add exception metadata for disabled features (#53941)
This change adds a new exception with consistent metadata for when
security features are not enabled. This allows clients to be able to
tell that an API failed due to a configuration option, and respond
accordingly.

Relates: kibana#55255
Resolves: #52311, #47759

Backport of: #52811
2020-03-23 14:13:15 +11:00
Jason Tedor 27c8bcbbd1
Introduce aarch64 packaging (#53914) (#53926)
This commit introduces aarch64 packaging, including bundling an aarch64
JDK distribution. We had to make some interesting choices here:
 - ML binaries are not compiled for aarch64, so for now we disable ML on
   aarch64
 - depending on underlying page sizes, we have to disable class data
   sharing
2020-03-22 11:58:11 -04:00
David Roberts 076ba02e9c
[TEST] Mute transforms rolling upgrade tests (#53932)
Due to https://github.com/elastic/elasticsearch/issues/53931
2020-03-22 15:17:07 +00:00
Gordon Brown 10cabbbade
Transition Transforms to using hidden indices for notifcations index (#53773)
This commit changes the Transforms notifications index to be hidden
index, with a hidden alias.

This commit also removes the temporary hack in
MetaDataCreateIndexService that prevents deprecation warnings for known
dot-prefixed index names which are not hidden/system indices, as this
was the last index pattern to need that hack.
2020-03-20 15:40:58 -06:00
Ryan Ernst caa4e0dc18
Use boolean methods for allowed realm types in license state (#53456) (#53834)
In xpack the license state contains methods to determine whether a
particular feature is allowed to be used. The one exception is
allowsRealmTypes() which returns an enum of the types of realms allowed.
This change converts the enum values to boolean methods. There are 2
notable changes: NONE is removed as we always fall back to basic license
behavior, and NATIVE is not needed because it would always return true
since we should always have a basic license.
2020-03-20 14:30:31 -07:00
Aleksandr Maus fd0cdde38c
EQL: EqlActionIT improvements (#53780) (#53888)
Related to https://github.com/elastic/elasticsearch/issues/53598
2020-03-20 17:28:15 -04:00
Lee Hinman 1f3de2fa7e
Set feature flags for IndexTemplatesV2 in top-level gradle file (#53898)
Resolves #53892
2020-03-20 14:52:22 -06:00
Nik Everett c2a2fcb5a1
Clean up eclipse build (backport of #53831) (#53870)
Fixes up the "forbidden" warnings that you get when you import
Elasticsearch using "import gradle projects".

With this, and the manual step of switching circular project definitions
to warnings this gets most thing *compiling*.
2020-03-20 12:12:05 -04:00
Aleksandr Maus 83bef862e0
EQL: Extract query folder tests definitions into resources (#53802) (#53869) 2020-03-20 10:39:35 -04:00
Luca Cavanna 03fca61fcb [DOCS] add docs for async search (#53675)
Relates to #49091

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-03-20 14:46:38 +01:00
Christoph Büscher 8eacb153df
Add async_search.submit to HLRC #53592 (#53852)
This commit adds a new AsyncSearchClient to the High Level Rest Client which
initially supporst the submitAsyncSearch in its blocking and non-blocking
flavour. Also adding client side request and response objects and parsing code
to parse the xContent output of the client side AsyncSearchResponse together
with parsing roundtrip tests and a simple roundtrip integration test.

Relates to #49091
Backport of #53592
2020-03-20 13:15:58 +01:00
Przemysław Witek a68071dbba
[7.x] Delete empty .ml-state* indices during nightly maintenance task. (#53587) (#53849) 2020-03-20 13:08:36 +01:00
Alan Woodward d23112f441 Report parser name and location in XContent deprecation warnings (#53805)
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.

Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
2020-03-20 11:52:55 +00:00
Dimitris Athanasiou 60153c5433
[7.x][ML] Data frame analytics analysis stats (#53788) (#53844)
Adds parsing and indexing of analysis instrumentation stats.
The latest one is also returned from the get-stats API.

Note that we chose to duplicate objects even where they are currently
similar. There are already ideas on how these will diverge in the future
and while the duplication looks ugly at the moment, it is the option
that offers the highest flexibility.

Backport of #53788
2020-03-20 12:11:53 +02:00
Ryan Ernst b8ef830c0a
Decouple AuditTrailService from AuditTrail (#53450) (#53760)
The AuditTrailService has historically been an AuditTrail itself, acting
as a composite of the configured audit trails. This commit removes that
interface from the service and instead builds a composite delegating
implementation internally. The service now has a single get() method to
get an AuditTrail implementation which may be called. If auditing is not
allowed by the license, an empty noop version is returned.
2020-03-19 14:39:01 -07:00
Christoph Büscher d846ea43f4
Fix ReloadSynonymAnalyzerIT failure (#53663) (#53806)
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.

Backport of #53663
2020-03-19 19:00:14 +01:00
Benjamin Trent 433952b595
[7.x] [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725) (#53808)
* [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725)

This fixes two issues:


- Results persister would retry actions even if they are not intermittent. An example of an persistent failure is a doc mapping problem.
- Data frame analytics would continue to retry to persist results even after the job is stopped.

closes https://github.com/elastic/elasticsearch/issues/53687
2020-03-19 13:56:41 -04:00
Jake Landis cce60215d8
[7.x] Add Watcher to available rest resources (#53620) (#53764)
Prior to this commit Watcher explicitly copied test between two
projects with a copy task. This commit removes the explicit copy in favor
of adding the Watcher tests to the available restResources that may be
copied between projects.

This is how inter-project dependencies should be modeled. However, only
Watcher is included here since it is (currently) the only project with
inter-project test dependencies.
2020-03-19 12:29:36 -05:00
Jake Landis db3420d757
[7.x] Optimize which Rest resources are used by the Rest tests… (#53766)
This should help with Gradle's incremental compile such that projects
only depend upon the resources they use.

related #52114
2020-03-19 12:28:59 -05:00
Lee Hinman 40181eb200
[7.x] Fix feature flag setting for ComponentTemplate APIs (#53… (#53800)
* Fix feature flag setting for ComponentTemplate APIs (#53758)

The feature flag was set for *most* of the builds, but there are a couple where it was missing.

Resolves #53708

* Add skip for older versions of ES
2020-03-19 09:35:07 -06:00
Ignacio Vera dfc1d79ddf
Add support for distance queries on shape queries (#53468) (#53796)
With the upgrade to Lucene 8.5, XYShape field has support for distance queries. This change implements this new feature and removes the limitation.
2020-03-19 15:32:09 +01:00
Dominic Page b0884baf46
Geo shape query vs geo point backport (#53774)
Backport to 7x

Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928
Co-Authored-By:  @iverase
2020-03-19 13:00:36 +01:00
Ioannis Kakavas 4a36894a48
Mute failing tests (#53781)
See #53738
2020-03-19 08:16:23 +02:00