Commit Graph

5094 Commits

Author SHA1 Message Date
Benjamin Trent d276058c6c
[ML] adjusting feature importance mapping for multi-class support (#53821) (#54013)
Feature importance storage format is changing to encompass multi-class.

Feature importance objects are now mapped as follows
(logistic) Regression:
```
{
   "feature_name": "feature_0",
   "importance": -1.3
}
```
Multi-class [class names are `foo`, `bar`, `baz`]
```
{
   “feature_name”: “feature_0”,
   “importance”: 2.0, // sum(abs()) of class importances
   “foo”: 1.0,
   “bar”: 0.5,
   “baz”: -0.5
},
```

This change adjusts the mapping creation for analytics so that the field is mapped as a `nested` type.

Native side change: https://github.com/elastic/ml-cpp/pull/1071
2020-03-23 15:50:12 -04:00
Przemysław Witek 88c5d520b3
[7.x] Verify that the field is aggregatable before attempting cardinality aggregation (#53874) (#54004) 2020-03-23 19:36:33 +01:00
Luca Cavanna 932a7e3112
Backport of async search changes (#53976)
* Get Async Search: omit _clusters section when empty (#53907)

The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content.

This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`.

* Async search: remove version from response (#53960)

The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).

That said this commit clarifies this in the docs and removes the version field from the async search response

* Async Search: replicas to auto expand from 0 to 1 (#53964)

This way single node clusters that are green don't go yellow once async search is used, while
all the others still have one replica.

* [DOCS] address timing issue in async search docs tests (#53910)

The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final.

With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response.

Closes #53887
Closes #53891
2020-03-23 19:13:31 +01:00
Dimitris Athanasiou 965af3a68b
[7.x][ML] Delete DF analytics stats upon job deletion (#53933) (#53997)
Since a data frame analytics job may have associated docs
in the .ml-stats-* indices, when the job is deleted we
should delete those docs too.

Backport of #53933
2020-03-23 19:55:36 +02:00
Dimitris Athanasiou 08a8345269
[7.x][ML] Fix typo in outlier detection timing stats (#53988) (#53995)
The field holding the timing stats was mistakenly called
`timings_stats`.

Backport of #53988
2020-03-23 19:46:39 +02:00
Ryan Ernst 960d1fb578
Revert "Introduce system index APIs for Kibana (#53035)" (#53992)
This reverts commit c610e0893d.

backport of #53912
2020-03-23 10:29:35 -07:00
Armin Braun 5b9864db2c
Better Incrementality for Snapshots of Unchanged Shards (#52182) (#53984)
Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.
2020-03-23 16:43:41 +01:00
Dimitris Athanasiou 3873510332
[7.x][ML] Refactor DFA custom processor to cross validation splitter (#53915) (#53956)
While `CustomProcessor` is generic and allows for flexibility, there
are new requirements that make cross validation a concept it's hard
to abstract behind custom processor. In particular, we would like to
add data_counts to the DFA jobs stats. Counting training VS. test
docs would be a useful statistic. We would also want to add a
different cross validation strategy for multiclass classification.

This commit renames custom processors to cross validation splitters
which allows for those enhancements without cryptically doing
things as a side effect of the abstract custom processing.

Backport of #53915
2020-03-23 17:15:14 +02:00
Armin Braun 754d071c4e
Upgrade to AWS SDK 1.11.749 (#53962) (#53974)
Upgrading AWS SDK to v1.11.749.
Required building clients inside privileged contexts because some class loading that requires privileges now happens there and working around a new SDK bug in the S3 client builder.

Closes #53191
2020-03-23 15:31:29 +01:00
Marios Trivyzas af03200ad6
SQL: Extend DATE_TRUNC to also operate on intervals(elastic - #46632 ) (#47720) (#53972)
The function is extended to operate on intervals according to the PostgreSQL: https://www.postgresql.org/docs/9.1/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC

Closes : #46632
(cherry picked from commit 2dc79505825fa75e0711dcfa8e9c69e8028fc979)

Co-authored-by: musteaf <gs_mustea@hotmail.com>
2020-03-23 15:05:16 +01:00
Martijn van Groningen aef7b89219
Backport: initial data stream commit (#53959)
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to thest the data stream API stubs.

This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.

The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.

The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.

Relates to #53100
2020-03-23 12:58:09 +01:00
Yannick Welsch 060c72c799 Only link fd* files during source-only snapshot (#53463)
Source-only snapshots currently create a second full source-only copy of the shard on disk to
support incrementality during upload. Given that stored fields are occupying a substantial part
of a shard's storage, this means that clusters with source-only snapshots can require up to
50% more local storage. Ideally we would only generate source-only parts of the shard for the
things that need to be uploaded (i.e. do incrementality checks on original file instead of
trimmed-down source-only versions), but that requires much bigger changes to the snapshot
infrastructure. This here is an attempt to dramatically cut down on the storage used by the
source-only copy of the shard by soft-linking the stored-fields files (fd*) instead of copying
them.

Relates #50231
2020-03-23 11:04:53 +01:00
Tim Vernum cde8725e3c
Create API Key on behalf of other user (#53943)
This change adds a "grant API key action"

   POST /_security/api_key/grant

that creates a new API key using the privileges of one user ("the
system user") to execute the action, but creates the API key with
the roles of the second user ("the end user").

This allows a system (such as Kibana) to create API keys representing
the identity and access of an authenticated user without requiring
that user to have permission to create API keys on their own.

This also creates a new QA project for security on trial licenses and runs
the API key tests there

Backport of: #52886
2020-03-23 18:50:07 +11:00
Tim Vernum f003a419a5
Add exception metadata for disabled features (#53941)
This change adds a new exception with consistent metadata for when
security features are not enabled. This allows clients to be able to
tell that an API failed due to a configuration option, and respond
accordingly.

Relates: kibana#55255
Resolves: #52311, #47759

Backport of: #52811
2020-03-23 14:13:15 +11:00
Jason Tedor 27c8bcbbd1
Introduce aarch64 packaging (#53914) (#53926)
This commit introduces aarch64 packaging, including bundling an aarch64
JDK distribution. We had to make some interesting choices here:
 - ML binaries are not compiled for aarch64, so for now we disable ML on
   aarch64
 - depending on underlying page sizes, we have to disable class data
   sharing
2020-03-22 11:58:11 -04:00
David Roberts 076ba02e9c
[TEST] Mute transforms rolling upgrade tests (#53932)
Due to https://github.com/elastic/elasticsearch/issues/53931
2020-03-22 15:17:07 +00:00
Gordon Brown 10cabbbade
Transition Transforms to using hidden indices for notifcations index (#53773)
This commit changes the Transforms notifications index to be hidden
index, with a hidden alias.

This commit also removes the temporary hack in
MetaDataCreateIndexService that prevents deprecation warnings for known
dot-prefixed index names which are not hidden/system indices, as this
was the last index pattern to need that hack.
2020-03-20 15:40:58 -06:00
Ryan Ernst caa4e0dc18
Use boolean methods for allowed realm types in license state (#53456) (#53834)
In xpack the license state contains methods to determine whether a
particular feature is allowed to be used. The one exception is
allowsRealmTypes() which returns an enum of the types of realms allowed.
This change converts the enum values to boolean methods. There are 2
notable changes: NONE is removed as we always fall back to basic license
behavior, and NATIVE is not needed because it would always return true
since we should always have a basic license.
2020-03-20 14:30:31 -07:00
Aleksandr Maus fd0cdde38c
EQL: EqlActionIT improvements (#53780) (#53888)
Related to https://github.com/elastic/elasticsearch/issues/53598
2020-03-20 17:28:15 -04:00
Lee Hinman 1f3de2fa7e
Set feature flags for IndexTemplatesV2 in top-level gradle file (#53898)
Resolves #53892
2020-03-20 14:52:22 -06:00
Nik Everett c2a2fcb5a1
Clean up eclipse build (backport of #53831) (#53870)
Fixes up the "forbidden" warnings that you get when you import
Elasticsearch using "import gradle projects".

With this, and the manual step of switching circular project definitions
to warnings this gets most thing *compiling*.
2020-03-20 12:12:05 -04:00
Aleksandr Maus 83bef862e0
EQL: Extract query folder tests definitions into resources (#53802) (#53869) 2020-03-20 10:39:35 -04:00
Luca Cavanna 03fca61fcb [DOCS] add docs for async search (#53675)
Relates to #49091

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-03-20 14:46:38 +01:00
Christoph Büscher 8eacb153df
Add async_search.submit to HLRC #53592 (#53852)
This commit adds a new AsyncSearchClient to the High Level Rest Client which
initially supporst the submitAsyncSearch in its blocking and non-blocking
flavour. Also adding client side request and response objects and parsing code
to parse the xContent output of the client side AsyncSearchResponse together
with parsing roundtrip tests and a simple roundtrip integration test.

Relates to #49091
Backport of #53592
2020-03-20 13:15:58 +01:00
Przemysław Witek a68071dbba
[7.x] Delete empty .ml-state* indices during nightly maintenance task. (#53587) (#53849) 2020-03-20 13:08:36 +01:00
Alan Woodward d23112f441 Report parser name and location in XContent deprecation warnings (#53805)
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.

Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
2020-03-20 11:52:55 +00:00
Dimitris Athanasiou 60153c5433
[7.x][ML] Data frame analytics analysis stats (#53788) (#53844)
Adds parsing and indexing of analysis instrumentation stats.
The latest one is also returned from the get-stats API.

Note that we chose to duplicate objects even where they are currently
similar. There are already ideas on how these will diverge in the future
and while the duplication looks ugly at the moment, it is the option
that offers the highest flexibility.

Backport of #53788
2020-03-20 12:11:53 +02:00
Ryan Ernst b8ef830c0a
Decouple AuditTrailService from AuditTrail (#53450) (#53760)
The AuditTrailService has historically been an AuditTrail itself, acting
as a composite of the configured audit trails. This commit removes that
interface from the service and instead builds a composite delegating
implementation internally. The service now has a single get() method to
get an AuditTrail implementation which may be called. If auditing is not
allowed by the license, an empty noop version is returned.
2020-03-19 14:39:01 -07:00
Christoph Büscher d846ea43f4
Fix ReloadSynonymAnalyzerIT failure (#53663) (#53806)
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.

Backport of #53663
2020-03-19 19:00:14 +01:00
Benjamin Trent 433952b595
[7.x] [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725) (#53808)
* [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725)

This fixes two issues:


- Results persister would retry actions even if they are not intermittent. An example of an persistent failure is a doc mapping problem.
- Data frame analytics would continue to retry to persist results even after the job is stopped.

closes https://github.com/elastic/elasticsearch/issues/53687
2020-03-19 13:56:41 -04:00
Jake Landis cce60215d8
[7.x] Add Watcher to available rest resources (#53620) (#53764)
Prior to this commit Watcher explicitly copied test between two
projects with a copy task. This commit removes the explicit copy in favor
of adding the Watcher tests to the available restResources that may be
copied between projects.

This is how inter-project dependencies should be modeled. However, only
Watcher is included here since it is (currently) the only project with
inter-project test dependencies.
2020-03-19 12:29:36 -05:00
Jake Landis db3420d757
[7.x] Optimize which Rest resources are used by the Rest tests… (#53766)
This should help with Gradle's incremental compile such that projects
only depend upon the resources they use.

related #52114
2020-03-19 12:28:59 -05:00
Lee Hinman 40181eb200
[7.x] Fix feature flag setting for ComponentTemplate APIs (#53… (#53800)
* Fix feature flag setting for ComponentTemplate APIs (#53758)

The feature flag was set for *most* of the builds, but there are a couple where it was missing.

Resolves #53708

* Add skip for older versions of ES
2020-03-19 09:35:07 -06:00
Ignacio Vera dfc1d79ddf
Add support for distance queries on shape queries (#53468) (#53796)
With the upgrade to Lucene 8.5, XYShape field has support for distance queries. This change implements this new feature and removes the limitation.
2020-03-19 15:32:09 +01:00
Dominic Page b0884baf46
Geo shape query vs geo point backport (#53774)
Backport to 7x

Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928
Co-Authored-By:  @iverase
2020-03-19 13:00:36 +01:00
Ioannis Kakavas 4a36894a48
Mute failing tests (#53781)
See #53738
2020-03-19 08:16:23 +02:00
Benjamin Trent 415d73c27d
[Transform] renamed _cat/transform to _cat/transforms (#53743) (#53771)
renaming _cat/transform to  _cat/transforms for uniformity with the other _cat apis.
2020-03-18 19:54:03 -04:00
Stuart Tettemer cdbee32f55
Scripting: Per-context script cache, default off (#52855) (#53756)
* Adds per context settings:
  `script.context.${CONTEXT}.cache_max_size` ~
  `script.cache.max_size`

  `script.context.${CONTEXT}.cache_expire` ~
  `script.cache.expire`

  `script.context.${CONTEXT}.max_compilations_rate` ~
  `script.max_compilations_rate`

* Context cache is used if:
  `script.max_compilations_rate=use-context`.  This
  value is dynamically updatable, so users can
  switch back to the general cache if desired.

* Settings for context caches take the first value
  that applies:
  1) Context specific settings if set, eg
     `script.context.ingest.cache_max_size`
  2) Correlated general setting is set to the non-default
     value, eg `script.cache.max_size`
  3) Context default

The reason for 2's inclusion is to allow an easy
transition for users who've customized their general
cache settings.

Using the general cache settings for the context caches
results in higher effective settings, since they are
multiplied across the number of contexts.  So a general
cache max size of 200 will become 200 * # of contexts.
However, this behavior it will avoid users snapping to a
value that is too low for them.

Backport of: #52855
Refs: #50152
2020-03-18 14:44:04 -06:00
Ioannis Kakavas af519cccff Revert "Mute TimeSeriesLifecycleActionsIT (#53741)"
This reverts commit df0ad7569b.
2020-03-18 18:51:06 +02:00
markharwood ae19802e29
Fix highlighter support in PinnedQuery and added test (#53716) (#53729)
CappedScoreQuery was not delegating queryVisitor calls

Closes #53699
2020-03-18 15:39:17 +00:00
Ioannis Kakavas df0ad7569b
Mute TimeSeriesLifecycleActionsIT (#53741)
see #53738
2020-03-18 17:38:24 +02:00
Luca Cavanna 75c367de13 [TEST] Replace agg key in async search yaml test (#53727)
Some clients have problems running this test as a numeric key is treated like an array index by default.
We can work around this by renaming the aggregation key to not be a numeric.
2020-03-18 16:16:15 +01:00
Benjamin Trent 2ccb963f1d
Create GET _cat/transforms API Issue (#53643) (#53726)
Adds new` _cat/transform` and `_cat/transform/{transform_id}` endpoints.
2020-03-18 10:45:28 -04:00
Alan Woodward 580bc40c0c Make it possible to deprecate all variants of a ParseField with no replacement (#53722)
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
2020-03-18 14:16:19 +00:00
Ioannis Kakavas e5aa0906f7
Mute testHistoryIsWrittenWithDeletion (#53721)
see #53718
2020-03-18 14:49:57 +02:00
Christoph Büscher 2384c1359d Revert "Fix ReloadSynonymAnalyzerIT failure (#53663)"
This reverts commit 2c32173fce.
2020-03-18 12:44:23 +01:00
Christoph Büscher 2c32173fce Fix ReloadSynonymAnalyzerIT failure (#53663)
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.

Closes #53443
2020-03-18 11:55:37 +01:00
Przemysław Witek ec13c093df
Make ML index aliases hidden (#53160) (#53710) 2020-03-18 10:28:45 +01:00
Ioannis Kakavas 873d0ecd09
Fix potential bug in concurrent token refresh support (#53668) (#53705)
Ensure that we do not proceed execution after calling the
listerer's onFailure
2020-03-18 09:43:26 +02:00
Hendrik Muhs 7a12300ce6
[7.x][Transform] enhance the output of preview to return full… (#53695)
changes the output format of preview regarding deduced mappings and enhances
it to return all the details about auto-index creation. This allows the user
to customize the index creation. Using HLRC you can create a index request
from the output of the response.

backport #53572
2020-03-18 08:37:56 +01:00
Hendrik Muhs a6dca577e5 [Transform] data nanos/date histogram IT (#53654)
add an integration test for date nanos in combination with date_histogram
2020-03-17 20:58:57 +01:00
Ryan Ernst 169308656c Actually add licenses for jackson
Missed in 1d9f57b
2020-03-17 11:13:20 -07:00
Ryan Ernst 1d9f57bfc1 Fix databind version reference
This fixes fallout from a bad backport of #53642
2020-03-17 10:40:56 -07:00
Ryan Ernst 5c472fcb47 Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642)
Re-applies the change from #53523 along with test fixes.

closes #53626
closes #53624
closes #53622
closes #53625

Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
2020-03-17 10:28:51 -07:00
David Kyle 2b635737e1
[ML] Parse single named object in config classes (#53472) (#53542) 2020-03-17 13:59:52 +00:00
Alan Woodward 71b703edd1 Rename AtomicFieldData to LeafFieldData (#53554)
This conforms with lucene's LeafReader naming convention, and
matches other per-segment structures in elasticsearch.
2020-03-17 12:30:12 +00:00
Andrei Stefan 79600eb38b
SQL: add support for index aliases for SYS COLUMNS command (#53525) (#53653)
(cherry picked from commit f65e4d6ff7b2e00eb6f9c985fbe7cb24de00f045)
2020-03-17 12:49:08 +02:00
Hendrik Muhs a0314ad015 [Transform] add transform discovery node role (#53616)
Enhancement of #52712: Add a discovery node role using the letter t for transform.

Fixes #53156
2020-03-17 11:39:20 +01:00
Ioannis Kakavas 23af171cf8
Disallow Password Change when authenticated by Token (#49694) (#53614)
Password changes are only allowed when the user is currently
authenticated by a realm (that permits the password to be changed)
and not when authenticated by a bearer token or an API key.
2020-03-17 09:45:35 +02:00
Yang Wang 7f21ade924
Explicitly require that derived API keys have no privileges (#53647) (#53648)
The current implicit behaviour is that when an API keys is used to create another API key,
the child key is created without any privilege. This implicit behaviour is surprising and is
a source of confusion for users.

This change makes that behaviour explicit.
2020-03-17 17:56:37 +11:00
Tim Vernum 74dbdb991c
Avoid NPE in set_security_user without security (#53543)
If security was disabled (explicitly), then the SecurityContext would
be null, but the set_security_user processor was still registered.

Attempting to define a pipeline that used that processor would fail
with an (intentional) NPE. This behaviour, introduced in #52032, is a
regression from previous releases where the pipeline was allowed, but
was no usable.

This change restores the previous behaviour (with a new warning).

Backport of: #52691
2020-03-17 13:30:07 +11:00
Ryan Ernst e7f38674ed Add internalClusterTest to check task (#53444)
This commit adds internalClusterTest in xpack core to run as part of
check. This was accidentally removed in a refactoring. Other xpack
modules already do this, but core was left out. This commit also mutes 2
tests that currently fail.

closes #53407
2020-03-16 18:55:01 -07:00
Luca Cavanna c3d2417448
Cumulative backport of async search changes (#53635)
* Submit async search to work only with POST (#53368)

Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method.

* Refine SearchProgressListener internal API (#53373)

The following cumulative improvements have been made:
- rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce`
- add unit test for `SearchShard`
- on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private
- Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class.
- rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members
- make `SearchShard` and `SearchShardTask` classes final

* Move async search yaml tests to x-pack yaml test folder (#53537)

The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too.

* [DOCS] Add temporary redirect for async-search (#53454)

The following API spec files contain a link to a not-yet-created
async search docs page:

* [async_search.delete.json][0]
* [async_search.get.json][1]
* [async_search.submit.json][2]

The Elaticsearch-js client uses these spec files to create their docs.
This created a broken link in the Elaticsearch-js docs, which has broken
the docs build.

This PR adds a temporary redirect for the docs page. This redirect
should be removed when the actual API docs are added.

[0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json
[1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json
[2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-03-17 00:08:17 +01:00
Nik Everett f0beab4041
Stop using round-tripped PipelineAggregators (backport of #53423) (#53629)
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
   serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
   `PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
   really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.

This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.

Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
2020-03-16 16:15:23 -04:00
Gordon Brown 880cc3ca7e
Hide I/SLM history aliases (#53564)
This commit adjusts the aliases used for the ILM and SLM history indices
to be hidden aliases.

Also tweaks the configuration of the `IndexTemplateRegistry`s used by
these history system to only upgrade the template from the master node,
as documents are indexed from the master node, so the template version
should only be upgraded from the master node.
2020-03-16 13:07:26 -06:00
Gordon Brown 031932b32f
Allow _cat indices & aliases to use indices options (#53248)
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.

Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
2020-03-16 11:25:05 -06:00
Alexander Reelsen 7571ca437a Disable Watcher script optimization for stored scripts (#53497)
The watcher TextTemplateEngine uses a fast path mechanism where it
checks for the existence of `{{` to decide if a mustache script
required compilation. This does not work for stored script, as the field
that is checked contains the id of the script, which means, the name of
the script is returned as its value.

This commit checks for the script type and does not involve this fast
path check if a stored script is used.

Closes #40212
2020-03-16 18:07:54 +01:00
Andrei Stefan 91ca9c5c33
QL: constant_keyword support (#53241) (#53602)
(cherry picked from commit d6cd4ce7849ba215407c8c5fa815c9b373fb8480)
2020-03-16 18:06:31 +02:00
jimczi dc2edc97f0 Fix sporadic failures in AsyncSearchActionTests (take 2)
This change removes the need to always get a new version when iterating
on an async search. This is needed since we cannot guarantee that shards will
be queried exactly in order.

Relates #53360
2020-03-16 16:52:23 +01:00
markharwood 2c74f3e22c
Backport of new wildcard field type (#53590)
* New wildcard field optimised for wildcard queries (#49993)

Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values.  Also supports aggregations and sorting.
2020-03-16 15:07:13 +00:00
Przemysław Witek 376b2ae735
[7.x] Make classification evaluation metrics work when there is field mapping type mismatch (#53458) (#53601) 2020-03-16 15:38:56 +01:00
Jim Ferenczi e6680be0b1
Add new x-pack endpoints to track the progress of a search asynchronously (#49931) (#53591)
This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.

````
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
  "aggs": {
    "date_histogram": {
      "field": "@timestamp",
      "fixed_interval": "1h"
    }
  }
}
````

If after 100ms the final response is not available, a `partial_response` is included in the body:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 1,
  "is_running": true,
  "is_partial": true,
  "response": {
   "_shards": {
       "total": 100,
       "successful": 5,
       "failed": 0
    },
    "total_hits": {
      "value": 1653433,
      "relation": "eq"
    },
    "aggs": {
      ...
    }
  }
}
````

The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:

````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````

That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 10,
  "is_running": false,
  "response": {
     "is_partial": false,
     ...
  }
}
````

Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`.
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.

Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed.

Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.

````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````

The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.

The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.

Relates #49091

Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
2020-03-16 15:31:27 +01:00
Marios Trivyzas 723034001c SQL: Fix NPE for parameterized LIKE/RLIKE (#53573)
Fix NPE when `null` is passed as a parameter for a parameterized
pattern of LIKE/RLIKE. e.g.: `field LIKE ?` params=[null]`
Check for null pattern in LIKE/RLIKE as for RLIKE (RegexpQuery) we
get an IllegalArgumentExpression from Lucence but for LIKE
(WildcardQuery) we get an NPE.

Fixes: #53557
(cherry picked from commit ec3481ed13254ecdec32acf7a0fafd536ec77aff)
2020-03-16 14:44:48 +01:00
Dimitris Athanasiou 94da4ca3fc
[7.x][ML] Extend classification to support multiple classes (#53539) (#53597)
Prepares classification analysis to support more than just
two classes. It introduces a new parameter to the process config
which dictates the `num_classes` to the process. It also
changes the max classes limit to `30` provisionally.

Backport of #53539
2020-03-16 15:00:54 +02:00
David Kyle a38e5ca8e7
Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure (#53595)
Failure tracked in #50353
2020-03-16 12:30:56 +00:00
Marios Trivyzas 1272ae411e SQL: Fix issue with LIKE/RLIKE as painless script (#53495)
Add missing asScript() implementation for LIKE/RLIKE expressions.

When LIKE/RLIKE are used for example in GROUP BY or are wrapped with
scalar functions in a WHERE clause, the translation must produce a
painless script which will be executed to implement the correct
behaviour and previously this was completely missing, and as a
consquence wrong results were silently (no error) returned.

Fixes: #53486
(cherry picked from commit eaa8ead6742a8e7dcf343bcbaff8de031550fd77)
2020-03-16 12:27:45 +01:00
Martijn van Groningen 3b9545848f
Reenable watcher rest tests (#53532)
Also log a message instead of failing if there are active watches at a beginning of a test.

Relates to #53177
2020-03-16 10:24:14 +01:00
Mark Vieira 2f0aca992b
Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)"
This reverts commit b7dbadeea0.
2020-03-15 18:10:40 -07:00
Jason Tedor b7dbadeea0
Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)
This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2
dependency to 2.13.1.

Relates #53523
2020-03-14 13:28:06 -04:00
Benjamin Trent 1262ab2762
[ML] [Inference] fix number inference models returned in x-pack info call (#53540) (#53560)
the ML portion of the x-pack info API was erroneously counting configuration documents and definition documents. The underlying implementation of our storage separates the two out.

This PR filters the query so that only trained model config documents are counted.
2020-03-13 16:53:34 -04:00
Benjamin Trent 4e43ede735
[ML] renaming inference processor field field_mappings to new name field_map (#53433) (#53502)
This renames the `inference` processor configuration field `field_mappings` to `field_map`.

`field_mappings` is now deprecated.
2020-03-13 15:40:57 -04:00
Tom Veasey 690099553c
[7.x][ML] Adds the class_assignment_objective parameter to classification (#53552)
Adds a new parameter for classification that enables choosing whether to assign labels to
maximise accuracy or to maximise the minimum class recall.

Fixes #52427.
2020-03-13 17:35:51 +00:00
Tim Vernum a8677499d7
[Backport] Add support for secondary authentication (#53530)
This change makes it possible to send secondary authentication
credentials to select endpoints that need to perform a single action
in the context of two users.

Typically this need arises when a server process needs to call an
endpoint that users should not (or might not) have direct access to,
but some part of that action must be performed using the logged-in
user's identity.

Backport of: #52093
2020-03-13 16:30:20 +11:00
Tim Vernum bac1740d44
Support authentication without anonymous user (#53528)
This change adds a new parameter to the authenticate methods in the
AuthenticationService to optionally exclude support for the anonymous
user (if an anonymous user exists).

Backport of: #52094
2020-03-13 14:27:29 +11:00
Jason Tedor f696360517
Fix SHAs for :x-pack:snapshot-tool
This commit fixes the SHA for jackson-databind in :x-pack:snapshot-tool.
2020-03-12 20:24:29 -04:00
Nik Everett 9dcd64c110
Preserve metric types in top_metrics (backport of #53288) (#53440)
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
2020-03-12 17:17:09 -04:00
Jason Tedor 5b08ea84c9
Add deprecation check for listener thread pool (#53438)
This commit adds a deprecation check for the listener thread pool
settings as these will be removed in 8.0.0.
2020-03-12 14:32:41 -04:00
Jay Modi af36665b08
Deprecate the logstash enabled setting (#53487)
The setting, `xpack.logstash.enabled`, exists to enable or disable the
logstash extensions found within x-pack. In practice, this setting had
no effect on the functionality of the extension. Given this, the
setting is now deprecated in preparation for removal.

Backport of #53367
2020-03-12 10:18:39 -06:00
Dan Hermann 34adfd9611
Validate SSL settings at parse time (#49196) (#53473) 2020-03-12 10:14:51 -05:00
Aleksandr Maus 31d45b3c95
EQL: Improve query folder test suite (#53187) (#53476)
Related to https://github.com/elastic/elasticsearch/issues/52775
2020-03-12 10:58:07 -04:00
Yannick Welsch 48124807d5 Fix SourceOnlySnapshotIT (#53462)
The tests in this class had been failing for a while, but went unnoticed as not tested by CI (see #53442).

The reason the tests fail is that the can-match phase is smarter now, and filters out access to a non-existing field.

Closes #53442
2020-03-12 14:15:03 +01:00
Jason Tedor d8e70d4688
Enable deprecation checks for removed settings (#53317)
Today we do not have any infrastructure for adding a deprecation check
for settings that are removed. This commit enables this by adding such
infrastructure. Note that this infrastructure is unused in this commit,
which is deliberate. However, the primary target for this commit is 7.x
where this infrastructue will be used, in a follow-up.
2020-03-11 16:49:16 -04:00
Benjamin Trent 89668c5ea0
[ML][Inference] adds new default_field_map field to trained models (#53294) (#53419)
Adds a new `default_field_map` field to trained model config objects.

This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.

The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
2020-03-11 13:49:39 -04:00
Lisa Cawley c408a34a21 [DOCS] Fixes link to custom realm examples (#53205) 2020-03-11 09:15:48 -07:00
Jay Modi 9a21a8abf2
Opt-in logstash plugin to formatting (#53413)
This change opts-in the logstash plugin for enforced formatting.

Backport of #53370
2020-03-11 09:58:37 -06:00
Nhat Nguyen 6665ebe7ab Harden search context id (#53143)
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:

1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index

This commit avoids these issues by adding a UUID to SearchContexId.
2020-03-11 11:48:11 -04:00
Przemysław Witek 8c4c19d310
Perform evaluation in multiple steps when necessary (#53295) (#53409) 2020-03-11 15:36:38 +01:00
Przemysław Witek 063957b7d8
Simplify "refresh" calls. (#53385) (#53393) 2020-03-11 12:26:11 +01:00
Dimitris Athanasiou cc7751eb16
[7.x][ML] Add ILM policy to ml stats indices (#53349) (#53392)
Adds a size based ILM policy to automatically
rollover ml stats indices.

Backport of #53349
2020-03-11 13:01:34 +02:00
Dimitris Athanasiou 0fd0516d0d
[7.x][ML] Rename data frame analytics maximum_number_trees to max_trees (#53300) (#53390)
Deprecates `maximum_number_trees` parameter of classification and
regression and replaces it with `max_trees`.

Backport of #53300
2020-03-11 12:45:27 +02:00
David Roberts 532a720e1b
[ML] Skeleton estimate_model_memory endpoint for anomaly detection (#53386)
This is a partial implementation of an endpoint for anomaly
detector model memory estimation.

It is not complete, lacking docs, HLRC and sensible numbers
for many anomaly detector configurations.  These will be
added in a followup PR in time for 7.7 feature freeze.

A skeleton endpoint is useful now because it allows work on
the UI side of the change to commence.  The skeleton endpoint
handles the same cases that the old UI code used to handle,
and produces very similar estimates for these cases.

Backport of #53333
2020-03-11 10:20:00 +00:00
Jake Landis 2ab502afc4
[7.x] Remove dead 'beats' code (#53312) (#53376) 2020-03-10 20:57:29 -05:00
Nhat Nguyen 24f114766f Fix doc_stats and segment_stats of ReadOnlyEngine (#53345)
We can't always have the same segment stats and doc stats between
InternalEngine and ReadOnlyEngine if there are some fully deleted
segments. ReadOnlyEngine always filters out them. InternalEngine,
however, will keep them if peer recovery retention leases exist or the
number of the retaining operations is non-zero.

This change reverts the fix in #51331 and uses the wrapped reader to
calculate the segment stats and doc stats. For the test, we need to
disable the extra retaining soft-deletes operations.

Closes #51303
2020-03-10 21:51:33 -04:00
Nhat Nguyen cad02d4a31 Increase timeout testFollowIndexMaxOperationSizeInBytes (#53014)
Replicating 1000 documents one by one (as we cap the request size at 
1 byte) can take more than 10 seconds on a slow CI.

Closes #52812
2020-03-10 21:51:33 -04:00
William Brafford 3494c73c8d
Mute failing tests (#53362) (#53363) 2020-03-10 16:01:31 -04:00
Przemko Robakowski 847ac9c7d7
Fix null config in SnapshotLifecyclePolicy.toRequest (#53328) (#53355)
This avoids NPE when executing SLM policy when no config was provided.

Related to #44465

Closes #53171

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-10 20:44:30 +01:00
Przemysław Witek d54d7f2be0
[7.x] Implement ILM policy for .ml-state* indices (#52356) (#53327) 2020-03-10 14:24:18 +01:00
Benjamin Trent 856d9bfbc1
[ML] fixing data frame analysis test when two jobs are started in succession quickly (#53192) (#53332)
A previous change (#53029) is causing analysis jobs to wait for certain indices to be made available. While this it is good for jobs to wait, they could fail early on _start. 

This change will cause the persistent task to continually retry node assignment when the failure is due to shards not being available.

If the shards are not available by the time `timeout` is reached by the predicate, it is treated as a _start failure and the task is canceled. 

For tasks seeking a new assignment after a node failure, that behavior is unchanged.


closes #53188
2020-03-10 08:30:47 -04:00
Hendrik Muhs 5912895838 [Transform] wait for transform templates in Rest integration t… (#53330)
add transform templates to the list of templates to be installed before
executing tests
2020-03-10 13:22:12 +01:00
Hendrik Muhs 696aa4ddaf
[7.x][Transform] add support for script in group_by (#53167) (#53324)
add the possibility to base the group_by on the output of a script.

closes #43152
backport #53167
2020-03-10 11:12:58 +01:00
Alan Woodward 5c861cfe6e Upgrade to final lucene 8.5.0 snapshot (#53293)
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
2020-03-10 09:32:59 +00:00
Cauê Marcondes b68d7b1c33
giving kibana user privileges to create custom link index (#53221) (#53278) 2020-03-10 09:50:38 +01:00
Henning Andersen a4d481f2bb ILM Freeze step retry when not acknowledged (#53287)
A freeze operation can partially fail in multiple places, including the
close verification step. This left the index in an unfrozen but
partially closed state. Now throw an exception to retry the freeze step
instead.
2020-03-10 08:03:39 +01:00
Gordon Brown 1cb0a4399d
Fix Get Alias API handling of hidden indices with visible aliases (#53147)
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
2020-03-09 16:16:29 -06:00
Przemko Robakowski f075d70cf8
[7.x] Avoid race condition in ILMHistorySotre (#53039) (#53094)
* Avoid race condition in ILMHistorySotre (#53039)

* Avoid race condition in ILMHistorySotre

This change modifies ILMHistoryStore to always apply correct settings and mappings,
even if template is deleted and not yet recreated. This ensures that ILM history index
is correctly managed by ILM and also fixes flaky history tests that were prone to
triggenring this race.

This commit also refactors and simplifies ILM history tests.

Closes #50353 and #52853

* Review comment

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* fixed tests

* backport #53306

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-09 22:24:15 +01:00
Bogdan Pintea 62c8ac9993
SQL: transfer version compatibility decision to the server (#53082) (#53302)
This commit adds a new request object field, "version", containing the version of the requesting client. This parameter is now accepted - and for certain clients required - by the server and the request is validated against it. Currently server's and client's versions still need to be equal in order for the request to be accepted. Relaxing this check is going to be part of future work. 

On the clients' side, the only check remaining is to ensure that the peer server is supporting version backwards compatibility (i.e. is on, or newer than a certain release).

(cherry picked from commit a8f413a20fb023bec83af0de1211a2936a7f558c)
2020-03-09 21:16:57 +01:00
Aleksandr Maus d064846416
EQL: Test infrastructure improvements (#53253) (#53297)
Update CommonEqlRestTestCase code to simplify making changes as requested.
Update EqlActionIT to simplify the test code as requested.
Replace Jackson parser with XContent in EqlActionIT.
Whitelist more EQL tests specs that are now supported.
2020-03-09 14:11:54 -04:00
Ross Wolf f5f922c6f6
EQL: Add IsNull/IsNotNull checks (#52791)
* EQL: Add IsNull/IsNotNull checks
* EQL: Simplify IsNull/IsNotNull optimization
* EQL: Split string tests over multiple lines
2020-03-09 10:41:04 -06:00
Jason Tedor 8ad0080a59
Fork CCR checkpoint listeners on CCR thread pool (#53265)
This commit moves the global checkpoint listeners used in CCR to the CCR
thread pool. This removes the last use of the listener thread pool in
the codebase.
2020-03-09 08:56:30 -04:00
Martijn van Groningen 7775ddbc9c
Verify watch_count before a test starts and not after a test.
This check was added as part of: 0f2d26bdca

Checking this before the test starts makes more sense, because
the watches index has then also be removed.

Relates to #53177
2020-03-09 07:45:44 +01:00
Jason Tedor 5e96d3e59a
Use given executor for global checkpoint listener (#53260)
Today when notifying a global checkpoint listener, we use the listener
thread pool. This commit turns this inside out so that the global
checkpoint listener must provide an executor on which to notify the
listener.
2020-03-08 13:51:05 -04:00
Lisa Cawley 341417613e
[7.x][DOCS] Adds common definitions for security settings (#51017) (#53242)
Co-Authored-By: Tim Vernum <tim@adjective.org>
2020-03-06 16:28:54 -08:00
Gordon Brown ff9b8bda63
Implement hidden aliases (#52547)
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.

The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
2020-03-06 16:02:38 -07:00
Ross Wolf d6813cb348
EQL: Convert wildcards to LIKE in analyzer (#51901)
* EQL: Convert wildcard comparisons to Like
* EQL: Simplify wildcard handling, update tests
* EQL: Lint fixes for Optimizer.java
2020-03-06 13:13:07 -07:00
Mayya Sharipova f96ad5c32d Mute testSingleNumericFeatureAndMixedTrainingAndNonTrainingRows 2020-03-06 12:48:05 -05:00
Jay Modi a81460dbf5
Make watch history indices hidden (#52974)
This commit updates the template used for watch history indices with
the hidden index setting so that new indices will be created as hidden.

Relates #50251
Backport of #52962
2020-03-06 09:47:03 -07:00
Mark Vieira 09a3f45880
Mute ClassificationIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet
Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-03-06 07:38:04 -08:00
James Baiera 01f00df5cd
Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet 2020-03-06 07:37:57 -08:00
Benjamin Trent 85d7112e78
[ML] Fixing datafeed bwc tests (#52959)
Datafeed bwc tests have been muted for some time in the 7.x. This is because of date_histogram interval deprecation warnings.

This commit fixes the tests as must as possible while still handling deprecation warnings.
2020-03-06 10:27:21 -05:00
Dimitris Athanasiou 9abf537527
[7.x][ML] Improve DF analytics audits and logging (#53179) (#53218)
Adds audits for when the job starts reindexing, loading data,
analyzing, writing results. Also adds some info logging.

Backport of #53179
2020-03-06 13:47:27 +02:00
Nhat Nguyen 5476a49833 Revert "upgrade to lucene-snapshot-fa75139efea (#53150) (#53151)"
This reverts commit 058113aa42.
2020-03-05 17:33:00 -05:00
Nik Everett f32e4583d1
Add `allowed_warnings` to yaml tests (backport of #53139) (#53173)
When we test backwards compatibility we often end up in a situation
where we *sometimes* get a warning, and sometimes don't. Like, we won't
get the warning if we're testing against an older version, but we will
in a newer one. Or we won't get the warning if the request randomly
lands on a node with an old version of the code. But we wouldn't if it
randomed into a node with newer code.

This adds `allowed_warnings` to our yaml test runner for those cases:
warnings declared this way are "allowed" but not "required".

Blocks #52959

Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2020-03-05 17:11:54 -05:00
Benjamin Trent af0b1c2860
[ML] Fix minor race condition in dataframe analytics _stop (#53029) (#53164)
Tests have been periodically failing due to a race condition on checking a recently `STOPPED` task's state. The `.ml-state` index is not created until the task has already been transitioned to `STARTED`. This allows the `_start` API call to return. But, if a user (or test) immediately attempts to `_stop` that job, the job could stop and the task removed BEFORE the `.ml-state|stats` indices are created/updated.

This change moves towards the task cleaning up itself in its main execution thread. `stop` flips the flag of the task to `isStopping` and now we check `isStopping` at every necessary method. Allowing the task to gracefully stop.

closes #53007
2020-03-05 09:59:18 -05:00
Benjamin Trent 181ee3ae0b
[ML] specifying missing_field_value value and using it instead of empty_string (#53108) (#53165)
For analytics, we need a consistent way of indicating when a value is missing. Inheriting from anomaly detection, analysis sent `""` when a field is missing. This works fine with numbers, but the underlying analytics process actually treats `""` as a category in categorical values. 

Consequently, you end up with this situation in the resulting model
```
{
              "frequency_encoding" : {
                "field" : "RainToday",
                "feature_name" : "RainToday_frequency",
                "frequency_map" : {
                  "" : 0.009844409027270245,
                  "No" : 0.6472019970785184,
                  "Yes" : 0.6472019970785184
                }
              }
            }
```
For inference this is a problem, because inference will treat missing values as `null`. And thus not include them on the infer call against the model.

This PR takes advantage of our new `missing_field_value` option and supplies `\0` as the value.
2020-03-05 09:50:52 -05:00
Aleksandr Maus 2dc872f052
EQL: Add HLRC for EQL stats (#53043) (#53148) 2020-03-05 09:20:38 -05:00
Adrien Grand 360ac1997f Fix test failures with the new `constant_keyword` field. (#53153)
This test failed because YAML tests randomly install an index template
that updates the default number of shards to 2.

Closes #53131
2020-03-05 14:29:13 +01:00
Nik Everett 28df7ae5ed
Support multiple metrics in `top_metrics` agg (backport of #52965) (#53163)
This adds support for returning multiple metrics to the `top_metrics`
agg. It looks like:
```
POST /test/_search?filter_path=aggregations
{
  "aggs": {
    "tm": {
      "top_metrics": {
        "metrics": [
          {"field": "v"},
          {"field": "m"}
        ],
        "sort": {"s": "desc"}
      }
    }
  }
}
```
2020-03-05 08:12:01 -05:00
David Roberts 01504df876 [TEST] Force close failed job before skipping test (#53128)
The assumption added in #52631 skips a problematic test
if it fails to create the required conditions for the
scenario it is supposed to be testing.  (This happens
very rarely.)

However, before skipping the test it needs to remove the
failed job it has created because the standard test
cleanup code treats failed jobs as fatal errors.

Closes #52608
2020-03-05 10:52:41 +00:00
Armin Braun 204c366a4e
Upgrade GCS SDK to 1.104.0 (#52839) (#53152)
Upgrading the GCS SDK to the most recent version.
Adjusting (i.e. improving) the REST mock accordingly.
This should significantly boost performance by pulling in
https://github.com/googleapis/java-core/issues/86 in some cases.
2020-03-05 11:18:18 +01:00
Ignacio Vera 058113aa42
upgrade to lucene-snapshot-fa75139efea (#53150) (#53151) 2020-03-05 10:04:05 +01:00
Lisa Cawley 859c6441b3 [DOCS] Adds PKI delegation.enabled example (#53030) 2020-03-04 14:59:45 -08:00
Ross Wolf a5e82d7fd6
EQL: Add explicit 'any where ...' handling (#52526) 2020-03-04 10:11:03 -07:00
Nik Everett 609c61f75c
Formalize usage stats for analytics (backport of #52966) (#53077)
This moves the usage statistics gathering from the `AnalyticsPlugin`
into an `AnalyicsUsage`, removing the static state. It also checks the
license level when parsing all analytics aggregations. This is how we
were checking them before but we did it in an easy to forget way. This
way is slightly simpler, I think.
2020-03-04 10:29:11 -05:00
Martijn van Groningen 3fa5395ac8
Use correct issue number: #52453 2020-03-04 16:17:55 +01:00
Martijn van Groningen 2e325e24cb
Mute testMonitorClusterHealth test (#53109)
Relates to #36782
2020-03-04 16:08:19 +01:00
Martijn van Groningen b77f6746d1
unmute watcher single node test case
relates to #36782
2020-03-04 15:25:17 +01:00
Aleksandr Maus b47bffba24
EQL: consistent naming for event type vs event category (#53073) (#53090)
Related to https://github.com/elastic/elasticsearch/issues/52941
2020-03-04 08:02:38 -05:00
Marios Trivyzas e180e2738a
SQL: [Tests] Add tests for optimization of aliased expressions (#53048)
Add a unit test to verify that the optimization of expression
(e.g. COALESCE) is applied to all instances of the expression:
SELECT, WHERE, GROUP BY and HAVING.

Relates to #35270

(cherry picked from commit 2ceedc7f2019fad92cd86679af1a9c6fa594aa8d)
2020-03-04 11:48:06 +01:00
Marios Trivyzas 1d5c842700 SQL: Fix column size for IP data type (#53056)
Set size/displaySize to 45 which is the maximum string for
an IP (v6), since IPs are returned as strings.

Fixes: #52762

(cherry picked from commit 815f01747a4d54a274ca248af6fc08e5ea0728c1)
2020-03-04 10:36:44 +01:00
Mark Vieira 4b528d97ad
Consolidate duplication of BWC testing task setup in script plugin (#53079)
(cherry picked from commit 33fc8e7ebfac8d47a5f9f026b3836bb47bea141a)
2020-03-03 14:43:02 -08:00