Commit Graph

4650 Commits

Author SHA1 Message Date
Dimitris Athanasiou 60153c5433
[7.x][ML] Data frame analytics analysis stats (#53788) (#53844)
Adds parsing and indexing of analysis instrumentation stats.
The latest one is also returned from the get-stats API.

Note that we chose to duplicate objects even where they are currently
similar. There are already ideas on how these will diverge in the future
and while the duplication looks ugly at the moment, it is the option
that offers the highest flexibility.

Backport of #53788
2020-03-20 12:11:53 +02:00
Ryan Ernst b8ef830c0a
Decouple AuditTrailService from AuditTrail (#53450) (#53760)
The AuditTrailService has historically been an AuditTrail itself, acting
as a composite of the configured audit trails. This commit removes that
interface from the service and instead builds a composite delegating
implementation internally. The service now has a single get() method to
get an AuditTrail implementation which may be called. If auditing is not
allowed by the license, an empty noop version is returned.
2020-03-19 14:39:01 -07:00
Christoph Büscher d846ea43f4
Fix ReloadSynonymAnalyzerIT failure (#53663) (#53806)
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.

Backport of #53663
2020-03-19 19:00:14 +01:00
Benjamin Trent 433952b595
[7.x] [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725) (#53808)
* [ML] only retry persistence failures when the failure is intermittent and stop retrying when analytics job is stopping (#53725)

This fixes two issues:


- Results persister would retry actions even if they are not intermittent. An example of an persistent failure is a doc mapping problem.
- Data frame analytics would continue to retry to persist results even after the job is stopped.

closes https://github.com/elastic/elasticsearch/issues/53687
2020-03-19 13:56:41 -04:00
Jake Landis cce60215d8
[7.x] Add Watcher to available rest resources (#53620) (#53764)
Prior to this commit Watcher explicitly copied test between two
projects with a copy task. This commit removes the explicit copy in favor
of adding the Watcher tests to the available restResources that may be
copied between projects.

This is how inter-project dependencies should be modeled. However, only
Watcher is included here since it is (currently) the only project with
inter-project test dependencies.
2020-03-19 12:29:36 -05:00
Jake Landis db3420d757
[7.x] Optimize which Rest resources are used by the Rest tests… (#53766)
This should help with Gradle's incremental compile such that projects
only depend upon the resources they use.

related #52114
2020-03-19 12:28:59 -05:00
Ignacio Vera dfc1d79ddf
Add support for distance queries on shape queries (#53468) (#53796)
With the upgrade to Lucene 8.5, XYShape field has support for distance queries. This change implements this new feature and removes the limitation.
2020-03-19 15:32:09 +01:00
Dominic Page b0884baf46
Geo shape query vs geo point backport (#53774)
Backport to 7x

Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928
Co-Authored-By:  @iverase
2020-03-19 13:00:36 +01:00
Ioannis Kakavas 4a36894a48
Mute failing tests (#53781)
See #53738
2020-03-19 08:16:23 +02:00
Benjamin Trent 415d73c27d
[Transform] renamed _cat/transform to _cat/transforms (#53743) (#53771)
renaming _cat/transform to  _cat/transforms for uniformity with the other _cat apis.
2020-03-18 19:54:03 -04:00
Stuart Tettemer cdbee32f55
Scripting: Per-context script cache, default off (#52855) (#53756)
* Adds per context settings:
  `script.context.${CONTEXT}.cache_max_size` ~
  `script.cache.max_size`

  `script.context.${CONTEXT}.cache_expire` ~
  `script.cache.expire`

  `script.context.${CONTEXT}.max_compilations_rate` ~
  `script.max_compilations_rate`

* Context cache is used if:
  `script.max_compilations_rate=use-context`.  This
  value is dynamically updatable, so users can
  switch back to the general cache if desired.

* Settings for context caches take the first value
  that applies:
  1) Context specific settings if set, eg
     `script.context.ingest.cache_max_size`
  2) Correlated general setting is set to the non-default
     value, eg `script.cache.max_size`
  3) Context default

The reason for 2's inclusion is to allow an easy
transition for users who've customized their general
cache settings.

Using the general cache settings for the context caches
results in higher effective settings, since they are
multiplied across the number of contexts.  So a general
cache max size of 200 will become 200 * # of contexts.
However, this behavior it will avoid users snapping to a
value that is too low for them.

Backport of: #52855
Refs: #50152
2020-03-18 14:44:04 -06:00
Ioannis Kakavas af519cccff Revert "Mute TimeSeriesLifecycleActionsIT (#53741)"
This reverts commit df0ad7569b.
2020-03-18 18:51:06 +02:00
markharwood ae19802e29
Fix highlighter support in PinnedQuery and added test (#53716) (#53729)
CappedScoreQuery was not delegating queryVisitor calls

Closes #53699
2020-03-18 15:39:17 +00:00
Ioannis Kakavas df0ad7569b
Mute TimeSeriesLifecycleActionsIT (#53741)
see #53738
2020-03-18 17:38:24 +02:00
Luca Cavanna 75c367de13 [TEST] Replace agg key in async search yaml test (#53727)
Some clients have problems running this test as a numeric key is treated like an array index by default.
We can work around this by renaming the aggregation key to not be a numeric.
2020-03-18 16:16:15 +01:00
Benjamin Trent 2ccb963f1d
Create GET _cat/transforms API Issue (#53643) (#53726)
Adds new` _cat/transform` and `_cat/transform/{transform_id}` endpoints.
2020-03-18 10:45:28 -04:00
Alan Woodward 580bc40c0c Make it possible to deprecate all variants of a ParseField with no replacement (#53722)
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
2020-03-18 14:16:19 +00:00
Ioannis Kakavas e5aa0906f7
Mute testHistoryIsWrittenWithDeletion (#53721)
see #53718
2020-03-18 14:49:57 +02:00
Christoph Büscher 2384c1359d Revert "Fix ReloadSynonymAnalyzerIT failure (#53663)"
This reverts commit 2c32173fce.
2020-03-18 12:44:23 +01:00
Christoph Büscher 2c32173fce Fix ReloadSynonymAnalyzerIT failure (#53663)
There is an assertion in ReloadAnalyzersResponse.merge that compares index names
of merged responses that was falsely using object equality instead of
String.equals(). In the past this didn't seem to matter but with changes in the
test setup we started to see failures. Correcting this and also simplifying test
a bit to be able to run it repeatedly if needed.

Closes #53443
2020-03-18 11:55:37 +01:00
Przemysław Witek ec13c093df
Make ML index aliases hidden (#53160) (#53710) 2020-03-18 10:28:45 +01:00
Ioannis Kakavas 873d0ecd09
Fix potential bug in concurrent token refresh support (#53668) (#53705)
Ensure that we do not proceed execution after calling the
listerer's onFailure
2020-03-18 09:43:26 +02:00
Hendrik Muhs 7a12300ce6
[7.x][Transform] enhance the output of preview to return full… (#53695)
changes the output format of preview regarding deduced mappings and enhances
it to return all the details about auto-index creation. This allows the user
to customize the index creation. Using HLRC you can create a index request
from the output of the response.

backport #53572
2020-03-18 08:37:56 +01:00
Hendrik Muhs a6dca577e5 [Transform] data nanos/date histogram IT (#53654)
add an integration test for date nanos in combination with date_histogram
2020-03-17 20:58:57 +01:00
Ryan Ernst 5c472fcb47 Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642)
Re-applies the change from #53523 along with test fixes.

closes #53626
closes #53624
closes #53622
closes #53625

Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
2020-03-17 10:28:51 -07:00
David Kyle 2b635737e1
[ML] Parse single named object in config classes (#53472) (#53542) 2020-03-17 13:59:52 +00:00
Alan Woodward 71b703edd1 Rename AtomicFieldData to LeafFieldData (#53554)
This conforms with lucene's LeafReader naming convention, and
matches other per-segment structures in elasticsearch.
2020-03-17 12:30:12 +00:00
Andrei Stefan 79600eb38b
SQL: add support for index aliases for SYS COLUMNS command (#53525) (#53653)
(cherry picked from commit f65e4d6ff7b2e00eb6f9c985fbe7cb24de00f045)
2020-03-17 12:49:08 +02:00
Hendrik Muhs a0314ad015 [Transform] add transform discovery node role (#53616)
Enhancement of #52712: Add a discovery node role using the letter t for transform.

Fixes #53156
2020-03-17 11:39:20 +01:00
Ioannis Kakavas 23af171cf8
Disallow Password Change when authenticated by Token (#49694) (#53614)
Password changes are only allowed when the user is currently
authenticated by a realm (that permits the password to be changed)
and not when authenticated by a bearer token or an API key.
2020-03-17 09:45:35 +02:00
Yang Wang 7f21ade924
Explicitly require that derived API keys have no privileges (#53647) (#53648)
The current implicit behaviour is that when an API keys is used to create another API key,
the child key is created without any privilege. This implicit behaviour is surprising and is
a source of confusion for users.

This change makes that behaviour explicit.
2020-03-17 17:56:37 +11:00
Tim Vernum 74dbdb991c
Avoid NPE in set_security_user without security (#53543)
If security was disabled (explicitly), then the SecurityContext would
be null, but the set_security_user processor was still registered.

Attempting to define a pipeline that used that processor would fail
with an (intentional) NPE. This behaviour, introduced in #52032, is a
regression from previous releases where the pipeline was allowed, but
was no usable.

This change restores the previous behaviour (with a new warning).

Backport of: #52691
2020-03-17 13:30:07 +11:00
Ryan Ernst e7f38674ed Add internalClusterTest to check task (#53444)
This commit adds internalClusterTest in xpack core to run as part of
check. This was accidentally removed in a refactoring. Other xpack
modules already do this, but core was left out. This commit also mutes 2
tests that currently fail.

closes #53407
2020-03-16 18:55:01 -07:00
Luca Cavanna c3d2417448
Cumulative backport of async search changes (#53635)
* Submit async search to work only with POST (#53368)

Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method.

* Refine SearchProgressListener internal API (#53373)

The following cumulative improvements have been made:
- rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce`
- add unit test for `SearchShard`
- on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private
- Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class.
- rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members
- make `SearchShard` and `SearchShardTask` classes final

* Move async search yaml tests to x-pack yaml test folder (#53537)

The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too.

* [DOCS] Add temporary redirect for async-search (#53454)

The following API spec files contain a link to a not-yet-created
async search docs page:

* [async_search.delete.json][0]
* [async_search.get.json][1]
* [async_search.submit.json][2]

The Elaticsearch-js client uses these spec files to create their docs.
This created a broken link in the Elaticsearch-js docs, which has broken
the docs build.

This PR adds a temporary redirect for the docs page. This redirect
should be removed when the actual API docs are added.

[0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json
[1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json
[2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-03-17 00:08:17 +01:00
Nik Everett f0beab4041
Stop using round-tripped PipelineAggregators (backport of #53423) (#53629)
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
   serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
   `PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
   really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.

This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.

Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
2020-03-16 16:15:23 -04:00
Gordon Brown 880cc3ca7e
Hide I/SLM history aliases (#53564)
This commit adjusts the aliases used for the ILM and SLM history indices
to be hidden aliases.

Also tweaks the configuration of the `IndexTemplateRegistry`s used by
these history system to only upgrade the template from the master node,
as documents are indexed from the master node, so the template version
should only be upgraded from the master node.
2020-03-16 13:07:26 -06:00
Gordon Brown 031932b32f
Allow _cat indices & aliases to use indices options (#53248)
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.

Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
2020-03-16 11:25:05 -06:00
Alexander Reelsen 7571ca437a Disable Watcher script optimization for stored scripts (#53497)
The watcher TextTemplateEngine uses a fast path mechanism where it
checks for the existence of `{{` to decide if a mustache script
required compilation. This does not work for stored script, as the field
that is checked contains the id of the script, which means, the name of
the script is returned as its value.

This commit checks for the script type and does not involve this fast
path check if a stored script is used.

Closes #40212
2020-03-16 18:07:54 +01:00
Andrei Stefan 91ca9c5c33
QL: constant_keyword support (#53241) (#53602)
(cherry picked from commit d6cd4ce7849ba215407c8c5fa815c9b373fb8480)
2020-03-16 18:06:31 +02:00
jimczi dc2edc97f0 Fix sporadic failures in AsyncSearchActionTests (take 2)
This change removes the need to always get a new version when iterating
on an async search. This is needed since we cannot guarantee that shards will
be queried exactly in order.

Relates #53360
2020-03-16 16:52:23 +01:00
markharwood 2c74f3e22c
Backport of new wildcard field type (#53590)
* New wildcard field optimised for wildcard queries (#49993)

Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values.  Also supports aggregations and sorting.
2020-03-16 15:07:13 +00:00
Przemysław Witek 376b2ae735
[7.x] Make classification evaluation metrics work when there is field mapping type mismatch (#53458) (#53601) 2020-03-16 15:38:56 +01:00
Jim Ferenczi e6680be0b1
Add new x-pack endpoints to track the progress of a search asynchronously (#49931) (#53591)
This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.

````
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
  "aggs": {
    "date_histogram": {
      "field": "@timestamp",
      "fixed_interval": "1h"
    }
  }
}
````

If after 100ms the final response is not available, a `partial_response` is included in the body:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 1,
  "is_running": true,
  "is_partial": true,
  "response": {
   "_shards": {
       "total": 100,
       "successful": 5,
       "failed": 0
    },
    "total_hits": {
      "value": 1653433,
      "relation": "eq"
    },
    "aggs": {
      ...
    }
  }
}
````

The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:

````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````

That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 10,
  "is_running": false,
  "response": {
     "is_partial": false,
     ...
  }
}
````

Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`.
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.

Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed.

Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.

````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````

The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.

The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.

Relates #49091

Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
2020-03-16 15:31:27 +01:00
Marios Trivyzas 723034001c SQL: Fix NPE for parameterized LIKE/RLIKE (#53573)
Fix NPE when `null` is passed as a parameter for a parameterized
pattern of LIKE/RLIKE. e.g.: `field LIKE ?` params=[null]`
Check for null pattern in LIKE/RLIKE as for RLIKE (RegexpQuery) we
get an IllegalArgumentExpression from Lucence but for LIKE
(WildcardQuery) we get an NPE.

Fixes: #53557
(cherry picked from commit ec3481ed13254ecdec32acf7a0fafd536ec77aff)
2020-03-16 14:44:48 +01:00
Dimitris Athanasiou 94da4ca3fc
[7.x][ML] Extend classification to support multiple classes (#53539) (#53597)
Prepares classification analysis to support more than just
two classes. It introduces a new parameter to the process config
which dictates the `num_classes` to the process. It also
changes the max classes limit to `30` provisionally.

Backport of #53539
2020-03-16 15:00:54 +02:00
David Kyle a38e5ca8e7
Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure (#53595)
Failure tracked in #50353
2020-03-16 12:30:56 +00:00
Marios Trivyzas 1272ae411e SQL: Fix issue with LIKE/RLIKE as painless script (#53495)
Add missing asScript() implementation for LIKE/RLIKE expressions.

When LIKE/RLIKE are used for example in GROUP BY or are wrapped with
scalar functions in a WHERE clause, the translation must produce a
painless script which will be executed to implement the correct
behaviour and previously this was completely missing, and as a
consquence wrong results were silently (no error) returned.

Fixes: #53486
(cherry picked from commit eaa8ead6742a8e7dcf343bcbaff8de031550fd77)
2020-03-16 12:27:45 +01:00
Martijn van Groningen 3b9545848f
Reenable watcher rest tests (#53532)
Also log a message instead of failing if there are active watches at a beginning of a test.

Relates to #53177
2020-03-16 10:24:14 +01:00
Mark Vieira 2f0aca992b
Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)"
This reverts commit b7dbadeea0.
2020-03-15 18:10:40 -07:00
Jason Tedor b7dbadeea0
Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)
This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2
dependency to 2.13.1.

Relates #53523
2020-03-14 13:28:06 -04:00
Benjamin Trent 1262ab2762
[ML] [Inference] fix number inference models returned in x-pack info call (#53540) (#53560)
the ML portion of the x-pack info API was erroneously counting configuration documents and definition documents. The underlying implementation of our storage separates the two out.

This PR filters the query so that only trained model config documents are counted.
2020-03-13 16:53:34 -04:00
Benjamin Trent 4e43ede735
[ML] renaming inference processor field field_mappings to new name field_map (#53433) (#53502)
This renames the `inference` processor configuration field `field_mappings` to `field_map`.

`field_mappings` is now deprecated.
2020-03-13 15:40:57 -04:00
Tom Veasey 690099553c
[7.x][ML] Adds the class_assignment_objective parameter to classification (#53552)
Adds a new parameter for classification that enables choosing whether to assign labels to
maximise accuracy or to maximise the minimum class recall.

Fixes #52427.
2020-03-13 17:35:51 +00:00
Tim Vernum a8677499d7
[Backport] Add support for secondary authentication (#53530)
This change makes it possible to send secondary authentication
credentials to select endpoints that need to perform a single action
in the context of two users.

Typically this need arises when a server process needs to call an
endpoint that users should not (or might not) have direct access to,
but some part of that action must be performed using the logged-in
user's identity.

Backport of: #52093
2020-03-13 16:30:20 +11:00
Tim Vernum bac1740d44
Support authentication without anonymous user (#53528)
This change adds a new parameter to the authenticate methods in the
AuthenticationService to optionally exclude support for the anonymous
user (if an anonymous user exists).

Backport of: #52094
2020-03-13 14:27:29 +11:00
Nik Everett 9dcd64c110
Preserve metric types in top_metrics (backport of #53288) (#53440)
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
2020-03-12 17:17:09 -04:00
Jason Tedor 5b08ea84c9
Add deprecation check for listener thread pool (#53438)
This commit adds a deprecation check for the listener thread pool
settings as these will be removed in 8.0.0.
2020-03-12 14:32:41 -04:00
Jay Modi af36665b08
Deprecate the logstash enabled setting (#53487)
The setting, `xpack.logstash.enabled`, exists to enable or disable the
logstash extensions found within x-pack. In practice, this setting had
no effect on the functionality of the extension. Given this, the
setting is now deprecated in preparation for removal.

Backport of #53367
2020-03-12 10:18:39 -06:00
Dan Hermann 34adfd9611
Validate SSL settings at parse time (#49196) (#53473) 2020-03-12 10:14:51 -05:00
Aleksandr Maus 31d45b3c95
EQL: Improve query folder test suite (#53187) (#53476)
Related to https://github.com/elastic/elasticsearch/issues/52775
2020-03-12 10:58:07 -04:00
Yannick Welsch 48124807d5 Fix SourceOnlySnapshotIT (#53462)
The tests in this class had been failing for a while, but went unnoticed as not tested by CI (see #53442).

The reason the tests fail is that the can-match phase is smarter now, and filters out access to a non-existing field.

Closes #53442
2020-03-12 14:15:03 +01:00
Jason Tedor d8e70d4688
Enable deprecation checks for removed settings (#53317)
Today we do not have any infrastructure for adding a deprecation check
for settings that are removed. This commit enables this by adding such
infrastructure. Note that this infrastructure is unused in this commit,
which is deliberate. However, the primary target for this commit is 7.x
where this infrastructue will be used, in a follow-up.
2020-03-11 16:49:16 -04:00
Benjamin Trent 89668c5ea0
[ML][Inference] adds new default_field_map field to trained models (#53294) (#53419)
Adds a new `default_field_map` field to trained model config objects.

This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.

The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
2020-03-11 13:49:39 -04:00
Jay Modi 9a21a8abf2
Opt-in logstash plugin to formatting (#53413)
This change opts-in the logstash plugin for enforced formatting.

Backport of #53370
2020-03-11 09:58:37 -06:00
Nhat Nguyen 6665ebe7ab Harden search context id (#53143)
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:

1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index

This commit avoids these issues by adding a UUID to SearchContexId.
2020-03-11 11:48:11 -04:00
Przemysław Witek 8c4c19d310
Perform evaluation in multiple steps when necessary (#53295) (#53409) 2020-03-11 15:36:38 +01:00
Przemysław Witek 063957b7d8
Simplify "refresh" calls. (#53385) (#53393) 2020-03-11 12:26:11 +01:00
Dimitris Athanasiou cc7751eb16
[7.x][ML] Add ILM policy to ml stats indices (#53349) (#53392)
Adds a size based ILM policy to automatically
rollover ml stats indices.

Backport of #53349
2020-03-11 13:01:34 +02:00
Dimitris Athanasiou 0fd0516d0d
[7.x][ML] Rename data frame analytics maximum_number_trees to max_trees (#53300) (#53390)
Deprecates `maximum_number_trees` parameter of classification and
regression and replaces it with `max_trees`.

Backport of #53300
2020-03-11 12:45:27 +02:00
David Roberts 532a720e1b
[ML] Skeleton estimate_model_memory endpoint for anomaly detection (#53386)
This is a partial implementation of an endpoint for anomaly
detector model memory estimation.

It is not complete, lacking docs, HLRC and sensible numbers
for many anomaly detector configurations.  These will be
added in a followup PR in time for 7.7 feature freeze.

A skeleton endpoint is useful now because it allows work on
the UI side of the change to commence.  The skeleton endpoint
handles the same cases that the old UI code used to handle,
and produces very similar estimates for these cases.

Backport of #53333
2020-03-11 10:20:00 +00:00
Jake Landis 2ab502afc4
[7.x] Remove dead 'beats' code (#53312) (#53376) 2020-03-10 20:57:29 -05:00
Nhat Nguyen 24f114766f Fix doc_stats and segment_stats of ReadOnlyEngine (#53345)
We can't always have the same segment stats and doc stats between
InternalEngine and ReadOnlyEngine if there are some fully deleted
segments. ReadOnlyEngine always filters out them. InternalEngine,
however, will keep them if peer recovery retention leases exist or the
number of the retaining operations is non-zero.

This change reverts the fix in #51331 and uses the wrapped reader to
calculate the segment stats and doc stats. For the test, we need to
disable the extra retaining soft-deletes operations.

Closes #51303
2020-03-10 21:51:33 -04:00
Nhat Nguyen cad02d4a31 Increase timeout testFollowIndexMaxOperationSizeInBytes (#53014)
Replicating 1000 documents one by one (as we cap the request size at 
1 byte) can take more than 10 seconds on a slow CI.

Closes #52812
2020-03-10 21:51:33 -04:00
William Brafford 3494c73c8d
Mute failing tests (#53362) (#53363) 2020-03-10 16:01:31 -04:00
Przemko Robakowski 847ac9c7d7
Fix null config in SnapshotLifecyclePolicy.toRequest (#53328) (#53355)
This avoids NPE when executing SLM policy when no config was provided.

Related to #44465

Closes #53171

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-10 20:44:30 +01:00
Przemysław Witek d54d7f2be0
[7.x] Implement ILM policy for .ml-state* indices (#52356) (#53327) 2020-03-10 14:24:18 +01:00
Benjamin Trent 856d9bfbc1
[ML] fixing data frame analysis test when two jobs are started in succession quickly (#53192) (#53332)
A previous change (#53029) is causing analysis jobs to wait for certain indices to be made available. While this it is good for jobs to wait, they could fail early on _start. 

This change will cause the persistent task to continually retry node assignment when the failure is due to shards not being available.

If the shards are not available by the time `timeout` is reached by the predicate, it is treated as a _start failure and the task is canceled. 

For tasks seeking a new assignment after a node failure, that behavior is unchanged.


closes #53188
2020-03-10 08:30:47 -04:00
Hendrik Muhs 5912895838 [Transform] wait for transform templates in Rest integration t… (#53330)
add transform templates to the list of templates to be installed before
executing tests
2020-03-10 13:22:12 +01:00
Hendrik Muhs 696aa4ddaf
[7.x][Transform] add support for script in group_by (#53167) (#53324)
add the possibility to base the group_by on the output of a script.

closes #43152
backport #53167
2020-03-10 11:12:58 +01:00
Alan Woodward 5c861cfe6e Upgrade to final lucene 8.5.0 snapshot (#53293)
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
2020-03-10 09:32:59 +00:00
Cauê Marcondes b68d7b1c33
giving kibana user privileges to create custom link index (#53221) (#53278) 2020-03-10 09:50:38 +01:00
Henning Andersen a4d481f2bb ILM Freeze step retry when not acknowledged (#53287)
A freeze operation can partially fail in multiple places, including the
close verification step. This left the index in an unfrozen but
partially closed state. Now throw an exception to retry the freeze step
instead.
2020-03-10 08:03:39 +01:00
Gordon Brown 1cb0a4399d
Fix Get Alias API handling of hidden indices with visible aliases (#53147)
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
2020-03-09 16:16:29 -06:00
Przemko Robakowski f075d70cf8
[7.x] Avoid race condition in ILMHistorySotre (#53039) (#53094)
* Avoid race condition in ILMHistorySotre (#53039)

* Avoid race condition in ILMHistorySotre

This change modifies ILMHistoryStore to always apply correct settings and mappings,
even if template is deleted and not yet recreated. This ensures that ILM history index
is correctly managed by ILM and also fixes flaky history tests that were prone to
triggenring this race.

This commit also refactors and simplifies ILM history tests.

Closes #50353 and #52853

* Review comment

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* fixed tests

* backport #53306

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-09 22:24:15 +01:00
Bogdan Pintea 62c8ac9993
SQL: transfer version compatibility decision to the server (#53082) (#53302)
This commit adds a new request object field, "version", containing the version of the requesting client. This parameter is now accepted - and for certain clients required - by the server and the request is validated against it. Currently server's and client's versions still need to be equal in order for the request to be accepted. Relaxing this check is going to be part of future work. 

On the clients' side, the only check remaining is to ensure that the peer server is supporting version backwards compatibility (i.e. is on, or newer than a certain release).

(cherry picked from commit a8f413a20fb023bec83af0de1211a2936a7f558c)
2020-03-09 21:16:57 +01:00
Aleksandr Maus d064846416
EQL: Test infrastructure improvements (#53253) (#53297)
Update CommonEqlRestTestCase code to simplify making changes as requested.
Update EqlActionIT to simplify the test code as requested.
Replace Jackson parser with XContent in EqlActionIT.
Whitelist more EQL tests specs that are now supported.
2020-03-09 14:11:54 -04:00
Ross Wolf f5f922c6f6
EQL: Add IsNull/IsNotNull checks (#52791)
* EQL: Add IsNull/IsNotNull checks
* EQL: Simplify IsNull/IsNotNull optimization
* EQL: Split string tests over multiple lines
2020-03-09 10:41:04 -06:00
Jason Tedor 8ad0080a59
Fork CCR checkpoint listeners on CCR thread pool (#53265)
This commit moves the global checkpoint listeners used in CCR to the CCR
thread pool. This removes the last use of the listener thread pool in
the codebase.
2020-03-09 08:56:30 -04:00
Martijn van Groningen 7775ddbc9c
Verify watch_count before a test starts and not after a test.
This check was added as part of: 0f2d26bdca

Checking this before the test starts makes more sense, because
the watches index has then also be removed.

Relates to #53177
2020-03-09 07:45:44 +01:00
Jason Tedor 5e96d3e59a
Use given executor for global checkpoint listener (#53260)
Today when notifying a global checkpoint listener, we use the listener
thread pool. This commit turns this inside out so that the global
checkpoint listener must provide an executor on which to notify the
listener.
2020-03-08 13:51:05 -04:00
Gordon Brown ff9b8bda63
Implement hidden aliases (#52547)
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.

The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
2020-03-06 16:02:38 -07:00
Ross Wolf d6813cb348
EQL: Convert wildcards to LIKE in analyzer (#51901)
* EQL: Convert wildcard comparisons to Like
* EQL: Simplify wildcard handling, update tests
* EQL: Lint fixes for Optimizer.java
2020-03-06 13:13:07 -07:00
Mayya Sharipova f96ad5c32d Mute testSingleNumericFeatureAndMixedTrainingAndNonTrainingRows 2020-03-06 12:48:05 -05:00
Jay Modi a81460dbf5
Make watch history indices hidden (#52974)
This commit updates the template used for watch history indices with
the hidden index setting so that new indices will be created as hidden.

Relates #50251
Backport of #52962
2020-03-06 09:47:03 -07:00
Mark Vieira 09a3f45880
Mute ClassificationIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet
Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-03-06 07:38:04 -08:00
James Baiera 01f00df5cd
Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet 2020-03-06 07:37:57 -08:00
Dimitris Athanasiou 9abf537527
[7.x][ML] Improve DF analytics audits and logging (#53179) (#53218)
Adds audits for when the job starts reindexing, loading data,
analyzing, writing results. Also adds some info logging.

Backport of #53179
2020-03-06 13:47:27 +02:00
Nhat Nguyen 5476a49833 Revert "upgrade to lucene-snapshot-fa75139efea (#53150) (#53151)"
This reverts commit 058113aa42.
2020-03-05 17:33:00 -05:00
Benjamin Trent af0b1c2860
[ML] Fix minor race condition in dataframe analytics _stop (#53029) (#53164)
Tests have been periodically failing due to a race condition on checking a recently `STOPPED` task's state. The `.ml-state` index is not created until the task has already been transitioned to `STARTED`. This allows the `_start` API call to return. But, if a user (or test) immediately attempts to `_stop` that job, the job could stop and the task removed BEFORE the `.ml-state|stats` indices are created/updated.

This change moves towards the task cleaning up itself in its main execution thread. `stop` flips the flag of the task to `isStopping` and now we check `isStopping` at every necessary method. Allowing the task to gracefully stop.

closes #53007
2020-03-05 09:59:18 -05:00
Benjamin Trent 181ee3ae0b
[ML] specifying missing_field_value value and using it instead of empty_string (#53108) (#53165)
For analytics, we need a consistent way of indicating when a value is missing. Inheriting from anomaly detection, analysis sent `""` when a field is missing. This works fine with numbers, but the underlying analytics process actually treats `""` as a category in categorical values. 

Consequently, you end up with this situation in the resulting model
```
{
              "frequency_encoding" : {
                "field" : "RainToday",
                "feature_name" : "RainToday_frequency",
                "frequency_map" : {
                  "" : 0.009844409027270245,
                  "No" : 0.6472019970785184,
                  "Yes" : 0.6472019970785184
                }
              }
            }
```
For inference this is a problem, because inference will treat missing values as `null`. And thus not include them on the infer call against the model.

This PR takes advantage of our new `missing_field_value` option and supplies `\0` as the value.
2020-03-05 09:50:52 -05:00
Aleksandr Maus 2dc872f052
EQL: Add HLRC for EQL stats (#53043) (#53148) 2020-03-05 09:20:38 -05:00
Adrien Grand 360ac1997f Fix test failures with the new `constant_keyword` field. (#53153)
This test failed because YAML tests randomly install an index template
that updates the default number of shards to 2.

Closes #53131
2020-03-05 14:29:13 +01:00
Nik Everett 28df7ae5ed
Support multiple metrics in `top_metrics` agg (backport of #52965) (#53163)
This adds support for returning multiple metrics to the `top_metrics`
agg. It looks like:
```
POST /test/_search?filter_path=aggregations
{
  "aggs": {
    "tm": {
      "top_metrics": {
        "metrics": [
          {"field": "v"},
          {"field": "m"}
        ],
        "sort": {"s": "desc"}
      }
    }
  }
}
```
2020-03-05 08:12:01 -05:00
David Roberts 01504df876 [TEST] Force close failed job before skipping test (#53128)
The assumption added in #52631 skips a problematic test
if it fails to create the required conditions for the
scenario it is supposed to be testing.  (This happens
very rarely.)

However, before skipping the test it needs to remove the
failed job it has created because the standard test
cleanup code treats failed jobs as fatal errors.

Closes #52608
2020-03-05 10:52:41 +00:00
Ignacio Vera 058113aa42
upgrade to lucene-snapshot-fa75139efea (#53150) (#53151) 2020-03-05 10:04:05 +01:00
Ross Wolf a5e82d7fd6
EQL: Add explicit 'any where ...' handling (#52526) 2020-03-04 10:11:03 -07:00
Nik Everett 609c61f75c
Formalize usage stats for analytics (backport of #52966) (#53077)
This moves the usage statistics gathering from the `AnalyticsPlugin`
into an `AnalyicsUsage`, removing the static state. It also checks the
license level when parsing all analytics aggregations. This is how we
were checking them before but we did it in an easy to forget way. This
way is slightly simpler, I think.
2020-03-04 10:29:11 -05:00
Martijn van Groningen 3fa5395ac8
Use correct issue number: #52453 2020-03-04 16:17:55 +01:00
Martijn van Groningen 2e325e24cb
Mute testMonitorClusterHealth test (#53109)
Relates to #36782
2020-03-04 16:08:19 +01:00
Martijn van Groningen b77f6746d1
unmute watcher single node test case
relates to #36782
2020-03-04 15:25:17 +01:00
Aleksandr Maus b47bffba24
EQL: consistent naming for event type vs event category (#53073) (#53090)
Related to https://github.com/elastic/elasticsearch/issues/52941
2020-03-04 08:02:38 -05:00
Marios Trivyzas e180e2738a
SQL: [Tests] Add tests for optimization of aliased expressions (#53048)
Add a unit test to verify that the optimization of expression
(e.g. COALESCE) is applied to all instances of the expression:
SELECT, WHERE, GROUP BY and HAVING.

Relates to #35270

(cherry picked from commit 2ceedc7f2019fad92cd86679af1a9c6fa594aa8d)
2020-03-04 11:48:06 +01:00
Marios Trivyzas 1d5c842700 SQL: Fix column size for IP data type (#53056)
Set size/displaySize to 45 which is the maximum string for
an IP (v6), since IPs are returned as strings.

Fixes: #52762

(cherry picked from commit 815f01747a4d54a274ca248af6fc08e5ea0728c1)
2020-03-04 10:36:44 +01:00
Jay Modi c610e0893d
Introduce system index APIs for Kibana (#53035)
This commit introduces a module for Kibana that exposes REST APIs that
will be used by Kibana for access to its system indices. These APIs are wrapped
versions of the existing REST endpoints. A new setting is also introduced since
the Kibana system indices' names are allowed to be changed by a user in case
multiple instances of Kibana use the same instance of Elasticsearch.

Additionally, the ThreadContext has been extended to indicate that the use of
system indices may be allowed in a request. This will be built upon in the future
for the protection of system indices.

Backport of #52385
2020-03-03 14:11:36 -07:00
Andrei Stefan 9ad9ad7a6b
SQL: update SqlNodeSubclassTests list of min-two-parameters functions list (#53045) (#53058)
(cherry picked from commit c741e49d9f5e7b78c1a78e1af97eb19354fe6864)
2020-03-03 19:37:37 +02:00
Adrien Grand cb868d2f5e
Introduce a `constant_keyword` field. (#49713) (#53024)
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.

The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?

For this field there is a choice between
 1. accepting values in `_source` when they are equal to the value configured
    in mappings, but rejecting mapping updates
 2. rejecting values in `_source` but then allowing updates to the value that
    is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.

Backport of #49713
2020-03-03 16:01:47 +01:00
Yang Wang 70814daa86
Allow _rollup_search with read privilege (#52043) (#53047)
Currently _rollup_search requires manage privilege to access. It should really be
a read only operation. This PR changes the requirement to be read indices privilege.

Resolves: #50245
2020-03-03 22:29:54 +11:00
Martijn van Groningen 510db25dd0
Simplify watcher indexing listener.(#53046)
Backport: #52627

Add watcher to trigger server after index operation has succeeded,
instead of adding a watch to trigger service before
the actual index operation has performed on the shard level.

This logic is simpler to reason about in the case that a failure
does occur during the execution of an index operation on
the shard level.

Relates to #52453, but I think doesn't fix it, but makes it easier
to debug.
2020-03-03 11:01:57 +01:00
Hendrik Muhs 844f350774 [Transform] restructure transform yaml tests (#52956)
restructure transform yaml tests to run cleanup in teardown phase

relates #52428
2020-03-03 10:31:22 +01:00
Hendrik Muhs d9258e210e [Transform] fix sporadic race condition in TransformUsageIT (#52946)
relax the test for trigger count

fixes #52931
2020-03-03 10:27:36 +01:00
Costin Leau 712e0c05cd EQL: Add implicit ordering on timestamp (#53004)
QL: Move Sort base class from SQL to QL
(cherry picked from commit 798015b7bbd565e9c4222724614baeb432c7c2b3)
2020-03-02 22:41:36 +02:00
Mark Vieira f8396e8d15
Mute RunDataFrameAnalyticsIT.testStopOutlierDetectionWithEnoughDocumentsToScroll
Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-03-02 09:21:55 -08:00
Mark Vieira 5b5e92c71d
Mute NodeSubclassTests.testReplaceChildren
Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-03-02 09:21:54 -08:00
Lisa Cawley 4fbe1b0550
[DOCS] Adds cat anomaly detectors API (#52866) (#52970) 2020-03-02 07:28:55 -08:00
Hendrik Muhs a328a8eaf1
[7.x][Transform] implement node.transform to control where to… (#52998)
implement transform node attributes to disable transform on certain nodes and
test which nodes are allowed to do remote connections

closes #52200
closes #50033
closes #48734

backport #52712
2020-03-02 16:10:57 +01:00
Aleksandr Maus 89ed857c79
EQL: Change request parameter query to filter and rule to query (#52971) (#53006)
Related to https://github.com/elastic/elasticsearch/issues/52911
2020-03-02 09:26:23 -05:00
Andrei Stefan 6fecc1db84
Issue a different error message in case an index doesn't have a mapping (#52967) (#53003)
(cherry picked from commit a0bd83a0579cf196a1d727de2a46b3b101d5a73b)
2020-03-02 14:04:49 +02:00
Andrei Stefan 69383acecf
Define list of Nodes that have minimum two children in tests (#52957) (#52994)
(cherry picked from commit c1e43e694f02edf3e197abbab7c21008c022b516)
2020-03-02 11:26:50 +02:00
Hendrik Muhs 49f41d127b [Transform] fix NPE in derive stats if shouldStopAtNextCheckpo… (#52940)
fixes a NPE in _stats in case shouldStopAtNextCheckpoint is set.
2020-03-02 08:11:01 +01:00
Martijn van Groningen d102158e6f
Improve closing mock webserver when failed to start (#52943)
Fix NPE when closing a webserver that hasn't started correctly.

This can happen when ssl context isn't initialized. The server instance is then never set,
which causes an NPE that masks the actual failure.

Example stacktrace that would mask an actual failure:

```
java.lang.NullPointerException
	at org.elasticsearch.test.http.MockWebServer.close(MockWebServer.java:271)
	at org.elasticsearch.xpack.watcher.test.integration.HttpSecretsIntegrationTests.cleanup(HttpSecretsIntegrationTests.java:70)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
```
2020-03-02 07:19:08 +01:00
Nhat Nguyen e6755afeeb
Upgrade to Lucene 8.5.0-snapshot-c4475920b08 (#52950) (#52977)
To give LUCENE-9228 more CI cycles
2020-02-29 09:29:16 -05:00
Dimitris Athanasiou 85b4e45093
[7.x]ML] Parse and report memory usage for DF Analytics (#52778) (#52980)
Adds reporting of memory usage for data frame analytics jobs.
This commit introduces a new index pattern `.ml-stats-*` whose
first concrete index will be `.ml-stats-000001`. This index serves
to store instrumentation information for those jobs.

Backport of #52778 and #52958
2020-02-29 13:03:40 +02:00
Luca Cavanna 090bdf69c0
Mute NodeSubclassTests#testReplaceChildren (#52952)
Relates #52951
2020-02-28 16:13:17 +01:00
Andrei Stefan c3a167830f
SQL: refactor In predicate moving it to QL project (#52870) (#52938)
* Move In, InPipe and InProcessor out of SQL to the common QL project.
* Move tests classes to the QL project.
* Create SQL dedicated In class to handle SQL specific data types.
* Update SQL classes to use the InPipe and InProcessor QL classes.
* Extract common Foldables methods in QL project.
* Be more explicit when folding and converting a foldable value, by
removing most of the code inside Foldables class.

(cherry picked from commit 7425042f86f66df8c207c5e96f9b9848bda2b4c3)
2020-02-28 14:04:10 +02:00
Costin Leau a674085903 EQL: Disable field extraction for returned events (#52884)
Return the whole source of matching events

(cherry picked from commit 79ca586ab1d89d645fb58142b82202f14ce5d361)
2020-02-28 13:48:15 +02:00
Yang Wang 82553524af
Respect runas realm for ApiKey security operations (#52178) (#52932)
When user A runs as user B and performs any API key related operations,
user B's realm should always be used to associate with the API key.
Currently user A's realm is used when getting or invalidating API keys
and owner=true. The PR is to fix this bug.

resolves: #51975
2020-02-28 10:53:52 +11:00
Nik Everett 866b08716c
Fix test for top_metrics (#52927)
I added the wrong skips and the wrong error message. Ooops.
2020-02-27 18:30:37 -05:00
Nik Everett 1d1956ee93
Add size support to `top_metrics` (backport of #52662) (#52914)
This adds support for returning the top "n" metrics instead of just the
very top.

Relates to #51813
2020-02-27 16:12:52 -05:00
Benjamin Trent 19a6c5d980
[7.x] [ML][Inference] Add support for multi-value leaves to the tree model (#52531) (#52901)
* [ML][Inference] Add support for multi-value leaves to the tree model (#52531)

This adds support for multi-value leaves. This is a prerequisite for multi-class boosted tree classification.
2020-02-27 14:05:28 -05:00
Benjamin Trent eac38e9847
[ML] Add indices_options to datafeed config and update (#52793) (#52905)
This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index.

This is necessary for the following use cases:
 - Reading from frozen indices
 - Allowing certain indices in multiple index patterns to not exist yet

These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object.

closes https://github.com/elastic/elasticsearch/issues/48056
2020-02-27 13:43:25 -05:00
Henning Andersen 09fe4b42db Disable ILM history in x-pack rest tests (#52868)
The ILM history index can be delayed created from one test into the
next, which can cause issues for tests using `_all`.

Closes #52209
2020-02-27 17:20:33 +01:00
David Kyle d8bdf31110 Revert "Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart"
This reverts commit ad3a3b1af9.
2020-02-27 12:38:13 +00:00
David Kyle 6e5e64559a
Unwrap cause from remote ActionTransportExceptions (#52842) (#52878)
And log the cause
2020-02-27 11:58:28 +00:00
István Zoltán Szabó 4a33352a94 [DOCS] Adds cat trained model API documentation (#52824) 2020-02-27 12:54:11 +01:00
Costin Leau 40bc06f6ad EQL: Hook engine to Elasticsearch (#52828)
Add query execution and return actual results returned from
Elasticsearch inside the tests

(cherry picked from commit 3e039282bf991af87604a6d4f8eada19d5e33842)
2020-02-27 11:22:22 +02:00
Yang Wang 14c21aedd2
Simplify ml license checking with XpackLicenseState internals (#52684) (#52863)
This change removes TrainedModelConfig#isAvailableWithLicense method with calls to
XPackLicenseState#isAllowedByLicense.

Please note there are subtle changes to the code logic. But they are the right changes:
* Instead of Platinum license, Enterprise license nows guarantees availability.
* No explicit check when the license requirement is basic. Since basic license is always available, this check is unnecessary.
* Trial license is always allowed.
2020-02-27 14:14:16 +11:00
Yang Wang f5c4e92558
Refactor license checking (#52118) (#52859)
Improve code resuse and readility. Add convenience checking method which
covers most use cases without having to pass many boolean arguments.
2020-02-27 13:04:19 +11:00
Jake Landis b4179a8814
[7.x] Refactor watcher tests (#52799) (#52844)
This PR moves the majority of the Watcher REST tests under
the Watcher x-pack plugin.

Specifically, moves the Watcher tests from:
x-pack/plugin/test
x-pack/qa/smoke-test-watcher
x-pack/qa/smoke-test-watcher-with-security
x-pack/qa/smoke-test-monitoring-with-watcher

to:
x-pack/plugin/watcher/qa/rest (/test and /qa/smoke-test-watcher)
x-pack/plugin/watcher/qa/with-security
x-pack/plugin/watcher/qa/with-monitoring

Additionally, this disables Watcher from the main
x-pack test cluster and consolidates the stop/start logic
for the tests listed.

No changes to the tests (beyond moving them) are included.

3rd party tests and doc tests (which also touch Watcher)
are not included in the changes here.
2020-02-26 15:57:10 -06:00
Jay Modi 07ef8ccff4
Allow dynamic updates for index.hidden setting (#52837)
This commit changes the `index.hidden` setting from being final to a
dynamic setting. While the setting being final allows for easier
reasoning about an index, making this setting update-able has more
benefits in that we can upgrade existing indices to be hidden and it
will enable future features that would dynamically make indices hidden.

Backport of #52772
2020-02-26 11:46:29 -07:00
Nik Everett bfaa487757
Switch pipeline agg parsing to ContextParser (#52776) (#52832)
We've pretty well settled on `ContextParser` for a generic interface to
`ObjectParser`-like-things. This switches the interface used for
building parsing pipeline aggregations to `ContextParser` which saves a
couple of little wrappers around `ObjectParser`.
2020-02-26 12:57:20 -05:00
Lisa Cawley b788ec7157 [DOCS] Adds cat datafeeds API (#52738) 2020-02-26 09:28:57 -08:00
Ioannis Kakavas 2d01c005ba
Update commons-collections test dependency to 3.2.2 (#52808) (#52817)
This is only a test dependency but it trips scanners so upgrade to
3.2.2 which doesn't suffer from the issues mentioned in i.e.
https://snyk.io/vuln/SNYK-JAVA-COMMONSCOLLECTIONS-472711
2020-02-26 17:03:45 +02:00
Adrien Grand 1807f86751
Generalize how queries on `_index` are handled at rewrite time (#52815)
Generalize how queries on `_index` are handled at rewrite time (#52486)

Since this change refactors rewrites, I also took it as an opportunity to adrress #49254: instead of returning the same queries you would get on a keyword field when a field is unmapped, queries get rewritten to a MatchNoDocsQueryBuilder.

This change exposed a couple bugs, like the fact that the percolator doesn't rewrite queries at query time, or that the significant_terms aggregation doesn't rewrite its inner filter, which I fixed.

Closes #49254
2020-02-26 15:37:43 +01:00
David Kyle ad3a3b1af9 Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart 2020-02-26 14:31:00 +00:00
Jake Landis 8d311297ca
[7.x] Smarter copying of the rest specs and tests (#52114) (#52798)
* Smarter copying of the rest specs and tests (#52114)

This PR addresses the unnecessary copying of the rest specs and allows
for better semantics for which specs and tests are copied. By default
the rest specs will get copied if the project applies
`elasticsearch.standalone-rest-test` or `esplugin` and the project
has rest tests or you configure the custom extension `restResources`.

This PR also removes the need for dozens of places where the x-pack
specs were copied by supporting copying of the x-pack rest specs too.

The plugin/task introduced here can also copy the rest tests to the
local project through a similar configuration.

The new plugin/task allows a user to minimize the surface area of
which rest specs are copied. Per project can be configured to include
only a subset of the specs (or tests). Configuring a project to only
copy the specs when actually needed should help with build cache hit
rates since we can better define what is actually in use.
However, project level optimizations for build cache hit rates are
not included with this PR.

Also, with this PR you can no longer use the includePackaged flag on
integTest task.

The following items are included in this PR:
* new plugin: `elasticsearch.rest-resources`
* new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy
* new extension 'restResources'
```
restResources {
  restApi {
    includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar
    includeXpack 'baz' //will include x-pack specs that start with baz
  }
  restTests {
    includeCore 'foo', 'bar' //will include the core tests that start with foo and bar
    includeXpack 'baz' //will include the x-pack tests that start with baz
  }
}

```
2020-02-26 08:13:41 -06:00
Ioannis Kakavas 2a6c3bea3f
Update oauth2-oidc-sdk to 7.0 (#52489) (#52806)
Resolves: #48409
Other changes:
https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect
-extensions/src/7.0.2/CHANGELOG.txt
2020-02-26 16:02:10 +02:00
István Zoltán Szabó f57422bbfd [DOCS] Adds cat data frame analytics API (#52764)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-02-26 11:10:42 +01:00
David Kyle 37be695d5c
[ML] Handle failed datafeed in MlDistributedFailureIT (#52631) (#52789) 2020-02-26 08:18:37 +00:00
Lisa Cawley 05f1cd74a6 [DOCS] Fixes monitoring links (#52790) 2020-02-25 18:08:23 -08:00
Tim Brooks 6669e53f08
Do not lock on reads of XPackLicenseState (#52492)
XPackLicenseState reads to necessary to validate a number of cluster
operations. This reads occasionally occur on transport threads which
should not be blocked. Currently we sychronize when reading. However,
this is unecessary as only a single piece of state is updateable. This
commit makes this state volatile and removes the locking.
2020-02-25 15:38:35 -07:00
Andrei Stefan 51c6aefa55
SQL: Use calendar_interval of 1d for HISTOGRAMs with 1 DAY intervals (#52749) (#52771)
(cherry picked from commit 556f5fa33be88570c4f8550cb8f784323d26a707)
2020-02-25 18:44:02 +02:00
Costin Leau a8911802d3 EQL: transform query AST into queryDSL (#52432)
(cherry picked from commit 94cef29df259319dfe2a3bf92d3f1a42d7e45781)
2020-02-25 17:53:59 +02:00
Nik Everett 02b23c37d1 Another test fix
Another attempt to fix a test that fails rarely and randomly. This time
try locking the query to just a single index.
2020-02-25 10:22:12 -05:00
Aleksandr Maus a6f5b4bb78
Unmute EqlActionIT (#52757)
Related to https://github.com/elastic/elasticsearch/issues/52737
2020-02-25 10:22:07 -05:00
David Roberts cf122d13b8 [ML] Use event.timezone in file_structure_finder ingest pipeline (#52720)
This is because beat.timezone was renamed to event.timezone in
elastic/beats#9458
2020-02-25 12:33:53 +00:00
Aleksandr Maus b2cb38ccf5
EQL: Expand verification tests (#52664) (#52725)
* EQL: Expand verification tests (#52664)

Expand verification tests
Fix some error messaging consistency in EqlParser

Related to https://github.com/elastic/elasticsearch/issues/51873

* Adjust for 7.x compatibility
2020-02-25 07:19:33 -05:00
Mark Vieira 025352f0a4
Mute EqlActionIT 2020-02-24 16:06:30 -08:00
Andrei Stefan ed6b10bc03
SQL: use a calendar interval for histograms over 1 month intervals (#52586) (#52715)
(cherry picked from commit 928b11a34ec92d90d082abdf4fa09f7ce1d7c0c4)
2020-02-25 01:41:51 +02:00
Nik Everett d48870ef94 Try to fix test another way.....
Explictly create the index rather than skip adding the default
template....
2020-02-24 17:17:41 -05:00
Nik Everett a7fe3329cb
Fix some top_metrics tests (#52575) (#52726)
These tests didn't work properly when run against multi-shard indices.
The `_score` based sorting test expects fairly specific scores which
isn't going to happen with multiple shards so this disables multiple
shards for that test. The other tests were failing due to a fairly
sneaky race condition around `_bulk` and type inference. This fixes them
by always sending metric values as floating point numbers so
Elasticsearch always infers them to be doubles.
2020-02-24 14:30:37 -05:00
Ryan Ernst 8c295cdc87 Fix sql cli sourcing of x-pack-env (#52613)
The sql-cli script sources x-pack-env, but it does so assuming the
current directory is ES_HOME. This commit alters the source command to
use ES_HOME which is available after running elasticsearch-env.

closes #47803
2020-02-24 11:13:31 -08:00
Aleksandr Maus a7bdb0b456
EQL: Add integration tests harness to test EQL feature parity with original implementation (#52248) (#52675)
The tests use the original test queries from
https://github.com/endgameinc/eql/blob/master/eql/etc/test_queries.toml
for EQL implementation correctness validation.
The file test_queries_unsupported.toml serves as a "blacklist" for the
queries that we do not support. Currently all of the queries are
blacklisted. Over the time the expectation is to eventually have an
empty "blacklist" when all of the queries are fully supported.

The tests use the original test vector from
https://raw.githubusercontent.com/endgameinc/eql/master/eql/etc/test_data.json.

Only one EQL and the response is stubbed for now to match the expected
output from that query. This part would need some tweaking after EQL is
fully wired.

Related to https://github.com/elastic/elasticsearch/issues/49581
2020-02-24 12:46:59 -05:00
Adrien Grand f993ef80f8
Move the terms index of `_id` off-heap. (#52518)
In #42838 we moved the terms index of all fields off-heap except the
`_id` field because we were worried it might make indexing slower. In
general, the indexing rate is only affected if explicit IDs are used, as
otherwise Elasticsearch almost never performs lookups in the terms
dictionary for the purpose of indexing. So it's quite wasteful to
require the terms index of `_id` to be loaded on-heap for users who have
append-only workloads. Furthermore I've been conducting benchmarks when
indexing with explicit ids on the http_logs dataset that suggest that
the slowdown is low enough that it's probably not worth forcing the terms
index to be kept on-heap. Here are some numbers for the median indexing
rate in docs/s:

| Run | Master  | Patch   |
| --- | ------- | ------- |
| 1   | 45851.2 | 46401.4 |
| 2   | 45192.6 | 44561.0 |
| 3   | 45635.2 | 44137.0 |
| 4   | 46435.0 | 44692.8 |
| 5   | 45829.0 | 44949.0 |

And now heap usage in MB for segments:

| Run | Master  | Patch    |
| --- | ------- | -------- |
| 1   | 41.1720 | 0.352083 |
| 2   | 45.1545 | 0.382534 |
| 3   | 41.7746 | 0.381285 |
| 4   | 45.3673 | 0.412737 |
| 5   | 45.4616 | 0.375063 |

Indexing rate decreased by 1.8% on average, while memory usage decreased
by more than 100x.

The `http_logs` dataset contains small documents and has a simple
indexing chain. More complex indexing chains, e.g. with more fields,
ingest pipelines, etc. would see an even lower decrease of indexing rate.
2020-02-24 18:14:12 +01:00
David Kyle de3d674bb7 Revert "Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart"
This reverts commit c4d91143ac.
2020-02-24 15:22:49 +00:00
David Kyle 044a4e127a
[ML] Add reason to DataFrameAnalyticsTask setFailed log message (#52659) (#52707) 2020-02-24 15:21:51 +00:00
Albert Zaharovits 33131e2dcd
Logfile audit settings validation (#52537)
Add validation for the following logfile audit settings:

    xpack.security.audit.logfile.events.include
    xpack.security.audit.logfile.events.exclude
    xpack.security.audit.logfile.events.ignore_filters.*.users
    xpack.security.audit.logfile.events.ignore_filters.*.realms
    xpack.security.audit.logfile.events.ignore_filters.*.roles
    xpack.security.audit.logfile.events.ignore_filters.*.indices

Closes #52357
Relates #47711 #47038
Follows the example from #47246
2020-02-24 16:38:16 +02:00
Ignacio Vera ba9d3c6389
Add support for multipoint shape queries (#52564) (#52705) 2020-02-24 13:46:51 +01:00
Martijn van Groningen 225d841212
Improve watcher test by preventing a npe when closing the http client. 2020-02-24 10:23:45 +01:00
Yang Wang 7cefba78c5
License removal leads back to a basic license (#52407) (#52683)
A new basic license will be generated when existing license is deleted.
In addition, deleting an existing basic license is a no-op.

Resolves: #45022
2020-02-24 11:02:40 +11:00
Jason Tedor 1685cbe504
Add messages for CCR on license state changes (#52470)
When a license expires, or license state changes, functionality might be
disabled. This commit adds messages for CCR to inform users that CCR
functionality will be disabled when a license expires, or when license
state changes to a license level lower than trial/platinum/enterprise.
2020-02-22 09:09:42 -05:00
Benjamin Trent afd90647c9
[ML] Adds feature importance to option to inference processor (#52218) (#52666)
This adds machine learning model feature importance calculations to the inference processor.

The new flag in the configuration matches the analytics parameter name: `num_top_feature_importance_values`
Example:
```
"inference": {
   "field_mappings": {},
   "model_id": "my_model",
   "inference_config": {
      "regression": {
         "num_top_feature_importance_values": 3
      }
   }
}
```

This will write to the document as follows:
```
"inference" : {
   "feature_importance" : {
      "FlightTimeMin" : -76.90955548511226,
      "FlightDelayType" : 114.13514762158526,
      "DistanceMiles" : 13.731580450792187
   },
   "predicted_value" : 108.33165831875137,
   "model_id" : "my_model"
}
```

This is done through calculating the [SHAP values](https://arxiv.org/abs/1802.03888).

It requires that models have populated `number_samples` for each tree node. This is not available to models that were created before 7.7.

Additionally, if the inference config is requesting feature_importance, and not all nodes have been upgraded yet, it will not allow the pipeline to be created. This is to safe-guard in a mixed-version environment where only some ingest nodes have been upgraded.

NOTE: the algorithm is a Java port of the one laid out in ml-cpp: https://github.com/elastic/ml-cpp/blob/master/lib/maths/CTreeShapFeatureImportance.cc

usability blocked by: https://github.com/elastic/ml-cpp/pull/991
2020-02-21 18:42:31 -05:00
Jay Modi 8abfda0b59
Rename assertThrows to prevent naming clash (#52651)
This commit renames ElasticsearchAssertions#assertThrows to
assertRequestBuilderThrows and assertFutureThrows to avoid a
naming clash with JUnit 4.13+ and static imports of these methods.
Additionally, these methods have been updated to make use of
expectThrows internally to avoid duplicating the logic there.

Relates #51787
Backport of #52582
2020-02-21 13:30:11 -07:00
Jack Conradson c4d91143ac Mute RunDataFrameAnalyticsIT.testOutlierDetectionStopAndRestart
Relates: #52654
2020-02-21 09:32:19 -08:00
Lisa Cawley 4ff78e8a00
[7.x][DOCS] Adds X-Pack usage API (#52592) 2020-02-21 06:57:11 -08:00
Jay Modi f3f6ff97ee
Single instance of the IndexNameExpressionResolver (#52604)
This commit modifies the codebase so that our production code uses a
single instance of the IndexNameExpressionResolver class. This change
is being made in preparation for allowing name expression resolution
to be augmented by a plugin.

In order to remove some instances of IndexNameExpressionResolver, the
single instance is added as a parameter of Plugin#createComponents and
PersistentTaskPlugin#getPersistentTasksExecutor.

Backport of #52596
2020-02-21 07:50:02 -07:00
Nik Everett ed957f35a9
Cover missing case in top_metrics test (#52517)
The top_metrics test assumed that it'd never end up *only* reducing
unmapped results. But, rarely, it does. This handles that case in the
test.

Closes #52462
2020-02-21 09:49:17 -05:00
Igor Motov e5b21a3fc6
Add HLRC for EQL search (#52550)
Adds EQL HLRC client with the search method.

Relates to #51961
2020-02-21 08:44:08 -05:00
Hendrik Muhs 288ccae23b [Transform] add support for filter aggregation (#52483)
add support for filter aggregations, refactor code for sub-aggregation support in mapping
deduction

fixes #52151
2020-02-21 14:05:11 +01:00
markharwood 96d603979b
Upgrade Lucene to 8.5.0-snapshot-b01d7cb (#52584)
Upgrading 7x to same Lucene 8.5 version used in master
2020-02-21 10:25:03 +00:00
Przemko Robakowski aff693bc9f
Make FreezeStep retryable (#52540) (#52559)
* Make FreezeStep retryable

This change marks `FreezeStep` as retryable and adds test to make sure we can really run it again.

* refactor tests

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-21 10:11:35 +01:00
Armin Braun 4bb780bc37
Refactor Inflexible Snapshot Repository BwC (#52365) (#52557)
* Refactor Inflexible Snapshot Repository BwC (#52365)

Transport the version to use for  a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere.
Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.
2020-02-21 09:14:34 +01:00
Przemysław Witek b84e8db7b5
[7.x] Rename .ml-state index to .ml-state-000001 to support rollover (#52510) (#52595) 2020-02-21 08:55:59 +01:00
Andrei Stefan c9b7bb282a
Move IsNull/IsNotNull predicates to QL project (#52502) (#52546)
(cherry picked from commit b7d534e20c005f1c3565e52c0d0e0273f4a4cece)
2020-02-21 09:21:44 +02:00
Yang Wang 4bc7545e43
Add enterprise mode and refactor license check (#51864) (#52115)
Add enterprise operation mode to properly map enterprise license.

Aslo refactor XPackLicenstate class to consolidate license status and mode checks.
This class has many sychronised methods to check basically three things:
* Minimum operation mode required
* Whether security is enabled
* Whether current license needs to be active

Depends on the actual feature, either 1, 2 or all of above checks are performed.
These are now consolidated in to 3 helper methods (2 of them are new).
The synchronization is pushed down to the helper methods so actual checking
methods no longer need to worry about it.

resolves: #51081
2020-02-21 14:18:18 +11:00
Benjamin Trent 2a5c181dda
[ML][Inference] don't return inflated definition when storing trained models (#52573) (#52580)
When `PUT` is called to store a trained model, it is useful to return the newly create model config. But, it is NOT useful to return the inflated definition.

These definitions can be large and returning the inflated definition causes undo work on the server and client side.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-20 19:47:29 -05:00
Benjamin Trent 013d5c2d24
[ML] Adds support for a global calendar via `_all` (#50372) (#52578)
This adds `_all` to Calendar searches. This enables users to supply the `_all` string in the `job_ids` array when creating a Calendar. That calendar will now be applied to all jobs (existing and newly created).

Closes #45013

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-20 17:22:59 -05:00
Russ Cam 62da077beb Specify name on enrich.get_policy as list type (#50217)
This commit updates the enrich.get_policy API to specify name
as a list, in line with other URL parts that accept a comma-separated
list of values.

In addition, update the get enrich policy API docs
to align the URL part name in the documentation with
the name used in the REST API specs.

(cherry picked from commit 94f6f946ef283dc93040e052b4676c5bc37f4bde)
2020-02-20 11:39:28 +10:00
Ryan Ernst 3c3a0b2f37 Mute additional failing top_metrics test (#52545)
Most top_metrics tests were muted in #52468, but the scaled float can
also fail. This commit mutes that test as well.

relates #52418
2020-02-19 16:14:26 -08:00
Przemko Robakowski 88bb06f055
Make DeleteStep retryable (#52494) (#52532)
* Make DeleteStep retryable

This change marks `DeleteStep` as retryable and adds test to make sure we really can invoke it again.

* Fix unused import

* revert unneeded changes

* test reworked
2020-02-19 21:16:59 +01:00
Lee Hinman 22cf1140eb
[7.x] Add additional logging to SLM retention task (#52343) (#52535)
This commit adds more logging to the actions that the SLM retention task does. It will help in the
event that we need to diagnose any additional issues or problems while running retention.
2020-02-19 13:15:01 -07:00
David Kyle 7bbe5c8464
[Ml] Validate tree feature index is within range (#52514)
This changes the tree validation code to ensure no node in the tree has a
feature index that is beyond the bounds of the feature_names array.
Specifically this handles the situation where the C++ emits a tree containing
a single node and an empty feature_names list. This is valid tree used to
centre the data in the ensemble but the validation code would reject this
as feature_names is empty. This meant a broken workflow as you cannot GET
the model and PUT it back
2020-02-19 14:41:43 +00:00
Nik Everett 8796cdce4b
Modernize boxplot's parser (backport of #52361) (#52372)
Uses a newer way to build `ObjectParser` for in `boxplot` that allows us
to drop a mostly ceremonial method.
2020-02-19 09:20:49 -05:00
Przemysław Witek 7cd997df84
[ML] Make ml internal indices hidden (#52423) (#52509) 2020-02-19 14:02:32 +01:00
Hendrik Muhs 4d006f09d2 [Transform] fix XPackRestIT continuous transform stats test failure
do not match explicit number but only test existence for duration test (#52504)

fixes #52429
2020-02-19 12:32:54 +01:00
Przemysław Witek 5acee761eb
Implement unit tests for AnomalyDetectorsIndex class (#52417) (#52508) 2020-02-19 12:24:59 +01:00
Tim Brooks b5e191fa57
Use thread local random for request id generation (#52344)
Currently we used the secure random number generate when generating http
request ids in the security AuditUtil. We do not need to be using this
level of randomness for this use case. Additionally, this random number
generator involves locking that blocks the http worker threads at high
concurrency loads.

This commit modifies this randomness generator to use our reproducible
randomness generator for Elasticsearch. This generator will fall back to
thread local random when used in production.
2020-02-18 09:32:14 -07:00
Ioannis Kakavas 09773efb41
[7.x] Return realm name in SAML Authenticate API (#52188) (#52465)
This is useful in cases where the caller of the API needs to know
the name of the realm that consumed the SAML Response and
authenticated the user and this is not self evident (i.e. because
there are many saml realms defined in ES).
Currently, the way to learn the realm name would be to make a
subsequent request to the `_authenticate` API.
2020-02-18 17:16:24 +02:00
Henning Andersen 84de601551 Mute failing top_metrics tests (#52468)
These tests fails when the global template is added, which changes
number_of_shards to 2.

Relates #52409 and #52418
2020-02-18 13:29:28 +01:00
Ioannis Kakavas d9ce0e6733
Update BouncyCastle to 1.64 (#52185) (#52464)
This commit upgrades the bouncycastle dependency from 1.61 to 1.64.
2020-02-18 14:11:34 +02:00
David Roberts 9c49868bc5 [TEST] Use busy asserts in ML distributed failure test (#52461)
When changing a job state using a mechanism that doesn't
wait for the desired state to be reached within the production
code the test code needs to loop until the cluster state has
been updated.

Closes #52451
2020-02-18 11:17:37 +00:00
Przemysław Witek 6fa067a2a0
Relax assertions on memory_estimation.* fields (#52452) (#52458) 2020-02-18 11:57:03 +01:00
Przemko Robakowski d467c50e90
Make TimeSeriesLifecycleActionsIT.testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore wait for snaphost (#51892) (#52419)
* waitForSnapshot tests rework

* Refactor assertBusy

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-18 11:01:42 +01:00
Martijn van Groningen d17ecb5936
Change the delete policy api to not pass wildcard expressions to the delete index api (#52448)
Backport from #52179

Don't rely on the delete index api to resolve all the enrich indices for a particular enrich policy using a '[policy_name]-*' wildcard expression. With this change, the delete policy api will resolve the indices to remove and pass that directly to the delete index api.

This resolves a bug, that if `action.destructive_requires_name` setting has been set to true then the delete policy api is unable to remove the enrich indices related to the policy being deleted.

Closes #51228 

Co-authored-by: bellengao <gbl_long@163.com>
2020-02-18 10:53:39 +01:00
Hendrik Muhs 2071f85e1a forward audits to logs (#52394)
audit messages are stored in the notifications index, so audit information is lost for integration
tests. This change forwards audit messages to logs, so they can help to debug issues.

relates: #51627
2020-02-18 08:47:27 +01:00
Nhat Nguyen bdb2e72ea4
Fix timeout in testDowngradeRemoteClusterToBasic (#52322)
- ESCCRRestTestCase#ensureYellow does not work well with assertBusy
- Increases timeout to 60s

Closes #52036
2020-02-17 15:05:42 -05:00
David Roberts 48ccf36db9 [ML] Increase assertBusy timeout in ML node failure tests (#52425)
Following the change to store cluster state in Lucene indices
(#50907) it can take longer for all the cluster state updates
associated with node failure scenarios to be processed during
internal cluster tests where several nodes all run in the same
JVM.
2020-02-17 17:04:18 +00:00
Costin Leau 20862fe64f Break QueryTranslator into QL and SQL (#52397)
Refactor the code to allow contextual parameterization of dateFormat and
name.
Separate aggs/query implementation though there's room for improvement
in the future

(cherry picked from commit e086f81b688875b33d01e4504ce7377031c8cf28)
2020-02-17 17:30:15 +02:00
Martijn van Groningen d3db6cbf50
Fix NPE in cluster state collector for monitoring. (#52371)
Take into account a null license may be returned by the license service.

Closes #52317
2020-02-17 09:04:44 +01:00
Jason Tedor c9f72a0116
Fix shard follow task cleaner under security (#52347)
The shard follow task cleaner executes on behalf of the user to clean up
a shard follow task after the follower index has been
deleted. Otherwise, these persistent tasks are left laying around, and
they fail to execute because the follower index has been deleted. In the
face of security, attempts to complete these persistent tasks would
fail.  This is because these cleanups are executed under the system
context (this makes sense, they are happening on behalf of the user
after the user has executed an action) but the system role was never
granted the permission for persistent task completion. This commit
addresses this by adding this cluster privilege to the system role.
2020-02-16 17:26:14 -05:00
Hendrik Muhs f0747e607d delete the transform to delete any docs which might have been written by the (#52360)
delete the transform to delete any docs which might have been written by the task after deleting
the index

fixes #51347
2020-02-16 11:23:06 +01:00
Andrei Dan bd3a70db4e
ILM fix the init step to actually be retryable (#52076) (#52375)
We marked the `init` ILM step as retryable but our test used `waitUntil`
without an assert so we didn’t catch the fact that we were not actually
able to retry this step as our ILM state didn’t contain any information
about the policy execution (as we were in the process of initialising
it).

This commit manually sets the current step to `init` when we’re moving
the ilm policy into the ERROR step (this enables us to successfully
move to the error step and later retry the step)

* ShrunkenIndexCheckStep: Use correct logger

(cherry picked from commit f78d4b3d91345a2a8fc0f48b90dd66c9959bd7ff)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-02-15 18:42:05 +00:00
Hicham Mallah 5b32d112e1
SQL: Fix issues with GROUP BY queries (#41964)
Translate to an agg query even if only literals are selected,
so that the correct number of rows is returned (number of buckets).

Fix issue with key only in GROUP BY (not in select) and WHERE clause:
Resolve aggregates and groupings based on the child plan which holds
the info info for all the fields of the underlying table.

Fixes: #41951
Fixes: #41413
(cherry picked from commit 45b85809678b34a448639a420b97e25436ae851f)
2020-02-15 10:38:24 +01:00
Andrei Dan da2d441d50
ILM make the set-single-node-allocation retryable (#52077) (#52138)
(cherry picked from commit 0e473115958f691fc8dc87293642aea6a07fe3da)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-02-14 17:31:24 +00:00
Nik Everett 146def8caa
Implement top_metrics agg (#51155) (#52366)
The `top_metrics` agg is kind of like `top_hits` but it only works on
doc values so it *should* be faster.

At this point it is fairly limited in that it only supports a single,
numeric sort and a single, numeric metric. And it only fetches the "very
topest" document worth of metric. We plan to support returning a
configurable number of top metrics, requesting more than one metric and
more than one sort. And, eventually, non-numeric sorts and metrics. The
trick is doing those things fairly efficiently.

Co-Authored by: Zachary Tong <zach@elastic.co>
2020-02-14 11:19:11 -05:00
Dimitris Athanasiou ad56802ac6
[7.x][ML] Refactor ML mappings and templates into JSON resources (#51… (#52353)
ML mappings and index templates have so far been created
programmatically. While this had its merits due to static typing,
there is consensus it would be clear to maintain those in json files.
In addition, we are going to adding ILM policies to these indices
and the component for a plugin to register ILM policies is
`IndexTemplateRegistry`. It expects the templates to be in resource
json files.

For the above reasons this commit refactors ML mappings and index
templates into json resource files that are registered via
`MlIndexTemplateRegistry`.

Backport of #51765
2020-02-14 17:16:06 +02:00
Marios Trivyzas 51e74be1bb
SQL: [Tests] Add tests for fixed issues (#52335)
Add tests to verify behaviour for
fixed issues: #33724 & #38306

(cherry picked from commit 89fb6753a9db9484a5622417cd4ffea9af0347ad)
2020-02-14 11:23:30 +01:00
Ioannis Kakavas 6cd42923d5
Update cryptacular to 1.2.4 (#52331) (#52349)
Cryptacular is a dependency of opensaml
2020-02-14 10:24:45 +02:00
Hendrik Muhs efd7542b2a
[7.x][Transform] provide exponential_avg* stats for batch transforms (#52041) (#52323)
provide exponential_avg* stats for batch transforms, avoids confusion
why those values are all 0 otherwise
2020-02-14 07:48:23 +01:00
Igor Motov a66988281f
Add histogram field type support to boxplot aggs (#52265)
Add support for the histogram field type to boxplot aggs.

Closes #52233
Relates to #33112
2020-02-13 18:09:26 -05:00
Julie Tibshirani 0d7165a40b Standardize naming of fetch subphases. (#52171)
This commit makes the names of fetch subphases more consistent:
* Now the names end in just 'Phase', whereas before some ended in
  'FetchSubPhase'. This matches the query subphases like AggregationPhase.
* Some names include 'fetch' like FetchScorePhase to avoid ambiguity about what
  they do.
2020-02-13 13:00:46 -08:00
Przemysław Witek 0da3af7581
[7.x] [ML] Add _cat/ml/data_frame/analytics API (#52260) (#52312) 2020-02-13 16:55:47 +01:00
Marios Trivyzas ea6f0e39bc
[Tests] Update skip version for YAML tests (#52310)
Update skip versions upper boundary to match the release
or intented release version of the feature/fix.
2020-02-13 15:36:31 +01:00
Costin Leau 5373a77fb9 QL: Extract common Failure class (#52281)
Shared across SQL and EQL

(cherry picked from commit 1aeda20d3ec3d6c885de03c6043dd1e8eab9f230)
2020-02-13 14:35:15 +02:00
David Roberts 3ea49557fe Add cluster:admin/analyze permission to Kibana system role (#52259)
This is to support the ML categorization wizard.

Currently cluster:admin/analyze is only provided with the
"manage" cluster privilege, which is an excessive privilege
level to provide access to this single feature.  It means
that the ML categorization wizard only works for extremely
highly privileged users.

Following this change the Kibana system user will be
permitted to run the _analyze endpoint on supplied strings
(not on an index).  The ML UI will then call the _analyze
endpoint as the Kibana system user after first checking
that the logged-in user is permitted to create an ML job.
This will mean that users with the more reasonable
"manage_ml" cluster privilege will be permitted to use
the ML categorization wizard.

(This is also consistent with the way the ML UI will access
_all_ Elasticsearch functionality when the "ML in Spaces"
project is completed.)

Closes #51391
Relates elastic/kibana#57375
2020-02-13 11:01:27 +00:00
Nik Everett 2dac36de4d
HLRC support for string_stats (#52163) (#52297)
This adds a builder and parsed results for the `string_stats`
aggregation directly to the high level rest client. Without this the
HLRC can't access the `string_stats` API without the elastic licensed
`analytics` module.

While I'm in there this adds a few of our usual unit tests and
modernizes the parsing.
2020-02-12 19:25:05 -05:00
Julie Tibshirani f0668cabbc Adjust the 'skip' version in flattened REST tests. (#52293)
I forgot to adjust it after backporting the flattened fields feature.
2020-02-12 15:17:44 -08:00
Jay Modi 5bcc6fce5c
Remove DeprecationLogger from route objects (#52285)
This commit removes the need for DeprecatedRoute and ReplacedRoute to
have an instance of a DeprecationLogger. Instead the RestController now
has a DeprecationLogger that will be used for all deprecated and
replaced route messages.

Relates #51950
Backport of #52278
2020-02-12 15:05:41 -07:00
Marios Trivyzas dac720d7a1
Add a cluster setting to disallow expensive queries (#51385) (#52279)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
(cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)
2020-02-12 22:56:14 +01:00
Bogdan Pintea 5dfe27601e
SQL: supplement input checks on received request parameters (#52229) (#52277)
* Add more checks around parameter conversions

This commit adds two necessary verifications on received parameters:
- it checks the validity of the parameter's data type: if the declared
data type is resolved to an ES or Java type;
- it checks if the returned converter is non-null (i.e. a conversion is
possible) and generates an appropriate exception otherwise.

(cherry picked from commit eda30ac9c69383165324328c599ace39ac064342)
2020-02-12 19:45:12 +01:00
Costin Leau 26900bfb05 EQL: Add infra for planning and query folding (#52065)
Actual folding not yet in place (TBD)

(cherry picked from commit d52b96f273a94c90e475a5035cd57baa086fb0c0)
2020-02-12 18:51:42 +02:00
Hendrik Muhs 5d35eaa1cb [Transform] improve irrecoverable error detection - part 2 (#52003)
base error handling on rest status instead of listing individual exception types

relates to #51820
2020-02-12 14:38:42 +01:00
James Rodewig 3f151d1d75 [DOCS] Add redirects, update JSON spec to fix docs build (#51747)
Docs build [#11556][0] broke due to several outdated or incorrect links
in the JSON REST spec.

This fixes those links where possible and adds redirects.

[0]: https://elasticsearch-ci.elastic.co/job/elastic+docs+master+build/11556/
2020-02-12 08:30:59 -05:00
Andrei Stefan a3ebacfcf3
52169 & 52172 7x backport (#52256)
* Extract common optimizer tests (#52169)

(cherry picked from commit e5ad72bc22e9ec0686ab582195f0032efcb880bf)

* Hook in the optimizer rules (#52172)

(cherry picked from commit 1f90d8cc56052fbf2af604e72f9f5ca73f5e75d5)
2020-02-12 11:20:03 +02:00
Marios Trivyzas daab242c75
SQL: Fix ORDER BY on aggregates and GROUPed BY fields (#51894)
Previously, in the in-memory sorting module
`LocalAggregationSorterListener` only the aggregate functions where used
(grabbed by the `sortingColumns`). As a consequence, if the ORDER BY
was also using columns of the GROUP BY clause, (especially in the case
of higher priority - before the aggregate functions) wrong results were
produced. E.g.:
```
SELECT gender, MAX(salary) AS max FROM test_emp
GROUP BY gender
ORDER BY gender, max
```

Add all columns of the ORDER BY to the `sortingColumns` so that the
`LocalAggregationSorterListener` can use the correct comparators in
the underlying PriorityQueue used to implement the in-memory sorting.

Fixes: #50355
(cherry picked from commit be680af11c823292c2d115bff01658f7b75abd76)
2020-02-12 09:38:47 +01:00
Hendrik Muhs edaf6d1f79
[Transform] maintain a list of unsupported aggregations in transforms (#52190) (#52222)
add a list of unsupported aggs in transforms and create a test that fails if a new aggregation is
added. Limitation: works only if a new agg is added to either the core or a known plugin
(Analytics, MatrixAggregation).
2020-02-12 07:48:04 +01:00
Benjamin Trent 2a968f4f2b
[ML] job results provider refactoring (#52012) (#52238)
During a bug hunt, I caught a handful of things (unrelated to the bug) that could be potential issues:

1. Needlessly wrapping in exception handling (minor cleanup)
2. Potential of notifying listeners of a failure multiple times + even trying to notify of a success after a failure notification
2020-02-11 17:54:44 -05:00
Gordon Brown d48ce12920
Convert ILM and SLM histories into hidden indices (#51456)
Modifies SLM's and ILM's history indices to be hidden indices for added
protection against accidental querying and deletion, and improves
IndexTemplateRegistry to handle upgrading index templates.

Also modifies the REST test cleanup to delete hidden indices.
2020-02-11 14:18:55 -07:00
Albert Zaharovits cc1fce96ba
Add a new async search security origin (#52141)
This commit adds a new security origin, and an associated reserved user
and role, named `_async_search`, which can be used by internal clients to
manage the `.async-search-*` restricted index namespace.
2020-02-11 19:58:06 +02:00
James Rodewig d68a4ec82e
[7.x] Permit EQL feature flag in release builds (#52201) (#52214)
7.x backport of #52201

Provides a path to set register the EQL feature flag in release builds.
This enables EQL in release builds so that release docs tests pass.

Release docs tests do not have infrastructure in place to only register
snippets from included portions of the docs, they instead include all
docs snippets.

Since EQL can not be enabled in release builds, this meant that the EQL
snippets fail in the release docs tests.

This adds the ability to enable EQL in the release docs tests. This
system property will be removed when EQL is ready for release.
2020-02-11 11:49:49 -05:00
Hendrik Muhs 098380e483 Percentiles aggregation validation checks for range (#51871)
disallow to specify percentile out of range [0,100]. This also fixes a problem in transform by failing
validation if an invalid percentile configuration is used.
2020-02-11 17:25:39 +01:00
David Roberts d1d9c40e71 [ML] Switch poor categorization audit warning to use status field (#52195)
In #51146 a rudimentary check for poor categorization was added to
7.6.

This change replaces that warning based on a Java-side check with
a new one based on the categorization_status field that the ML C++
sets.  categorization_status was added in 7.7 and above by #51879,
so this new warning based on more advanced conditions will also be
in 7.7 and above.

Closes #50749
2020-02-11 15:33:27 +00:00
David Roberts 473468d763 [ML] Better error when persistent task assignment disabled (#52014)
Changes the misleading error message when attempting to open
a job while the "cluster.persistent_tasks.allocation.enable"
setting is set to "none" to a clearer message that names the
setting.

Closes #51956
2020-02-11 15:23:21 +00:00
Igor Motov 667e1a5225
Add Boxplot Aggregation (#52174)
Adds a `boxplot` aggregation that calculates min, max, medium and the first
and the third quartiles of the given data set.

Closes #33112
2020-02-11 09:38:17 -05:00
Marios Trivyzas 204d086266 SQL: Fix issue with timezone when paginating (#52101)
Previously, when the specified (or default) fetchSize led to
subsequent HTTP requests and the usage of cursors, those subsequent
were no longer using the client timezone specified in the initial
SQL query. As a consequence, Even though the query is executed once
(with the correct timezone) the processing of the query results by
the HitExtractors in the next pages was done using the default
timezone Z. This could lead to incorrect results.

Fix the issue by correctly using the initially specified timezone,
which is found in the deserialisation of the cursor string.

Fixes: #51258
(cherry picked from commit 8f7afbdeb9295999b48a6c36db5b31cbe0cee432)
2020-02-11 15:27:56 +01:00
Yang Wang 16ba59e9d1
Expose more authentication info to ingest pipeline (#51305) (#52119)
The changes add more granularity for identiying the data ingestion user.
The ingest pipeline can now be configure to record authentication realm and
type. It can also record API key name and ID when one is in use. 
This improves traceability when data are being ingested from multiple agents
and will become more relevant with the incoming support of required
pipelines (#46847)

Resolves: #49106
2020-02-11 23:05:01 +11:00
Tim Vernum b0b1b13311
Extract class to store Authentication in context (#52183)
This change extracts the code that previously existed in the
"Authentication" class that was responsible for reading and writing
authentication objects to/from the ThreadContext.

This is needed to support multiple authentication objects under
separate keys.

This refactoring highlighted that there were a large number of places
where we extracted the Authentication/User objects from the thread
context, in a variety of ways. These have been consolidated to rely on
the SecurityContext object.

Backport of: #52032
2020-02-11 20:59:06 +11:00
Dimitris Athanasiou 6086fadf00
[7.x][ML] Prepare to hold additional stats in DF Analytics task (#52134) (#52187)
Refactors `DataFrameAnalyticsTask` to hold a `StatsHolder` object.
That just has a `ProgressTracker` for now but this is paving the
way to add additional stats like memory usage, analysis stats, etc.

Backport #52134
2020-02-11 11:18:45 +02:00
Dimitris Athanasiou cbebc26f50
[7.x][ML] Retry persisting DF Analytics results (#52048) (#52160)
Employs `ResultsPersisterService` from `DataFrameRowsJoiner` in order
to add retries when a data frame analytics job is persisting the results
to the destination data frame.

Backport of #52048
2020-02-11 09:55:00 +02:00
Andrei Stefan 2f1631d9d0
Telemetry data initial implementation (#51715) (#52175)
(cherry picked from commit f1d1cceacaacf226fcd2459f34689843b822fe4b)
2020-02-11 09:15:47 +02:00
Marios Trivyzas 6b600855a9
SQL: Make parsing of date more lenient (#52137)
Make the parsing of date more lenient

- as an escaped literal: `{d '2020-02-10[[T| ]10:20[:30][.123456789][tz]]'}`
- cast a string to a date: `CAST(2020-02-10[[T| ]10:20[:30][.123456789][tz]]' AS DATE)`

Closes: #49379
(cherry picked from commit 5863b27500d5e7f6cdd8c6c62b09b84e53ca724a)
2020-02-10 21:47:00 +01:00
Julie Tibshirani 28a8db730f In FieldTypeLookup, factor out flat object field logic. (#52091)
Currently, the logic for looking up `flattened` field types lives in the
top-level `FieldTypeLookup`. This PR moves it into a dedicated class
`DynamicKeyFieldTypeLookup`.
2020-02-10 10:44:02 -08:00
Bogdan Pintea 7b58ed0dd7
Fix milliseconds handling in intervals (#51675) (#52156)
This fixes:

- the parsing of milliseconds in intervals: everything past the . used to be converted as-is to milliseconds, with no normalisation of the unit; thus, a value of .23 ended up as 23 millis in the interval, instead of 230.
- the printing of a trailing .0, in case the interval lacks the fractional part;
- tests generating a random millisecond value used to simply print it in the string about to be evaluated without a necessary front-filling of 0[s], where the amount was below 100/10.

(The combination of first and last issues above, plus statistical "luck" made the incorrect handling pass the tests.)

(cherry picked from commit 4de8c64f63ee37c1bcfdb9b9d3a07d09be243222)
2020-02-10 19:24:26 +01:00
Lee Hinman 37a2e9bac6
[7.x] Allow forcemerge in the hot phase for ILM policies (#520… (#52083)
* Allow forcemerge in the hot phase for ILM policies

This commit changes the `forcemerge` action to also be allowed in the `hot` phase for policies. The
forcemerge will occur after a rollover, and allows users to take advantage of higher disk speeds for
performing the force merge (on a separate node type, for example).

On caveat with this is that a `forcemerge` in the `hot` phase *MUST* be accompanied by a `rollover`
action. ILM validates policies to ensure this is the case.

Resolves #43165

* Use anyMatch instead of findAny in validation

* Make randomTimeseriesLifecyclePolicy single-pass
2020-02-10 08:54:49 -07:00
Przemysław Witek c7cc383d33
[7.x] Update persistent state document in the index the document belongs to (#51751) (#52145) 2020-02-10 16:32:34 +01:00
Nhat Nguyen 864e9d875d Bubble up exception in follow task in ccr tests (#52085)
It's perfectly fine if a bulk request on the follower hits 
IndexShardClosedException in some CCR tests because we sometimes 
close some follower shards while the follow-task is replicating operations.
Instead of failing the test immediately, this commit bubbles up that
failure to the shard follow task.

Closes #52052
2020-02-10 08:27:04 -05:00
Marios Trivyzas 27265f032a SQL: Enhance timestamp escaped literal parsing (#52097)
Allow also whitespace ` ` (together with `T`) as a separator between
date and time parts of the timestamp string. E.g.:
```
{ts '2020-02-08 12.10.45'}
```
or
```
{ts '2020-02-08T12.10.45'}
```

Fixes: #46069
(cherry picked from commit 07c977023fb8ceab5991c359a6cbfe07beaad9bb)
2020-02-10 11:24:55 +01:00
Tim Vernum 4e4815355a Mute DocumentSubsetBitsetCacheTests.testCacheUnderConcurrentAccess (#52135)
Test does not always complete in expected time.

Relates: #51914
Backport of: #52122
2020-02-10 21:19:18 +11:00
Andrei Stefan fa4dcd50d9 Extract common optimization rules for QL (#52054) (#52132)
(cherry picked from commit ee43115531234c2d955193ce0c9c268e1f02ab43)
2020-02-10 11:48:45 +02:00
Ignacio Vera 80e3c97210 Upgrade to lucene-8.5.0-snapshot-d62f6307658 (#52039) (#52130) 2020-02-10 10:13:22 +01:00
David Roberts 1cefafdd14 [ML] Add new categorization stats to model_size_stats (#52009)
This change adds support for the following new model_size_stats
fields:

- categorized_doc_count
- total_category_count
- frequent_category_count
- rare_category_count
- dead_category_count
- categorization_status

Backport of #51879
2020-02-10 09:10:50 +00:00
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00
Ioannis Kakavas 8c0b49cd32 Adjust jarHell and 3rd party audit exclusions (#51733) (#51766)
Now that the FIPS 140 security provider is simply a test dependency
we don't need the thirdPartyAudit exceptions, but plugin-cli and
transport-netty4 do need jarHell disabled as they use the non fips
BouncyCastle security provider as a test dependency too.
2020-02-10 07:38:59 +02:00
Tim Vernum d5c015062d
Don't allow null User.principal (#52049)
Some parts of the User class (e.g. equals/hashCode) assumed that
principal could never be null, but the constructor didn't enforce
that.

This adds a null check into the constructor and fixes a few tests that
relied on being able to pass in null usernames.

Backport of: #51988
2020-02-10 12:23:55 +11:00
Jason Tedor 2b99291187
Add autoscaling feature flag in release REST tests (#52096)
The REST tests for autoscaling either need to be skipped in a
non-snapshot build, or alternatively, the feature flag registered so
that autoscaling can be enabled. We prefer the latter approach, as it
allows us to also test autoscaling in non-snapshot builds incrementally,
instead of at the end of development as autoscaling prepares for
release. This commit registers the autoscaling feature flag in REST
tests for non-snapshot builds.
2020-02-09 15:49:01 -05:00
Armin Braun 90eb6a020d Remove Redundant Loading of RepositoryData during Restore (#51977) (#52108)
We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
2020-02-09 21:44:18 +01:00
Marios Trivyzas 3e7f939f63
SQL: [Tests] Add more tests for aggs and literals (#52086)
Add some more tests where more than one literal is selected,
unaliased and aliased.

Follows: #42121
(cherry picked from commit 405271d408a233e697eb2e9ded3005a71f4df5e7)
2020-02-09 18:01:05 +01:00
Costin Leau 214beed90f QL: move query AST from SQL to QL (#52069)
(cherry picked from commit 59368968b698652352be1bb2a60d5a357a01b978)
2020-02-08 23:10:51 +02:00
Jason Tedor 8b1d2c5b95
Permit autoscaling feature flag in release builds (#52088)
This commit provides a path to set register the autoscaling feature flag
in release builds, and therefore enabling autoscaling in release
builds. The primary reason that we add this is so that our release docs
tests can pass. Our release docs tests do not have infrastructure in
place to only register snippets from included portions of the docs, they
instead include all docs snippets. Since autoscaling can not be enabled
in release builds, this meant that the autoscaling snippets would fail
in the release docs tests. To address then, we need the ability to
enable autoscaling in the release docs tests which we can now do with
the system property added here. This system property will be removed
when autoscaling is ready for release.
2020-02-07 21:40:51 -05:00
Benjamin Trent dffcd021df
[7.x] [ML] Add bwc serialization unit test scaffold (#51889) (#52061)
* [ML] Add bwc serialization unit test scaffold (#51889)

Adds new `AbstractBWCSerializationTestCase` which provides easy scaffolding for BWC serialization unit tests.

These are no replacement for true BWC tests (which execute actual old code). These tests do provide some good coverage for the current code when serializing to/from old versions.

* removing unnecessary override for 7.series branch

* adding necessary import

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-07 17:17:11 -05:00
Benjamin Trent c6111eb90e
[ML][Inference] adding number_samples to TreeNode (#51937) (#52060)
in preparation for feature importance and split information gain, adding `number_samples` field to `TreeNode` definition.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-07 17:04:58 -05:00
Julie Tibshirani 337d73a7c6 Rename MapperService#fullName to fieldType.
The new name more accurately describes what the method returns.
2020-02-07 10:35:53 -08:00
Emanuele Sabellico 282e919607 SQL: [Tests] Add integ tests for selecting a literal and an aggregate (#42121)
The related issue regarding aggregation queries where some literals
are also selected together with aggregate function has been fixed
with #49570. Add integration tests to verify the behavior.

Relates to: #41411

(cherry picked from commit 9f414a8d05c75e1a9f8250084f6dcd634d5d78d8)
2020-02-07 19:00:15 +01:00
David Kyle 8f10a7c6ca [ML] Make Ensemble feature names optional (#51996)
The featureNames field is requisite in individual models but is not required by the Ensemble.
2020-02-07 10:08:37 +00:00
Armin Braun 91e938ead8
Add Trace Logging of REST Requests (#51684) (#52015)
Being able to trace log all REST requests to a node would make debugging
a number of issues a lot easier.
2020-02-07 09:03:20 +01:00
Jason Tedor 25daf5f1e1
Add autoscaling API skelton (#51564)
The main purpose of this commit is to add a single autoscaling REST
endpoint skeleton, for the purpose of starting to build out the build
and testing infrastructure that will surround it. For example, rather
than commiting a fully-functioning autoscaling API, we introduce here
the skeleton so that we can start wiring up the build and testing
infrastructure, establish security roles/permissions, an so on. This
way, in a forthcoming PR that introduces actual functionality, that PR
will be smaller and have less distractions around that sort of
infrastructure.
2020-02-06 21:55:01 -05:00
Andrei Stefan 488944f4a1
SQL: Handle uberjar scenario where the ES jdbc driver file is bundled in another jar (#51856) (#52024)
(cherry picked from commit 6247b0793c9db19a8a9fa6f0164cc14d0debed6e)
2020-02-07 04:15:59 +02:00
Benjamin Trent 846f87a26e
[ML] allow close/stop for jobs/datafeeds with missing configs (#51888) (#51997)
If the configs are removed (by some horrific means), we should still allow tasks to be cleaned up easily.

Datafeeds and jobs with missing configs are now visible in their respective _stats calls and can be stopped/closed.
2020-02-06 12:10:18 -05:00
Hendrik Muhs 03fb5cdaae fallback to float if source type is scaled_float for mapping deduction (#51990)
fallback to float if source type is scaled_float for mapping deduction of min/max aggregation

fixes #51780
2020-02-06 17:27:26 +01:00
Martijn Laarman 898dd0b9cc Cat.ml.* introduces an additional depths to namespace API's (#51981)
Not all clients support this e.g if the java high level rest client were
to map this it would look like `client.cat().ml().api()` which hinders
discoverability.

(cherry picked from commit 21cdabf09dc8305ce2f5e3b6cb193f67137d8bdb)
2020-02-06 13:16:59 +01:00
Jim Ferenczi 0f333c89b9
Always rewrite search shard request outside of the search thread pool (#51708) (#51979)
This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change.
This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.
2020-02-06 10:53:11 +01:00
Jason Tedor 12473c2bcb
Log failure when cleaning shard follow task (#51971)
When clenaing a shard follow task after an index has been deleted, an
exception can occur submitting the complete persistent task
action. However, this exception message is not logged. This commit
addresses this by including the exception that led to the failure in the
log message.
2020-02-05 20:48:00 -05:00
Tanguy Leroux d86a7ad6d2 Give more time to AutoFollowIT tests (#51938)
AutoFollowIT tests are regularly failing on CI because they rely 
on how cluster state updates are processed within the integration 
clusters. We tried to limit this in #49141 by moving to latches 
instead of waiting for assertions to pass but there are still some 
places were it still need to wait for the cluster state updates to 
be processed and auto-follow stats to be updated.

This commit gives more time to assertBusy() that verifies the 
AutoFollowStats (up to 60 seconds) and also always log the 
auto-follow stats in case the assertions failed.

Closes #48982
2020-02-05 15:57:27 +01:00
Costin Leau bd6d9e063c EQL: Add missing commit messages for #51940
* EQL: Plug query params into the AstBuilder (#51886)

As the eventType is customizable, plug that into the parser based on the
given request.

(cherry picked from commit 5b4a3a3c07eacbc339cbd4c05a3621d056cc8d60)

* EQL: Add field resolution and verification (#51872)

Add basic field resolution inside the Analyzer and a basic Verifier to
check for any unresolved fields.

(cherry picked from commit 7087358ae2fb212811d480ec8641a46167946c82)

* EQL: Introduce basic execution pipeline (#51809)

Add main classes that form the 'execution' pipeline are added - most of
them have no functionality; the purpose of this PR is to add flesh out
the contract between the various moving parts so that work can start on
them independently.

(cherry picked from commit 9a1bae50a49af7fe8467b74b154c0d82c6bb9a19)

* EQL: Add AstBuilder to convert to QL tree (#51558)

* EQL: Add AstBuilder visitors
* EQL: Add tests for wildcards and sets
* EQL: Fix licensing
* EQL: Fix ExpressionTests.java license
* EQL: Cleanup imports
* EQL: PR feedback and remove LiteralBuilder
* EQL: Split off logical plan from expressions
* EQL: Remove stray import
* EQL: Add predicate handling for set checks
* EQL: Remove commented out dead code
* EQL: Remove wildcard test, wait until analyzer

(cherry picked from commit a462700f9c8e1fb977d62d42eb0077403b8fa98b)

* EQL grammar updates and tests (#49658)

* EQL: Additional tests and grammar updates
* EQL: Add backtick escaped identifiers
* EQL: Adding keywords to language
* EQL: Add checks for unsupported syntax
* EQL: Testing updates and PR feedback
* EQL: Add string escapes
* EQL: Cleanup grammar for identifier
* EQL: Remove tabs from .eql tests

(cherry picked from commit 6f1890bf2d52cabdfd1e7848fb481cf54b895f25)
2020-02-05 16:53:42 +02:00
Costin Leau 6ff0e411a8
EQL: backport updates to 7.x (#51940) 2020-02-05 16:45:58 +02:00
Benjamin Trent 79f143907a
[7.x] [ML] add _cat/ml/trained_models API (#51529) (#51936)
* [ML] add _cat/ml/trained_models API (#51529)

This adds _cat/ml/trained_models.
2020-02-05 08:26:44 -05:00
Marios Trivyzas 64f9a2089b SQL: [Tests] add tests for literals and GROUP BY (#51878)
Add unit and integration tests where literals are SELECTed
in combination with GROUP BY and possibly aggregate functions.

Relates to #41411 and #34583
which have been fixed.

(cherry picked from commit b97f1ca12675d6ea4772c60578922fe1cc2409ee)
2020-02-05 12:55:56 +01:00
Ignacio Vera ababd730f6
Histogram field: Use #name() instead of #simpleName() when generating doc values (#51920) (#51927) 2020-02-05 12:35:49 +01:00
Adrien Grand ad9d2f1922
Move analysis/mappings stats to cluster-stats. (#51875)
Closes #51138
2020-02-05 11:02:25 +01:00
debadair c0156cbb5d
Backporting updates to ILM org, overview, & GS (#51898)
* [DOCS] Align with ILM API docs (#48705)

* [DOCS] Reconciled with Snapshot/Restore reorg

* [DOCS] Split off ILM overview to a separate topic. (#51287)

* [DOCS} Split off overview to a separate topic.

* [DOCS] Incorporated feedback from @jrodewig.

* [DOCS] Edit ILM GS tutorial (#51513)

* [DOCS] Edit ILM GS tutorial

* [DOCS] Incorporated review feedback from @andreidan.

* [DOCS] Removed test link & fixed anchor & title.

* Update docs/reference/ilm/getting-started-ilm.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Fixed glossary merge error.

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-02-04 16:45:18 -08:00
Lee Hinman 0be61a3662
[7.x] Adding best_compression (#49974) (763480ee) (#51819)
* Adding best_compression (#49974)

This commit adds a `codec` parameter to the ILM `forcemerge` action. When setting the codec to `best_compression` ILM will close the index, then update the codec setting, re-open the index, and finally perform a force merge.

* Fix ForceMergeAction toSteps construction (#51825)

There was a duplicate force merge step and the test continued to fail. This commit clarifies the
`toStep` method and changes the `assertBestCompression` method for better readability.

Resolves #51822

* Update version constants

Co-authored-by: Sivagurunathan Velayutham <sivadeva.93@gmail.com>
2020-02-04 14:15:43 -07:00
Julie Tibshirani 38ce428831
Create a class to hold field capabilities for one index. (#51844)
Currently, the same class `FieldCapabilities` is used both to represent the
capabilities for one index, and also the merged capabilities across indices. To
help clarify the logic, this PR proposes to create a separate class
`IndexFieldCapabilities` for the capabilities in one index. The refactor will
also help when adding `source_path` information in #49264, since the merged
source path field will have a different structure from the field for a single index.

Individual changes:
* Add a new class IndexFieldCapabilities.
* Remove extra constructor from FieldCapabilities.
* Combine the add and merge methods in FieldCapabilities.Builder.
2020-02-04 11:24:57 -08:00
Hendrik Muhs b7aace44f3 mark transform API's stable (#51862)
mark transform API's stable, meaning making transform GA for the next minor release
2020-02-04 16:13:47 +01:00
David Roberts 9d55c45b5a [ML] Improve multiline_start_pattern for CSV in find_file_structure (#51737)
The work to switch file upload over to treating delimited files
like semi-structured text and using the ingest pipeline for CSV
parsing makes the multi-line start pattern used for delimited
files much more critical than it used to be.

Previously it was always based on the time field, even if that
was towards the end of the columns, and no multi-line pattern
was created if no timestamp was detected.

This change improves the multi-line start pattern by:

1. Never creating a multi-line pattern if the sample contained
   only single line records.  This improves the import
   efficiency in a common case.
2. Choosing the leftmost field that has a well-defined pattern,
   whether that be the time field or a boolean/numeric field.
   This reduces the risk of a field with newlines occurring
   earlier, and also means the algorithm doesn't automatically
   fail for data without a timestamp.
2020-02-04 12:37:48 +00:00
Hendrik Muhs c2b08bb72f [Transform] add support for percentile aggs (#51808)
make transform ready for multi value aggregations and add support for percentile

fixes #51663
2020-02-04 12:02:20 +01:00
Hendrik Muhs 5d5f3ce256 [Transform] improve irrecoverable error detection
treat resource not found and illegal argument exceptions as irrecoverable error

relates #50135
2020-02-04 10:36:35 +01:00
Benjamin Trent d293980a09
[7.x] [ML] add GET _cat/ml/datafeeds (#51500) (#51829)
* [ML] add GET _cat/ml/datafeeds (#51500)

This adds GET _cat/ml/datafeeds && _cat/ml/datafeeds/{datafeed_id}

* fixing for java8 compilation
2020-02-03 17:16:33 -05:00
Jonathan Budzenski 8fa4a40bdf [rest spec] fill in documentation links for security.{put,delete}_privileges (#48482)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-03 10:53:50 -06:00
James Rodewig 4ea7297e1e
[DOCS] Change http://elastic.co -> https (#48479) (#51812)
Co-authored-by: Jonathan Budzenski <jon@budzenski.me>
2020-02-03 09:50:11 -05:00
Dan Hermann 4083eae0b7
[7.x] Secure password for monitoring HTTP exporter (#51775)
Adds a secure and reloadable SECURE_AUTH_PASSWORD setting to allow keystore entries in the form "xpack.monitoring.exporters.*.auth.secure_password" to securely supply passwords for monitoring HTTP exporters. Also deprecates the insecure `AUTH_PASSWORD` setting.
2020-02-03 07:42:30 -06:00
Andrei Dan 81388051d8
Reenable testWhenUserLimitedByOnlyAliasOfIndexCanWriteToIndexWhichWasRolledoverByILMPolicy (#51768) (#51801)
We suspect the flakiness could’ve come from the fact that the rollover
step used to create the new index and roll the write alias to the new
index in separate cluster state updates. So the assertion that the
rolled index exists could’ve passed in the test but, before the
alias was rolled over to the new index, the subsequent write we execute
in the test (namely
`indexDocs("test_user", "x-pack-test-password", "foo_alias", 1)`)
would’ve sent the new document to the source index (ie. foo-logs-000001)

This would see the source index containing 3 documents and the rolled
index (foo-logs-000002) 0 documents.

However, we fixed this and the rollover step executes the “create index
and roll alias” in one single cluster update, so this situation should
not occur anymore.

(cherry picked from commit 834261c4fe7dd93f437eeec43c00d01ff2279f86)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-02-03 11:54:00 +00:00
David Roberts d5d8fb26fa [TEST] Remove obsolete test trace logging from NetworkDisruptionIT (#51746)
The issue this logging was added to fix (#49908) was closed in
December and the problem has not recurred so this logging is no
longer needed.
2020-02-03 11:25:53 +00:00
Karel Minarik 050c4d4c89
Fixes for the REST specification (#51791)
* REST: Test: Fix the `accept_enterprise` parameter for Get License API (#51527)

The Get License API specifies the `accept_enterprise` parameter as a `boolean`:

0ca5cb8cb6/x-pack/plugin/src/test/resources/rest-api-spec/api/license.get.json (L22-L27)

In the test, a `string` is passed however, which makes the test compilation fail in the Go client.

(cherry picked from commit e2a2169b3d44592057c143253bb56375ed3e4268)

* Fix the SQL API documentation in REST specification (#51534)

This patch fixes the SQL REST API documentation to conform to the current schema.

(cherry picked from commit c8b6a849852699883086a6ada42279f2f68d7e07)

* Fix the "slices" parameter for the Delete By Query API in the REST specification (#51535)

This patch updates the `type` parameter in the Delete By Query API: according to
[the documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-slice),
it can be set to "auto", but the type in the documentation allows only numerical values.

This prevents people from setting the parameter to "auto" eg. in the Go client,
which generates source from the specification, and sets the corresponding Go
type as number.

The patch uses the `|` notation, which we have discussed previously for encoding
a "polymorphic" parameter like this.

Related: https://github.com/elastic/go-elasticsearch/issues/77

* Fix the Enrich API documentation in REST specification (#51528)

This patch fixes the REST API documentation for the Enrich APIs to conform to the current schema.

(cherry picked from commit 59f28f4f2feeba3f6d2f0b632410577eacb28121)
2020-02-02 15:28:08 +01:00
Hendrik Muhs ed170cc548
[Transform] Fix stats can return old state information if security is enabled (#51732) (#51738)
do index refresh of the internal transform index with the system user
instead of using the calling user which does not have sufficient rights
if security is enabled

fixes #51728
2020-02-01 19:34:58 +01:00
Ryan Ernst 21224caeaf Remove comparison to true for booleans (#51723)
While we use `== false` as a more visible form of boolean negation
(instead of `!`), the true case is implied and the true value does not
need to explicitly checked. This commit converts cases that have slipped
into the code checking for `== true`.
2020-01-31 16:35:43 -08:00
Lee Hinman 4594a210bf
[7.x] Fix SnapshotLifecycleRestIT.testFullPolicySnapshot (#517… (#51778)
* Fix SnapshotLifecycleRestIT.testFullPolicySnapshot

This previously was missing some key information in the output of the failure. This captures that
information and adds logging at each step so we can determine the cause *if* it fails again.

Resolves #50358
2020-01-31 15:38:28 -07:00
Aleksandr Maus d4f6f38150
EQL: Fix #51541: [CI] unknown setting [xpack.eql.enabled] in release-tests (#51699) (#51770)
Fixes #51541
Co-authored-by: Igor Motov <igor@motovs.org>
2020-01-31 15:14:27 -05:00
Dimitris Athanasiou 55b5c8f703
[7.x][ML] Remove index.unassigned.node_left.delayed_timeout setting from M… (#51740) (#51764)
This setting was introduced with the purpose of reducing the time took by
tests that shut nodes down. Tests like `MlDistributedFailureIT` and
`NetworkDisruptionIT`. However, it is unfortunate to have to set the value
to an explicit value in production. In addition, and most important, the dynamically
choosing the value for this setting makes it impossible to adopt static index template configs
that we register via `IndexTemplateRegistry`, which we need to use in order to start
registering ILM policies for the ML indices.

This commit removes this setting from our templates. I run the tests a few times and could
not see execution time differing significantly.

Backport of #51740
2020-01-31 20:28:29 +02:00
Andrei Dan 5ca51562ec
Fix testThatNonExistingTemplatesAreAddedImmediately (#51668) (#51752)
This addresses another race condition that could yield this test flaky.

(cherry picked from commit d20d90aceb2b687239654d6f013f61f7f4cc1512)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-31 18:18:00 +00:00
Andrei Dan 20f47b14b0
Fix SnapshotLifecycleServiceTests.testPolicyCRUD (#51653) (#51755)
(cherry picked from commit 8f9a87fa576a8a1c6ea3efb29bf1296d50d89ace)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-31 18:17:38 +00:00
Przemko Robakowski 227621dd13
Change index.lifecycle.step.master_timeout to indices.lifecycle.step.master_timeout (#51744) (#51761)
* Change index.lifecycle.step.master_timeout to indices.lifecycle.step.master_timeout

This changes setting name from `index.lifecycle.step.master_timeout` to
`indices.lifecycle.step.master_timeout` to avoid confusion about its scope.
`index.*` settings are recognized as index level settings, this one is node level.

Reletes to #51698
2020-01-31 18:56:50 +01:00
Lee Hinman deefc85d60
[7.x] Stop policy on last PhaseCompleteStep instead of Termina… (#51758)
Currently when an ILM policy finishes its execution, the index moves into the `TerminalPolicyStep`,
denoted by a completed/completed/completed phase/action/step lifecycle execution state.

This commit changes the behavior so that the index lifecycle execution state halts at the last
configured phase's `PhaseCompleteStep`, so for instance, if an index were configured with a policy
containing a `hot` and `cold` phase, the index would stop at the `cold/complete/complete`
`PhaseCompleteStep`. This allows an ILM user to update the policy to add any later phases and have
indices configured to use that policy pick up execution at the newly added "later" phase. For
example, if a `delete` phase were added to the policy specified about, the index would then move
from `cold/complete/complete` into the `delete` phase.

Relates to #48431
2020-01-31 10:36:41 -07:00
Mayya Sharipova 42b885f050
Upgrade to lucene-8.5.0-snapshot-3333ce7da6d (#51749)
Backport for #51327
2020-01-31 11:20:15 -05:00
Benjamin Trent e372854d43
[ML][Inference] Fix model pagination with models as resources (#51573) (#51736)
This adds logic to handle paging problems when the ID pattern + tags reference models stored as resources. 

Most of the complexity comes from the issue where a model stored as a resource could be at the start, or the end of a page or when we are on the last page.
2020-01-31 07:52:19 -05:00
Yang Wang 77b00fc0c0
Add warnings for invalid realm order config (#51195) (#51515)
The changes are to help users prepare for migration to next major
release (v8.0.0) regarding to the break change of realm order config.

Warnings are added for when:
* A realm does not have an order config
* Multiple realms have the same order config

The warning messages are added to both deprecation API and loggings.
The main reasons for doing this are: 1) there is currently no automatic relay
between the two; 2) deprecation API is under basic and we need logging
for OSS.
2020-01-31 12:32:37 +11:00
Gordon Brown 10c8179351
Use exclusions list instead of fake system indices (#51586)
This commit switches the strategy for managing dot-prefixed indices that
should be hidden indices from using "fake" system indices to an explicit
exclusions list that must be updated when those indices are converted to
hidden indices.
2020-01-30 16:31:27 -07:00
Bogdan Pintea f1173aaa48
SQL: Add optimisations for not-equalities (#51088) (#51700)
* Optimize not-equalities in con-/disjunctions

This commit adds optimisations of not-equalities in conjunctions and
disjunctions:
* for conjunctions, the not-equality can be optimized away when applied
together with a range or inequality, in case the not-equality point
falls outside the domain of the later condition; if its on the boarder,
it will modify the bound, to simply exclude the equality, if present;
otherwise no optimisation can be applied;
* for disjunctions, the not-equals could filter away the ranges and
inequalities, unless these include an equality on the bound, in which
case the entire condition becomes always true, but this would influence
the score() function, so it's been omitted;

* fix aggregations of inequalities in ranges

This commit fixes the loop that aggregates inequalities into ranges:
- it won't advance the outer loop index in case of a merge, since the
current element is removed;
- it will break the inner loop, since comparision against the element
selected in the outer loop can't continue, as it had been removed.



(cherry picked from commit 789724ac2cc726de603849b4eeb8194da7528bcc)
2020-01-30 23:29:39 +01:00
Lee Hinman b9faa0733d
[7.x] Rename ILM history index enablement setting (#51698) (#51705)
* Rename ILM history index enablement setting

The previous setting was `index.lifecycle.history_index_enabled`, this commit changes it to
`indices.lifecycle.history_index_enabled` to indicate this is not an index-level setting (it's node
level).
2020-01-30 15:27:44 -07:00
Benjamin Trent 1380dd439a
[7.x] [ML][Inference] Fix weighted mode definition (#51648) (#51695)
* [ML][Inference] Fix weighted mode definition (#51648)

Weighted mode inaccurately assumed that the "max value" of the input values would be the maximum class value. This does not make sense. 

Weighted Mode should know how many classes there are. Hence the new parameter `num_classes`. This indicates what the maximum class value to be expected.
2020-01-30 15:33:25 -05:00
Nhat Nguyen 1cba5d7c4b Force flush in FrozenEngine#testSearchers (#51635)
We need to force flush to make the last commit safe; otherwise, we might 
fail to open FrozenEngine. Note that we force flush before closing a
shard.

Closes #51620
2020-01-30 14:48:45 -05:00
Benjamin Trent 2a2a0941af
[ML][Inference] stream inflate to parser + throw when byte limit is reached (#51644) (#51679)
Three fixes for when the `compressed_definition` is utilized on PUT

* Update the inflate byte limit to be the minimum of 10% the max heap, or 1GB (what it was previously)
* Stream data directly to the JSON parser, so if it is invalid, we don't have to inflate the whole stream to find out
* Throw when the maximum bytes are reach indicating that is why the request was rejected
2020-01-30 10:16:14 -05:00
Marios Trivyzas f373020349 SQL: Fix ORDER BY YEAR() function (#51562)
Previously, if YEAR() was used as and ORDER BY argument without being
wrapped with another scalar (e.g. YEAR(birth_date) + 10), no script
ordering was used but instead the underlying field (e.g. birth_date)
was used instead as a performance optimisation. This works correctly if
YEAR() is the only ORDER BY arg but if further args are used as tie
breakers for the ordering wrong results are produced. This is because
2 rows with the different birth_date but on the same year are not tied
as the underlying ordering is on birth_date and not on the
YEAR(birth_date), and the following ORDER BY args are ignored.

Remove this optimisation for YEAR() to avoid incorrect results in
such cases.

As a consequence another bug is revealed: scalar functions on top
of nested fields produce scripted sorting/filtering which is not yet
supported. In such cases no error was thrown but instead all values for
such nested fields were null and were passed to the script implementing
the sorting/filtering, producing incorrect results.

Detect such cases and throw a validation exception.

Fixes: #51224
(cherry picked from commit f41efd6753dc3650a7eabb3e07b02b3b32c5704c)
2020-01-30 15:29:36 +01:00
Marios Trivyzas 285a167c34 SQL: Verify Full-Text Search functions not allowed in SELECT (#51568)
Add a verification that full-text search functions `MATCH()` and `QUERY()`
are not allowed in the SELECT clause, so that a nice error message is
returned to the user early instead of an "ugly" exception.

Fixes: #47446
2020-01-30 13:14:38 +01:00
Albert Zaharovits f25b6cc2eb
Add new 'maintenance' index privilege #50643
This commit creates a new index privilege named `maintenance`.
The privilege grants the following actions: `refresh`, `flush` (also synced-`flush`),
and `force-merge`. Previously the actions were only under the `manage` privilege
which in some situations was too permissive.

Co-authored-by: Amir H Movahed <arhd83@gmail.com>
2020-01-30 11:59:11 +02:00
Henning Andersen 149b68d850 [ML] Fix possible race condition starting datafeed (#51646)
Datafeeds being closed while starting could result in and NPE. This was
handled as any other failure, masking out the NPE. However, this
conflicts with the changes in #50886.

Related to #50886 and #51302
2020-01-30 08:23:45 +01:00
Aleksandr Maus 0d21d9e2c5
EQL: Enable QA/rest integration tests for snapshot builds only (#51624) (#51645)
* Related to #51541: [CI] unknown setting [xpack.eql.enabled] in release-tests
2020-01-29 16:38:52 -05:00
Julie Tibshirani 9dcc3ef7e6
Always use one shard in vector REST tests. (#51643)
This PR tries to address the intermittent vector test failures on 7.x by making
sure we create indices with one shard.

The fix is based on this theory as to what's happening:
* On 7.x, the default number of shards is 1, but in REST tests we randomly use
2 in order to cover the multiple shards case. In the failing test run, we use 2
shards and all documents end up on only one shard.
* During a search, the response from the empty shard doesn't produce
deprecation warnings because  we never try to execute the script. If not all
shard responses contain the warning headers, then certain deprecation warnings
can be lost (due to the bug described in #33936).

Addresses #50716.
Relates to #50061.
2020-01-29 12:24:41 -08:00
Przemysław Witek 683170b007
Increase the number of indexed documents to increase a chance that there are at least 2 training rows. (#51607) (#51615) 2020-01-29 17:17:19 +01:00
David Roberts e0e35b7feb [TEST] Mute TimeSeriesLifecycleActionsIT.testWaitForSnapshotSlmExecutedBefore
Due to https://github.com/elastic/elasticsearch/issues/50781
2020-01-29 13:08:55 +01:00
Martijn van Groningen b253af36f3
The watcher indexing listener didn't handle document level exceptions. (#51466)
Prior to the change the watcher index listener didn't implement the
`postIndex(ShardId, Engine.Index, Engine.IndexResult)` method. This
caused document level exceptions like VersionConflictEngineException
to be ignored. This commit fixes this.

The watcher indexing listener did implement the `postIndex(ShardId, Engine.Index, Exception)`
method, but that only handles engine level exceptions.

This change also unmutes the SmokeTestWatcherTestSuiteIT#testMonitorClusterHealth test again.

Relates to #32299
2020-01-29 12:55:02 +01:00
Rory Hunter d8bd736f8a
Formatting: keep simple if / else on the same line (#51544)
Backport of #51526.

Previous the formatter was breaking simple if/else statements (i.e.
without braces) onto separate lines, which could be fragile because the
formatter cannot also introduce braces. Instead, keep such expressions
on the same line.
2020-01-29 10:42:04 +00:00
Albert Zaharovits 90285ee907
Deprecate timeout.tcp_read AD/LDAP realm setting (#47305)
The timeout.tcp_read AD/LDAP realm setting, despite the low-level
allusion, controls the time interval the realms wait for a response for
a query (search or bind). If the connection to the server is synchronous
(un-pooled) the response timeout is analogous to the tcp read timeout.
But the tcp read timeout is irrelevant in the common case of a pooled
connection (when a Bind DN is specified).

The timeout.tcp_read qualifier is hereby deprecated in favor of
timeout.response.

In addition, the default value for both timeout.tcp_read and
timeout.response is that of timeout.ldap_search, instead of the 5s (but
the default for timeout.ldap_search is still 5s). The
timeout.ldap_search defines the server-controlled timeout of a search
request. There is no practical use case to have a smaller tcp_read
timeout compared to ldap_search (in this case the request would time-out
on the client but continue to be processed on the server). The proposed
change aims to simplify configuration so that the more common
configuration change, adjusting timeout.ldap_search up, has the expected
result (no timeout during searches) without any additional
modifications.

Closes #46028
2020-01-29 10:48:26 +02:00
Jason Tedor 3a7192966a
Check if interface is up for loopback devices only (#51583)
In the SQL with SSL tests, we need to find the interfaces that are up,
are loopback devices, or have a loopback address. If we check if the
device is up first, we can run into situations where the device is a
virtual ethernet device that might have disappeared between us seeing
the device, and checking if it is up. By first checking if the device is
a loopback device or it has a loopback address, then we can avoid
checking if the device is up except for loopback devices and therefore
we can avoid the disappearing virtual ethernet device problem.
2020-01-28 18:38:46 -05:00
Armin Braun aae93a7578
Allow Repository Plugins to Filter Metadata on Create (#51472) (#51542)
* Allow Repository Plugins to Filter Metadata on Create

Add a hook that allows repository plugins to filter the repository metadata
before it gets written to the cluster state.
2020-01-28 18:33:26 +01:00
Gordon Brown 89c2834b24
Deprecate creation of dot-prefixed index names except for hidden and system indices (#49959)
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.

This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
2020-01-28 10:01:16 -07:00
Yannick Welsch f6686345c9 Avoid unnecessary setup and teardown in docs tests (#51430)
The docs tests have recently been running much slower than before (see #49753).

The gist here is that with ILM/SLM we do a lot of unnecessary setup / teardown work on each
test. Compounded with the slightly slower cluster state storage mechanism, this causes the
tests to run much slower.

In particular, on RAMDisk, docs:check is taking

ES 7.4: 6:55 minutes
ES master: 16:09 minutes
ES with this commit: 6:52 minutes

on SSD, docs:check is taking

ES 7.4: ??? minutes
ES master: 32:20 minutes
ES with this commit: 11:21 minutes
2020-01-28 16:52:23 +01:00
David Roberts 550254ec7f [ML] Use CSV ingest processor in find_file_structure ingest pipeline (#51492)
Changes the find_file_structure response to include a CSV
ingest processor in the ingest pipeline it suggests.

Previously the Kibana file upload functionality parsed CSV
in the browser, but by parsing CSV in the ingest pipeline
it makes the Kibana file upload functionality more easily
interchangable with Filebeat such that the configurations
it creates can more easily be used to import data with the
same structure repeatedly in production.
2020-01-28 14:38:43 +00:00
Aleksandr Maus a8bd4d08e3 Merge branch 'feature/eql_backport' into 7.x 2020-01-28 09:19:39 -05:00
Hendrik Muhs 53e4d1ef07 [Transform] fix TransformRobustnessIT intermittent test failures part 2 (#51523)
add wait for completion in transform robustness test to avoid occasional test failures during cleanup

fixes #51347
2020-01-28 13:37:01 +01:00
William Brafford 9efa5be60e
Password-protected Keystore Feature Branch PR (#51123) (#51510)
* Reload secure settings with password (#43197)

If a password is not set, we assume an empty string to be
compatible with previous behavior.
Only allow the reload to be broadcast to other nodes if TLS is
enabled for the transport layer.

* Add passphrase support to elasticsearch-keystore (#38498)

This change adds support for keystore passphrases to all subcommands
of the elasticsearch-keystore cli tool and adds a subcommand for
changing the passphrase of an existing keystore.
The work to read the passphrase in Elasticsearch when
loading, which will be addressed in a different PR.

Subcommands of elasticsearch-keystore can handle (open and create)
passphrase protected keystores

When reading a keystore, a user is only prompted for a passphrase
only if the keystore is passphrase protected.

When creating a keystore, a user is allowed (default behavior) to create one with an
empty passphrase

Passphrase can be set to be empty when changing/setting it for an
existing keystore

Relates to: #32691
Supersedes: #37472

* Restore behavior for force parameter (#44847)

Turns out that the behavior of `-f` for the add and add-file sub
commands where it would also forcibly create the keystore if it
didn't exist, was by design - although undocumented.
This change restores that behavior auto-creating a keystore that
is not password protected if the force flag is used. The force
OptionSpec is moved to the BaseKeyStoreCommand as we will presumably
want to maintain the same behavior in any other command that takes
a force option.

*  Handle pwd protected keystores in all CLI tools  (#45289)

This change ensures that `elasticsearch-setup-passwords` and
`elasticsearch-saml-metadata` can handle a password protected
elasticsearch.keystore.
For setup passwords the user would be prompted to add the
elasticsearch keystore password upon running the tool. There is no
option to pass the password as a parameter as we assume the user is
present in order to enter the desired passwords for the built-in
users.
For saml-metadata, we prompt for the keystore password at all times
even though we'd only need to read something from the keystore when
there is a signing or encryption configuration.

* Modify docs for setup passwords and saml metadata cli (#45797)

Adds a sentence in the documentation of `elasticsearch-setup-passwords`
and `elasticsearch-saml-metadata` to describe that users would be
prompted for the keystore's password when running these CLI tools,
when the keystore is password protected.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* Elasticsearch keystore passphrase for startup scripts (#44775)

This commit allows a user to provide a keystore password on Elasticsearch
startup, but only prompts when the keystore exists and is encrypted.

The entrypoint in Java code is standard input. When the Bootstrap class is
checking for secure keystore settings, it checks whether or not the keystore
is encrypted. If so, we read one line from standard input and use this as the
password. For simplicity's sake, we allow a maximum passphrase length of 128
characters. (This is an arbitrary limit and could be increased or eliminated.
It is also enforced in the keystore tools, so that a user can't create a
password that's too long to enter at startup.)

In order to provide a password on standard input, we have to account for four
different ways of starting Elasticsearch: the bash startup script, the Windows
batch startup script, systemd startup, and docker startup. We use wrapper
scripts to reduce systemd and docker to the bash case: in both cases, a
wrapper script can read a passphrase from the filesystem and pass it to the
bash script.

In order to simplify testing the need for a passphrase, I have added a
has-passwd command to the keystore tool. This command can run silently, and
exit with status 0 when the keystore has a password. It exits with status 1 if
the keystore doesn't exist or exists and is unencrypted.

A good deal of the code-change in this commit has to do with refactoring
packaging tests to cleanly use the same tests for both the "archive" and the
"package" cases. This required not only moving tests around, but also adding
some convenience methods for an abstraction layer over distribution-specific
commands.

* Adjust docs for password protected keystore (#45054)

This commit adds relevant parts in the elasticsearch-keystore
sub-commands reference docs and in the reload secure settings API
doc.

* Fix failing Keystore Passphrase test for feature branch (#50154)

One problem with the passphrase-from-file tests, as written, is that
they would leave a SystemD environment variable set when they failed,
and this setting would cause elasticsearch startup to fail for other
tests as well. By using a try-finally, I hope that these tests will fail
more gracefully.

It appears that our Fedora and Ubuntu environments may be configured to
store journald information under /var rather than under /run, so that it
will persist between boots. Our destructive tests that read from the
journal need to account for this in order to avoid trying to limit the
output we check in tests.

* Run keystore management tests on docker distros (#50610)

* Add Docker handling to PackagingTestCase

Keystore tests need to be able to run in the Docker case. We can do this
by using a DockerShell instead of a plain Shell when Docker is running.

* Improve ES startup check for docker

Previously we were checking truncated output for the packaged JDK as
an indication that Elasticsearch had started. With new preliminary
password checks, we might get a false positive from ES keystore
commands, so we have to check specifically that the Elasticsearch
class from the Bootstrap package is what's running.

* Test password-protected keystore with Docker (#50803)

This commit adds two tests for the case where we mount a
password-protected keystore into a Docker container and provide a
password via a Docker environment variable.

We also fix a logging bug where we were logging the identifier for an
array of strings rather than the contents of that array.

* Add documentation for keystore startup prompting (#50821)

When a keystore is password-protected, Elasticsearch will prompt at
startup. This commit adds documentation for this prompt for the archive,
systemd, and Docker cases.

Co-authored-by: Lisa Cawley <lcawley@elastic.co>

* Warn when unable to upgrade keystore on debian (#51011)

For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This
commit brings the same logic to the code for Debian packages. See the
posttrans file for gets executed for RPMs.

* Restore handling of string input

Adds tests that were mistakenly removed. One of these tests proved
we were not handling the the stdin (-x) option correctly when no
input was added. This commit restores the original approach of
reading stdin one char at a time until there is no more (-1, \r, \n)
instead of using readline() that might return null

* Apply spotless reformatting

* Use '--since' flag to get recent journal messages

When we get Elasticsearch logs from journald, we want to fetch only log
messages from the last run. There are two reasons for this. First, if
there are many logs, we might get a string that's too large for our
utility methods. Second, when we're looking for a specific message or
error, we almost certainly want to look only at messages from the last
execution.

Previously, we've been trying to do this by clearing out the physical
files under the journald process. But there seems to be some contention
over these directories: if journald writes a log file in between when
our deletion command deletes the file and when it deletes the log
directory, the deletion will fail.

It seems to me that we might be able to use journald's "--since" flag to
retrieve only log messages from the last run, and that this might be
less likely to fail due to race conditions in file deletion.

Unfortunately, it looks as if the "--since" flag has a granularity of
one-second. I've added a two-second sleep to make sure that there's a
sufficient gap between the test that will read from journald and the
test before it.

* Use new journald wrapper pattern

* Update version added in secure settings request

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
2020-01-28 05:32:32 -05:00
Hendrik Muhs 2239ba8c6e
[Transform] avoid mapping problems with index templates (#51368) (#51519)
insert explict mappings for objects in nested output to avoid clashes with index templates

fixes #51321
2020-01-28 11:31:07 +01:00