1522 Commits

Author SHA1 Message Date
Lee Hinman
013d87d716 Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAlloc… (#47313)
* Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAllocated

These tests were using randomly generated includes/excludes/requires for
routing, however, it was possible to generate mutually exclusive
allocation settings (about 1 out of 50,000 times for my runs).

This splits the test into three different tests, and removes the
randomization (it doesn't add anything to the testing here) to fix the
issue.

Resolves #47142
2019-10-02 10:01:23 -06:00
Benjamin Trent
2228a7dd8d
[ML][Inference] adding ensemble model objects (#47241) (#47438)
* [ML][Inference] adding ensemble model objects

* addressing PR comments

* Update TreeTests.java

* addressing PR comments

* fixing test
2019-10-02 09:49:46 -04:00
Dimitris Athanasiou
b9541eb3af
[7.x][ML] Make PUT data frame analytics action a master node action (… (#47433)
While it seemed like the PUT data frame analytics action did not
have to be a master node action as the config is stored in an index
rather than the cluster state, there are other subtle nuances which
make it worthwhile to convert it. In particular, it helps maintain
order of execution for put actions which are anyhow user driven and
are expected to have low volume.

This commit converts `TransportPutDataFrameAnalyticsAction` from
a handled transport action to a master node action.

Note this means that the action might fail in a mixed cluster
but as the API is still experimental and not widely used there will
be few moments more suitable to make this change than now.
2019-10-02 16:24:21 +03:00
Yannick Welsch
7b2613db55 Allow optype CREATE for append-only indexing operations (#47169)
Bulk requests currently do not allow adding "create" actions with auto-generated IDs.
This commit allows using the optype CREATE for append-only indexing operations. This is
mainly the user facing aspect of it.
2019-10-02 14:16:52 +02:00
David Roberts
4379a3c52b [ML] Throttle the delete-by-query of expired results (#47177)
Due to #47003 many clusters will have built up a
large backlog of expired results. On upgrading to
a version where that bug is fixed users could find
that the first ML daily maintenance task deletes
a very large amount of documents.

This change introduces throttling to the
delete-by-query that the ML daily maintenance uses
to delete expired results to limit it to deleting an
average 200 documents per second. (There is no
throttling for state/forecast documents as these
are expected to be lower volume.)

Additionally a rough time limit of 8 hours is applied
to the whole delete expired data action. (This is only
rough as it won't stop part way through a single
operation - it only checks the timeout between
operations.)

Relates #47103
2019-10-02 11:16:34 +01:00
Dimitris Athanasiou
36884a3c32
[7.x][ML] Restore analytics state if available (#47128) (#47393)
This commit restores the model state if available in data
frame analytics jobs.

In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.

Note that in order to be able to distinguish between a job
that runs for the first time and another that is restarting,
we ensure reindexing progress is reported to be at least 1
for a running task.
2019-10-02 10:24:05 +03:00
Benjamin Trent
f5fe5e7cd6
[7.x] [ML][Inference] Adding preprocessors to definition object (#47320) (#47370)
* [ML][Inference] Adding preprocessors to definition object (#47320)

* [ML][Inference] Adding preprocessors to definition object

* Update TrainedModelConfig.java

* adjusting for backport
2019-10-01 13:31:25 -04:00
Albert Zaharovits
78558a7b2f
Fix AD realm additional metadata (#47179)
Due to a regression bug the metadata Active Directory realm
setting is ignored (it works correctly for the LDAP realm type).
This commit redresses it.

Closes #45848
2019-10-01 17:05:25 +03:00
Benjamin Trent
4335e07716
[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267) (#47310)
* [ML][Inference] adding .ml-inference* index and storage (#47267)

* [ML][Inference] adding .ml-inference* index and storage

* Addressing PR comments

* Allowing null definition, adding validation tests for model config

* fixing line length

* adjusting for backport
2019-10-01 08:20:33 -04:00
Armin Braun
3d23cb44a3
Speed up Snapshot Finalization (#47283) (#47309)
As a result of #45689 snapshot finalization started to
take significantly longer than before. This may be a
little unfortunate since it increases the likelihood
of failing to finalize after having written out all
the segment blobs.
This change parallelizes all the metadata writes that
can safely run in parallel in the finalization step to
speed the finalization step up again. Also, this will
generally speed up the snapshot process overall in case
of large number of indices.

This is also a nice to have for #46250 since we add yet
another step (deleting of old index- blobs in the shards
to the finalization.
2019-09-30 23:28:59 +02:00
Martijn van Groningen
fe937ea4b8
Add config namespace in get policy api response (#47162)
Currently the policy config is placed directly in the json object
of the toplevel `policies` array field. For example:

```
{
    "policies": [
        {
            "match": {
                "name" : "my-policy",
                "indices" : ["users"],
                "match_field" : "email",
                "enrich_fields" : [
                    "first_name",
                    "last_name",
                    "city",
                    "zip",
                    "state"
                ]
            }
        }
    ]
}
```

This change adds a `config` field in each policy json object:

```
{
    "policies": [
        {
            "config": {
                "match": {
                    "name" : "my-policy",
                    "indices" : ["users"],
                    "match_field" : "email",
                    "enrich_fields" : [
                        "first_name",
                        "last_name",
                        "city",
                        "zip",
                        "state"
                    ]
                }
            }
        }
    ]
}
```

This allows us in the future to add other information about policies
in the get policy api response.

The UI will consume this API to build an overview of all policies.
The UI may in the future include additional information about a policy
and the plan is to include that in the get policy api, so that this
information can be gathered in a single api call.

An example of the information that is likely to be added is:
* Last policy execution time
* The status of a policy (executing, executed, unexecuted)
* Information about the last failure if exists
2019-09-30 14:37:23 +02:00
Jason Tedor
2cba323b4e
Remove use of get raw in token/API key settings (#47260)
These settings were using get raw to fallback to whether or not SSL is
enabled. Yet, we have a formal mechanism for falling back to a
setting. This commit cuts over to that formal mechanism.
2019-09-30 06:35:58 -04:00
Martijn van Groningen
66f72bcdbc
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-30 08:12:28 +02:00
Rory Hunter
53a4d2176f
Convert most awaitBusy calls to assertBusy (#45794) (#47112)
Backport of #45794 to 7.x. Convert most `awaitBusy` calls to
`assertBusy`, and use asserts where possible. Follows on from #28548 by
@liketic.

There were a small number of places where it didn't make sense to me to
call `assertBusy`, so I kept the existing calls but renamed the method to
`waitUntil`. This was partly to better reflect its usage, and partly so
that anyone trying to add a new call to awaitBusy wouldn't be able to find
it.

I also didn't change the usage in `TransportStopRollupAction` as the
comments state that the local awaitBusy method is a temporary
copy-and-paste.

Other changes:

  * Rework `waitForDocs` to scale its timeout. Instead of calling
    `assertBusy` in a loop, work out a reasonable overall timeout and await
    just once.
  * Some tests failed after switching to `assertBusy` and had to be fixed.
  * Correct the expect templates in AbstractUpgradeTestCase.  The ES
    Security team confirmed that they don't use templates any more, so
    remove this from the expected templates. Also rewrite how the setup
    code checks for templates, in order to give more information.
  * Remove an expected ML template from XPackRestTestConstants The ML team
    advised that the ML tests shouldn't be waiting for any
    `.ml-notifications*` templates, since such checks should happen in the
    production code instead.
  * Also rework the template checking code in `XPackRestTestHelper` to give
    more helpful failure messages.
  * Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4
2019-09-29 12:21:46 +01:00
Martijn van Groningen
7ffe2e7e63
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-27 14:42:11 +02:00
Andrei Dan
4c909438dd
Fix OriginationDate parsing tests. (#47170) (#47200)
Drop the usage of `SimpleDateFormat` and use the `DateFormatter` instead

(cherry picked from commit 7cf509a7a11ecf6c40c44c18e8f03b8e81fcd1c2)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-09-27 13:16:45 +01:00
Przemysław Witek
3fbd58d156
[7.x] Allow evaluation to consist of multiple steps. (#46653) (#47194) 2019-09-27 13:01:51 +02:00
Martijn van Groningen
8a4eefdd83
Expose enrich stats api to monitoring. (#46708)
This change also slightly modifies the stats response,
so that is can easier consumer by monitoring and other
users. (coordinators stats are now in a list instead of
a map and has an additional field for the node id)

Relates to #32789
2019-09-26 11:04:33 +02:00
Yogesh Gaikwad
9a64b7a888
[Backport] Validate query field when creating roles (#46275) (#47094)
In the current implementation, the validation of the role query
occurs at runtime when the query is being executed.

This commit adds validation for the role query when creating a role
but not for the template query as we do not have the runtime
information required for evaluating the template query (eg. authenticated user's
information). This is similar to the scripts that we
store but do not evaluate or parse if they are valid queries or not.

For validation, the query is evaluated (if not a template), parsed to build the
QueryBuilder and verify if the query type is allowed.

Closes #34252
2019-09-26 17:57:36 +10:00
Benjamin Trent
fcddaa90de
[7.x] [ML][Inference] adding tree model (#47044) (#47141)
* [ML][Inference] adding tree model (#47044)

* [ML][Inference] adding tree model

* renaming features for updated schema

* fixing 7.x compilation
2019-09-25 19:11:15 -04:00
Andrei Dan
27520cac3b
ILM: parse origination date from index name (#46755) (#47124)
* ILM: parse origination date from index name (#46755)

Introduce the `index.lifecycle.parse_origination_date` setting that
indicates if the origination date should be parsed from the index name.
If set to true an index which doesn't match the expected format (namely
`indexName-{dateFormat}-optional_digits` will fail before being created.
The origination date will be parsed when initialising a lifecycle for an
index and it will be set as the `index.lifecycle.origination_date` for
that index.

A user set value for `index.lifecycle.origination_date` will always
override a possible parsable date from the index name.

(cherry picked from commit c363d27f0210733dad0c307d54fa224a92ddb569)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

* Drop usage of Map.of to be java 8 compliant
2019-09-25 21:44:16 +01:00
Lee Hinman
a267df30fa Wait for snapshot completion in SLM snapshot invocation (#47051)
* Wait for snapshot completion in SLM snapshot invocation

This changes the snapshots internally invoked by SLM to wait for
completion. This allows us to capture more snapshotting failure
scenarios.

For example, previously a snapshot would be created and then registered
as a "success", however, the snapshot may have been aborted, or it may
have had a subset of its shards fail. These cases are now handled by
inspecting the response to the `CreateSnapshotRequest` and ensuring that
there are no failures. If any failures are present, the history store
now stores the action as a failure instead of a success.

Relates to #38461 and #43663
2019-09-25 14:25:22 -06:00
Gordon Brown
a46eef9634
Change SLM stats format (#46991)
Using arrays of objects with embedded IDs is preferred for new APIs over
using entity IDs as JSON keys.  This commit changes the SLM stats API to
use the preferred format.
2019-09-25 11:32:08 -06:00
Benjamin Trent
05fb7be571
[7.x] [ML][Inference] Feature pre-processing objects and functions (#46777) (#47040)
* [ML][Inference] Feature pre-processing objects and functions (#46777)

To support inference on pre-trained machine learning models, some basic feature encoding will be necessary. I am using a named object serialization approach so new encodings/pre-processing steps could be added in the future. 

This PR lays down the ground work for 3 basic encodings:

* HotOne
* Target Mean
* Frequency

More feature encodings or pre-processings could be added in the future:

* Handling missing columns
* Standardization
* Label encoding
* etc....

* fixing compilation for namedxcontent tests
2019-09-25 08:16:24 -04:00
Ioannis Kakavas
23bceaadf8
Handle RelayState in preparing a SAMLAuthN Request (#46534) (#47092)
This change allows for the caller of the `saml/prepare` API to pass
a `relay_state` parameter that will then be part of the redirect
URL in the response as the `RelayState` query parameter.

The SAML IdP is required to reflect back the value of that relay
state when sending a SAML Response. The caller of the APIs can
then, when receiving the SAML Response, read and consume the value
as it see fit.
2019-09-25 13:23:46 +03:00
Yogesh Gaikwad
6f453aa6b2
Validate index and cluster privilege names when creating a role (#46361) (#47063)
This commit adds validation so a role cannot be created with
invalid index or cluster privilege name.

Closes #29703
2019-09-25 18:57:11 +10:00
Hendrik Muhs
7377ac4637 [Transform] Replace transforms with transform, index constants (#47023)
- replace "transforms" with "transform" for consistency
 - use constants for internal index naming wherever possible and document required changes
2019-09-25 08:31:43 +02:00
James Baiera
a349b22273 Add the cluster version to enrich policies (#45021)
Adds the Elasticsearch version as a field on the EnrichPolicy object
2019-09-23 18:44:45 -04:00
Julie Tibshirani
9124c94a6c
Add support for aliases in queries on _index. (#46944)
Previously, queries on the _index field were not able to specify index aliases.
This was a regression in functionality compared to the 'indices' query that was
deprecated and removed in 6.0.

Now queries on _index can specify an alias, which is resolved to the concrete
index names when we check whether an index matches. To match a remote shard
target, the pattern needs to be of the form 'cluster:index' to match the
fully-qualified index name. Index aliases can be specified in the following query
types: term, terms, prefix, and wildcard.
2019-09-23 13:21:37 -07:00
Martijn van Groningen
0cfddca61d
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-23 09:46:05 +02:00
Michael Basnight
f1c7ed647b Allow comma separated ids in get enrich policy API (#46351)
This commit changes the GET REST api so it will accept an optional comma
separated list of enrich policy ids. This change also modifies the
behavior of the GET API in that it will not error if it is passed a bad
enrich id anymore, but will instead just return an empty list.
2019-09-20 10:06:58 -05:00
Hendrik Muhs
4a2cb05162 add message about transform disabled if license is missing (#46901)
adds a message for transform about what happens if no license has been activated
2019-09-20 13:47:40 +02:00
Hendrik Muhs
abe889af75
[7.5][Transform] rename classes in transform plugin (#46867)
rename classes and settings in transform plugin, provide BWC for old settings
2019-09-20 10:43:00 +02:00
Dimitris Athanasiou
02a5e153dc
[7.x][ML] Parse and index data frame analytics state (#46804) (#46820)
This commit reuses the same state processor that is used for autodetect
to parse state output from data frame analytics jobs. We then index the
state document into the state index.

Backport of #46804
2019-09-18 20:37:40 +03:00
Benjamin Trent
9cf9c64ec2
[7.x] [ML][Transforms] remove force flag from _start (#46414) (#46748)
* [ML][Transforms] remove `force` flag from _start (#46414)

* [ML][Transforms] remove `force` flag from _start

* fixing expected error message

* adjusting bwc version
2019-09-18 10:06:05 -04:00
Lee Hinman
b85468d6ea
Add node setting for disabling SLM (#46794) (#46796)
This adds the `xpack.slm.enabled` setting to allow disabling of SLM
functionality as well as its HTTP API endpoints.

Relates to #38461
2019-09-17 17:39:41 -06:00
Oliver Gupte
cbd58d3b78
Give kibana user privileges to create APM agent config index (#46765) (#46792)
* Give kibana user reserved role privileges on .apm-* to create APM agent configuration index.

* fixed test to include checking all .apm-* permissions

* changed pattern from ".apm-*" to the more specific ".apm-agent-configuration"
2019-09-17 15:01:42 -07:00
Armin Braun
b0f09b279f
Make Snapshot Logic Write Metadata after Segments (#45689) (#46764)
* Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581
* Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons
   * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization
* Fixes #41581
2019-09-17 13:09:39 +02:00
Przemysław Witek
e49be611ad
[7.x] Add audit messages for Data Frame Analytics (#46521) (#46738) 2019-09-16 21:21:38 +02:00
Hendrik Muhs
c8f52ec4ff
[Transform] Rename data frame plugin to transform: classes in xpack.core (#46644) (#46734)
rename classes in xpack.core of transform plugin from "data frame transform" to "transform"
2019-09-16 13:39:22 +02:00
Andrei Dan
c57cca98b2
[ILM] Add date setting to calculate index age (#46561) (#46697)
* [ILM] Add date setting to calculate index age

Add the `index.lifecycle.origination_date` to allow users to configure a
custom date that'll be used to calculate the index age for the phase
transmissions (as opposed to the default index creation date).

This could be useful for users to create an index with an "older"
origination date when indexing old data.

Relates to #42449.

* [ILM] Don't override creation date on policy init

The initial approach we took was to override the lifecycle creation date
if the `index.lifecycle.origination_date` setting was set. This had the
disadvantage of the user not being able to update the `origination_date`
anymore once set.

This commit changes the way we makes use of the
`index.lifecycle.origination_date` setting by checking its value when
we calculate the index age (ie. at "read time") and, in case it's not
set, default to the index creation date.

* Make origination date setting index scope dynamic

* Document orignation date setting in ilm settings

(cherry picked from commit d5bd2bb77ee28c1978ab6679f941d7c02e389d32)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-09-16 08:50:28 +01:00
Luca Cavanna
e57756492a Update http-core and http-client dependencies (#46549)
Relates to #45808
Closes #45577
2019-09-12 09:45:29 +02:00
James Rodewig
f9bf10f2b6
[DOCS] Change "a SSL" to "an SSL" in the Java docs (#46524) (#46618) 2019-09-11 15:55:57 -04:00
Lee Hinman
09a9cefaa0 Handle partial failure retrieving segments in SegmentCountStep (#46556)
Since the `IndicesSegmentsRequest` scatters to all shards for the index,
it's possible that some of the shards may fail. This adds failure
handling and logging (since this is a best-effort step in the first
place) for this case.
2019-09-11 10:29:31 -06:00
Jim Ferenczi
23bf310c84 Replace the SearchContext with QueryShardContext when building aggregator factories (#46527)
This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them.
The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`.

Relates #46523
2019-09-11 16:43:30 +02:00
Hendrik Muhs
efea581dcc
[7.x][Transform]Rename data frame plugin to transform: plugin and package names (#46583)
rename data frame transform plugin to transform:

 - rename plugin data-frame to transform
 - change all package names from o.e.*.dataframe.* to o.e.*.transform.*
 - necessary changes to fix loading/testing
2019-09-11 14:50:08 +02:00
Armin Braun
41633cb9b5
More Efficient Ordering of Shard Upload Execution (#42791) (#46588)
* More Efficient Ordering of Shard Upload Execution (#42791)
* Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard
* Inspired by #39657
* Cleanup BlobStoreRepository Abort and Failure Handling (#46208)
2019-09-11 13:59:20 +02:00
Martijn van Groningen
a4b0f66919
Add enrich stats api (#46462)
The enrich api returns enrich coordinator stats and
information about currently executing enrich policies.

The coordinator stats include per ingest node:
* The current number of search requests in the queue.
* The total number of outstanding remote requests that
  have been executed since node startup. Each remote
  request is likely to include multiple search requests.
  This depends on how much search requests are in the
  queue at the time when the remote request is performed.
* The number of current outstanding remote requests.
* The total number of search requests that `enrich`
  processors have executed since node startup.

The current execution policies stats include:
* The name of policy that is executing
* A full blow task info object that is executing the policy.

Relates to #32789
2019-09-11 13:40:24 +02:00
Jim Ferenczi
425b1a77e8 Add more context to QueryShardContext (#46584)
This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext.
It's a spin off of #46527 as this change is required to allow aggregation builder to solely use the
query shard context.

Relates #46523
2019-09-11 12:24:51 +02:00
Przemysław Witek
e38e631dac
[7.x] Implement DataFrameAnalyticsAuditMessage and DataFrameAnalyticsAuditor (#45967) (#46519) 2019-09-11 12:17:26 +02:00