Commit Graph

2083 Commits

Author SHA1 Message Date
Jason Tedor b46b6d5977
Fix compilation in DataTierTests.java
This commit fixes a compilation issue in DataTierTests.java that was
introduced due to language-level differences between 7.10/7.x and
master.
2020-10-27 13:04:55 -04:00
Jason Tedor 04a9845a49
Adjust defaults for tiered data roles (#64015)
This commit adjusts the defaults for the tiered data roles so that they
are enabled by default, or if the node has the legacy data role. This
ensures that the default experience is that the tiered data roles are
enabled.

To fully specifiy the behavior for the tiered data roles then:
 - starting a new node with the defaults: enabled
 - starting a new node with node.roles configured: enabled if and only
   if the tiered data roles are explicitly configured, independently
   of the node having the data role
 - starting a new node with node.data enabled: enabled unless the
   tiered data roles are explicitly disabled
 - starting a new node with node.data disabled: disabled unless the
   tiered data roles are explicitly enabled
2020-10-27 12:48:31 -04:00
Henning Andersen 0cba23e08f XPack Usage should run on MANAGEMENT threads (#64160)
XPack usage starts out on management threads, but depending on the
implementation of the usage plugin, they could end up running on
transport threads instead. Fixed to always reschedule on a management
thread.
2020-10-27 16:03:26 +01:00
Nhat Nguyen 566d1fd459 Return the same point in time in search response (#64188)
With this change, we will always return the same point in time in a
search response as its input until we implement the retry mechanism
for the point in times.
2020-10-27 10:17:44 -04:00
David Roberts adc5509eda
[ML] Support the unsigned_long type in data frame analytics (#64072)
Adds support for the unsigned_long type to data frame analytics.

This type is handled in the same way as the long type.  Values
sent to the ML native processes are converted to floats and
hence will lose accuracy when outside the range where a float
can uniquely represent long values.

Backport of #64066
2020-10-26 09:05:49 +00:00
Benjamin Trent eff7f06ca6
[ML] fix inference binary classification predication label and feature importance (#63688) (#63930)
When calculating feature importance, the leaf values directly correlate the value of the importance.

Consequently, positive leaf values -> positive feature importance

negative leaf values -> negative feature importance.

It follows that for binary classification, this is done such that the importance relates to the leaf values, which relate directly to the "probability of class 1".

So, the feature importance calculated is always for the importance as it relates to class 1.

The inverse is the importance as it relates to class 0.
2020-10-20 08:50:15 -04:00
Ioannis Kakavas 364511395d
[7.10] Move RestRequestFilter to core (#63507)
Move RestRequestFilter to core so that Rest requests outside xpack can use 
it to filter fields and expand its usage.

Backport of #63507
2020-10-16 13:57:52 +03:00
Jim Ferenczi 1d78bd0f72 Async search should retry updates on version conflict (#63652)
* Async search should retry updates on version conflict

The _async_search APIs can throw version conflict exception when the internal response
is updated concurrently. That can happen if the final response is written while the user
extends the expiration time. That scenario should be rare but it happened in Kibana for
several users so this change ensures that updates are retried at least 5 times. That
should resolve the transient errors for Kibana. This change also preserves the version
conflict exception in case the retry didn't work instead of returning a confusing 404.
This commit also ensures that we don't delete the response if the search was cancelled
internally and not deleted explicitly by the user.

Closes #63213
2020-10-16 08:49:02 +02:00
Albert Zaharovits f4e1e6893d Add view_index_metadata over metricbeat-* for monitoring agent (#63750)
The `remote_monitoring_agent` reserved role is extended to grant more privileges
over the metricbeat-* index pattern.
In addition to the index and create_index index privileges that it granted already,
it now also grants the view_index_metadata privilege.

Closes #63203
2020-10-16 02:13:55 +03:00
Jay Modi ebdaeb2f9a
Ensure cancelled jobs do not continue to run (#63771)
This commit ensures that jobs within the SchedulerEngine do not
continue to run after they are cancelled. There was no synchronization
between the cancel method of an ActiveSchedule and the run method, so
an actively running schedule would go ahead and reschedule itself even
if the cancel method had been called.

This commit adds synchronization between cancelling and the scheduling
of the next run to ensure that the job is cancelled. In real life
scenarios this could manifest as a job running multiple times for
SLM. This could happen if a job had been triggered and was cancelled
prior to completing its run such as if the node was no longer the
master node or if SLM was stopping/stopped.

Closes #63754
Backport of #63762
2020-10-15 14:01:14 -06:00
Albert Zaharovits 2b7fbe9957 Add the missing apikey.* fields to the logfile audit layout for docker builds (#63609)
The layout pattern for the security audit for docker builds was missing the apiKey.* fields.
2020-10-14 13:58:41 +03:00
Lee Hinman 7371e51583
[7.10] Add DiscoveryNodeRole compatibility role for bwc tier serialization (#63581) (#63613)
Backports the following commits to 7.10:

    Add DiscoveryNodeRole compatibility role for bwc tier serialization (#63581)
2020-10-13 09:17:15 -06:00
Przemysław Witek acbd48f834
[ML] Allow setting num_top_classes to a special value -1 (#63587) (#63602) 2020-10-13 13:57:50 +02:00
Dimitris Athanasiou e1c418aac7
[7.10][ML] Validate dest pipeline exists on transform update (#63494) (#63549)
Adds validation that the dest pipeline exists when a transform
is updated. Refactors the pipeline check into the `SourceDestValidator`.

Fixes #59587

Backport of #63494
2020-10-12 15:41:35 +03:00
Przemysław Witek bd761cce1d
[ML] Validate that AucRoc has the data necessary to be calculated (#63302) (#63454) 2020-10-08 09:52:15 +02:00
Alan Woodward 88b45dfa61
Convert TextFieldMapper to parametrized form (#63269) (#63392)
As a result of this, we can remove a chunk of code from TypeParsers as well. Tests
for search/index mode analyzers have moved into their own file. This commit also
rationalises the serialization checks for parameters into a single SerializerCheck
interface that takes the values includeDefaults, isConfigured and the value
itself.

Relates to #62988
2020-10-07 13:26:25 +01:00
Gordon Brown 15edc39d9b
Update logstash_admin role for system indices (#63368)
This PR updates the `logstash_admin` role to include the recently-added Logstash Pipeline Management APIs, as well as access to the `.logstash*` index pattern.

Co-authored-by: William Brafford <williamrandolphbrafford@gmail.com>
2020-10-06 20:43:36 -06:00
Gordon Brown 5c8b0662df
Deprecate REST access to System Indices (#63274) (Original #60945)
This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns.

Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default:

- `GET _cluster/health`
- `GET {index}/_recovery`
- `GET _cluster/allocation/explain`
- `GET _cluster/state`
- `POST _cluster/reroute`
- `GET {index}/_stats`
- `GET {index}/_segments`
- `GET {index}/_shard_stores`
- `GET _cat/[indices,aliases,health,recovery,shards,segments]`

Deprecation warnings for accessing system indices take the form:
```
this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default
```
2020-10-06 13:41:40 -06:00
Tanguy Leroux 87076c32e2
Determine shard size before allocating shards recovering from snapshots (#61906) (#63337)
Determines the shard size of shards before allocating shards that are
recovering from snapshots. It ensures during shard allocation that the
target node that is selected as recovery target will have enough free
disk space for the recovery event. This applies to regular restores,
CCR bootstrap from remote, as well as mounting searchable snapshots.

The InternalSnapshotInfoService is responsible for fetching snapshot
shard sizes from repositories. It provides a getShardSize() method
to other components of the system that can be used to retrieve the
latest known shard size. If the latest snapshot shard size retrieval
failed, the getShardSize() returns
ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While
we'd like a better way to handle such failures, returning this value
allows to keep the existing behavior for now.

Note that this PR does not address an issues (we already have today)
where a replica is being allocated without knowing how much disk
space is being used by the primary.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2020-10-06 18:37:05 +02:00
David Kyle ea32b4ab82
[ML] Audit message when nightly maintenance times out (#63252) (#63330)
During deletion of old ml data set the delete by query timeout to 8 hours and
audit a job message when the nightly maintenance task times out.
2020-10-06 16:19:37 +01:00
Hendrik Muhs 058c55da6a [Transform] disallow field and script being empty for group sources (#63313)
fail validation earlier when field and script are both missing in a group source
2020-10-06 16:59:02 +02:00
Yang Wang abf9b885b4
Bulk invalidate API keys using a list of IDs (#63224) (#63320)
Add a new ids field to the API of invalidating API keys so that it supports bulk
invalidation with a list of IDs.
Note the existing id field is kept as is and it is an error if both id and ids are specified.
2020-10-07 00:49:21 +11:00
Yang Wang bbfa2f1303 Fix test failure due to missing client action 2020-10-07 00:45:30 +11:00
Yang Wang 7969fbb4ab
Cache API key doc to reduce traffic to the security index (#59376) (#63319)
Getting the API key document form the security index is the most time consuing part
of the API Key authentication flow (>60% if index is local and >90% if index is remote).
This traffic is now avoided by caching added with this PR.

Additionally, we add a cache invalidator registry so that clearing of different caches will
be managed in a single place (requires follow-up PRs).
2020-10-06 23:49:23 +11:00
David Kyle 8f4ef40f78
[ML] Auditor ensures template is installed before writes (#63286)
The ML auditors should not write if the latest template is not present. 
Instead a PUT template request is made and the writes queued up
2020-10-06 11:20:37 +01:00
Armin Braun cf75abb021
Optimize XContentParserUtils.ensureExpectedToken (#62691) (#63253)
We only ever use this with `XContentParser` no need to make it inline
worse by forcing the lambda and hence dynamic callsite here.
=> Extraced the exception formatting code path that is likely very cold
to a separate method and removed the lambda usage in hot loops by simplifying
the signature here.
2020-10-05 19:08:32 +02:00
Benjamin Trent 1e63313c19
[ML] adds feature_importance_baseline object to model metadata (#63172) (#63237)
this adds the new field `feature_importance_baseline` and allows it to be optionally be included in the model's metadata.

Related to: https://github.com/elastic/ml-cpp/pull/1522
2020-10-05 09:33:38 -04:00
Benjamin Trent 752ee0288e
[7.x] [ML] optimize delete expired snapshots (#63134) (#63200)
* [ML] optimize delete expired snapshots (#63134)

When deleting expired snapshots, we do an individual delete action per snapshot per job.

We should instead gather the expired snapshots and delete them in a single call.

This commit achieves this and a side-effect is there is less audit log spam on nightly cleanup

closes https://github.com/elastic/elasticsearch/issues/62875
2020-10-02 13:24:36 -04:00
Przemysław Witek 5370f270d7
[7.x] [ML] Ensure data frame analytics jobs don't run on a node that's too new (#62749) (#63175) 2020-10-02 17:19:58 +02:00
Joe Gallo d172a18c95 Tidy up some ILM and SLM packages (#63146)
Very minor refactoring, just moving some ILM and SLM classes around to decrease
the total number of packages.
2020-10-02 09:30:24 -04:00
Benjamin Trent 535f8a434b
Revert "[ML] adding `baseline` field to total_feature_importance objects (#63098) (#63125)" (#63144)
This reverts commit 95242eccee.
2020-10-02 07:03:15 -04:00
Ioannis Kakavas e91f66e22f
Ensure domain_name setting for AD realm is present (#61983) (#63159)
We would only check for a null value and not for an empty string so
that meant that we were not actually enforcing this mandatory
setting. This commits ensures we check for both and fail 
accordingly if necessary, on startup
2020-10-02 12:16:08 +03:00
Lee Hinman f0f0da2188
[7.x] Add telemetry for data tiers (#63031) (#63140)
Backports the following commits to 7.x:

    Add telemetry for data tiers (#63031)
2020-10-01 12:37:32 -06:00
Benjamin Trent 95242eccee
[ML] adding `baseline` field to total_feature_importance objects (#63098) (#63125)
This adds a new `baseline` field to the feature importance values. 

This field contains the baseline importance for a given feature and class.
2020-10-01 09:48:07 -04:00
Yang Wang e31bef4032
Fix API key role descriptors rewrite bug for upgraded clusters (#62917) (#63042)
This PR ensures that API key role descriptors are always rewritten to a target node
compatible format before a request is sent.
2020-09-30 22:16:39 +10:00
Benjamin Trent 0860746bf2
[ML] changing ngram loop order for minor performance improvement (#63033) (#63059)
This is a very minor optimization but trivial to implement, so might as well. 

```
Benchmark                               (nGramStrs)  Mode  Cnt        Score        Error  Units
NGramProcessorBenchmark.ngramInnerLoop        1,2,3  avgt   20  4415092.443 ±  31302.115  ns/op
NGramProcessorBenchmark.ngramOuterLoop        1,2,3  avgt   20  4235550.340 ± 103393.465  ns/op
```

This measurement is in nanoseconds, consequently, the overall performance of inference is dominated by other factors (i.e. map#put). But, this optimization adds up overtime and is simple.
2020-09-30 07:51:31 -04:00
Przemysław Witek 4366d58564
[7.x] [ML] Implement AucRoc metric for classification (#60502) (#63051) 2020-09-30 12:55:52 +02:00
Benjamin Trent 0b3af242d4
[ML] fixing classification feature importance parsing (#63003) (#63015)
Classification feature importance supports various types in the class name:
- string
- boolean
- numerical

The xcontent parsing on the server side and the HLRC side should support and test these types.
2020-09-29 10:54:35 -04:00
Yang Wang 068f605040
Use compilation as validation for painless role template (#62845) (#63010)
* Use compilation as validation for painless role template (#62845)

Role template validation now performs only compilation if the script is painless.
It no longer attempts to execute the script with empty input which is problematic.
The compliation process will catch things like invalid syntax, undefined variables,
which still provide certain level of protection against ill-defined role templates.
Behaviour for Mustache script is unchanged.

* Checkstyle
2020-09-30 00:37:41 +10:00
Dimitris Athanasiou facf9ede0a
[ML] Fix binary classification importance in LegacyFeatureImportanceTests (#63000)
Fixes #62991
2020-09-29 15:53:34 +03:00
Dimitris Athanasiou 7f6c1ff5b4
[7.x][ML] Remove top level importance from classification inference results (#62486) (#62964)
As we have decided top level importance for classification is not useful,
it has been removed from the results from the training job. This commit
also removes them from inference.

Backport of #62486
2020-09-29 10:58:48 +03:00
Hendrik Muhs b1a8437d0b
[7.x][Transform] Improve robustness when saving state (#62927)
refactor how state is persisted, call doSaveState only from the indexer thread, except there is none.

fixes #60781
fixes #52931
fixes #51629
fixes #52035
2020-09-28 10:12:51 +02:00
Andrei Stefan a43f29cfc9
EQL: data streams tests for PIT and EQL sequences (#62850) (#62889)
* PIT should run well with data streams

(cherry picked from commit 0a89a7db848b015b797c7678874b5c9e33bbd650)
2020-09-24 23:37:46 +03:00
Andrei Dan e323c5245b
[7.x] ILM: migrate action configures the _tier_preference setting (#62829) (#62860)
The `migrate` action will now configure the
`index.routing.allocation.include._tier_preference` setting to the corresponding
tiers. For the HOT phase it will configure `data_hot`, for the WARM phase it will
configure `data_warm,data_hot` and for the COLD phase
`data_cold,data_warm,data_cold`.

(cherry picked from commit 9dbf0e6f0c267e40c5bcfb568bb2254da103ae40)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-09-24 13:37:09 +01:00
Hendrik Muhs a70389015d [Transform] Return parsed count for get transform stats (#62809)
In case of more than 500 transforms, get and stats return paged results which can be requested using
page parameters. For >500 transforms count wasn't parsed out of the server response but taken from
size of the list of transforms.

The change also adds client/server hlrc tests and fixes a wrong type for count in get.

fixes #56245
2020-09-24 08:38:07 +02:00
Dimitris Athanasiou 7de5201291
[7.x][ML] Handle data frame analytics state spreading over multiple docs (#62564) (#62824)
When state persistence was first implemented for data frame analytics
we had the assumption that state would always fit in a single document.
However this is not the case any more.

This commit adds handling of state that spreads over multiple documents.

Backport of #62564
2020-09-23 16:16:34 +03:00
Nhat Nguyen 663b85b98f Make keep alive optional in PointInTimeBuilder (#62720)
Remove the keepAlive parameter from the constructor of PointInTimeBuilder
as it's optional.
2020-09-22 18:52:54 -04:00
Benjamin Trent 77bfb32635
[7.x] [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694) (#62784)
* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694)

* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls
 global parameters, outside of the global index, are ignored for internal callers in certain cases.
If the interal caller is adding requests via the following methods:
```
- BulkRequest#add(IndexRequest)
- BulkRequest#add(UpdateRequest)
- BulkRequest#add(DocWriteRequest)
- BulkRequest#add(DocWriteRequest[])
```
It is better to specifically set the desired parameters on the requests before they are added
to the bulk request object.

This commit addresses this issue for the ML plugin

* unmuting test
2020-09-22 15:07:08 -04:00
Andrei Dan 79d0c4ed18
ILM: allow check-migration step to continue if tier setting unset (#62636) (#62724)
This allows the `check-migration` step to move past the allocation check
if the tier routing settings are manually unset.

This helps a user unblock ILM in case a tier is removed (ie. if the warm tier
is decommissioned this will allow users to resume the ILM policies stuck in
`check-migration` waiting for the warm nodes to become available and the managed
index to allocate. this allows the index to allocate on the other available tiers)

(cherry picked from commit d7a1eaa7f51d0972d10c0df1d3cd77d6b755dd41)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-09-21 20:40:01 +01:00
Lee Hinman 4a08928c47
[7.x] Add index.routing.allocation.include._tier_preference setting (#62589) (#62667)
This commit adds the `index.routing.allocation.prefer._tier` setting to the
`DataTierAllocationDecider`. This special-purpose allocation setting lets a user specify a
preference-based list of tiers for an index to be assigned to. For example, if the setting were set
to:

```
"index.routing.allocation.prefer._tier": "data_hot,data_warm,data_content"
```

If the cluster contains any nodes with the `data_hot` role, the decider will only allow them to be
allocated on the `data_hot` node(s). If there are no `data_hot` nodes, but there are `data_warm` and
`data_content` nodes, then the index will be allowed to be allocated on `data_warm` nodes.

This allows us to specify an index's preference for tier(s) without causing the index to be
unassigned if no nodes of a preferred tier are available.

Subsequent work will change the ILM migration to make additional use of this setting.

Relates to #60848
2020-09-18 15:41:36 -06:00