Commit Graph

1849 Commits

Author SHA1 Message Date
Andrei Dan 9f280621ba
[7.x] ILM add data stream support to searchable snapshot action (#57873) (#57916)
(cherry picked from commit 34856a90532c6c62a53817bb395399c8a8c17c0f)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-10 10:16:57 +01:00
Yang Wang 72a6441a88
Revert "Resolve anonymous roles and deduplicate roles during authentication (#53453) (#55995)" (#57858)
This reverts commit 84a2f1adf2.
2020-06-10 10:42:52 +10:00
Andrei Dan 3945712c72
[7.x] ILM add data stream support to the Shrink action (#57616) (#57884)
The shrink action creates a shrunken index with the target number of shards.
This makes the shrink action data stream aware. If the ILM managed index is
part of a data stream the shrink action will make sure to swap the original
managed index with the shrunken one as part of the data stream's backing
indices and then delete the original index.

(cherry picked from commit 99aeed6acf4ae7cbdd97a3bcfe54c5d37ab7a574)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-09 19:45:22 +01:00
Nik Everett 44a79d1739
Deprecte Rounding#round (#57845) (#57893)
This deprecates `Rounding#round` and `Rounding#nextRoundingValue` in
favor of calling
```
Rounding.Prepared prepared = rounding.prepare(min, max);
...
prepared.round(val)

```

because it is always going to be faster to prepare once. There
are going to be some cases where we won't know what to prepare *for*
and in those cases you can call `prepareForUnknown` and stil be faster
than calling the deprecated method over and over and over again.

Ultimately, this is important because it doesn't look like there is an
easy way to cache `Rounding.Prepared` or any of its precursors like
`LocalTimeOffset.Lookup`. Instead, we can just build it at most once per
request.

Relates to #56124
2020-06-09 14:30:56 -04:00
Dan Hermann b501b282f8
Change default backing index naming scheme 2020-06-09 09:31:34 -05:00
Przemysław Witek 7a1300a09e
[7.x] Make ModelPlotConfig.annotations_enabled default to ModelPlotConfig.enabled if unset (#57808) (#57815) 2020-06-08 17:41:12 +02:00
Mayya Sharipova 70e63a365a
Refactor how to determine if a field is metafield (#57378) (#57771)
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

Related #38373, #41656
Closes #24422
2020-06-08 09:16:18 -04:00
Andrei Dan 1b84e93d83
[7.x] DataStream creation validation allows for prefixed indices (#57750) (#57799)
We want to validate the DataStreams on creation to make sure the future backing
indices would not clash with existing indices in the system (so we can
always rollover the data stream).
This changes the validation logic to allow for a DataStream to be created
with a backing index that has a prefix (eg. `shrink-foo-000001`) even if the
former backing index (`foo-000001`) exists in the system.
The new validation logic will look for potential index conflicts with indices
in the system that have the counter in the name greater than the data stream's
generation.

This ensures that the `DataStream`'s future rollovers are safe because for a
`DataStream` `foo` of generation 4, we will look for standalone indices in the
form of `foo-%06d` with the counter greater than 4 (ie. validation will fail if
`foo-000006` exists in the system), but will also allow replacing a
backing index with an index named by prefixing the backing index it replaces.

(cherry picked from commit 695b242d69f0dc017e732b63737625adb01fe595)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-08 13:31:52 +01:00
David Kyle 08d1286de7
[7.x] Delete expired data by job (#57337) (#57796)
Deleting expired data can take a long time leading to timeouts if there
are many jobs. Often the problem is due to a few large jobs which 
prevent the regular maintenance of the remaining jobs. This change adds
a job_id parameter to the delete expired data endpoint to help clean up
those problematic jobs.
2020-06-08 13:00:23 +01:00
Luca Cavanna 7a06a13d99 Add description to submit and get async search, as well as cancel tasks (#57745)
This makes it easier to debug where such tasks come from in case they are returned from the get tasks API.

Also renamed the last occurrence of waitForCompletion to waitForCompletionTimeout in get async search request.
2020-06-08 11:17:29 +02:00
David Roberts 1d64d55a86
[7.x][ML] Add per-partition categorization option (#57723)
This PR adds the initial Java side changes to enable
use of the per-partition categorization functionality
added in elastic/ml-cpp#1293.

There will be a followup change to complete the work,
as there cannot be any end-to-end integration tests
until elastic/ml-cpp#1293 is merged, and also
elastic/ml-cpp#1293 does not implement some of the
more peripheral functionality, like stop_on_warn and
per-partition stats documents.

The changes so far cover REST APIs, results object
formats, HLRC and docs.

Backport of #57683
2020-06-06 08:15:17 +01:00
Benjamin Trent 9666a895f7
[ML] inference performance optimizations and refactor (#57674) (#57753)
This is a major refactor of the underlying inference logic.

The main refactor is now we are separating the model configuration and
the inference interfaces.

This has the following benefits:
 - we can store extra things with the model that are not
   necessary for inference (i.e. treenode split information gain)
 - we can optimize inference separate from model serialization and storage.
 - The user is oblivious to the optimizations (other than seeing the benefits).

A major part of this commit is removing all inference related methods from the
trained model configurations (ensemble, tree, etc.) and moving them to a new class.

This new class satisfies a new interface that is ONLY for inference.

The optimizations applied currently are:
- feature maps are flattened once
- feature extraction only happens once at the highest level
  (improves inference + feature importance through put)
- Only storing what we need for inference + feature importance on heap
2020-06-05 14:20:58 -04:00
Dimitris Athanasiou f49a14ce6f
[7.x][ML] Fix race condition when force stopping DF analytics job (#57680) (#57717)
When we force delete a DF analytics job, we currently first force
stop it and then we proceed with deleting the job config.
This may result in logging errors if the job config is deleted
before it is retrieved while the job is starting.

Instead of force stopping the job, it would make more sense to
try to stop the job gracefully first. So we now try that out first.
If normal stop fails, then we resort to force stopping the job to
ensure we can go through with the delete.

In addition, this commit introduces `timeout` for the delete action
and makes use of it in the child requests.

Backport of #57680
2020-06-05 17:50:01 +03:00
Hendrik Muhs e91b975878 [Transform] mark old data frame transform roles deprecated (#57655)
mark old data frame transform roles deprecated

fixes #50087
2020-06-05 09:20:35 +02:00
Hendrik Muhs c1c8817eae
[7.x][Transform] improve update API (#57685)
rewrite config on update if either version is outdated, credentials change,
the update changes the config or deprecated settings are found. Deprecated
settings get migrated to the new format. The upgrade can be easily extended to
do any necessary re-writes.

fixes #56499
backport #57648
2020-06-05 08:48:47 +02:00
William Brafford dfb6def3da Revert "Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383)"
This reverts commit 7a67fb2d04.
2020-06-04 16:25:05 -04:00
Ioannis Kakavas af9f9d7f03
[7.x] Add http proxy support for OIDC realm (#57039) (#57584)
This change introduces support for using an http proxy for egress
communication of the OpenID Connect realm.
2020-06-04 20:51:00 +03:00
William Brafford 7a67fb2d04
Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383)
In #55592 and #55416, we deprecated the settings for enabling and disabling
basic license features and turned those settings into no-ops. Since doing so,
we've had feedback that this change may not give users enough time to cleanly
switch from non-ILM index management tools to ILM. If two index managers
operate simultaneously, results could be strange and difficult to
reconstruct. We don't know of any cases where SLM will cause a problem, but we
are restoring that setting as well, to be on the safe side.

This PR is not a strict commit reversion. First, we are keeping the new
xpack.watcher.use_ilm_index_management setting, introduced when
xpack.ilm.enabled was made a no-op, so that users can begin migrating to using
it. Second, the SLM setting was modified in the same commit as a group of other
settings, so I have taken just the changes relating to SLM.
2020-06-04 13:38:22 -04:00
Mark Vieira 9b0f5a1589
Include vendored code notices in distribution notice files (#57017) (#57569)
(cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)
2020-06-04 10:34:24 -07:00
Przemysław Witek 6b5f49d097
[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539) (#57641) 2020-06-04 15:15:35 +02:00
Andrei Dan bd188f4a21
[7.x] ILM: add support for rolling over data streams (#57295) (#57515)
As the datastream information is stored in the `ClusterState.Metadata` we exposed
the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for
the steps to be able to identify when a managed index is part of a DataStream.

If a managed index is part of a DataStream the rollover target is the DataStream
name and the highest generation index is the write index (ie. the rolled index).

(cherry picked from commit 6b410dfb78f3676fce1b7401f1628c1ca6fbd45a)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-02 11:55:23 +01:00
Przemysław Witek ea6cfb7c3d
[7.x] Make Annotation a result type (#56342) (#57508) 2020-06-02 11:56:41 +02:00
Przemysław Witek ceb4b29b98
Introduce Annotation.event field (#57144) (#57453) 2020-06-01 20:42:25 +02:00
Przemysław Witek 72ad9a4548
[7.x] Make AnnotationPersister use bulk requests instead of indexing individual documents (#57278) (#57354) 2020-06-01 12:05:09 +02:00
Benjamin Trent 34f1e0b6bb
[7.x] [ML] mark forecasts for force closed/failed jobs as failed (#57143) (#57374)
* [ML] mark forecasts for force closed/failed jobs as failed (#57143)

forecasts that are still running should be marked as failed/finished in the following scenarios:

- Job is force closed
- Job is re-assigned to another node.

Forecasts are not "resilient". Their execution does not continue after a node failure. Consequently, forecasts marked as STARTED or SCHEDULED should be flagged as failed. These forecasts can then be deleted.

Additionally, force closing a job kills the native task directly. This means that if a forecast was running, it is not allowed to complete and could still have the status of `STARTED` in the index.

relates to https://github.com/elastic/elasticsearch/issues/56419
2020-05-29 14:48:10 -04:00
Benjamin Trent 35d5126cea
[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351) (#57368)
* [ML] adds new for_export flag to GET _ml/inference API (#57351)

Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API.

This flag is useful for moving models between clusters.
2020-05-29 14:01:08 -04:00
Benjamin Trent c8374dc9f3
[ML] add max_model_memory parameter to forecast request (#57254) (#57355)
This adds a max_model_memory setting to forecast requests. 
This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb").

The default value is `20mb`.

There is a HARD limit at `500mb` which will throw an error if used.

If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited.

related native change: https://github.com/elastic/ml-cpp/pull/1238

closes: https://github.com/elastic/elasticsearch/issues/56420
2020-05-29 11:16:08 -04:00
Tim Vernum 408250dcc4
Fix smtp.ssl.trust setting for watcher email (#57268)
The ssl.trust setting for Watcher provides a list of hostnames that
should be automatically trusted for SSL hostname verification. It was
accidentally broken when we added the full ssl.* settings for email
notifications (see #45272)

This commit corrects this, so the setting is once again respected,
as long as none of the other ssl settings are configured for email
notifications.

Resolves: #52153
Backport of: #56090
2020-05-28 17:34:13 +10:00
David Kyle 571477d0ad
[7.x] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041) (#57136)
Fix delete_expired_data/nightly maintenance when 
many model snapshots need deleting (#57041)

The queries performed by the expired data removers pull back entire 
documents when only a few fields are required. For ModelSnapshots in 
particular this is a problem as they contain quantiles which may be 
100s of KB and the search size is set to 10,000.

This change makes the search more efficient by only requesting the 
fields needed to work out which expired data should be deleted.
2020-05-26 10:56:42 +01:00
Benjamin Trent ee4ce8ecec
Fix geotile_grid group_by field mapping (#56939) (#56990)
The original implementation utilized `bbox` as the index mapping type. This would not work as it would have to be `envelope`. But, given that `envelope` and `polygon` are tessellated in the same way, we choose to use `polygon` as the geo_shape type. This is for easier support other places in the stack (a la kibana maps)
2020-05-20 08:22:13 -04:00
Benjamin Trent 297f864884
[ML] relax throttling on expired data cleanup (#56711) (#56895)
Throttling nightly cleanup as much as we do has been over cautious.

Night cleanup should be more lenient in its throttling. We still
keep the same batch size, but now the requests per second scale
with the number of data nodes. If we have more than 5 data nodes,
we don't throttle at all.

Additionally, the API now has `requests_per_second` and `timeout` set.
So users calling the API directly can set the throttling.

This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`.
This will allow users to adjust throttling of the nightly maintenance.
2020-05-18 08:46:42 -04:00
Jake Landis 813609b47c
Ensure that .watcher-history-11* template is in installed prior to use (#56734)
WatcherIndexTemplateRegistry as of https://github.com/elastic/elasticsearch/pull/52962 
requires all nodes to be on 7.7.0 before it allows the version 11 index template to be 
installed.

While in a mixed cluster, nothing prevents Watcher from running on the new
host before the all of the nodes are on 7.7.0. This will result in the
.watcher-history-11* index without the proper mappings. Without the proper
mapping a single document (for a large watch) can exceed the default 1000 field
limit and cause error to show in the logs.

This commit ensures the same logic for writing to the index is applied as for
installing the template. In a mixed cluster, the `10` index template will continue
to be written. Only once all of nodes are on 7.7.0+ will the `11` index template
be installed and used.

closes #56732
2020-05-15 16:29:04 -05:00
Ioannis Kakavas 239ada1669
Test adjustments for FIPS 140 (#56526)
This change aims to fix our setup in CI so that we can run 7.x in
FIPS 140 mode. The major issue that we have in 7.x and did not
have in master is that we can't use the diagnostic trust manager
in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it
explicitly disallows the wrapping of X509TrustManager.

Previous attempts like #56427 and #52211 focused on disabling the
setting in all of our tests when creating a Settings object or
on setting fips_mode.enabled accordingly (which implicitly disables
the diagnostic trust manager). The attempts weren't future proof
though as nothing would forbid someone to add new tests without
setting the necessary setting and forcing this would be very
inconvenient for any other case ( see
#56427 (comment) for the full argumentation).

This change introduces a runtime check in SSLService that overrides
the configuration value of xpack.security.ssl.diagnose.trust and
disables the diagnostic trust manager when we are running in Java 8
and the SunJSSE provider is set in FIPS mode.
2020-05-15 18:10:45 +03:00
Ryan Ernst 9fb80d3827
Move publishing configuration to a separate plugin (#56727)
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
2020-05-14 20:23:07 -07:00
Tal Levy 5e90ff32f7
Add Normalize Pipeline Aggregation (#56399) (#56792)
This aggregation will perform normalizations of metrics
for a given series of data in the form of bucket values.

The aggregations supports the following normalizations

- rescale 0-1
- rescale 0-100
- percentage of sum
- mean normalization
- z-score normalization
- softmax normalization

To specify which normalization is to be used, it can be specified
in the normalize agg's `normalizer` field.

For example:

```
{
  "normalize": {
    "buckets_path": <>,
    "normalizer": "percent"
  }
}
```
2020-05-14 17:40:15 -07:00
Mark Vieira 0fd756d511
Enforce strict license distribution requirements (#56642) 2020-05-14 13:57:56 -07:00
Przemysław Witek 98fbd85290
[7.x] Add scope-related fields to Annotation (#56417) (#56681) 2020-05-14 10:23:13 +02:00
Nik Everett b98b260048
Merge significant_terms into the terms package (backport of #56699) (#56715)
This merges the code for the `significant_terms` agg into the package
for the code for the `terms` agg. They are *super* entangled already,
this mostly just admits that to ourselves.

Precondition for the terms work in #56487
2020-05-13 17:36:21 -04:00
Ignacio Vera b4521d5183
upgrade to Lucene 8.6.0 snapshot (#56661) 2020-05-13 14:25:16 +02:00
Luca Cavanna 30e9a1b8c7 Improve error handling when decoding async execution ids (#56285)
When decoding async execution ids, exceptions thrown from the decode method itself were not caught, leading to cryptic errors like "Input byte array has incorrect ending byte at 68" being returned. With this commit we return "invalid id: [abcdef]".

Added tests coverage for a couple of these scenarios and also added tests for equals/hashcode methods.
2020-05-13 12:26:17 +02:00
Ioannis Kakavas cc119c3853
Expose idp.metadata.http.refresh for SAML realm (#56354) (#56593)
This setting was not returned in the SamlRealmSettings#getSettings
so it was not possible for users to set this in the realm config
in our configuration.
2020-05-13 11:51:18 +03:00
Hendrik Muhs a9425a0240
[7.x][Transform] fix count when matching exact ids(#56544) (#56582)
fix count in get and get stats if explicit ids are given and ids might be
duplicated when configuration are stored in different index (versions).

fixes #56196
2020-05-12 14:23:13 +02:00
Ignacio Vera 222ee721ec
Add moving percentiles pipeline aggregation (#55441) (#56575)
Similar to what the moving function aggregation does, except merging windows of percentiles
sketches together instead of cumulatively merging final metrics
2020-05-12 11:35:23 +02:00
Ryan Ernst 902fc546bd
Migrate remaining ESIntegTestCases to internalClusterTest (#56479) (#56563)
This commit migrates the ESIntegTestCase tests in x-pack to the
internalClusterTest source set.
2020-05-11 21:06:04 -07:00
Tim Brooks 760ab726c2
Share netty event loops between transports (#56553)
Currently Elasticsearch creates independent event loop groups for each
transport (http and internal) transport type. This is unnecessary and
can lead to contention when different threads access shared resources
(ex: allocators). This commit moves to a model where, by default, the
event loops are shared between the transports. The previous behavior can
be attained by specifically setting the http worker count.
2020-05-11 15:43:43 -06:00
Benjamin Trent 1d6b2f074e
[Transform] adds geotile_grid support in group_by (#56514) (#56549)
This adds support for grouping by geo points. This uses the agg [geotile_grid](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geotilegrid-aggregation.html).

I am opting to store the tile results of group_by as a `geo_shape` so that users can query the results. Additionally, the shapes could be visualized and filtered in the kibana maps app.

relates to https://github.com/elastic/elasticsearch/issues/56121
2020-05-11 17:02:40 -04:00
Jim Ferenczi 02ab9112a9 Fix spurious failures in AsyncSearchIntegTestCase (#56026)
Async search integration tests are subject to random failures when:
  * The test index has more than one replica.
  * The request cache is used.
  * Some shards are empty.
  * The maintenance service starts a garbage collection when node is closing.

They are also slow because the test index is created/populated on each
test method.

This change refactors these integration tests in order to:
  * Create the index once for the entire test suite.
  * Fix the usage of the request cache and replicas.
  * Ensures that all shards have at least one document.
  * Increase the delay of the maintenance service garbage collection.

Closes #55895
Closes #55988
2020-05-11 15:03:03 +02:00
Dimitris Athanasiou 60b1c67409
[7.x][ML] Allow stopping DF analytics whose config is missing (#56360) (#56408)
It is possible that the config document for a data frame
analytics job is deleted from the config index. If that is
the case the user is unable to stop a running job because
we attempt to retrieve the config and that will throw.

This commit changes that. When the request is forced,
we do not expand the requested ids based on the existing
configs but from the list of running tasks instead.

Backport of #56360
2020-05-08 13:54:44 +03:00
Dimitris Athanasiou d064eda2b0
[7.x][ML] Ensure phase progress may only increase (#56339) (#56357)
Due to multi-threading it is possible that phase progress
updates written from the c++ process arrive reordered.
We can address this by ensuring that progress may only increase.

Closes #56282

Backport of #56339
2020-05-07 19:46:58 +03:00
Przemysław Witek 0cd0ab276e
Introduce Annotation.Builder class and use it to create instances of Annotation class (#56276) (#56286) 2020-05-06 20:47:03 +02:00