OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-18 19:05:06 +00:00

Author	SHA1	Message	Date
Dimitris Athanasiou	f49a14ce6f	[7.x][ML] Fix race condition when force stopping DF analytics job (#57680 ) (#57717 ) When we force delete a DF analytics job, we currently first force stop it and then we proceed with deleting the job config. This may result in logging errors if the job config is deleted before it is retrieved while the job is starting. Instead of force stopping the job, it would make more sense to try to stop the job gracefully first. So we now try that out first. If normal stop fails, then we resort to force stopping the job to ensure we can go through with the delete. In addition, this commit introduces `timeout` for the delete action and makes use of it in the child requests. Backport of #57680	2020-06-05 17:50:01 +03:00
Hendrik Muhs	e91b975878	[Transform] mark old data frame transform roles deprecated (#57655 ) mark old data frame transform roles deprecated fixes #50087	2020-06-05 09:20:35 +02:00
Hendrik Muhs	c1c8817eae	[7.x][Transform] improve update API (#57685 ) rewrite config on update if either version is outdated, credentials change, the update changes the config or deprecated settings are found. Deprecated settings get migrated to the new format. The upgrade can be easily extended to do any necessary re-writes. fixes #56499 backport #57648	2020-06-05 08:48:47 +02:00
William Brafford	dfb6def3da	Revert "Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 )" This reverts commit 7a67fb2d04d46a10856271d634248dcf4050addb.	2020-06-04 16:25:05 -04:00
Ioannis Kakavas	af9f9d7f03	[7.x] Add http proxy support for OIDC realm (#57039 ) (#57584 ) This change introduces support for using an http proxy for egress communication of the OpenID Connect realm.	2020-06-04 20:51:00 +03:00
William Brafford	7a67fb2d04	Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 ) In #55592 and #55416, we deprecated the settings for enabling and disabling basic license features and turned those settings into no-ops. Since doing so, we've had feedback that this change may not give users enough time to cleanly switch from non-ILM index management tools to ILM. If two index managers operate simultaneously, results could be strange and difficult to reconstruct. We don't know of any cases where SLM will cause a problem, but we are restoring that setting as well, to be on the safe side. This PR is not a strict commit reversion. First, we are keeping the new xpack.watcher.use_ilm_index_management setting, introduced when xpack.ilm.enabled was made a no-op, so that users can begin migrating to using it. Second, the SLM setting was modified in the same commit as a group of other settings, so I have taken just the changes relating to SLM.	2020-06-04 13:38:22 -04:00
Mark Vieira	9b0f5a1589	Include vendored code notices in distribution notice files (#57017 ) (#57569 ) (cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)	2020-06-04 10:34:24 -07:00
Przemysław Witek	6b5f49d097	[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539 ) (#57641 )	2020-06-04 15:15:35 +02:00
Andrei Dan	bd188f4a21	[7.x] ILM: add support for rolling over data streams (#57295 ) (#57515 ) As the datastream information is stored in the `ClusterState.Metadata` we exposed the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for the steps to be able to identify when a managed index is part of a DataStream. If a managed index is part of a DataStream the rollover target is the DataStream name and the highest generation index is the write index (ie. the rolled index). (cherry picked from commit 6b410dfb78f3676fce1b7401f1628c1ca6fbd45a) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-02 11:55:23 +01:00
Przemysław Witek	ea6cfb7c3d	[7.x] Make Annotation a result type (#56342 ) (#57508 )	2020-06-02 11:56:41 +02:00
Przemysław Witek	ceb4b29b98	Introduce Annotation.event field (#57144 ) (#57453 )	2020-06-01 20:42:25 +02:00
Przemysław Witek	72ad9a4548	[7.x] Make AnnotationPersister use bulk requests instead of indexing individual documents (#57278 ) (#57354 )	2020-06-01 12:05:09 +02:00
Benjamin Trent	34f1e0b6bb	[7.x] [ML] mark forecasts for force closed/failed jobs as failed (#57143 ) (#57374 ) * [ML] mark forecasts for force closed/failed jobs as failed (#57143) forecasts that are still running should be marked as failed/finished in the following scenarios: - Job is force closed - Job is re-assigned to another node. Forecasts are not "resilient". Their execution does not continue after a node failure. Consequently, forecasts marked as STARTED or SCHEDULED should be flagged as failed. These forecasts can then be deleted. Additionally, force closing a job kills the native task directly. This means that if a forecast was running, it is not allowed to complete and could still have the status of `STARTED` in the index. relates to https://github.com/elastic/elasticsearch/issues/56419	2020-05-29 14:48:10 -04:00
Benjamin Trent	35d5126cea	[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351 ) (#57368 ) * [ML] adds new for_export flag to GET _ml/inference API (#57351) Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API. This flag is useful for moving models between clusters.	2020-05-29 14:01:08 -04:00
Benjamin Trent	c8374dc9f3	[ML] add max_model_memory parameter to forecast request (#57254 ) (#57355 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 11:16:08 -04:00
Tim Vernum	408250dcc4	Fix smtp.ssl.trust setting for watcher email (#57268 ) The ssl.trust setting for Watcher provides a list of hostnames that should be automatically trusted for SSL hostname verification. It was accidentally broken when we added the full ssl.* settings for email notifications (see #45272) This commit corrects this, so the setting is once again respected, as long as none of the other ssl settings are configured for email notifications. Resolves: #52153 Backport of: #56090	2020-05-28 17:34:13 +10:00
David Kyle	571477d0ad	[7.x] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041 ) (#57136 ) Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041) The queries performed by the expired data removers pull back entire documents when only a few fields are required. For ModelSnapshots in particular this is a problem as they contain quantiles which may be 100s of KB and the search size is set to 10,000. This change makes the search more efficient by only requesting the fields needed to work out which expired data should be deleted.	2020-05-26 10:56:42 +01:00
Benjamin Trent	ee4ce8ecec	Fix geotile_grid group_by field mapping (#56939 ) (#56990 ) The original implementation utilized `bbox` as the index mapping type. This would not work as it would have to be `envelope`. But, given that `envelope` and `polygon` are tessellated in the same way, we choose to use `polygon` as the geo_shape type. This is for easier support other places in the stack (a la kibana maps)	2020-05-20 08:22:13 -04:00
Benjamin Trent	297f864884	[ML] relax throttling on expired data cleanup (#56711 ) (#56895 ) Throttling nightly cleanup as much as we do has been over cautious. Night cleanup should be more lenient in its throttling. We still keep the same batch size, but now the requests per second scale with the number of data nodes. If we have more than 5 data nodes, we don't throttle at all. Additionally, the API now has `requests_per_second` and `timeout` set. So users calling the API directly can set the throttling. This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`. This will allow users to adjust throttling of the nightly maintenance.	2020-05-18 08:46:42 -04:00
Jake Landis	813609b47c	Ensure that .watcher-history-11* template is in installed prior to use (#56734 ) WatcherIndexTemplateRegistry as of https://github.com/elastic/elasticsearch/pull/52962 requires all nodes to be on 7.7.0 before it allows the version 11 index template to be installed. While in a mixed cluster, nothing prevents Watcher from running on the new host before the all of the nodes are on 7.7.0. This will result in the .watcher-history-11* index without the proper mappings. Without the proper mapping a single document (for a large watch) can exceed the default 1000 field limit and cause error to show in the logs. This commit ensures the same logic for writing to the index is applied as for installing the template. In a mixed cluster, the `10` index template will continue to be written. Only once all of nodes are on 7.7.0+ will the `11` index template be installed and used. closes #56732	2020-05-15 16:29:04 -05:00
Ioannis Kakavas	239ada1669	Test adjustments for FIPS 140 (#56526 ) This change aims to fix our setup in CI so that we can run 7.x in FIPS 140 mode. The major issue that we have in 7.x and did not have in master is that we can't use the diagnostic trust manager in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it explicitly disallows the wrapping of X509TrustManager. Previous attempts like #56427 and #52211 focused on disabling the setting in all of our tests when creating a Settings object or on setting fips_mode.enabled accordingly (which implicitly disables the diagnostic trust manager). The attempts weren't future proof though as nothing would forbid someone to add new tests without setting the necessary setting and forcing this would be very inconvenient for any other case ( see #56427 (comment) for the full argumentation). This change introduces a runtime check in SSLService that overrides the configuration value of xpack.security.ssl.diagnose.trust and disables the diagnostic trust manager when we are running in Java 8 and the SunJSSE provider is set in FIPS mode.	2020-05-15 18:10:45 +03:00
Ryan Ernst	9fb80d3827	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 20:23:07 -07:00
Tal Levy	5e90ff32f7	Add Normalize Pipeline Aggregation (#56399 ) (#56792 ) This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ```	2020-05-14 17:40:15 -07:00
Mark Vieira	0fd756d511	Enforce strict license distribution requirements (#56642 )	2020-05-14 13:57:56 -07:00
Przemysław Witek	98fbd85290	[7.x] Add scope-related fields to Annotation (#56417 ) (#56681 )	2020-05-14 10:23:13 +02:00
Nik Everett	b98b260048	Merge significant_terms into the terms package (backport of #56699 ) (#56715 ) This merges the code for the `significant_terms` agg into the package for the code for the `terms` agg. They are super entangled already, this mostly just admits that to ourselves. Precondition for the terms work in #56487	2020-05-13 17:36:21 -04:00
Ignacio Vera	b4521d5183	upgrade to Lucene 8.6.0 snapshot (#56661 )	2020-05-13 14:25:16 +02:00
Luca Cavanna	30e9a1b8c7	Improve error handling when decoding async execution ids (#56285 ) When decoding async execution ids, exceptions thrown from the decode method itself were not caught, leading to cryptic errors like "Input byte array has incorrect ending byte at 68" being returned. With this commit we return "invalid id: [abcdef]". Added tests coverage for a couple of these scenarios and also added tests for equals/hashcode methods.	2020-05-13 12:26:17 +02:00
Ioannis Kakavas	cc119c3853	Expose idp.metadata.http.refresh for SAML realm (#56354 ) (#56593 ) This setting was not returned in the SamlRealmSettings#getSettings so it was not possible for users to set this in the realm config in our configuration.	2020-05-13 11:51:18 +03:00
Hendrik Muhs	a9425a0240	[7.x][Transform] fix count when matching exact ids(#56544 ) (#56582 ) fix count in get and get stats if explicit ids are given and ids might be duplicated when configuration are stored in different index (versions). fixes #56196	2020-05-12 14:23:13 +02:00
Ignacio Vera	222ee721ec	Add moving percentiles pipeline aggregation (#55441 ) (#56575 ) Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics	2020-05-12 11:35:23 +02:00
Ryan Ernst	902fc546bd	Migrate remaining ESIntegTestCases to internalClusterTest (#56479 ) (#56563 ) This commit migrates the ESIntegTestCase tests in x-pack to the internalClusterTest source set.	2020-05-11 21:06:04 -07:00
Tim Brooks	760ab726c2	Share netty event loops between transports (#56553 ) Currently Elasticsearch creates independent event loop groups for each transport (http and internal) transport type. This is unnecessary and can lead to contention when different threads access shared resources (ex: allocators). This commit moves to a model where, by default, the event loops are shared between the transports. The previous behavior can be attained by specifically setting the http worker count.	2020-05-11 15:43:43 -06:00
Benjamin Trent	1d6b2f074e	[Transform] adds geotile_grid support in group_by (#56514 ) (#56549 ) This adds support for grouping by geo points. This uses the agg [geotile_grid](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geotilegrid-aggregation.html). I am opting to store the tile results of group_by as a `geo_shape` so that users can query the results. Additionally, the shapes could be visualized and filtered in the kibana maps app. relates to https://github.com/elastic/elasticsearch/issues/56121	2020-05-11 17:02:40 -04:00
Jim Ferenczi	02ab9112a9	Fix spurious failures in AsyncSearchIntegTestCase (#56026 ) Async search integration tests are subject to random failures when: * The test index has more than one replica. * The request cache is used. * Some shards are empty. * The maintenance service starts a garbage collection when node is closing. They are also slow because the test index is created/populated on each test method. This change refactors these integration tests in order to: * Create the index once for the entire test suite. * Fix the usage of the request cache and replicas. * Ensures that all shards have at least one document. * Increase the delay of the maintenance service garbage collection. Closes #55895 Closes #55988	2020-05-11 15:03:03 +02:00
Dimitris Athanasiou	60b1c67409	[7.x][ML] Allow stopping DF analytics whose config is missing (#56360 ) (#56408 ) It is possible that the config document for a data frame analytics job is deleted from the config index. If that is the case the user is unable to stop a running job because we attempt to retrieve the config and that will throw. This commit changes that. When the request is forced, we do not expand the requested ids based on the existing configs but from the list of running tasks instead. Backport of #56360	2020-05-08 13:54:44 +03:00
Dimitris Athanasiou	d064eda2b0	[7.x][ML] Ensure phase progress may only increase (#56339 ) (#56357 ) Due to multi-threading it is possible that phase progress updates written from the c++ process arrive reordered. We can address this by ensuring that progress may only increase. Closes #56282 Backport of #56339	2020-05-07 19:46:58 +03:00
Przemysław Witek	0cd0ab276e	Introduce Annotation.Builder class and use it to create instances of Annotation class (#56276 ) (#56286 )	2020-05-06 20:47:03 +02:00
Tal Levy	e4f2c3105d	Add geo_shape support for geotile_grid and geohash_grid (#55966 ) (#56228 ) this commit adds aggregation support for the geo_shape field type on geo*_grid aggregations. it introduces a Tiler for both tiles and hashes that enables a new type of ValuesSource to replace the GeoPoint's CellIdSource. This makes it possible for the existing Aggregator to be re-used, so no new implementations of the grid aggregators are added.	2020-05-05 09:54:14 -07:00
William Brafford	3499fa917c	Deprecated xpack "enable" settings should be no-ops (#55416 ) (#56167 ) The following settings are now no-ops: * xpack.flattened.enabled * xpack.logstash.enabled * xpack.rollup.enabled * xpack.slm.enabled * xpack.sql.enabled * xpack.transform.enabled * xpack.vectors.enabled Since these settings no longer need to be checked, we can remove settings parameters from a number of constructors and methods, and do so in this commit. We also update documentation to remove references to these settings.	2020-05-05 10:40:49 -04:00
David Roberts	7aa0daaabd	[7.x][ML] More advanced model snapshot retention options (#56194 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Backport of #56125	2020-05-05 14:31:58 +01:00
Dimitris Athanasiou	2d7899c83c	[7.x][ML] Adjust DF Analytics process phases (#56107 ) (#56177 ) As of elastic/ml-cpp#1179, the analytics process reports phases depending on the analysis type. This commit adjusts the phases of current analyses from `analyzing` to the following: - outlier_detection: [`computing_outlier`] - regression/classification: [`feature_selection`, `coarse_parameter_search`, `fine_tuning_parameters`, `final_training`] Backport of #56107	2020-05-05 15:00:07 +03:00
Dimitris Athanasiou	75dadb7a6d	[7.x][ML] Add loss_function to regression (#56118 ) (#56187 ) Adds parameters `loss_function` and `loss_function_parameter` to regression. Backport of #56118	2020-05-05 14:59:51 +03:00
Hendrik Muhs	e177a38504	[7.x][Transform] add throttling (#56007 ) (#56184 ) add throttling to transform, throttling will slow down search requests by delaying the execution based on a documents per second metric. fixes #54862	2020-05-05 13:09:02 +02:00
Martijn van Groningen	2ac32db607	Move includeDataStream flag from IndicesOptions to IndexNameExpressionResolver.Context (#56151 ) Backport of #56034. Move includeDataStream flag from an IndicesOptions to IndexNameExpressionResolver.Context as a dedicated field that callers to IndexNameExpressionResolver can set. Also alter indices stats api to support data streams. The rollover api uses this api and otherwise rolling over data stream does no longer work. Relates to #53100	2020-05-04 22:38:33 +02:00
Martijn van Groningen	6d03081560	Add auto create action (#56122 ) Backport of #55858 to 7.x branch. Currently the TransportBulkAction detects whether an index is missing and then decides whether it should be auto created. The coordination of the index creation also happens in the TransportBulkAction on the coordinating node. This change adds a new transport action that the TransportBulkAction delegates to if missing indices need to be created. The reasons for this change: * Auto creation of data streams can't occur on the coordinating node. Based on the index template (v2) either a regular index or a data stream should be created. However if the coordinating node is slow in processing cluster state updates then it may be unaware of the existence of certain index templates, which then can load to the TransportBulkAction creating an index instead of a data stream. Therefor the coordination of creating an index or data stream should occur on the master node. See #55377 * From a security perspective it is useful to know whether index creation originates from the create index api or from auto creating a new index via the bulk or index api. For example a user would be allowed to auto create an index, but not to use the create index api. The auto create action will allow security to distinguish these two different patterns of index creation. This change adds the following new transport actions: AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via #55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created. The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege. Relates to #53100	2020-05-04 19:10:09 +02:00
Dimitris Athanasiou	76fa5a2397	[7.x][ML] Improve cleanup for DF Analytics HLRC tests (#56101 ) (#56109 ) Adds the step of stopping all data frame analytics before deleting them to the cleanup of the corresponding HLRC tests. Closes #56097 Backport of #56101	2020-05-04 16:08:08 +03:00
Przemysław Witek	44f5a8ccd3	Use snapshot's latest result time rather than snapshot's creation time when creating an annotation (#56093 ) (#56103 )	2020-05-04 12:36:12 +02:00
Armin Braun	0860d1dc74	Remove Dead Code in SLM Delete Handling (#56081 ) (#56098 ) The delete response is always acknowledged. No need to handle anything else.	2020-05-04 12:22:06 +02:00
Armin Braun	3a64ecb6bf	Allow Deleting Multiple Snapshots at Once (#55474 ) (#56083 ) * Allow Deleting Multiple Snapshots at Once (#55474) Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise. This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.	2020-05-03 20:30:58 +02:00

1 2 3 4 5 ...

1887 Commits