OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-09 14:34:43 +00:00

Author	SHA1	Message	Date
Andrei Dan	1b84e93d83	[7.x] DataStream creation validation allows for prefixed indices (#57750 ) (#57799 ) We want to validate the DataStreams on creation to make sure the future backing indices would not clash with existing indices in the system (so we can always rollover the data stream). This changes the validation logic to allow for a DataStream to be created with a backing index that has a prefix (eg. `shrink-foo-000001`) even if the former backing index (`foo-000001`) exists in the system. The new validation logic will look for potential index conflicts with indices in the system that have the counter in the name greater than the data stream's generation. This ensures that the `DataStream`'s future rollovers are safe because for a `DataStream` `foo` of generation 4, we will look for standalone indices in the form of `foo-%06d` with the counter greater than 4 (ie. validation will fail if `foo-000006` exists in the system), but will also allow replacing a backing index with an index named by prefixing the backing index it replaces. (cherry picked from commit 695b242d69f0dc017e732b63737625adb01fe595) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-08 13:31:52 +01:00
David Kyle	08d1286de7	[7.x] Delete expired data by job (#57337 ) (#57796 ) Deleting expired data can take a long time leading to timeouts if there are many jobs. Often the problem is due to a few large jobs which prevent the regular maintenance of the remaining jobs. This change adds a job_id parameter to the delete expired data endpoint to help clean up those problematic jobs.	2020-06-08 13:00:23 +01:00
Armin Braun	004eb8bd7e	Fix Bug With RepositoryData Caching (#57785 ) (#57800 ) * Fix Bug With RepositoryData Caching This fixes a really subtle bug with caching `RepositoryData` that can corrupt a repository. We were caching `RepositoryData` serialized in the newest metadata format. This lead to a confusing situation where numeric shard generations would be cached in `ShardGenerations` that were not written to the repository because the repository or cluster did not yet support `ShardGenerations`. In the case where shard generations are not actually supported yet, these cached numeric generations are not safe and there's multiple scenarios where they would be incorrect, leading to the repository trying to read shard level metadata from index-N that don't exist. This commit makes it so that cached metadata is always in the same format as the metadata in the repository. Relates #57798	2020-06-08 13:16:45 +02:00
Luca Cavanna	7a06a13d99	Add description to submit and get async search, as well as cancel tasks (#57745 ) This makes it easier to debug where such tasks come from in case they are returned from the get tasks API. Also renamed the last occurrence of waitForCompletion to waitForCompletionTimeout in get async search request.	2020-06-08 11:17:29 +02:00
Luca Cavanna	06ef3042c1	Specify reason whenever async search gets cancelled (#57761 ) This allows to trace where the cancel tasks request came from given that it may be triggered for multiple reasons.	2020-06-08 10:25:31 +02:00
Armin Braun	619e4f8c02	Make BackgroundIndexer more Efficient (#57781 ) (#57789 ) Improve efficiency of background indexer by allowing to add an assertion for failures while they are produced to prevent queuing them up. Also, add non-blocking stop to the background indexer so that when stopping multiple indexers we don't needlessly continue indexing on some indexers while stopping another one. Closes #57766	2020-06-08 10:18:47 +02:00
James Rodewig	6c93fed204	[DOCS] Document `doc_as_upsert` does not support ingest pipelines (#57649 ) (#57783 ) Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: Asfaloth <asfalots@users.noreply.github.com>	2020-06-06 16:10:42 -04:00
David Roberts	1d64d55a86	[7.x][ML] Add per-partition categorization option (#57723 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs. Backport of #57683	2020-06-06 08:15:17 +01:00
debadair	100d2bd063	[DOCS] Editorial ILM cleanup (#57565 ) (#57776 ) * [DOCS] Editorial cleanup * Moved example of applying a template to multiple indices. * Combine existing indices topics * Fixed test * Add skip rollover file. * Revert rename. * Update include. * Revert rename * Apply suggestions from code review Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> * Apply suggestions from code review * Fixed callout * Update docs/reference/ilm/ilm-with-existing-indices.asciidoc Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> * Update docs/reference/ilm/ilm-with-existing-indices.asciidoc Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> * Apply suggestions from code review * Restored policy to template example. * Fixed JSON parse error Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>	2020-06-05 18:55:51 -07:00
James Rodewig	26d2b4f871	[DOCS] Fix typo in CCR overview docs (#57709 ) (#57773 ) Co-authored-by: bellengao <gbl_long@163.com>	2020-06-05 17:42:49 -04:00
Nik Everett	3b1dfa3b5d	Remove deprecated wrapped from scripted_metric (backport of #57627 ) (#57763 ) This removes the deprecated `asMultiBucketAggregator` wrapper from `scripted_metric`. Unlike most other such removals, this isn't likely to save much memory. But it does make the internals of the aggregator slightly less twisted. Relates to #56487	2020-06-05 16:14:28 -04:00
Benjamin Trent	9666a895f7	[ML] inference performance optimizations and refactor (#57674 ) (#57753 ) This is a major refactor of the underlying inference logic. The main refactor is now we are separating the model configuration and the inference interfaces. This has the following benefits: - we can store extra things with the model that are not necessary for inference (i.e. treenode split information gain) - we can optimize inference separate from model serialization and storage. - The user is oblivious to the optimizations (other than seeing the benefits). A major part of this commit is removing all inference related methods from the trained model configurations (ensemble, tree, etc.) and moving them to a new class. This new class satisfies a new interface that is ONLY for inference. The optimizations applied currently are: - feature maps are flattened once - feature extraction only happens once at the highest level (improves inference + feature importance through put) - Only storing what we need for inference + feature importance on heap	2020-06-05 14:20:58 -04:00
Martijn van Groningen	f170b52e64	Backing indices should use composable template matching with the corresponding data stream name (#57728 ) Backport of #57640 to 7.x branch. Composable templates with exact matches, can match with the data stream name, but not with the backing index name. Also if the backing index naming scheme changes, then a composable template may never match with a backing index. In that case mappings and settings may not get applied.	2020-06-05 18:38:22 +02:00
Dan Hermann	3fe93e24a6	[7.x] Prohibit closing the write index for a data stream (#57740 )	2020-06-05 11:14:43 -05:00
James Rodewig	bc921ea17c	[DOCS] Add docs for designing resilient clusters (#47233 ) (#57743 ) Adds some guidance for designing clusters to be resilient to failures, including example architectures. Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: David Turner <david.turner@elastic.co>	2020-06-05 12:08:45 -04:00
Jake Landis	459ab9a0b2	[7.x] Ensure type exists for all monitoring configuration (#57399 ) (#57704 ) #47711 and #47246 helped to validate that monitoring settings are rejected at time of setting the monitoring settings. Else an invalid monitoring setting can find it's way into the cluster state and result in an exception thrown [1] on the cluster state application (there by causing significant issues). Some additional monitoring settings have been identified that can result in invalid cluster state that also result in exceptions thrown on cluster state application. All settings require a type of either http or local to be applicable. When a setting is changed, the exporters are automatically updated with the new settings. However, if the old or new settings lack of a type setting an exception will be thrown (since exporters are always of type 'http' or 'local'). Arguably we shouldn't blindly create and destroy new exporters on each monitoring setting update, but the lifecycle of the exporters is abit out the scope this PR is trying to address. This commit introduces a similar methodology to check for validity as #47711 and #47246 but this time for ALL (including non-http) settings. Monitoring settings are not useful unless there an exporter with a type defined. The type is used as dependent setting, such that it must exist to set the value. This ensures that when any monitoring settings changes that they can only get added to cluster state if the type exists. If the type exists (and the other validations pass) then the exporters will get re-built and the cluster state remains valid. Tests have been included to ensure that all dynamic monitoring settings have the type as dependent settings. [1] org.elasticsearch.common.settings.SettingsException: missing exporter type for [found-user-defined] exporter at org.elasticsearch.xpack.monitoring.exporter.Exporters.initExporters(Exporters.java:126) ~[?:?]	2020-06-05 10:47:11 -05:00
Adam Locke	fe558f6373	[DOCS] Clarifying env variable substitution (#57370 ) (#57737 ) * Clarifying environment variable substitution in the ES configuration YAML * Update code snippet * Remove extraneous quotes from string example * Incorporating review feedback	2020-06-05 11:08:03 -04:00
Nik Everett	94b3eed6be	Re-mute test Tracked in #57402	2020-06-05 10:52:24 -04:00
Dimitris Athanasiou	f49a14ce6f	[7.x][ML] Fix race condition when force stopping DF analytics job (#57680 ) (#57717 ) When we force delete a DF analytics job, we currently first force stop it and then we proceed with deleting the job config. This may result in logging errors if the job config is deleted before it is retrieved while the job is starting. Instead of force stopping the job, it would make more sense to try to stop the job gracefully first. So we now try that out first. If normal stop fails, then we resort to force stopping the job to ensure we can go through with the delete. In addition, this commit introduces `timeout` for the delete action and makes use of it in the child requests. Backport of #57680	2020-06-05 17:50:01 +03:00
Martijn van Groningen	c407b0f40d	[DOCS] Add data stream overview and intro (#57724 ) Backporting #57596 to 7.x branch. Added data streams overview page and an introduction to data streams. Relates to #53100 Co-authored-by: Dan Hermann danhermann@users.noreply.github.com Co-authored-by: James Rodewig james.rodewig@elastic.co	2020-06-05 16:10:03 +02:00
Armin Braun	8805c1f112	Manually Craft CreateSnapshotRequest to fix BwC Test (#57661 ) (#57715 ) We can't use the high level create snapshot request any longer since we changed some of its default parameters in `8` and those are not understood by older versions like `7.4`. Closes #57650	2020-06-05 15:49:44 +02:00
Jun Ohtani	c75c8b6e9d	Expose discard_compound_token option to kuromoji_tokenizer (#57421 ) This commit exposes the new Lucene option `discard_compound_token` to the Elasticsearch Kuromoji plugin.	2020-06-05 15:41:01 +02:00
Tanguy Leroux	0e57528d5d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:34:06 +02:00
James Rodewig	b03a83a69d	[DOCS] Fix source filtering xrefs (#57720 ) (#57725 )	2020-06-05 09:05:30 -04:00
DU-ds	925c01f1d7	add jvm clarification (#57460 ) Emphasise in the Docker documentation that although the default heap size is 1GB, the docker-compose.yml example specifies 512MB.	2020-06-05 11:48:15 +01:00
Hendrik Muhs	61c496d320	[Transform] use old roles only together with old endpoints (#57710 ) avoids a CI failure if new endpoints used together with old roles and warnings are asserted.	2020-06-05 10:08:05 +02:00
Hendrik Muhs	e91b975878	[Transform] mark old data frame transform roles deprecated (#57655 ) mark old data frame transform roles deprecated fixes #50087	2020-06-05 09:20:35 +02:00
Hendrik Muhs	c1c8817eae	[7.x][Transform] improve update API (#57685 ) rewrite config on update if either version is outdated, credentials change, the update changes the config or deprecated settings are found. Deprecated settings get migrated to the new format. The upgrade can be easily extended to do any necessary re-writes. fixes #56499 backport #57648	2020-06-05 08:48:47 +02:00
Jake Landis	f4a3d969ad	[7.x] Ensure default watches are updated for rolling upgrades. (#57185 ) (#57563 ) For a rolling/mixed cluster upgrade (add new version to existing cluster then shutdown old instances), the watches that ship by default with monitoring may not get properly updated to the new version. Monitoring watches can only get published if the internal state is marked as dirty. If a node is not master, will also get marked as clean (e.g. not dirty). For a mixed cluster upgrade, it is possible for the new node to be added, not as master, the internal state gets marked as clean so that no more attempts can be made to publish the watches. This happens on all new nodes. Once the old nodes are de-commissioned one of the new version nodes in the cluster gets promoted to master. However, that new master node (with out intervention like restarting the node or removing/adding exporters) will never attempt to re-publish since the internal state was already marked as clean. This commit adds a cluster state listener to mark the resource dirty when a node is promoted to master. This will allow the new resource to be published without any intervention.	2020-06-04 16:44:36 -05:00
William Brafford	dfb6def3da	Revert "Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 )" This reverts commit 7a67fb2d04d46a10856271d634248dcf4050addb.	2020-06-04 16:25:05 -04:00
Nik Everett	de27253d87	Drop skip on test after backporting fix Fixed in 98c379c507a8cc93ae6015a5355fc5b6a213c0f6. Closes #57402	2020-06-04 16:04:18 -04:00
Gordon Brown	5a4e5a1e9d	Handle `cluster.max_shards_per_node` in YAML config (#57234 ) Prior to this commit, `cluster.max_shards_per_node` is not correctly handled when it is set via the YAML config file, only when it is set via the Cluster Settings API. This commit refactors how the limit is implemented, both to enable correctly handling the setting in the YAML and to more effectively centralize the logic used to enforce the limit. The logic used to apply the limit, as well as the setting value, has been moved to the new `ShardLimitValidator`.	2020-06-04 14:02:21 -06:00
James Rodewig	9c7a5c7b83	[DOCS] Move source filtering examples (#57689 ) (#57695 ) Moves the source filtering example snippets form the "Request body search" API docs page to the "Return fields in a search" section of the "Run a search" page.	2020-06-04 15:34:10 -04:00
Nik Everett	98c379c507	Merge remaining sig_terms into terms (#57397 ) (#57687 ) Merges the remaining implementation of `significant_terms` into `terms` so that we can more easilly make them work properly without `asMultiBucketAggregator` which should save memory and speed them up. Relates #56487	2020-06-04 14:32:32 -04:00
Howard	76ee1aad4b	Remove unused routing for ClusterState creation utils (#57679 ) Remove some unused routing definitions from cluster state creation utils.	2020-06-04 13:59:18 -04:00
Ioannis Kakavas	8afd55ebe6	Disable testing conventions for idp in fips (#57663 ) (#57676 ) Since we disable both integTest and test tasks. This should have been part of #57048 but we missed it.	2020-06-04 20:51:38 +03:00
Ioannis Kakavas	af9f9d7f03	[7.x] Add http proxy support for OIDC realm (#57039 ) (#57584 ) This change introduces support for using an http proxy for egress communication of the OpenID Connect realm.	2020-06-04 20:51:00 +03:00
William Brafford	7a67fb2d04	Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 ) In #55592 and #55416, we deprecated the settings for enabling and disabling basic license features and turned those settings into no-ops. Since doing so, we've had feedback that this change may not give users enough time to cleanly switch from non-ILM index management tools to ILM. If two index managers operate simultaneously, results could be strange and difficult to reconstruct. We don't know of any cases where SLM will cause a problem, but we are restoring that setting as well, to be on the safe side. This PR is not a strict commit reversion. First, we are keeping the new xpack.watcher.use_ilm_index_management setting, introduced when xpack.ilm.enabled was made a no-op, so that users can begin migrating to using it. Second, the SLM setting was modified in the same commit as a group of other settings, so I have taken just the changes relating to SLM.	2020-06-04 13:38:22 -04:00
Mark Vieira	9b0f5a1589	Include vendored code notices in distribution notice files (#57017 ) (#57569 ) (cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)	2020-06-04 10:34:24 -07:00
Armin Braun	24779c80f9	Serialize Outbound Message on Flush (#57084 ) (#57682 ) Follow up to #56961: We can be a little more efficient than just serializing at the IO loop by serializing only when we flush to a channel. This has the advantage that we don't serialize a long queue of messages for a channel that isn't writable for a longer period of time (unstable network, actually writing large volumes of data, etc.). Also, this further reduces the time for which we hold on to the write buffer for a message, making allocations because of an empty page cache recycler pool less likely.	2020-06-04 18:06:13 +02:00
Armin Braun	80d1b12fa3	Restore ThreadContext after Serializing OutboundMessage (#57659 ) (#57681 ) Stash the current context before restoring the stored context on the IO thread so that its thread context does not get polluted. Closes #57554	2020-06-04 17:55:26 +02:00
James Rodewig	2104b0503c	[DOCS] Reformat whitespace in search API docs (#57667 ) (#57669 ) Changes the search API docs to use: * Consistent indentation in param definitions * Two-space indentation in JSON snippets	2020-06-04 10:02:39 -04:00
Przemysław Witek	6b5f49d097	[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539 ) (#57641 )	2020-06-04 15:15:35 +02:00
Rene Groeschke	20aa4eec55	Set impliesSubProjects flag for root RunTask task (#57615 ) Fixes #57521	2020-06-04 14:57:35 +02:00
Benjamin Trent	ea9b8b9d41	[ML] fix setting forecasts to failed method (#57654 ) (#57656 )	2020-06-04 08:54:46 -04:00
Rene Groeschke	751f16858b	Remove duplicate ssl setup in sql/qa projects (#57319 ) (#57643 ) * Remove duplicate ssl setup in sql/qa projects * Fix enforcement of task instances * Use static data for cert generation * Move ssl testing logic into a plugin * Document test cert creation	2020-06-04 14:53:23 +02:00
David Turner	fc4dd6d681	Timeout health API on busy master (#57587 ) Today `GET _cluster/health?wait_for_events=...&timeout=...` will wait indefinitely for the master to process the pending cluster health task, ignoring the specified timeout. This could take a very long time if the master is overloaded. This commit fixes this by adding a timeout to the pending cluster health task.	2020-06-04 13:39:22 +01:00
Marios Trivyzas	5f8442d1f4	SQL: Improve performances of LTRIM/RTRIM (#57603 ) Change custom stripping leading and trailing whitespaces implementation to substantially improves performance: ``` Benchmark Mode Cnt Score Error Units StringTrim.testWithStringBuilder avgt 25 82547.575 ± 66.244 ns/op (existing impl) StringTrim.testWithSubstring avgt 25 1398.762 ± 101.152 ns/op (new impl) StringTrim.testWithJavaStrip avgt 25 1186.120 ± 10.374 ns/op (for reference) ``` Java's string stripLeading()/stripTrailing() not available to all supported JDKs. Enhanced LENGTH unit tests and compine a couple of LTRIM/RTRIM integ tests. Relates to: #57594 (partially cherry picked from commit ee7868d68733f195dc46926a7eab3d9dd7033ef4) Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>	2020-06-04 13:43:49 +02:00
Rene Groeschke	2c2d903277	Revert "Update Gradle wrapper to 6.5 (#57580 )" This reverts commit b7e39dd1c8f0ee1a5a856a8f4b96c42ec8e6357a.	2020-06-04 11:01:11 +02:00
Rene Groeschke	ddf01f89ef	Gradle Enterprise Plugin Update to 3.3.3 (#57583 ) This Updates the gradle enterprise plugin to the latest released version 3.3.3	2020-06-04 10:38:12 +02:00

1 2 3 4 5 ...

51993 Commits