OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	0330bef409	Improve async search's tasks cancellation (#53799 ) This commit adds an explicit cancellation of the search task if the initial async search submit task is cancelled (connection closed by the user). This was previously done through the cancellation of the parent task but we don't handle grand-children cancellation yet so we have to manually cancel the search task in order to ensure that shard actions are cancelled too. This change can be considered as a workaround until #50990 is fixed.	2020-03-24 15:51:10 +01:00
Ahmet Arslan	52062565a9	[DOCS] Correct DFI docs regarding stop word removal (#53836 ) The documentation of DFI should recommend not to [remove stop words][1], since DFI is good at scoring queries that contain common terms: `the wall`, `the sun`, `the who`, etc. [1]:https://lucene.apache.org/core/8_1_1/core/org/apache/lucene/search/similarities/DFISimilarity.html	2020-03-24 10:48:42 -04:00
Przemysław Witek	7e25563303	Use the new ML state index name (.ml-state-000001) instead of the legacy one (.ml-state) (#54070 ) (#54085 )	2020-03-24 15:22:59 +01:00
Andrei Stefan	3234b50e95	SQL: jdbc debugging enhancement (#53880 ) (#54081 ) * add flush always output option that will flush the output printer after each debug message when enabled (disabled by default) * at debug output initializationtime, log debug output information about OS, JVM and default JVM timezone (cherry picked from commit b5db9657d1eadce9902041e5b128bf32c02d302a)	2020-03-24 16:09:53 +02:00
Karen Metts	9da589c5fd	[DOCS] Replace outdated Logstash monitoring link (#54032 ) Replaces a link to Logstash OSS-only content with a link to the general Logstash monitoring topic.	2020-03-24 10:03:31 -04:00
Alan Woodward	39d7d0dc10	Upgrade to lucene 8.5.0 release (#54077 ) Upgrades our lucene dependency to the released 8.5.0 version.	2020-03-24 13:45:50 +00:00
Tanguy Leroux	dea8a31480	Wait for Active license before running CCR API tests (#53966 ) DocsClientYamlTestSuiteIT sometimes fails for CCR related tests because tests are started before the license is fully applied and active within the cluster. The first tests to be executed then fails with the error noticed in #53430. This can be easily reproduced locally by only running CCR docs tests. This commit adds some @Before logic in DocsClientYamlTestSuiteIT so that it waits for the license to be active before running CCR tests. Closes #53430	2020-03-24 14:29:45 +01:00
David Roberts	1421471556	[ML] Introduce a "starting" datafeed state for lazy jobs (#54065 ) It is possible for ML jobs to open lazily if the "allow_lazy_open" option in the job config is set to true. Such jobs wait in the "opening" state until a node has sufficient capacity to run them. This commit fixes the bug that prevented datafeeds for jobs lazily waiting assignment from being started. The state of such datafeeds is "starting", and they can be stopped by the stop datafeed API while in this state with or without force. Backport of #53918	2020-03-24 13:00:04 +00:00
Dan Hermann	30105a5ab5	[7.x] Cluster state and CRUD operations for data streams (#54073 )	2020-03-24 07:58:52 -05:00
Peter Schretlen	92acb2859b	Allow kibana_system to create and invalidate API keys on behalf of other users	2020-03-24 08:38:12 -04:00
Dimitris Athanasiou	be20bb5755	[7.x][ML] No refresh on indexing DFA stats (#53977 ) (#54064 ) When we index data frame analytics stats docs we do not need to refresh immediately. Backport of #53977	2020-03-24 13:13:03 +02:00
Armin Braun	4e462db2ed	Fix BlobStoreIncrementalityIT (#54055 ) (#54060 ) The snapshot stats response list of snapshot statuses is not ordered according to the given list of snapshot names so randomly we could mix up snapshot1 and snapshot2 when asserting on the stats. Fixed by getting each snapshot's stats individually. Closes #54034	2020-03-24 11:46:40 +01:00
Yang Wang	d33d20bfdc	Validate role templates before saving role mapping (#52636 ) (#54059 ) Role names are now compiled from role templates before role mapping is saved. This serves as validation for role templates to prevent malformed and invalid scripts to be persisted, which could later break authentication. Resolves: #48773	2020-03-24 20:43:59 +11:00
Dimitris Athanasiou	5ce7c99e74	[7.x][ML] Data frame analytics data counts (#53998 ) (#54031 ) This commit instruments data frame analytics with stats for the data that are being analyzed. In particular, we count training docs, test docs, and skipped docs. In order to account docs with missing values as skipped docs for analyses that do not support missing values, this commit changes the extractor so that it only ignores docs with missing values when it collects the data summary, which is used to estimate memory usage. Backport of #53998	2020-03-24 11:30:43 +02:00
Hendrik Muhs	7dcacf531f	[7.x][Transform][Rollup] add processing stats to record the ti… (#54027 ) add 2 additional stats: processing time and processing total which capture the time spent for processing results and how often it ran. The 2 new stats correspond to the existing indexing and search stats. Together with indexing and search this now allows the user to see the full picture, all 3 stages.	2020-03-24 09:22:02 +01:00
Mark Vieira	cff10368b8	Add remote debug run configuration for IntelliJ	2020-03-23 21:32:45 -07:00
Mark Vieira	ccceba4b74	Fix nasty errors when importing into IntelliJ	2020-03-23 21:32:37 -07:00
muachilin	b33fbe7026	Deprecate alternatives to the hot threads API (#52930 ) This commit deprecates various undocumented alternatives to the hot threads API.	2020-03-23 23:24:40 -04:00
Jason Tedor	e3ca124537	Introduce autoscaling decisions (#53934 ) This is the first in a series of commits that will introduce the autoscaling deciders framework. This commit introduces the basic framework for representing autoscaling decisions.	2020-03-23 23:08:06 -04:00
Jason Tedor	c1c9f7a735	Use onlyIf for build Docker image task execution (#54047 ) This commit switches to using an onlyIf to determine if a build Docker image task execution should occur. This is preferred since it means that the determination is performed at task execution time, rather than during configuration.	2020-03-23 22:53:08 -04:00
Tim Vernum	4bd853a6f2	Add "grant_api_key" cluster privilege (#54042 ) This change adds a new cluster privilege "grant_api_key" that allows the use of the new /_security/api_key/grant endpoint Backport of: #53527	2020-03-24 13:17:45 +11:00
Jim Ferenczi	9e3f7f4575	Add heuristics to compute pre_filter_shard_size when unspecified (#53873 ) (#54007 ) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes #39835	2020-03-24 02:05:15 +01:00
Nik Everett	4734c645f1	Fix serialization bug for aggs (#54029 ) I created this bug today in #53793. When a `DelayableWriteable` that references an existing object serializes itself it wasn't taking the version of the node on the other side of the wire into account. This fixes that.	2020-03-23 19:00:47 -04:00
Benjamin Trent	19af869243	[ML] adds multi-class feature importance support (#53803 ) (#54024 ) Adds multi-class feature importance calculation. Feature importance objects are now mapped as follows (logistic) Regression: ``` { "feature_name": "feature_0", "importance": -1.3 } ``` Multi-class [class names are `foo`, `bar`, `baz`] ``` { “feature_name”: “feature_0”, “importance”: 2.0, // sum(abs()) of class importances “foo”: 1.0, “bar”: 0.5, “baz”: -0.5 }, ``` For users to get the full benefit of aggregating and searching for feature importance, they should update their index mapping as follows (before turning this option on in their pipelines) ``` "ml.inference.feature_importance": { "type": "nested", "dynamic": true, "properties": { "feature_name": { "type": "keyword" }, "importance": { "type": "double" } } } ``` The mapping field name is as follows `ml.<inference.target_field>.<inference.tag>.feature_importance` if `inference.tag` is not provided in the processor definition, it is not part of the field path. `inference.target_field` is defaulted to `ml.inference`. //cc @lcawl ^ Where should we document this? If this makes it in for 7.7, there shouldn't be any feature_importance at inference BWC worries as 7.7 is the first version to have it.	2020-03-23 18:49:07 -04:00
Jason Tedor	5c96a7e210	Fix compilation in RemoteClusterServiceTests This commit fixes an issue when a JDK collection convenience method not available in JDK 8 was backported to 7.x.	2020-03-23 18:41:17 -04:00
Gordon Brown	e225f08613	Mute TransformSurvivesUpgradeIT.testTransformRollingUpgrade (#54037 )	2020-03-23 16:38:04 -06:00
Julie Tibshirani	df7cfb3a5b	Remove the top-level 'mapping type' section. (#54035 ) It seemed confusing for users that our top-level mapping page still had a prominent section named 'Mapping Type'. This PR reworks the docs to remove this reference and adds a note about types removal (similar to the note we added to other APIs like put mapping).	2020-03-23 15:34:23 -07:00
Jason Tedor	d3cc5bff17	Give helpful message on remote connections disabled (#53690 ) Today when cluster.remote.connect is set to false, and some aspect of the codebase tries to get a remote client, today we return a no such remote cluster exception. This can be quite perplexing to users, especially if the remote cluster is actually defined in their cluster state, it is only that the local node is not a remote cluter client. This commit addresses this by providing a dedicated error message when a remote cluster is not available because the local node is not a remote cluster client.	2020-03-23 18:32:38 -04:00
Mark Vieira	70cfedf542	Refactor global build info plugin to leverage JavaInstallationRegistry (#54026 ) This commit removes the configuration time vs execution time distinction with regards to certain BuildParms properties. Because of the cost of determining Java versions for configuration JDK locations we deferred this until execution time. This had two main downsides. First, we had to implement all this build logic in tasks, which required a bunch of additional plumbing and complexity. Second, because some information wasn't known during configuration time, we had to nest any build logic that depended on this in awkward callbacks. We now defer to the JavaInstallationRegistry recently added in Gradle. This utility uses a much more efficient method for probing Java installations vs our jrunscript implementation. This, combined with some optimizations to avoid probing the current JVM as well as deferring some evaluation via Providers when probing installations for BWC builds we can maintain effectively the same configuration time performance while removing a bunch of complexity and runtime cost (snapshotting inputs for the GenerateGlobalBuildInfoTask was very expensive). The end result should be a much more responsive build execution in almost all scenarios. (cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)	2020-03-23 15:30:10 -07:00
Jason Tedor	c97ee4e695	Fix classifier on OSS Linux aarch64 archive This commit fixes the classifier on the OSS Linux aarch64 archive.	2020-03-23 18:19:29 -04:00
Mark Vieira	be1b34c3f8	Mute BlobStoreIncrementalityIT.testIncrementalBehaviorOnPrimaryFailover	2020-03-23 15:15:30 -07:00
Nik Everett	b9bfba2c8b	Move pipeline agg validation to coordinating node (backport of #53669 ) (#54019 ) This moves the pipeline aggregation validation from the data node to the coordinating node so that we, eventually, can stop sending pipeline aggregations to the data nodes entirely. In fact, it moves it into the "request validation" stage so multiple errors can be accumulated and sent back to the requester for the entire request. We can't always take advantage of that, but it'll be nice for folks not to have to play whack-a-mole with validation. This is implemented by replacing `PipelineAggretionBuilder#validate` with: ``` protected abstract void validate(ValidationContext context); ``` The `ValidationContext` handles the accumulation of validation failures, provides access to the aggregation's siblings, and implements a few validation utility methods.	2020-03-23 17:22:56 -04:00
James Rodewig	43199a8c82	[DOCS] Remove double space in WDG docs	2020-03-23 17:18:04 -04:00
James Rodewig	553d8a9ca9	[DOCS] Fix "letter case" typo Changes "lettercase" to "letter case" in the `uppercase` token filter docs.	2020-03-23 17:11:59 -04:00
Jason Tedor	bc7b995523	Use deprecation logger holder in byte size value (#53928 ) If a setting is touched during bootstrap before logging is configured, and that setting uses a byte size value, the deprecation logger for ByteSizeValue will be initialized. However, this means a logger will be configured before log4j is initialized, which we reject at startup. This commit puts this deprecation logger in a holder pattern so that it is not initialized until first use, which will happen after logging is configured.	2020-03-23 17:06:12 -04:00
Mark Vieira	f0a015cae7	Clarify IntelliJ import instructions	2020-03-23 13:31:55 -07:00
Marios Trivyzas	3a3e964956	Reduce performance impact of ExitableDirectoryReader (#53978 ) (#54014 ) Benchmarking showed that the effect of the ExitableDirectoryReader is reduced considerably when checking every 8191 docs. Moreover, set the cancellable task before calling QueryPhase#preProcess() and make sure we don't wrap with an ExitableDirectoryReader at all when lowLevelCancellation is set to false to avoid completely any performance impact. Follows: #52822 Follows: #53166 Follows: #53496 (cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)	2020-03-23 21:30:34 +01:00
Christoph Büscher	286c3660bd	Add async_search get and delete APIs to HLRC (#53828 ) (#53980 ) This commit adds the "_async_searhc" get and delete APIs to the AsyncSearchClient in the High Level Rest Client. Relates to #49091 Backport of #53828	2020-03-23 21:21:36 +01:00
Benjamin Trent	d276058c6c	[ML] adjusting feature importance mapping for multi-class support (#53821 ) (#54013 ) Feature importance storage format is changing to encompass multi-class. Feature importance objects are now mapped as follows (logistic) Regression: ``` { "feature_name": "feature_0", "importance": -1.3 } ``` Multi-class [class names are `foo`, `bar`, `baz`] ``` { “feature_name”: “feature_0”, “importance”: 2.0, // sum(abs()) of class importances “foo”: 1.0, “bar”: 0.5, “baz”: -0.5 }, ``` This change adjusts the mapping creation for analytics so that the field is mapped as a `nested` type. Native side change: https://github.com/elastic/ml-cpp/pull/1071	2020-03-23 15:50:12 -04:00
Nik Everett	181bc807be	Try to save memory on aggregations (backport of #53793 ) (#53996 ) This delays deserializing the aggregation response try until right before we merge the objects.	2020-03-23 15:45:22 -04:00
Jason Tedor	80c24a0d62	Fix aarch64 OSS archive packaging This commit fixes the OSS aarch64 packaging to use the aarch64 JDK.	2020-03-23 15:07:25 -04:00
Jason Tedor	5a52ee3ef8	Fix typo in jdk-download testKit build.gradle This commit fixes a typo in the jdk-download testKit build.gradle file where "architecture" was not spelled correctly.	2020-03-23 15:05:30 -04:00
Jason Tedor	bf65bea6f4	Introduce aarch64 Docker image (#53936 ) This commit introduces the infrastructure needed to build a Docker image for aarch64.	2020-03-23 15:03:35 -04:00
Namgyu Kim	bc2289c258	Add nori_number token filter in analysis-nori (#53583 ) This change adds the `nori_number` token filter. It also adds a `discard_punctuation` option in nori_tokenizer that should be used in conjunction with the new filter.	2020-03-23 19:53:34 +01:00
Przemysław Witek	88c5d520b3	[7.x] Verify that the field is aggregatable before attempting cardinality aggregation (#53874 ) (#54004 )	2020-03-23 19:36:33 +01:00
Dan Hermann	ce31997ab2	disable check for non-snapshot builds for data streams feature flag (#54000 )	2020-03-23 13:29:51 -05:00
Paweł Krześniak	c0534f4157	[DOCS] link fix (#53973 ) Fix bad link in top_metrics.	2020-03-23 14:20:54 -04:00
Luca Cavanna	932a7e3112	Backport of async search changes (#53976 ) * Get Async Search: omit _clusters section when empty (#53907) The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content. This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`. * Async search: remove version from response (#53960) The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available). That said this commit clarifies this in the docs and removes the version field from the async search response * Async Search: replicas to auto expand from 0 to 1 (#53964) This way single node clusters that are green don't go yellow once async search is used, while all the others still have one replica. * [DOCS] address timing issue in async search docs tests (#53910) The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final. With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response. Closes #53887 Closes #53891	2020-03-23 19:13:31 +01:00
Dimitris Athanasiou	965af3a68b	[7.x][ML] Delete DF analytics stats upon job deletion (#53933 ) (#53997 ) Since a data frame analytics job may have associated docs in the .ml-stats-* indices, when the job is deleted we should delete those docs too. Backport of #53933	2020-03-23 19:55:36 +02:00
Dimitris Athanasiou	08a8345269	[7.x][ML] Fix typo in outlier detection timing stats (#53988 ) (#53995 ) The field holding the timing stats was mistakenly called `timings_stats`. Backport of #53988	2020-03-23 19:46:39 +02:00

1 2 3 4 5 ...

50749 Commits All Branches Search

50749 Commits

All Branches