OpenSearch

Commit Graph

Author	SHA1	Message	Date
Ioannis Kakavas	74eeecf91b	Fix testGenerateAndSignMetadata in FIPS mode (#54115 ) (#54387 ) BC provider throws different error message on signature validation failure	2020-04-01 12:04:20 +03:00
Jason Tedor	63e5f2b765	Rename META_DATA to METADATA This is a follow up to a previous commit that renamed MetaData to Metadata in all of the places. In that commit in master, we renamed META_DATA to METADATA, but lost this on the backport. This commit addresses that.	2020-03-31 17:30:51 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Zachary Tong	c9db2de41d	[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451 ) * Comprehensively test supported/unsupported field type:agg combinations (#52493) This adds a test to AggregatorTestCase that allows us to programmatically verify that an aggregator supports or does not support a particular field type. It fetches the list of registered field type parsers, creates a MappedFieldType from the parser and then attempts to run a basic agg against the field. A supplied list of supported VSTypes are then compared against the output (success or exception) and suceeds or fails the test accordingly. Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com> * Skip fields that are not aggregatable * Use newIndexSearcher() to avoid incompatible readers (#52723) Lucene's `newSearcher()` can generate readers like ParallelCompositeReader which we can't use. We need to instead use our helper `newIndexSearcher`	2020-03-31 14:35:03 -04:00
David Roberts	b8f06df53f	[ML] Fix bug, add tests, improve estimates for estimate_model_memory (#54508 ) This PR: 1. Fixes the bug where a cardinality estimate of zero could cause a 500 status 2. Adds tests for that scenario and a few others 3. Adds sensible estimates for the cases that were previously TODO Backport of #54462	2020-03-31 17:59:38 +01:00
David Kyle	9150e77269	[7.x] Remove unused environment from anomaly detector classes (#54399 ) (#54456 )	2020-03-31 16:55:37 +01:00
Dimitris Athanasiou	e4230c533c	[7.x][ML] Move DFA MemoryUsage to stats.common pkg (#54492 ) (#54512 ) This belongs in stats.common Backport of #54492	2020-03-31 18:36:05 +03:00
Andrei Stefan	977302e46c	EQL: startsWith and endsWith functions implementation (#54504 ) * EQL: startsWith function implementation (#54400) (cherry picked from commit 666719fcfc40f6fc0535609577791369123320ab) * EQL: endsWith function implementation (#54442) (cherry picked from commit 554a4c8ef04b67eed107d29b57185e9af25d9d4f)	2020-03-31 18:06:03 +03:00
Dimitris Athanasiou	6d96ca9bc8	[7.x][ML] Reenable classification and regression integ tests (#54489 ) (#54494 ) Relates #54401 Backport of #54489	2020-03-31 17:50:08 +03:00
Andrei Stefan	364ea0a3c0	EQL: Length function implementation (#54209 ) (#54490 ) (cherry picked from commit 18493467e55e014be2c9e0ebdf734e9d7fc4beaa)	2020-03-31 16:49:18 +03:00
Ioannis Kakavas	349293da6d	Mute failing test (#54446 ) (#54487 ) see #54445	2020-03-31 15:56:10 +03:00
Tim Vernum	a0853628cd	Add wildcard service providers to IdP (#54477 ) This adds the ability for the IdP to define wildcard service providers in a JSON file within the ES node's config directory. If a request is made for a service provider that has not been registered, then the set of wildcard services is consulted. If the SP entity-id and ACS match one of the wildcard patterns, then a dynamic service provider is defined from the associated mustache template. Backport of: #54148	2020-03-31 16:53:13 +11:00
Jason Tedor	5d760051a9	Clarify autoscaling feature flag registration (#54427 ) This commit clarifies the autoscaling feature flag registration system property. The intention is that this system property is: - unset in snapshot builds - unset, true, or false in release builds - in release builds, unset behaves the same as false - therefore, we only register the enabled flag if the build is a snapshot build, or the build is a release build and the system property is set to true This commit clarifies that intention, and removed a confusion situation where the AUTOSCALING_FEATURE_FLAG_REGISTERED field would be set to false in a snapshot build, even though we were going to register the setting.	2020-03-30 21:37:25 -04:00
Ross Wolf	d11e977b1f	EQL: Use In from QL (#53244 ) * EQL: Use In from QL * EQL: Add more In tests * EQL: Test In duplicates * EQL: Add test for In mixed types * EQL: Copy In translation to QL * SQL: Use InComparisons from QL * EQL: Remove boost checks from QueryFolderOkTests * QL: Add TranslatorHandler.convert	2020-03-30 15:19:23 -06:00
Dimitris Athanasiou	b4b54efa73	[7.x][ML] Hyperparameter names should match config (#54401 ) (#54435 ) Java side of elastic/ml-cpp#1096 Backport of #54401	2020-03-30 23:32:40 +03:00
Ryan Ernst	c9421594bf	Remove allowTrial flag in license checking (#54293 ) The allowTrial flag is always true, since trial licenses act as though everything is licensed. This commit removes the allowTrial flag in license checking helper methods.	2020-03-30 12:22:38 -07:00
Nik Everett	e58ad9fed3	Clean up how pipeline aggs check for multi-bucket (backport of #54161 ) (#54379 ) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes #53215	2020-03-30 10:44:55 -04:00
Jason Tedor	39b3010578	Add node local storage deprecation check (#54383 ) The node.local_storage setting has been deprecated and will be removed in 8.0.0. This commit adds a deprecation check to 7.x.	2020-03-30 10:23:43 -04:00
Christoph Büscher	67b9b68c66	[Docs] Add HLRC Async Search API documentation (#54353 ) Adds documentation and a corresponding test case containing typical API usage for the Async Search API to the High Level Rest Client.	2020-03-30 15:37:22 +02:00
Przemysław Witek	3c604da7f6	[7.x] Create an annotation when a model snapshot is stored (#53783 ) (#54405 )	2020-03-30 15:17:08 +02:00
Benjamin Trent	374e76d7cd	[Transform] fixing naming in HLRC and _cat to match API content (#54300 ) (#54408 ) Fixing the naming of the HLRC values to match the ToXContent field names (i.e. the field names returned from an API call). Also fixes the names in the _cat API as well. closes #53946	2020-03-30 08:57:02 -04:00
Martijn van Groningen	4b4fbc160d	Refactor AliasOrIndex abstraction. (#54394 ) Backport of #53982 In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams, the abstraction needs to be made more flexible, because currently it really can be only an alias or an index. * Renamed `AliasOrIndex` to `IndexAbstraction`. * Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is. * Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum. * Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface. * Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`. * Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method. Relates to #53100	2020-03-30 10:12:16 +02:00
Jason Tedor	d2aced810d	Add assertion for get autoscaling decision API test This commit adds a match assertion to the get autoscaling decision REST test.	2020-03-29 14:36:38 -04:00
Jason Tedor	512a318b4b	Do not stash environment in security (#54372 ) Today the security plugin stashes a copy of the environment in its constructor, and uses the stashed copy to construct its components even though it is provided with an environment to create these components. What is more, the environment it creates in its constructor is not fully initialized, as it does not have the final copy of the settings, but the environment passed in while creating components does. This commit removes that stashed copy of the environment.	2020-03-28 12:47:16 -04:00
Jason Tedor	cf68ac8a2c	Do not stash environment in machine learning (#54371 ) Today the machine learning plugin stashes a copy of the environment in its constructor, and uses the stashed copy to construct its components even though it is provided with an environment to create these components. What is more, the environment it creates in its constructor is not fully initialized, as it does not have the final copy of the settings, but the environment passed in while creating components does. This commit removes that stashed copy of the environment.	2020-03-28 12:46:16 -04:00
Tim Brooks	2ccddbfa88	Move transport decoding and aggregation to server (#54360 ) Currently all of our transport protocol decoding and aggregation occurs in the individual transport modules. This means that each implementation (test, netty, nio) must implement this logic. Additionally, it means that the entire message has been read from the network before the server package receives it. This commit creates a pipeline in server which can be passed arbitrary bytes to handle. Internally, the pipeline will decode, decompress, and aggregate the messages. Additionally, this allows us to run many megabytes of bytes through the pipeline in tests to ensure that the logic works. This work will enable future work: Circuit breaking or backoff logic based on message type and byte in the content aggregator. Sharing bytes with the application layer using the ref counted releasable network bytes. Improved network monitoring based specifically on channels. Finally, this fixes the bug where we do not circuit break on the correct message size when compression is enabled.	2020-03-27 14:13:10 -06:00
Stuart Tettemer	1630de4a42	Scripting: stats per context in nodes stats (#54008 ) (#54357 ) Adds script cache stats to `_node/stats`. If using the general cache: ``` "script_cache": { "sum": { "compilations": 12, "cache_evictions": 9, "compilation_limit_triggered": 5 } } ``` If using context caches: ``` "script_cache": { "sum": { "compilations": 13, "cache_evictions": 9, "compilation_limit_triggered": 5 }, "contexts": [ { "context": "aggregation_selector", "compilations": 8, "cache_evictions": 6, "compilation_limit_triggered": 3 }, { "context": "aggs", "compilations": 5, "cache_evictions": 3, "compilation_limit_triggered": 2 }, ``` Backport of: 32f46f2 Refs: #50152	2020-03-27 12:26:00 -06:00
Lee Hinman	f2cc2b1127	[7.x] Add REST APIs for IndexTemplateV2Metadata CRUD (#54039 ) (#54347 ) * Add REST APIs for IndexTemplateV2Metadata CRUD (#54039) * Add REST APIs for IndexTemplateV2Metadata CRUD This commit adds the get/put/delete APIs for interacting with the now v2 versions of index templates. These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag. Relates to #53101 * Add exceptions for HLRC tests * Add skips for 7.x versions * Use index_template instead of template_v2 in action names * Add test for MetaDataIndexTemplateService.addIndexTemplateV2 * Move removal to static method and add test * Add unit tests for request classes (implement hashCode & equals) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Fix compilation Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-27 10:47:22 -06:00
Christoph Büscher	0d17295601	[Docs] Minor fix for SubmitAsyncSearchRequest.keepOnCompletion javadoc (#54325 ) The semantics and the default value for this parameter have changed, adapting the javadoc accordingly.	2020-03-27 16:02:03 +01:00
Przemysław Witek	2eb079b67f	Add version guards around ML hidden indices settings (#54322 )	2020-03-27 14:50:57 +01:00
Ioannis Kakavas	5983f6aceb	Mute testSpInitiatedSsoFailsForMalformedRequest (#54328 ) (#54339 ) see #54285	2020-03-27 15:46:08 +02:00
Yannick Welsch	8126ad0ab1	Increase timeout on testUpdateAnalysisLeaderIndexSettings Closes #54204	2020-03-27 13:41:47 +01:00
Przemysław Witek	d40afc7871	[7.x] Do not fail Evaluate API when the actual and predicted fields' types differ (#54255 ) (#54319 )	2020-03-27 10:05:19 +01:00
Jason Tedor	c547fabb2b	Put CCR tasks on (data && remote cluster clients) (#54146 ) Today we assign CCR persistent tasks to nodes with the data role. It could be that the data node is not capable of connecting to remote clusters, in which case the task will fail since it can not connect to the remote cluster with the leader shard. Instead, we need to assign such tasks to nodes that are capable of connecting to remote clusters. This commit addresses this by enabling such persistent tasks to only be assigned to nodes that have the data role, and also have the remote cluster client role.	2020-03-26 23:50:16 -04:00
Hendrik Muhs	4ecf9904d5	[Transform] Transform optmize date histogram (#54068 ) optimize transform for group_by on date_histogram by injecting an additional range query. This limits the number of search and index requests and avoids unnecessary updates. Only recent buckets get re-written. fixes #54254	2020-03-26 21:39:50 +01:00
Gordon Brown	0d30b48613	Disallow negative TimeValues (#53913 ) This commit causes negative TimeValues, other than -1 which is sometimes used as a sentinel value, to be rejected during parsing. Also introduces a hack to allow ILM to load policies which were written to the cluster state with a negative min_age, treating those values as 0, which should match the behavior of prior versions.	2020-03-26 13:30:35 -06:00
William Brafford	14204f8381	Use set-based interface for NodesStatsRequest (#53637 ) (#54141 ) The NodesStatsRequest class uses a set of strings for its internal serialization. This commit updates the class's interface so that we no longer use hard-coded getters and setters, but rather methods that add strings directly. For example, the old way of adding "os" metrics to a request would be to call request.os(true). The new way of doing this is to call request.addMetric("os"). For the time being, the canonical list of metrics is an enum in NodesStatsRequest. This will eventually be replaced with something pluggable.	2020-03-26 14:41:49 -04:00
Dimitris Athanasiou	13368aae37	[7.x][ML] DF Analytics should always display operational stats (#54210 ) (#54290 ) This commit populates the _stats API response with sensible "empty" `data_counts` and `memory_usage` objects when the job itself has not started reporting them. Backport of #54210	2020-03-26 20:03:14 +02:00
Christoph Büscher	da404bbce2	HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200 ) (#54266 ) Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no longer needed since we now only send the parameters along with the rest request that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct defaults are set on the client side. This change removes setting and sending these defaults where possible, leaving only the overwrite of batchedReduceSize with a default value of 5, since the default used in the vanilla SearchRequest is 512. However, we don't need to send this value along as a request parameter if its the default since the correct one will be set on the receiving end if no value is specified. Also adding tests for RestSubmitAsyncSearchAction that check the correct defaults are set when parameters are missing on the server side. Backport of #54200	2020-03-26 19:01:17 +01:00
David Turner	fc92bf4208	assertBusy in XPackRestIT#awaitCallApi (#54264 ) Retries in this method were lost in #45794. This commit reinstates them.	2020-03-26 16:16:05 +00:00
Dimitris Athanasiou	cc981fa377	[7.x][ML] Get ML filters size should default to 100 (#54207 ) (#54278 ) When get filters is called without setting the `size` paramter only up to 10 filters are returned. However, 100 filters should be returned. This commit fixes this and adds an integ test to guard it. It seems this was accidentally broken in #39976. Closes #54206 Backport of #54207	2020-03-26 17:51:43 +02:00
David Turner	f48e8f31b9	AwaitsFix for #54180	2020-03-26 15:35:36 +00:00
Gordon Brown	bbc6bc0299	Fix WatcherRestartIT.testWatcherRestart (#54237 ) This commit adjusts testWatcherRestart to vary the template version number it checks for based on the ES version being upgraded from, because the v11 template is only installed on clusters with all nodes >=7.7.0.	2020-03-26 08:12:15 -06:00
David Turner	ffe1ba3754	Add error_trace parameter to REST test helper (#54259 ) Today the `XPackRestTestHelper` makes some REST calls without the `error_trace` parameter, so that if they fail due to an exception we do not see very much detail. This commit adds the `error_trace` parameter to help identify why these REST calls fail.	2020-03-26 14:04:52 +00:00
David Turner	ad3c96e250	AwaitsFix for #54093	2020-03-26 13:24:33 +00:00
David Turner	53e2fec93d	AwaitsFix for #53612	2020-03-26 10:41:37 +00:00
Yannick Welsch	1ba6783780	Schedule commands in current thread context (#54187 ) Changes ThreadPool's schedule method to run the schedule task in the context of the thread that scheduled the task. This is the more sensible default for this method, and eliminates a range of bugs where the current thread context is mistakenly dropped. Closes #17143	2020-03-26 10:07:59 +01:00
Luca Cavanna	ff269160af	Async search: rename REST parameters (#54198 ) This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069	2020-03-26 09:40:50 +01:00
Yang Wang	1afd510721	Check authentication type using enum instead of string (#54145 ) (#54246 ) Avoid string comparison when we can use safer enums. This refactor is a follow up for #52178. Resolves: #52511	2020-03-26 15:45:10 +11:00
Tim Vernum	1fc518c25e	Improve stability of SamlServiceProviderIndexTests (#54241 ) This test assumed cluster events would be processed quickly which is not always true Backport of: #54166	2020-03-26 13:07:42 +10:00
Ryan Ernst	5a5d6e9ef2	Invert license security disabled helper method (#54043 ) (#54239 ) Xpack license state contains a helper method to determine whether security is disabled due to license level defaults. Most code needs to know whether security is enabled, not disabled, but this method exists so that the security being explicitly disabled can be distinguished from licence level defaulting to disabled. However, in the case that security is explicitly disabled, the handlers in question are never registered, so security is implicitly not disabled explicitly, and thus we can share a single method to know whether licensing is enabled.	2020-03-25 19:20:10 -07:00
Benjamin Trent	6d68cf809c	[Transform] Remove node.attr.transform.remote_connect and use new remote cluster client node role (#54217 ) (#54224 ) With the addition of a formal role for nodes indicating remote cluster connection, the transform specific attribute `node.attr.transform.remote_connect` is no longer necessary. closes https://github.com/elastic/elasticsearch/issues/54179	2020-03-25 16:29:02 -04:00
Nik Everett	8f40f1435a	Save a little space in agg tree (backport of #53730 ) (#54213 ) This drop the "top level" pipeline aggregators from the aggregation result tree which should save a little memory and a few serialization bytes. Perhaps more imporantly, this provides a mechanism by which we can remove all pipelines from the aggregation result tree. This will save quite a bit of space when pipelines are deep in the tree. Sadly, doing this isn't simple because of backwards compatibility. Nodes before 7.7.0 need those pipelines. We provide them by setting passing a `Supplier<PipelineTree>` into the root of the aggregation tree that we only call if we need to serialize to a version before 7.7.0. This solution works for cross cluster search because we always reduce the aggregations in each remote cluster and then forward them back to the coordinating node. Its quite possible that the coordinating node needs the pipeline (say it is version 7.1.0) and the gateway node in the remote cluster doesn't (version 7.7.0). In that case the data nodes won't send the pipeline aggregations back to the gateway node. Critically, the gateway node will send the pipeline aggregations back to the coordinating node. This is all managed with that `Supplier<PipelineTree>`, but how it is managed is a bit tricky.	2020-03-25 15:51:16 -04:00
Jason Tedor	d14f170093	Add cluster.remote.connect to deprecation info API (#54142 ) This setting was recently deprecated in favor of node.remote_cluster_client. This commit adds this setting to the deprecation info API.	2020-03-25 15:11:59 -04:00
Nik Everett	b8b7516790	Disable WatcherRestartIT from 7.7.0 It is failing. Tracked in #54220.	2020-03-25 14:51:33 -04:00
Hendrik Muhs	cb0ecafdd8	[Transform] fix transform failure case for percentiles and spa… (#54202 ) index null if percentiles could not be calculated due to sparse data fixes #54201	2020-03-25 19:28:51 +01:00
Armin Braun	70b378cd1b	Upgrade GCS Dependency to 1.106.0 (#54092 ) (#54112 ) * Upgrade GCS Dependency to 1.106.0 (#54092) Upgrading GCS Dep + related dependencies as it seems some more retry bugs were fixed between .104 and .106	2020-03-25 19:05:01 +01:00
Martijn Laarman	077bf52acc	transform.cat should live in the cat namespace. (#54196 ) * transform.cat should live in the cat namespace. Similarly to to ml cat API's also living in the `cat` namespace. Clients treat the `cat` namespace differently then other API's (return types, content types). This introduces an exception to this rule. * rename the specification file as well (cherry picked from commit 0a98904b1a73a30bbaebc32bd16a238c8d03c329)	2020-03-25 18:16:01 +01:00
Mark Vieira	7728ccd920	Encore consistent compile options across all projects (#54120 ) (cherry picked from commit ddd068a7e92dc140774598664efdc15155ab05c2)	2020-03-25 08:24:21 -07:00
Dimitris Athanasiou	ba09a778dc	[7.x][ML] Unmute classification cardinality integ test (#54165 ) (#54173 ) Adjusts test to work for new cardinality limit. Backport of #54165	2020-03-25 15:00:34 +02:00
Benjamin Trent	ef05a4f416	[ML] relaxing parameters on stratified split test (#54127 ) (#54168 ) Relaxing the error rate a bit on two of the tests. Ran 1000s of times locally and never had a failure after these changes. closes https://github.com/elastic/elasticsearch/issues/54122	2020-03-25 08:06:15 -04:00
Tanguy Leroux	3a3930c7ec	Mute TooManyJobsIT.testCloseFailedJob on 7.x (#54163 ) Relates #54162	2020-03-25 12:44:41 +01:00
Tanguy Leroux	4a2db4651e	Mute ReadActionsTests (#54153 ) Relates #53340	2020-03-25 10:35:58 +01:00
Jason Tedor	381d7586e4	Introduce formal role for remote cluster client (#54138 ) This commit introduce a formal role for identifying nodes that are capable of making connections to remote clusters. Relates #53924	2020-03-24 21:59:43 -04:00
Oliver Gupte	96f0c668a8	[APM] Allow kibana to collect APM telemetry in background task (#52917 ) (#54106 ) * Required for elastic/kibana#50757. Allows the kibana user to collect APM telemetry in a background task. * removed unnecessary priviledges on `.ml-anomalies-*` for the `kibana_system` reserved role	2020-03-24 18:11:19 -07:00
David Roberts	7667004b20	[ML] Add a model memory estimation endpoint for anomaly detection (#54129 ) A new endpoint for estimating anomaly detection job model memory requirements: POST _ml/anomaly_detectors/estimate_model_memory Backport of #53507	2020-03-24 22:55:11 +00:00
Ioannis Kakavas	7c0123d6f3	Add SAML IdP plugin for internal use (#54046 ) (#54124 ) This change merges the "feature-internal-idp" branch into Elasticsearch. This introduces a small identity-provider plugin as a child of the x-pack module. This allows ES to act as a SAML IdP, for users who are authenticated against the Elasticsearch cluster. This feature is intended for internal use within Elastic Cloud environments and is not supported for any other use case. It falls under an enterprise license tier. The IdP is disabled by default. Co-authored-by: Ioannis Kakavas <ioannis@elastic.co> Co-authored-by: Tim Vernum <tim.vernum@elastic.co>	2020-03-25 09:45:13 +11:00
Gordon Brown	82e041442e	Add version guards around Transform hidden index settings (#54036 ) This commit ensures that the hidden index settings are only applied to the Transform index templates when the cluster can support those settings. Also unmutes the tests which were failing due to the previous behavior.	2020-03-24 15:52:56 -06:00
Ross Wolf	627ca03c72	EQL: Remove parser handling for functions (#54028 ) * EQL: Remove parser handling for functions * EQL: Comment out array functions in queries-unsupported.eql	2020-03-24 14:03:02 -06:00
Costin Leau	68f74cf593	EQL: Fix custom scripting for functions (#53935 ) (#54114 ) Improve separation of scripting between EQL and SQL by delegating common methods to QL. The context detection is determined based on the package to avoid having repetitive class hierarchies. The Painless whitelists have been improved so that the declaring class is used instead of the inherited one. Relates #53688 (cherry picked from commit 6d46033e736c64ac9255c5d6964600d2a931430a) EQL: Add Substring function with Python semantics (#53688) Does not reuse substring from SQL due to the difference in semantics and the accepted arguments. Currently it is missing full integration tests as, due to the usage of scripting, requires an actual integration test against a proper cluster (and likely its own QA project). (cherry picked from commit f58680bad33d5ce4139157a69a4d9f5f286bc3c4)	2020-03-24 20:54:19 +02:00
markharwood	6a60f85bba	Wildcard field - add normalizer support (#53851 ) (#54109 ) Backport support for normalisation to wildcard field Closes #53603	2020-03-24 17:37:47 +00:00
Dimitris Athanasiou	c141c1dd89	[7.x][ML] Stratified cross validation split for classification (#54087 ) (#54104 ) As classification now works for multiple classes, randomly picking training/test data frame rows is not good enough. This commit introduces a stratified cross validation splitter that maintains the proportion of the each class in the dataset in the sample that is used for training the model. Backport of #54087	2020-03-24 18:47:36 +02:00
Yannick Welsch	e006d1f6cf	Use special XContent registry for node tool (#54050 ) Fixes an issue where the elasticsearch-node command-line tools would not work correctly because PersistentTasksCustomMetaData contains named XContent from plugins. This PR makes it so that the parsing for all custom metadata is skipped, even if the core system would know how to handle it. Closes #53549	2020-03-24 17:40:51 +01:00
Luca Cavanna	6b457abbd3	Async search: prevent users from overriding pre_filter_shard_size (#54088 ) Submit async search forces pre_filter_shard_size for the underlying search that it creates. With this commit we also prevent users from overriding such default as part of request validation.	2020-03-24 17:06:04 +01:00
Luca Cavanna	3c67762f1b	Async search response: output start and expiration time as time fields (#54084 ) This commits makes start_time and expiration_time time fields, so that their date variant will be printed out when human readable output is requested.	2020-03-24 17:05:56 +01:00
Jim Ferenczi	0330bef409	Improve async search's tasks cancellation (#53799 ) This commit adds an explicit cancellation of the search task if the initial async search submit task is cancelled (connection closed by the user). This was previously done through the cancellation of the parent task but we don't handle grand-children cancellation yet so we have to manually cancel the search task in order to ensure that shard actions are cancelled too. This change can be considered as a workaround until #50990 is fixed.	2020-03-24 15:51:10 +01:00
Andrei Stefan	3234b50e95	SQL: jdbc debugging enhancement (#53880 ) (#54081 ) * add flush always output option that will flush the output printer after each debug message when enabled (disabled by default) * at debug output initializationtime, log debug output information about OS, JVM and default JVM timezone (cherry picked from commit b5db9657d1eadce9902041e5b128bf32c02d302a)	2020-03-24 16:09:53 +02:00
Alan Woodward	39d7d0dc10	Upgrade to lucene 8.5.0 release (#54077 ) Upgrades our lucene dependency to the released 8.5.0 version.	2020-03-24 13:45:50 +00:00
David Roberts	1421471556	[ML] Introduce a "starting" datafeed state for lazy jobs (#54065 ) It is possible for ML jobs to open lazily if the "allow_lazy_open" option in the job config is set to true. Such jobs wait in the "opening" state until a node has sufficient capacity to run them. This commit fixes the bug that prevented datafeeds for jobs lazily waiting assignment from being started. The state of such datafeeds is "starting", and they can be stopped by the stop datafeed API while in this state with or without force. Backport of #53918	2020-03-24 13:00:04 +00:00
Peter Schretlen	92acb2859b	Allow kibana_system to create and invalidate API keys on behalf of other users	2020-03-24 08:38:12 -04:00
Dimitris Athanasiou	be20bb5755	[7.x][ML] No refresh on indexing DFA stats (#53977 ) (#54064 ) When we index data frame analytics stats docs we do not need to refresh immediately. Backport of #53977	2020-03-24 13:13:03 +02:00
Yang Wang	d33d20bfdc	Validate role templates before saving role mapping (#52636 ) (#54059 ) Role names are now compiled from role templates before role mapping is saved. This serves as validation for role templates to prevent malformed and invalid scripts to be persisted, which could later break authentication. Resolves: #48773	2020-03-24 20:43:59 +11:00
Dimitris Athanasiou	5ce7c99e74	[7.x][ML] Data frame analytics data counts (#53998 ) (#54031 ) This commit instruments data frame analytics with stats for the data that are being analyzed. In particular, we count training docs, test docs, and skipped docs. In order to account docs with missing values as skipped docs for analyses that do not support missing values, this commit changes the extractor so that it only ignores docs with missing values when it collects the data summary, which is used to estimate memory usage. Backport of #53998	2020-03-24 11:30:43 +02:00
Hendrik Muhs	7dcacf531f	[7.x][Transform][Rollup] add processing stats to record the ti… (#54027 ) add 2 additional stats: processing time and processing total which capture the time spent for processing results and how often it ran. The 2 new stats correspond to the existing indexing and search stats. Together with indexing and search this now allows the user to see the full picture, all 3 stages.	2020-03-24 09:22:02 +01:00
Jason Tedor	e3ca124537	Introduce autoscaling decisions (#53934 ) This is the first in a series of commits that will introduce the autoscaling deciders framework. This commit introduces the basic framework for representing autoscaling decisions.	2020-03-23 23:08:06 -04:00
Tim Vernum	4bd853a6f2	Add "grant_api_key" cluster privilege (#54042 ) This change adds a new cluster privilege "grant_api_key" that allows the use of the new /_security/api_key/grant endpoint Backport of: #53527	2020-03-24 13:17:45 +11:00
Benjamin Trent	19af869243	[ML] adds multi-class feature importance support (#53803 ) (#54024 ) Adds multi-class feature importance calculation. Feature importance objects are now mapped as follows (logistic) Regression: ``` { "feature_name": "feature_0", "importance": -1.3 } ``` Multi-class [class names are `foo`, `bar`, `baz`] ``` { “feature_name”: “feature_0”, “importance”: 2.0, // sum(abs()) of class importances “foo”: 1.0, “bar”: 0.5, “baz”: -0.5 }, ``` For users to get the full benefit of aggregating and searching for feature importance, they should update their index mapping as follows (before turning this option on in their pipelines) ``` "ml.inference.feature_importance": { "type": "nested", "dynamic": true, "properties": { "feature_name": { "type": "keyword" }, "importance": { "type": "double" } } } ``` The mapping field name is as follows `ml.<inference.target_field>.<inference.tag>.feature_importance` if `inference.tag` is not provided in the processor definition, it is not part of the field path. `inference.target_field` is defaulted to `ml.inference`. //cc @lcawl ^ Where should we document this? If this makes it in for 7.7, there shouldn't be any feature_importance at inference BWC worries as 7.7 is the first version to have it.	2020-03-23 18:49:07 -04:00
Gordon Brown	e225f08613	Mute TransformSurvivesUpgradeIT.testTransformRollingUpgrade (#54037 )	2020-03-23 16:38:04 -06:00
Mark Vieira	70cfedf542	Refactor global build info plugin to leverage JavaInstallationRegistry (#54026 ) This commit removes the configuration time vs execution time distinction with regards to certain BuildParms properties. Because of the cost of determining Java versions for configuration JDK locations we deferred this until execution time. This had two main downsides. First, we had to implement all this build logic in tasks, which required a bunch of additional plumbing and complexity. Second, because some information wasn't known during configuration time, we had to nest any build logic that depended on this in awkward callbacks. We now defer to the JavaInstallationRegistry recently added in Gradle. This utility uses a much more efficient method for probing Java installations vs our jrunscript implementation. This, combined with some optimizations to avoid probing the current JVM as well as deferring some evaluation via Providers when probing installations for BWC builds we can maintain effectively the same configuration time performance while removing a bunch of complexity and runtime cost (snapshotting inputs for the GenerateGlobalBuildInfoTask was very expensive). The end result should be a much more responsive build execution in almost all scenarios. (cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)	2020-03-23 15:30:10 -07:00
Nik Everett	b9bfba2c8b	Move pipeline agg validation to coordinating node (backport of #53669 ) (#54019 ) This moves the pipeline aggregation validation from the data node to the coordinating node so that we, eventually, can stop sending pipeline aggregations to the data nodes entirely. In fact, it moves it into the "request validation" stage so multiple errors can be accumulated and sent back to the requester for the entire request. We can't always take advantage of that, but it'll be nice for folks not to have to play whack-a-mole with validation. This is implemented by replacing `PipelineAggretionBuilder#validate` with: ``` protected abstract void validate(ValidationContext context); ``` The `ValidationContext` handles the accumulation of validation failures, provides access to the aggregation's siblings, and implements a few validation utility methods.	2020-03-23 17:22:56 -04:00
Marios Trivyzas	3a3e964956	Reduce performance impact of ExitableDirectoryReader (#53978 ) (#54014 ) Benchmarking showed that the effect of the ExitableDirectoryReader is reduced considerably when checking every 8191 docs. Moreover, set the cancellable task before calling QueryPhase#preProcess() and make sure we don't wrap with an ExitableDirectoryReader at all when lowLevelCancellation is set to false to avoid completely any performance impact. Follows: #52822 Follows: #53166 Follows: #53496 (cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)	2020-03-23 21:30:34 +01:00
Christoph Büscher	286c3660bd	Add async_search get and delete APIs to HLRC (#53828 ) (#53980 ) This commit adds the "_async_searhc" get and delete APIs to the AsyncSearchClient in the High Level Rest Client. Relates to #49091 Backport of #53828	2020-03-23 21:21:36 +01:00
Benjamin Trent	d276058c6c	[ML] adjusting feature importance mapping for multi-class support (#53821 ) (#54013 ) Feature importance storage format is changing to encompass multi-class. Feature importance objects are now mapped as follows (logistic) Regression: ``` { "feature_name": "feature_0", "importance": -1.3 } ``` Multi-class [class names are `foo`, `bar`, `baz`] ``` { “feature_name”: “feature_0”, “importance”: 2.0, // sum(abs()) of class importances “foo”: 1.0, “bar”: 0.5, “baz”: -0.5 }, ``` This change adjusts the mapping creation for analytics so that the field is mapped as a `nested` type. Native side change: https://github.com/elastic/ml-cpp/pull/1071	2020-03-23 15:50:12 -04:00
Przemysław Witek	88c5d520b3	[7.x] Verify that the field is aggregatable before attempting cardinality aggregation (#53874 ) (#54004 )	2020-03-23 19:36:33 +01:00
Luca Cavanna	932a7e3112	Backport of async search changes (#53976 ) * Get Async Search: omit _clusters section when empty (#53907) The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content. This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`. * Async search: remove version from response (#53960) The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available). That said this commit clarifies this in the docs and removes the version field from the async search response * Async Search: replicas to auto expand from 0 to 1 (#53964) This way single node clusters that are green don't go yellow once async search is used, while all the others still have one replica. * [DOCS] address timing issue in async search docs tests (#53910) The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final. With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response. Closes #53887 Closes #53891	2020-03-23 19:13:31 +01:00
Dimitris Athanasiou	965af3a68b	[7.x][ML] Delete DF analytics stats upon job deletion (#53933 ) (#53997 ) Since a data frame analytics job may have associated docs in the .ml-stats-* indices, when the job is deleted we should delete those docs too. Backport of #53933	2020-03-23 19:55:36 +02:00
Dimitris Athanasiou	08a8345269	[7.x][ML] Fix typo in outlier detection timing stats (#53988 ) (#53995 ) The field holding the timing stats was mistakenly called `timings_stats`. Backport of #53988	2020-03-23 19:46:39 +02:00
Ryan Ernst	960d1fb578	Revert "Introduce system index APIs for Kibana (#53035 )" (#53992 ) This reverts commit `c610e0893d`. backport of #53912	2020-03-23 10:29:35 -07:00
Armin Braun	5b9864db2c	Better Incrementality for Snapshots of Unchanged Shards (#52182 ) (#53984 ) Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.	2020-03-23 16:43:41 +01:00
Dimitris Athanasiou	3873510332	[7.x][ML] Refactor DFA custom processor to cross validation splitter (#53915 ) (#53956 ) While `CustomProcessor` is generic and allows for flexibility, there are new requirements that make cross validation a concept it's hard to abstract behind custom processor. In particular, we would like to add data_counts to the DFA jobs stats. Counting training VS. test docs would be a useful statistic. We would also want to add a different cross validation strategy for multiclass classification. This commit renames custom processors to cross validation splitters which allows for those enhancements without cryptically doing things as a side effect of the abstract custom processing. Backport of #53915	2020-03-23 17:15:14 +02:00

1 2 3 4 5 ...

5136 Commits