OpenSearch

Commit Graph

Author	SHA1	Message	Date
Michael Basnight	eee4cfaa8b	Refactor HLRC transform stats test (#48708 ) This test uses a deprecated base class, and this commit moves it over to the new class. Ref #39745	2019-10-30 14:42:37 -05:00
Michael Basnight	d63e5772c0	Cleanup HLRC graph tests to use new test style (#48644 ) The old graph tests were duplicated a lot and used a deprecated parent class. This commit cleans that up and removes one of the duplicated tests. Ref #39745	2019-10-30 14:42:22 -05:00
Benjamin Trent	c9ead80c31	[7.x] [ML][Inference] separating definition and config object storage (#48651 ) (#48695 ) * [ML][Inference] separating definition and config object storage (#48651) This separates out the `definition` object from being stored within the configuration object in the index. This allows us to gather the config object without decompressing a potentially large definition. Additionally, `input` is moved to the TrainedModelConfig object and out of the definition. This is so the trained input fields are accessible outside the potentially large model definition.	2019-10-30 13:27:29 -04:00
Gordon Brown	4bd514715d	Increase timeout in ILM doc test slightly (#48606 ) This assertBusy can occasionally time out on systems under heavy load, such as CI, so this commit increases the timeout.	2019-10-29 11:12:41 -07:00
Christoph Büscher	09d68e7548	Support `search_type` in Rank Evaluation API (#48542 ) (#48631 ) Adding support for the `search_type` request parameter to the Ranking Evaluation API since this parameter can impact the ranking and the metric score and should be choosen in the same way when evaluating the search as later in the real search. Closes #48503	2019-10-29 14:54:33 +01:00
Rory Hunter	da4654527b	Improve resiliency to auto-formatting in client (#48617 ) Backport of #48447. Make a number of changes so that code in the `client` directory is more resilient to automatic formatting. This covers: * Literal JSON handling: * Reformatting multiline JSON to embed whitespace in the strings * Remove string concatenation where JSON fits on a single line * Use `String.format` for large documents with variable content * Remove some erroneous doc refs in `QueryDSLDocumentationTests` * Move some comments around to they aren't auto-formatted to a strange place	2019-10-29 10:40:54 +00:00
Mark Vieira	e5c6440a4f	Simplify usage of Gradle Shadow plugin (#48478 ) (#48597 ) This commit simplifies and standardizes our usage of the Gradle Shadow plugin to conform more to plugin conventions. The custom "bundle" plugin has been removed as it's not necessary and performs the same function as the Shadow plugin's default behavior with existing configurations. Additionally, this removes unnecessary creation of a "nodeps" artifact, which is unnecessary because by default project dependencies will in fact use the non-shadowed JAR unless explicitly depending on the "shadow" configuration. Finally, we've cleaned up the logic used for unit testing, so we are now correctly testing against the shadow JAR when the plugin is applied. This better represents a real-world scenario for consumers and provides better test coverage for incorrectly declared dependencies. (cherry picked from commit 3698131109c7e78bdd3a3340707e1c7b4740d310)	2019-10-28 12:11:55 -07:00
Benjamin Trent	6ea59dd428	[ML][Transforms] add wait_for_checkpoint flag to stop (#47935 ) (#48591 ) Adds `wait_for_checkpoint` for `_stop` API.	2019-10-28 13:02:57 -04:00
Gordon Brown	5021410165	Retry on RepositoryException in SLM tests (#48548 ) Due to a bug, GETing a snapshot can cause a RespositoryException to be thrown. This error is transient and should be retried, rather than causing the test to fail. This commit converts those RepositoryExceptions into AssertionErrors so that they will be retried in code wrapped in assertBusy.	2019-10-28 09:24:38 -07:00
Michael Basnight	1ba57dbe08	[Docs] add missing snapshot restore reference (#45256 )	2019-10-28 09:55:10 -05:00
David Turner	e821a22580	Mute SecurityDocumentationIT#testGetUsers - see #48440	2019-10-28 14:03:01 +01:00
Michael Basnight	5228956ecc	Add slices to delete and update by query in HLRC (#48420 ) The slices param was missing from both delete by query and update by query in the HLRC request converters. This commit fixes the omission.	2019-10-25 15:23:17 -05:00
Tanguy Leroux	2861088a59	Mute RestClientMultipleHostsIntegTests.testCancelAsyncRequests (#48535 ) Relates https://github.com/elastic/elasticsearch/issues/45577	2019-10-25 17:42:24 +02:00
Tim Brooks	c0b545f325	Make BytesReference an interface (#48486 ) BytesReference is currently an abstract class which is extended by various implementations. This makes it very difficult to use the delegation pattern. The implication of this is that our releasable BytesReference is a PagedBytesReference type and cannot be used as a generic releasable bytes reference that delegates to any reference type. This commit makes BytesReference an interface and introduces an AbstractBytesReference for common functionality.	2019-10-24 15:39:30 -06:00
Michael Basnight	d49958cef3	Remove deprecated test from the HLRC tests (#48424 ) The AbstractHlrcWriteableXContentTestCase was replaced by a better test case a while ago, and this is the last two instances using it. They have been converted and the test is now deleted. Ref #39745	2019-10-24 14:02:04 -05:00
Michael Basnight	c19379ef31	Remove random when using HLRC sync and async calls (#48211 ) This commit removes the randomization used by every execute call in the high level rest tests. Previously every execute call, which can be many calls per single test, would rely on a random boolean to determine if they should use the sync or async methods provided to the execute method. This commit runs the tests twice, using two different clusters, both of them providing the value one time via a sysprop. This ensures that the whole suite of tests is run using the sync and async code paths. Closes #39667	2019-10-24 09:06:17 -05:00
Przemysław Witek	aa29567e11	[7.x] Fix assignment Backport of https://github.com/elastic/elasticsearch/pull/48216	2019-10-22 11:34:09 +02:00
Martijn van Groningen	c09b62d5bf	Backport: also validate source index at put enrich policy time (#48311 ) Backport of: #48254 This changes tests to create a valid source index prior to creating the enrich policy.	2019-10-22 07:38:16 +02:00
Przemysław Witek	2db2b945ec	[7.x] Change format of MulticlassConfusionMatrix result to be more self-explanatory (#48174 ) (#48294 )	2019-10-21 22:07:19 +02:00
Lee Hinman	cc0c876a8d	fix incorrect comparison (#48208 ) (#48303 ) * remove comparison of identical values the comparison `tookInMillis == tookInMillis` is always true. * add comparison between tookInMillis	2019-10-21 09:14:44 -06:00
Przemysław Witek	1a42e37070	[7.x] Default "prediction_field_name" to (dependent_variable + "_prediction") (#48232 ) (#48279 )	2019-10-21 13:18:08 +02:00
Albert Zaharovits	69fc715bc3	Fix security origin for TokenService#findActiveTokensFor... (#47418 ) (#48280 ) All internal searches (triggered by APIs) across the .security index must be performed while "under the security origin". Otherwise, the search is performed in the context of the caller which most likely does not have privileges to search .security (hopefully). This commit fixes this in the case of two methods in the TokenService and corrects an overly done such context switch in the ApiKeyService. In addition, this makes all tests from the client/rest-high-level module execute as an all mighty administrator, but not a literal superuser. Closes #47151	2019-10-21 13:15:05 +03:00
Martijn van Groningen	5cb44a414c	Fixed links in java docs for EnrichClient (#48233 )	2019-10-18 16:26:49 +02:00
Benjamin Trent	876f4aafac	[ML] Add logistic_regression output aggregator (#48238 ) (#48244 )	2019-10-18 10:08:17 -04:00
Przemysław Witek	28f68fa221	Make num_top_classes parameter's default value equal to 2 (#48119 ) (#48201 )	2019-10-17 18:43:15 +02:00
Gordon Brown	eb7969e8cc	Fix ILM HLRC Javadoc->Documentation links (#48083 ) Several links from the ILM HLRC Javadoc to the online documentation were not updated when the ILM HLRC documentation was written. This commit fixes those links.	2019-10-17 09:59:40 -06:00
Stuart Tettemer	356eef00c8	Scripting: get context names REST API (#48026 ) (#48168 ) Adds `GET /_script_context`, returning a `contexts` object with each available context as a key whose value is an empty object. eg. ``` { "contexts": { "aggregation_selector": {}, "aggs": {}, "aggs_combine": {}, ... } } ``` refs: #47411	2019-10-17 09:08:55 -06:00
Benjamin Trent	0dddbb5b42	[ML] Parse and index inference model (#48016 ) (#48152 ) This adds parsing an inference model as a possible result of the analytics process. When we do parse such a model we persist a `TrainedModelConfig` into the inference index that contains additional metadata derived from the running job.	2019-10-16 15:46:20 -04:00
Martijn van Groningen	aff0c9babc	This commits merges (#48040 ) the enrich-7.x feature branch, which is backport merge and adds a new ingest processor, named enrich processor, that allows document being ingested to be enriched with data from other indices. Besides a new enrich processor, this PR adds several APIs to manage an enrich policy. An enrich policy is in charge of making the data from other indices available to the enrich processor in an efficient manner. Related to #32789	2019-10-15 17:31:45 +02:00
David Roberts	984323783e	[ML][7.x] Add lazy assignment job config option (#47993 ) This change adds: - A new option, allow_lazy_open, to anomaly detection jobs - A new option, allow_lazy_start, to data frame analytics jobs Both work in the same way: they allow a job to be opened/started even if no ML node exists that can accommodate the job immediately. In this situation the job waits in the opening/starting state until ML node capacity is available. (The starting state for data frame analytics jobs is new in this change.) Additionally, the ML nightly maintenance tasks now creates audit warnings for ML jobs that are unassigned. This means that jobs that cannot be assigned to an ML node for a very long time will show a yellow warning triangle in the UI. A final change is that it is now possible to close a job that is not assigned to a node without using force. This is because previously jobs that were open but not assigned to a node were an aberration, whereas after this change they'll be relatively common.	2019-10-15 06:55:11 +01:00
Martijn van Groningen	cc4b6c43b3	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-15 07:23:47 +02:00
Gordon Brown	300ddfa3c1	SLM Start/Stop HLRC and docs (#47966 ) This commit adds HLRC support and documentation for the SLM Start and Stop APIs, as well as updating existing documentation where appropriate. This commit also ensures that the SLM APIs are properly included in the HLRC documentation.	2019-10-14 16:56:31 -06:00
Martijn van Groningen	7cc73f6193	Add HLRC support for enrich execute policy API (#47991 ) This PR also includes HLRC docs for the enrich stats api. Relates to #32789	2019-10-14 19:55:48 +02:00
Michael Basnight	f6f5efe141	Add cloudId builder to the HLRC (#47868 ) Elastic cloud has a concept of a cloud Id. This Id is a base64 encoded url, split up into a few parts. This commit allows the user to pass in a cloud id now, which is translated to a HttpHost that is defined by the encoded parts therein.	2019-10-14 12:47:06 -05:00
Tanguy Leroux	e4ea8b46b6	Add Pause/Resume Auto-Follower APIs to High Level REST Client (#48004 ) This commit adds support for Pause/Resume Auto-Follower APIs to the HLRC, with the documentation. Relates #47510	2019-10-14 18:25:53 +02:00
David Roberts	1ca25bed38	[ML][7.x] Add option to stop datafeed that finds no data (#47995 ) Adds a new datafeed config option, max_empty_searches, that tells a datafeed that has never found any data to stop itself and close its associated job after a certain number of real-time searches have returned no data. Backport of #47922	2019-10-14 17:19:13 +01:00
Martijn van Groningen	d4901a71d7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-14 10:27:17 +02:00
Tanguy Leroux	742fa818b8	Add Pause/Resume Auto Follower APIs (#47510 ) (#47904 ) This commit adds two APIs that allow to pause and resume CCR auto-follower patterns: // pause auto-follower POST /_ccr/auto_follow/my_pattern/pause // resume auto-follower POST /_ccr/auto_follow/my_pattern/resume The ability to pause and resume auto-follow patterns can be useful in some situations, including the rolling upgrades of cluster using a bi-directional cross-cluster replication scheme (see #46665). This commit adds a new active flag to the AutoFollowPattern and adapts the AutoCoordinator and AutoFollower classes so that it stops to fetch remote's cluster state when all auto-follow patterns associate to the remote cluster are paused. When an auto-follower is paused, remote indices that match the pattern are just ignored: they are not added to the pattern's followed indices uids list that is maintained in the local cluster state. This way, when the auto-follow pattern is resumed the indices created in the remote cluster in the meantime will be picked up again and added as new following indices. Indices created and then deleted in the remote cluster will be ignored as they won't be seen at all by the auto-follower pattern at resume time. Backport of #47510 for 7.x	2019-10-13 09:22:51 +02:00
Yogesh Gaikwad	ac209c142c	Remove uniqueness constraint for API key name and make it optional (#47549 ) (#47959 ) Since we cannot guarantee the uniqueness of the API key `name` this commit removes the constraint and makes this field optional. Closes #46646	2019-10-12 22:22:16 +11:00
Przemysław Witek	d210bfa888	[7.x] Add MlClientDocumentationIT tests for classification. (#47569 ) (#47896 )	2019-10-11 10:19:55 +02:00
Martijn van Groningen	102016d571	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-10 14:44:05 +02:00
Hendrik Muhs	0e7869128a	[7.5][Transform] introduce new roles and deprecate old ones (#47780 ) (#47819 ) deprecate data_frame_transforms_{user,admin} roles and introduce transform_{user,admin} roles as replacement	2019-10-10 10:31:24 +02:00
Martijn van Groningen	aace42d38d	Add HLRC support for enrich stats API (#47306 ) This PR also includes HLRC docs for the enrich stats api. Relates to #32789	2019-10-10 09:08:29 +02:00
Tim Brooks	02622c1ef9	Fix issues with serializing BulkByScrollResponse (#45357 ) Currently there are two issues with serializing BulkByScrollResponse. First, when deserializing from XContent, indexing exceptions and search exceptions are switched. Additionally, search exceptions do no retain the appropriate RestStatus code, so you must evaluate the status code from the exception. However, the exception class is not always correctly retained when serialized. This commit adds tests in the failure case. Additionally, fixes the swapping of failure types and adds the rest status code to the search failure.	2019-10-09 10:12:14 -06:00
Martijn van Groningen	da1e2ea461	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-09 09:06:13 +02:00
Benjamin Trent	d33dbf82d4	[7.x] [ML][Inference] adjusting definition object schema and validation (#47447 ) (#47673 ) * [ML][Inference] adjusting definition object schema and validation (#47447) * [ML][Inference] adjusting definition object schema and validation * finalizing schema and fixing inference npe * addressing PR comments * fixing for backport	2019-10-08 07:11:05 -04:00
Hendrik Muhs	5e0e54f455	[Transform] move root endpoint to _transform with BWC layer (#47127 ) (#47682 ) move the main endpoint to /_transform/ from /_data_frame/transforms/ with providing backwards compatibility and deprecation warnings	2019-10-08 08:59:01 +02:00
Dimitris Athanasiou	7667ea5f6f	[7.x][ML] Additional outlier detection parameters (#47600 ) (#47669 ) Adds the following parameters to `outlier_detection`: - `compute_feature_influence` (boolean): whether to compute or not feature influence scores - `outlier_fraction` (double): the proportion of the data set assumed to be outlying prior to running outlier detection - `standardization_enabled` (boolean): whether to apply standardization to the feature values Backport of #47600	2019-10-07 18:21:33 +03:00
Yogesh Gaikwad	b6d1d2e6ec	Add 'create_doc' index privilege (#45806 ) (#47645 ) Use case: User with `create_doc` index privilege will be allowed to only index new documents either via Index API or Bulk API. There are two cases that we need to think: - User indexing a new document without specifying an Id. For this ES auto generates an Id and now ES version 7.5.0 onwards defaults to `op_type` `create` we just need to authorize on the `op_type`. - User indexing a new document with an Id. This is problematic as we do not know whether a document with Id exists or not. If the `op_type` is `create` then we can assume the user is trying to add a document, if it exists it is going to throw an error from the index engine. Given these both cases, we can safely authorize based on the `op_type` value. If the value is `create` then the user with `create_doc` privilege is authorized to index new documents. In the `AuthorizationService` when authorizing a bulk request, we check the implied action. This code changes that to append the `:op_type/index` or `:op_type/create` to indicate the implied index action.	2019-10-07 23:58:44 +11:00
Yogesh Gaikwad	7c862fe71f	Add support to retrieve all API keys if user has privilege (#47274 ) (#47641 ) This commit adds support to retrieve all API keys if the authenticated user is authorized to do so. This removes the restriction of specifying one of the parameters (like id, name, username and/or realm name) when the `owner` is set to `false`. Closes #46887	2019-10-07 23:58:21 +11:00
Martijn van Groningen	f2f2304c75	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-07 10:07:56 +02:00
Przemysław Witek	ee952da2e2	[7.x] Implement evaluation API for multiclass classification problem (#47126 ) (#47343 )	2019-10-04 17:54:51 +02:00
Przemysław Witek	ec9b77deaa	[7.x] Implement new analysis type: classification (#46537 ) (#47559 )	2019-10-04 13:47:19 +02:00
Alpar Torok	0a14bb174f	Remove eclipse conditionals (#44075 ) * Remove eclipse conditionals We used to have some meta projects with a `-test` prefix because historically eclipse could not distinguish between test and main source-sets and could only use a single classpath. This is no longer the case for the past few Eclipse versions. This PR adds the necessary configuration to correctly categorize source folders and libraries. With this change eclipse can import projects, and the visibility rules are correct e.x. auto compete doesn't offer classes from test code or `testCompile` dependencies when editing classes in `main`. Unfortunately the cyclic dependency detection in Eclipse doesn't seem to take the difference between test and non test source sets into account, but since we are checking this in Gradle anyhow, it's safe to set to `warning` in the settings. Unfortunately there is no setting to ignore it. This might cause problems when building since Eclipse will probably not know the right order to build things in so more wirk might be necesarry.	2019-10-03 11:55:00 +03:00
Lee Hinman	2e3eb4b24e	Add API to execute SLM retention on-demand (#47405 ) (#47463 ) * Add API to execute SLM retention on-demand (#47405) This is a backport of #47405 This commit adds the `/_slm/_execute_retention` API endpoint. This endpoint kicks off SLM retention and then returns immediately. This in particular allows us to run retention without scheduling it (for entirely manual invocation) or perform a one-off cleanup. This commit also includes HLRC for the new API, and fixes an issue in SLMSnapshotBlockingIntegTests where retention invoked prior to the test completing could resurrect an index the internal test cluster cleanup had already deleted. Resolves #46508 Relates to #43663	2019-10-02 12:29:04 -06:00
Benjamin Trent	2228a7dd8d	[ML][Inference] adding ensemble model objects (#47241 ) (#47438 ) * [ML][Inference] adding ensemble model objects * addressing PR comments * Update TreeTests.java * addressing PR comments * fixing test	2019-10-02 09:49:46 -04:00
Benjamin Trent	f5fe5e7cd6	[7.x] [ML][Inference] Adding preprocessors to definition object (#47320 ) (#47370 ) * [ML][Inference] Adding preprocessors to definition object (#47320) * [ML][Inference] Adding preprocessors to definition object * Update TrainedModelConfig.java * adjusting for backport	2019-10-01 13:31:25 -04:00
Benjamin Trent	4335e07716	[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267 ) (#47310 ) * [ML][Inference] adding .ml-inference* index and storage (#47267) * [ML][Inference] adding .ml-inference* index and storage * Addressing PR comments * Allowing null definition, adding validation tests for model config * fixing line length * adjusting for backport	2019-10-01 08:20:33 -04:00
Martijn van Groningen	fe937ea4b8	Add config namespace in get policy api response (#47162 ) Currently the policy config is placed directly in the json object of the toplevel `policies` array field. For example: ``` { "policies": [ { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } ] } ``` This change adds a `config` field in each policy json object: ``` { "policies": [ { "config": { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } } ] } ``` This allows us in the future to add other information about policies in the get policy api response. The UI will consume this API to build an overview of all policies. The UI may in the future include additional information about a policy and the plan is to include that in the get policy api, so that this information can be gathered in a single api call. An example of the information that is likely to be added is: * Last policy execution time * The status of a policy (executing, executed, unexecuted) * Information about the last failure if exists	2019-09-30 14:37:23 +02:00
Yannick Welsch	9dc90e41fc	Remove "force" version type (#47228 ) It's been deprecated long ago and can be removed. Relates to #20377 Closes #19769	2019-09-30 11:58:34 +02:00
Martijn van Groningen	66f72bcdbc	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-30 08:12:28 +02:00
Rory Hunter	53a4d2176f	Convert most awaitBusy calls to assertBusy (#45794 ) (#47112 ) Backport of #45794 to 7.x. Convert most `awaitBusy` calls to `assertBusy`, and use asserts where possible. Follows on from #28548 by @liketic. There were a small number of places where it didn't make sense to me to call `assertBusy`, so I kept the existing calls but renamed the method to `waitUntil`. This was partly to better reflect its usage, and partly so that anyone trying to add a new call to awaitBusy wouldn't be able to find it. I also didn't change the usage in `TransportStopRollupAction` as the comments state that the local awaitBusy method is a temporary copy-and-paste. Other changes: * Rework `waitForDocs` to scale its timeout. Instead of calling `assertBusy` in a loop, work out a reasonable overall timeout and await just once. * Some tests failed after switching to `assertBusy` and had to be fixed. * Correct the expect templates in AbstractUpgradeTestCase. The ES Security team confirmed that they don't use templates any more, so remove this from the expected templates. Also rewrite how the setup code checks for templates, in order to give more information. * Remove an expected ML template from XPackRestTestConstants The ML team advised that the ML tests shouldn't be waiting for any `.ml-notifications` templates, since such checks should happen in the production code instead. Also rework the template checking code in `XPackRestTestHelper` to give more helpful failure messages. * Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4	2019-09-29 12:21:46 +01:00
Martijn van Groningen	7ffe2e7e63	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-27 14:42:11 +02:00
Tanguy Leroux	95e2ca741e	Remove unused private methods and fields (#47154 ) This commit removes a bunch of unused private fields and unused private methods from the code base. Backport of (#47115)	2019-09-26 12:49:21 +02:00
Yogesh Gaikwad	9a64b7a888	[Backport] Validate `query` field when creating roles (#46275 ) (#47094 ) In the current implementation, the validation of the role query occurs at runtime when the query is being executed. This commit adds validation for the role query when creating a role but not for the template query as we do not have the runtime information required for evaluating the template query (eg. authenticated user's information). This is similar to the scripts that we store but do not evaluate or parse if they are valid queries or not. For validation, the query is evaluated (if not a template), parsed to build the QueryBuilder and verify if the query type is allowed. Closes #34252	2019-09-26 17:57:36 +10:00
Benjamin Trent	fcddaa90de	[7.x] [ML][Inference] adding tree model (#47044 ) (#47141 ) * [ML][Inference] adding tree model (#47044) * [ML][Inference] adding tree model * renaming features for updated schema * fixing 7.x compilation	2019-09-25 19:11:15 -04:00
Gordon Brown	7ac647c365	Add support for POST requests to SLM Execute API (#47061 ) This commit adds support for POST requests to the SLM `_execute` API, because POST is a more appropriate HTTP verb for this action as it is not idempotent. The docs are also changed to favor POST over PUT, although PUT is not removed or officially deprecated.	2019-09-25 16:15:10 -06:00
Gordon Brown	a46eef9634	Change SLM stats format (#46991 ) Using arrays of objects with embedded IDs is preferred for new APIs over using entity IDs as JSON keys. This commit changes the SLM stats API to use the preferred format.	2019-09-25 11:32:08 -06:00
Benjamin Trent	05fb7be571	[7.x] [ML][Inference] Feature pre-processing objects and functions (#46777 ) (#47040 ) * [ML][Inference] Feature pre-processing objects and functions (#46777) To support inference on pre-trained machine learning models, some basic feature encoding will be necessary. I am using a named object serialization approach so new encodings/pre-processing steps could be added in the future. This PR lays down the ground work for 3 basic encodings: * HotOne * Target Mean * Frequency More feature encodings or pre-processings could be added in the future: * Handling missing columns * Standardization * Label encoding * etc.... * fixing compilation for namedxcontent tests	2019-09-25 08:16:24 -04:00
Hendrik Muhs	e974f178b5	[Transform] rename data frame transform to transform for hlrc client (#46933 ) rename data frame transform to transform for hlrc	2019-09-25 08:31:43 +02:00
maidoo	618efcfcf9	Add submitDeleteByQueryTask method to RestHighLevelClient (#46833 ) The HLRC has a method for reindex, that allows to trigger an async reindex by running RestHighLevelClient.submitReindexTask and RestHighLevelClient.reindex. The delete by query however only has an RestHighLevelClient.deleteByQuery method (and its async counterpart), but no RestHighLevelClient.submitDeleteByQueryTask. So add RestHighLevelClient.submitDeleteByQueryTask Closes #46395	2019-09-24 10:10:55 +02:00
Martijn van Groningen	084bda6d32	fixed hlrc method signatures	2019-09-23 11:14:07 +02:00
Martijn van Groningen	0cfddca61d	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-23 09:46:05 +02:00
Martijn van Groningen	363beefbcd	Change HLRC count request to accept a QueryBuilder (#46904 ) Currently the CountRequest accepts a search source builder, while the RestCountAction only accepts a top level query object. This can lead to confusion if another element (e.g. aggregations) is specified, because that will be ignored on the server side in RestCountAction. By deprecating the current setter & constructor that accept a SearchSourceBuilder and adding replacement that accepts a QueryBuilder it is clear what the count api can handle from HLRC side. Follow up from #46829	2019-09-23 08:43:07 +02:00
Lisa Cawley	875d864be6	[DOCS] Update data frame transform URLs (#46940 ) (#46946 )	2019-09-20 15:57:43 -07:00
Andrei Dan	402fb5e882	Fix flaky test addSnapshotLifecyclePolicy (#46881 ) (#46912 ) * addSnapshotLifecyclePolicy drop version assertion This drops the assertion on the policy version (which was pinned to 1L) as we want to execute both put policy apis (sync and async) for documentation purposes. This will sometimes (depending on the async call) yield a version of 2L. Waiting for the async call to always complete could be an option but the test is already rather slow and it's a bit of an overkill as we're already verifying the policy was created. (cherry picked from commit af4864c39129bcdbf98d00223f445346a62075e4) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-09-20 15:07:54 +01:00
Martijn van Groningen	d82712417f	[HLRC] Send min_score as query string parameter to the count API (#46829 ) Prior to this commit min_score was sent as request body parameter (via SearchSourceBuilder), which is not possible in the count api. Similar to #46474	2019-09-20 10:23:45 +02:00
James Rodewig	b73a9604c1	[DOCS] Separate and reformat synced flush API docs (#46634 ) (#46839 )	2019-09-19 08:22:37 -04:00
Turaç Kangal	d1f47cf00e	[HLRC] Send terminate_after as query string parameter to the count API (#46474 ) Prior to this commit terminate_after was sent as request body parameter (via SearchSourceBuilder), which is not possible in the count api. Closes #46446	2019-09-18 15:54:43 +02:00
Przemysław Witek	e49be611ad	[7.x] Add audit messages for Data Frame Analytics (#46521 ) (#46738 )	2019-09-16 21:21:38 +02:00
Hendrik Muhs	c8f52ec4ff	[Transform] Rename data frame plugin to transform: classes in xpack.core (#46644 ) (#46734 ) rename classes in xpack.core of transform plugin from "data frame transform" to "transform"	2019-09-16 13:39:22 +02:00
Henning Andersen	c99adfb99a	Relax ReindexIT test timeouts (#46312 ) A couple of tests used 2 second timeout, which was prone to failure, increased to 10 seconds. Closes #46301	2019-09-16 09:30:10 +02:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
Jilles van Gurp	60f40d7638	Expose the ability to cancel async requests in REST high-level client (#45688 ) This commits makes all the async methods in the high level client return the `Cancellable` object that the low level client now exposes. Relates to #45379 Closes #44802	2019-09-12 09:45:29 +02:00
Luca Cavanna	cfb186afaf	Add support for cancelling async requests in low-level REST client (#45379 ) The low-level REST client exposes a `performRequestAsync` method that allows to send async requests, but today it does not expose the ability to cancel such requests. That is something that the underlying apache async http client supports, and it makes sense for us to expose. This commit adds a return value to the `performRequestAsync` method, which is backwards compatible. A `Cancellable` object gets returned, which exposes a `cancel` public method. When calling `cancel`, the on-going request associated with the returned `Cancellable` instance will be cancelled by calling its `abort` method. This works throughout multiple retries, though some special care was needed for the case where `cancel` is called between different attempts (when one attempt has failed and the consecutive one has not been sent yet). Note that cancelling a request on the client side does not automatically translate to cancelling the server side execution of it. That needs to be specifically implemented, which is on the work for the search API (see #43332). Relates to #44802	2019-09-12 09:45:28 +02:00
Hendrik Muhs	efea581dcc	[7.x][Transform]Rename data frame plugin to transform: plugin and package names (#46583 ) rename data frame transform plugin to transform: - rename plugin data-frame to transform - change all package names from o.e..dataframe. to o.e..transform. - necessary changes to fix loading/testing	2019-09-11 14:50:08 +02:00
Martijn van Groningen	60ad099178	Add HLRC support for enrich get policy API. (#45970 ) Changed the signature of AbstractResponseTestCase#createServerTestInstance(...) to include the randomly selected xcontent type. This is needed for the creating a server response instance with a query which is represented as BytesReference. Maybe this should go into a different change? This PR also includes HLRC docs for the get policy api. Relates to #32789	2019-09-11 14:42:50 +02:00
Lee Hinman	cdc3a260af	Add retention to Snapshot Lifecycle Management (backport of #4… (#46506 ) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From `2374316f0d`.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy	2019-09-10 09:08:09 -06:00
Henning Andersen	7125c101e6	HLRC multisearchTemplate forgot params (#46492 ) Since 7.3, the request converter for multiSearchTemplate would silently not set the two request parameters `typed_keys` and `max_concurrent_searches`. Closes #46488	2019-09-10 08:47:08 +02:00
Martijn van Groningen	ded98e50b7	Change exact match processor to match processor. (#46041 ) Besides a rename, this changes allows to processor to attach multiple enrich docs to the document being ingested. Also in order to control the maximum number of enrich docs to be included in the document being ingested, the `max_matches` setting is added to the enrich processor. Relates #32789	2019-09-04 18:05:12 +02:00
Martijn van Groningen	555b630160	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-02 09:16:55 +02:00
Martijn van Groningen	f50c7cf88b	Add XContentType as parameter to HLRC ART#createServerTestInstance (#46036 ) Add XContentType as parameter to the AbstractResponseTestCase#createServerTestInstance method. In the case a server side response class serializes xcontent as bytes then the test needs to know what xcontent type was randomily selected. This change is needed in #45970	2019-08-28 16:16:47 +02:00
Dimitris Athanasiou	bb8fcb3cac	[7.x][ML][HLRC] Add data frame analytics regression analysis (#46024 ) (#46053 )	2019-08-28 12:02:14 +03:00
Martijn van Groningen	1157224a6b	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-28 10:14:07 +02:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Martijn van Groningen	b170b76670	Add HLRC support for delete policy api (#45833 ) This PR also adds HLRC docs. Relates to #32789	2019-08-26 10:22:07 +02:00
Dimitris Athanasiou	be554fe5f0	[7.x][ML] Improve progress reportings for DF analytics (#45856 ) (#45910 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 23:04:39 +03:00
Martijn van Groningen	cb42e19a32	Change how type is stored in an enrich policy. (#45789 ) A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to #32789	2019-08-23 13:43:38 +02:00

1 2 3 4 5 ...

1369 Commits