OpenSearch

Commit Graph

Author	SHA1	Message	Date
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
James Rodewig	359b103f87	[DOCS] Rewrite term-level queries overview (#43337 )	2019-06-21 11:55:02 -04:00
Benjamin Trent	b333ced5a7	[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124 ) (#43388 ) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0	2019-06-19 16:18:27 -05:00
Henning Andersen	dea935ac31	Reindex max_docs parameter name (#42942 ) Previously, a reindex request had two different size specifications in the body: * Outer level, determining the maximum documents to process * Inside the source element, determining the scroll/batch size. The outer level size has now been renamed to max_docs to avoid confusion and clarify its semantics, with backwards compatibility and deprecation warnings for using size. Similarly, the size parameter has been renamed to max_docs for update/delete-by-query to keep the 3 interfaces consistent. Finally, all 3 endpoints now support max_docs in both body and URL. Relates #24344	2019-06-07 12:16:36 +02:00
Alan Woodward	2129d06643	Create client-only AnalyzeRequest/AnalyzeResponse classes (#42197 ) This commit clones the existing AnalyzeRequest/AnalyzeResponse classes to the high-level rest client, and adjusts request converters to use these new classes. This is a prerequisite to removing the Streamable interface from the internal server version of these classes.	2019-06-03 09:46:36 +01:00
Luca Cavanna	e747326b04	Adapt low-level REST client to java 8 (#41537 ) As a follow-up to #38540 we can use lambda functions and method references where convenient in the low-level REST client. Also, we need to update the docs to state that the minimum java version required is 1.8.	2019-05-22 18:47:54 +02:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Benjamin Trent	febee07dcc	[ML] adding pivot.max_search_page_size option for setting paging size (#41920 ) (#42079 ) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change	2019-05-10 13:22:31 -05:00
Lisa Cawley	cf8a2be27b	[DOCS] Fix callouts for dataframe APIs (#41904 )	2019-05-07 10:07:04 -07:00
Jason Tedor	d7fd51a84e	Provide names for all artifact repositories (#41857 ) This commit adds a name for each Maven and Ivy repository used in the build.	2019-05-07 06:35:28 -04:00
Jason Tedor	8df13b474d	Update some more S3 artifact locations to use https This commit updates some additional S3 artifact locations to use https instead of http. Relates `241c4ef97a`	2019-05-04 08:30:12 -04:00
Benjamin Trent	a0990ca239	[ML] cleanup + adding description field to transforms (#41554 ) (#41605 ) * [ML] cleanup + adding description field to transforms * making description length have a max of 1k	2019-04-26 16:50:59 -05:00
Benjamin Trent	08843ba62b	[ML] Adds progress reporting for transforms (#41278 ) (#41529 ) * [ML] Adds progress reporting for transforms * fixing after master merge * Addressing PR comments * removing unused imports * Adjusting afterKey handling and percentage to be 100* * Making sure it is a linked hashmap for serialization * removing unused import * addressing PR comments * removing unused import * simplifying code, only storing total docs and decrementing * adjusting for rewrite * removing initial progress gathering from executor	2019-04-25 11:23:12 -05:00
David Kyle	2b539f8347	[ML DataFrame] Data Frame stop all (#41156 ) Wild card support for the data frame stop API	2019-04-15 15:04:28 +01:00
Michael Basnight	fb5a0652a8	HLRC: Convert xpack methods to client side objects (#40705 ) This commit fixes a problem with BWC that was brought up in #40511. A newer version of the code was emitting a new value for an enum to an older version, and the older version could not handle that. It caused the response to error. The MainResponse is now relaxed, and will accept whatever values the server expose, and holds most of them as Strings instead of complex objects. Fixes #40511	2019-04-04 11:06:44 -05:00
Nik Everett	9e8499d20b	Docs: Pin two IDs in the rest client (#40785 ) We generate two pages with "funny" names: * _changing_the_client_8217_s_initialization_code.html * _changing_the_application_8217_s_code.html The leading `_` comes from us not specifying the name of the page. The `8217` comes about because of the single quote character. This is a funny name, but it is the name that we have so we shouldn't change it without putting in a redirect. We're looking at switching these docs from being built with the no-longer-maintained AsciiDoc project to being built with the actively-maintained Asciidoctor project. Asciidoctor Doesn't include the `8217`s in the generated ids. That is better, but we don't really want to change the pages. Ultimately we'd prefer none of our pages start with `_`, but that is a problem for a different time. Anyway, this pins the ids to their "funny" id so it won't change when we switch to Asciidoctor. We'll remove it later, when we have more fine control of our redirects.	2019-04-04 12:03:36 -04:00
David Kyle	c990b30019	[ML] Data Frame HLRC Get API (#40509 )	2019-03-27 12:40:39 +00:00
Benjamin Trent	12943c5d2c	[ML] Add data frame task state object and field (#40169 ) (#40490 ) * [ML] Add data frame task state object and field * A new state item is added so that the overall task state can be accoutned for * A new FAILED state and reason have been added as well so that failures can be shown to the user for optional correction * Addressing PR comments * adjusting after master merge * addressing pr comment * Adjusting auditor usage with failure state * Refactor, renamed state items to task_state and indexer_state * Adding todo and removing redundant auditor call * Address HLRC changes and PR comment * adjusting hlrc IT test	2019-03-27 06:53:58 -05:00
David Kyle	1354696db9	[ML] Data Frame HLRC Get Stats API (#40443 )	2019-03-26 11:17:13 +00:00
Benjamin Trent	7b4f964708	[ML] make source and dest objects in the transform config (#40337 ) (#40396 ) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test	2019-03-25 07:16:41 -05:00
David Kyle	a4cb92a300	[ML] Data Frame HLRC Preview API (#40258 )	2019-03-21 09:38:27 +00:00
David Kyle	387648065d	[ML] Data Frame HLRC start & stop APIs (#40197 )	2019-03-19 13:30:01 +00:00
Gordon Brown	c8a4a7fc9d	Remove Migration Upgrade and Assistance APIs (#40075 ) The Migration Assistance API has been functionally replaced by the Deprecation Info API, and the Migration Upgrade API is not used for the transition from ES 6.x to 7.x, and does not need to be kept around to repair indices that were not properly upgraded before upgrading the cluster, as was the case in 6.	2019-03-18 13:46:56 -06:00
David Kyle	c02f49e9d3	[ML-Dataframe] Add Data Frame client to the Java HLRC (#40040 ) Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform.	2019-03-14 14:57:12 +00:00
Jason Tedor	b9586f62cc	Fix CCR HLRC docs This commit fixes the CCR HLRC docs by including the forget follower API docs in the HLRC docs.	2019-03-07 11:44:35 -05:00
Jason Tedor	0250d554b6	Introduce forget follower API (#39718 ) This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.	2019-03-07 11:08:45 -05:00
Tal Levy	f1c8aa816f	fix dangling tag in TasksClientDocumentationIT (#39157 ) fix dangling tag in TasksClientDocumentationIT and fix tag mentions from list-tasks to cancel-tasks	2019-02-20 08:48:58 -08:00
Martijn van Groningen	0594a467f2	Add support for ccr follow info api to HLRC. (#39115 ) This API was introduces after #33824 was closed.	2019-02-20 16:10:54 +01:00
David Pilato	78541dfb2a	Update Lucene snapshot repo for 7.0.0-beta1 (#38946 ) This commit updates the documentation for the Lucene snapshot repo.	2019-02-15 08:57:44 -06:00
Luca Cavanna	a7046e001c	Remove support for maxRetryTimeout from low-level REST client (#38085 ) We have had various reports of problems caused by the maxRetryTimeout setting in the low-level REST client. Such setting was initially added in the attempts to not have requests go through retries if the request already took longer than the provided timeout. The implementation was problematic though as such timeout would also expire in the first request attempt (see #31834), would leave the request executing after expiration causing memory leaks (see #33342), and would not take into account the http client internal queuing (see #25951). Given all these issues, it seems that this custom timeout mechanism gives little benefits while causing a lot of harm. We should rather rely on connect and socket timeout exposed by the underlying http client and accept that a request can overall take longer than the configured timeout, which is the case even with a single retry anyways. This commit removes the `maxRetryTimeout` setting and all of its usages.	2019-02-06 08:43:47 +01:00
Boaz Leskes	033ba725af	Remove support for internal versioning for concurrency control (#38254 ) Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page: > When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach. Relates to #1078	2019-02-05 20:53:35 +01:00
Yogesh Gaikwad	fe36861ada	Add support for API keys to access Elasticsearch (#38291 ) X-Pack security supports built-in authentication service `token-service` that allows access tokens to be used to access Elasticsearch without using Basic authentication. The tokens are generated by `token-service` based on OAuth2 spec. The access token is a short-lived token (defaults to 20m) and refresh token with a lifetime of 24 hours, making them unsuitable for long-lived or recurring tasks where the system might go offline thereby failing refresh of tokens. This commit introduces a built-in authentication service `api-key-service` that adds support for long-lived tokens aka API keys to access Elasticsearch. The `api-key-service` is consulted after `token-service` in the authentication chain. By default, if TLS is enabled then `api-key-service` is also enabled. The service can be disabled using the configuration setting. The API keys:- - by default do not have an expiration but expiration can be configured where the API keys need to be expired after a certain amount of time. - when generated will keep authentication information of the user that generated them. - can be defined with a role describing the privileges for accessing Elasticsearch and will be limited by the role of the user that generated them - can be invalidated via invalidation API - information can be retrieved via a get API - that have been expired or invalidated will be retained for 1 week before being deleted. The expired API keys remover task handles this. Following are the API key management APIs:- 1. Create API Key - `PUT/POST /_security/api_key` 2. Get API key(s) - `GET /_security/api_key` 3. Invalidate API Key(s) `DELETE /_security/api_key` The API keys can be used to access Elasticsearch using `Authorization` header, where the auth scheme is `ApiKey` and the credentials, is the base64 encoding of API key Id and API key separated by a colon. Example:- ``` curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health ``` Closes #34383	2019-02-05 14:21:57 +11:00
Mayya Sharipova	641704464d	Deprecate types in rollover index API (#38039 ) Relates to #35190	2019-02-04 16:07:45 -05:00
Benjamin Trent	a70f54fc77	Adding ml_settings entry to HLRC and Docs for deprecation_info (#38118 )	2019-02-01 12:45:28 -06:00
David Pilato	3dd4c96e8e	Update Lucene repo for 7.0.0-alpha2 (#37985 ) Let's help the users by giving them the right version to use for 7.0.0-alpha2	2019-01-31 09:17:02 +01:00
Benjamin Trent	8280a20664	ML: Add upgrade mode docs, hlrc, and fix bug (#37942 ) * ML: Add upgrade mode docs, hlrc, and fix bug * [DOCS] Fixes build error and edits text * adjusting docs * Update docs/reference/ml/apis/set-upgrade-mode.asciidoc Co-Authored-By: benwtrent <ben.w.trent@gmail.com> * Update set-upgrade-mode.asciidoc * Update set-upgrade-mode.asciidoc	2019-01-30 06:51:11 -06:00
markharwood	b889221f75	Types removal - deprecate include_type_name with index templates (#37484 ) Added deprecation warnings for use of include_type_name in put/get index templates. HLRC changes: GetIndexTemplateRequest has a new client-side class which is a copy of server's GetIndexTemplateResponse but modified to be typeless. PutIndexTemplateRequest has a new client-side counterpart which doesn't use types in the mappings Relates to #35190	2019-01-29 20:52:41 +00:00
Tim Brooks	00ace369af	Use `CcrRepository` to init follower index (#35719 ) This commit modifies the put follow index action to use a CcrRepository when creating a follower index. It routes the logic through the snapshot/restore process. A wait_for_active_shards parameter can be used to configure how long to wait before returning the response.	2019-01-29 11:47:29 -07:00
Julie Tibshirani	b1735aa93b	Support both typed and typeless 'get mapping' requests in the HLRC. (#37796 ) From previous PRs, we've already added support for include_type_name to the get mapping API. We had also taken an approach to the HLRC where the server-side `GetMappingResponse#fromXContent` could only handle typeless input. This PR updates the HLRC for 'get mapping' to be in line with our new approach: * Add a typeless 'get mappings' method to the Java HLRC, that accepts new client-side request and response objects. This new response only handles typeless mapping definitions. * Switch the old version of `GetMappingResponse` back to expecting typed mappings, and deprecate the corresponding method on the HLRC. Finally, the PR also does some small, related clean-up around 'get field mappings'.	2019-01-27 16:02:22 -08:00
Julie Tibshirani	d473bcda8d	Remove outdated callouts from the 'create index' HLRC docs	2019-01-24 14:08:58 -08:00
Julie Tibshirani	e1d8df4ffa	Deprecate types in create index requests. (#37134 ) From #29453 and #37285, the include_type_name parameter was already present and defaulted to false. This PR makes the following updates: * Add deprecation warnings to RestCreateIndexAction, plus tests in RestCreateIndexActionTests. * Add a typeless 'create index' method to the Java HLRC, and deprecate the old typed version. To do this cleanly, I created new CreateIndexRequest and CreateIndexResponse objects that differ from the existing server ones.	2019-01-24 13:17:47 -08:00
Michael Basnight	944972a249	Deprecate HLRC EmptyResponse used by security (#37540 ) The EmptyResponse is essentially the same as returning a boolean, which is done in other places. This commit deprecates all the existing EmptyResponse methods and creates new boolean methods that have method params reordered so they can exist with the deprecated methods. A followup PR in master will remove the existing deprecated methods, fix the parameter ordering and deprecate the incorrectly ordered parameter methods. Relates #36938	2019-01-23 22:13:16 -06:00
Mayya Sharipova	c8565fe692	Deprecate types in get field mapping API (#37667 ) - Add deprecation warning to RestGetFieldMappingAction - Add two new java HRLC classes GetFieldMappingsRequest and GetFieldMappingsResponse. These classes use new typeless forms of a request and response, and differ in that from the server versions. Relates to #35190	2019-01-23 14:24:35 -05:00
Julie Tibshirani	8da7a27f3b	Deprecate types in the put mapping API. (#37280 ) From #29453 and #37285, the `include_type_name` parameter was already present and defaulted to false. This PR makes the following updates: - Add deprecation warnings to `RestPutMappingAction`, plus tests in `RestPutMappingActionTests`. - Add a typeless 'put mappings' method to the Java HLRC, and deprecate the old typed version. To do this cleanly, I opted to create a new `PutMappingRequest` object that differs from the existing server one.	2019-01-18 12:28:31 -08:00
Christoph Büscher	2f0e0b2426	Allow indices.get_mapping response parsing without types (#37492 ) This change adds deprecation warning to the indices.get_mapping API in case the "inlcude_type_name" parameter is set to "true" and changes the parsing code in GetMappingsResponse to parse the type-less response instead of the one containing types. As a consequence the HLRC client doesn't need to force "include_type_name=true" any more and the GetMappingsResponseTests can be adapted to the new format as well. Also removing some "include_type_name" parameters in yaml test and docs where not necessary.	2019-01-18 09:33:36 +01:00
Georgi Ivanov	87f9148580	Update the scroll example in the docs (#37394 ) Update the scroll example ascii and Java docs, so it is more clear when to consume the scroll documents. Before this change the user could loose the first results if one uses copy & paste.	2019-01-14 13:03:00 +01:00
markharwood	434430506b	Type removal - added deprecation warnings to _bulk apis (#36549 ) Added warnings checks to existing tests Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice. Related to #35190	2019-01-10 21:35:19 +00:00
Josh Soref	edb48321ba	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
lcawl	8f736c3ce9	[DOCS] Fixes broken link to profiling aggregations	2018-12-19 08:01:41 -08:00
lcawl	504cfb2fb1	[DOCS] Adds missing anchors for profile API	2018-12-18 15:20:19 -08:00

1 2 3 4 5 ...

381 Commits