OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	c8374dc9f3	[ML] add max_model_memory parameter to forecast request (#57254 ) (#57355 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 11:16:08 -04:00
Lee Hinman	c0f732b9f6	[7.x] Rename template V2 classes to ComposableTemplate (#57183 ) (#57232 ) Backports the following commits to 7.x: Rename template V2 classes to ComposableTemplate (#57183)	2020-05-27 11:01:59 -06:00
Samidh	fdd378e3fb	[Docs] Fix typo in start-watch-service.asciidoc (#57182 )	2020-05-27 15:50:31 +02:00
Benjamin Trent	297f864884	[ML] relax throttling on expired data cleanup (#56711 ) (#56895 ) Throttling nightly cleanup as much as we do has been over cautious. Night cleanup should be more lenient in its throttling. We still keep the same batch size, but now the requests per second scale with the number of data nodes. If we have more than 5 data nodes, we don't throttle at all. Additionally, the API now has `requests_per_second` and `timeout` set. So users calling the API directly can set the throttling. This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`. This will allow users to adjust throttling of the nightly maintenance.	2020-05-18 08:46:42 -04:00
Kamyar Ghajar	d105e0ea9c	[Docs] Update multi-search.asciidoc (#56284 ) The documentation shows the wrong command for a multi-search async call.	2020-05-06 16:54:09 +02:00
Andrei Dan	a7968a1a5e	[7.x] HLRC: document index template v2 and component template APIs (#56136 ) (#56225 ) This documents the index template v2 and component template APIs in the high level rest client. (cherry picked from commit 9bcf89b1e27613ab8887ce611ec2b0d1356cba8b) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-05-05 19:51:54 +01:00
Dimitris Athanasiou	75dadb7a6d	[7.x][ML] Add loss_function to regression (#56118 ) (#56187 ) Adds parameters `loss_function` and `loss_function_parameter` to regression. Backport of #56118	2020-05-05 14:59:51 +03:00
Hendrik Muhs	e177a38504	[7.x][Transform] add throttling (#56007 ) (#56184 ) add throttling to transform, throttling will slow down search requests by delaying the execution based on a documents per second metric. fixes #54862	2020-05-05 13:09:02 +02:00
David Roberts	da5aeb8be7	[ML] Return assigned node in start/open job/datafeed response (#55570 ) Adds a "node" field to the response from the following endpoints: 1. Open anomaly detection job 2. Start datafeed 3. Start data frame analytics job If the job or datafeed is assigned to a node immediately then this field will return the ID of that node. In the case where a job or datafeed is opened or started lazily the node field will contain an empty string. Clients that want to test whether a job or datafeed was opened or started lazily can therefore check for this. Backport of #55473	2020-04-22 12:06:53 +01:00
Yang Wang	862799956c	Deprecate local parameter for get field mapping request (#55014 ) (#55099 ) The usage of local parameter for GetFieldMappingRequest has been removed from the underlying transport action since v2.0. This PR deprecates the parameter from rest layer. It will be removed in next major version.	2020-04-12 13:48:47 +10:00
Nhat Nguyen	2fdbed7797	Broadcast cancellation to only nodes have outstanding child tasks (#54312 ) Today when canceling a task we broadcast ban/unban requests to all nodes in the cluster. This strategy does not scale well for hierarchical cancellation. With this change, we will track outstanding child requests and broadcast the cancellation to only nodes that have outstanding child tasks. This change also prevents a parent task from sending child requests once it got canceled. Relates #50990 Supersedes #51157 Co-authored-by: Igor Motov <igor@motovs.org> Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-04-06 11:11:29 -04:00
Benjamin Trent	4a1610265f	[7.x] [ML] add new inference_config field to trained model config (#54421 ) (#54647 ) * [ML] add new inference_config field to trained model config (#54421) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config. * fixing for backport	2020-04-02 12:25:10 -04:00
Christoph Büscher	67b9b68c66	[Docs] Add HLRC Async Search API documentation (#54353 ) Adds documentation and a corresponding test case containing typical API usage for the Async Search API to the High Level Rest Client.	2020-03-30 15:37:22 +02:00
David Roberts	7667004b20	[ML] Add a model memory estimation endpoint for anomaly detection (#54129 ) A new endpoint for estimating anomaly detection job model memory requirements: POST _ml/anomaly_detectors/estimate_model_memory Backport of #53507	2020-03-24 22:55:11 +00:00
Tom Veasey	690099553c	[7.x][ML] Adds the class_assignment_objective parameter to classification (#53552 ) Adds a new parameter for classification that enables choosing whether to assign labels to maximise accuracy or to maximise the minimum class recall. Fixes #52427.	2020-03-13 17:35:51 +00:00
Przemko Robakowski	e72cb79476	Add docs for errors in GetAlias API (#51850 ) (#52716 ) Closes #31499 Co-authored-by: Maxim <timonin.maksim@mail.ru>	2020-02-24 18:22:09 +01:00
Benjamin Trent	2a5c181dda	[ML][Inference] don't return inflated definition when storing trained models (#52573 ) (#52580 ) When `PUT` is called to store a trained model, it is useful to return the newly create model config. But, it is NOT useful to return the inflated definition. These definitions can be large and returning the inflated definition causes undo work on the server and client side. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-20 19:47:29 -05:00
OriGlassman	0da183339e	[DOCS] Fixed "SeachRequest" -> "SearchRequest" typo in HLRC docs (#52144 )	2020-02-14 13:43:06 -05:00
Nik Everett	146def8caa	Implement top_metrics agg (#51155 ) (#52366 ) The `top_metrics` agg is kind of like `top_hits` but it only works on doc values so it should be faster. At this point it is fairly limited in that it only supports a single, numeric sort and a single, numeric metric. And it only fetches the "very topest" document worth of metric. We plan to support returning a configurable number of top metrics, requesting more than one metric and more than one sort. And, eventually, non-numeric sorts and metrics. The trick is doing those things fairly efficiently. Co-Authored by: Zachary Tong <zach@elastic.co>	2020-02-14 11:19:11 -05:00
Nik Everett	2dac36de4d	HLRC support for string_stats (#52163 ) (#52297 ) This adds a builder and parsed results for the `string_stats` aggregation directly to the high level rest client. Without this the HLRC can't access the `string_stats` API without the elastic licensed `analytics` module. While I'm in there this adds a few of our usual unit tests and modernizes the parsing.	2020-02-12 19:25:05 -05:00
Raidok	e4936230a3	[DOCS] Fix "Asynchronous usage" title in HLRC docs (#52017 )	2020-02-07 09:40:16 -05:00
Ioannis Kakavas	81e7d926f6	Add HLRC docs for AuthN and TLS (#51355 ) (#51551 ) This commit adds examples in our documentation for - An HLRC instance authenticating to an elasticsearch cluster using an elasticsearch token service access token or an API key - An HLRC instance connecting to an elasticsearch cluster that is setup for TLS on the HTTP layer when the CA certificate of the cluster is available either as a PEM file or a keystore - An HLRC instance connecting to an elasticsearch cluster that requires client authentication where the client key and certificate are available in a keystore Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2020-01-29 08:14:38 +02:00
Benjamin Trent	76660a5a4f	[7.x] [ML][Inference] add tags url param to GET (#51330 ) (#51404 ) * [ML][Inference] add tags url param to GET (#51330) Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint. This parameter allows the list of models to be further reduced to those who contain all the provided tags.	2020-01-24 08:26:58 -05:00
Martijn van Groningen	0a8d8d7ae3	Add Get Source API to the HLRC (#51342 ) Backport to 7.x branch of #50885. Relates to #47678 Co-authored-by: Maxim <timonin.maksim@mail.ru>	2020-01-23 13:16:20 +01:00
Dimitris Athanasiou	1d8cb3c741	[7.x][ML] Add num_top_feature_importance_values param to regression and classi… (#50914 ) (#50976 ) Adds a new parameter to regression and classification that enables computation of importance for the top most important features. The computation of the importance is based on SHAP (SHapley Additive exPlanations) method. Backport of #50914	2020-01-14 16:46:09 +02:00
Benjamin Trent	fa116a6d26	[7.x] [ML][Inference] PUT API (#50852 ) (#50887 ) * [ML][Inference] PUT API (#50852) This adds the `PUT` API for creating trained models that support our format. This includes * HLRC change for the API * API creation * Validations of model format and call * fixing backport	2020-01-12 10:59:11 -05:00
Dimitris Athanasiou	ca0828ba07	[7.x][ML] Implement force deleting a data frame analytics job (#50553 ) (#50589 ) Adds a `force` parameter to the delete data frame analytics request. When `force` is `true`, the action force-stops the jobs and then proceeds to the deletion. This can be used in order to delete a non-stopped job with a single request. Closes #48124 Backport of #50553	2020-01-03 13:46:02 +02:00
Jake	ecf295fa77	[DOCS] Bump copyright to 2019 for Java HLRC license (#50206 )	2019-12-30 15:39:53 -05:00
Martijn van Groningen	10ed1ae1d2	Add remote info to the HLRC (#50483 ) The additional change to the original PR (#49657), is that `org.elasticsearch.client.cluster.RemoteConnectionInfo` now parses the initial_connect_timeout field as a string instead of a TimeValue instance. The reason that this is needed is because that the initial_connect_timeout field in the remote connection api is serialized for human consumption, but not for parsing purposes. Therefore the HLRC can't parse it correctly (which caused test failures in CI, but not in the PR CI :( ). The way this field is serialized needs to be changed in the remote connection api, but that is a breaking change. We should wait making this change until rest api versioning is introduced. Co-Authored-By: j-bean <anton.shuvaev91@gmail.com> Co-authored-by: j-bean <anton.shuvaev91@gmail.com>	2019-12-24 15:11:58 +01:00
Przemysław Witek	cc4bc797f9	[7.x] Implement `precision` and `recall` metrics for classification evaluation (#49671 ) (#50378 )	2019-12-19 18:55:05 +01:00
Dimitris Athanasiou	8891f4db88	[7.x][ML] Introduce randomize_seed setting for regression and classification (#49990 ) (#50023 ) This adds a new `randomize_seed` for regression and classification. When not explicitly set, the seed is randomly generated. One can reuse the seed in a similar job in order to ensure the same docs are picked for training. Backport of #49990	2019-12-10 15:29:19 +02:00
Henning Andersen	5adb33ec17	Deprecate sorting in reindex (#49458 ) (#49738 ) Reindex sort never gave a guarantee about the order of documents being indexed into the destination, though it could give a sense of locality of source data. It prevents us from doing resilient reindex and other optimizations and it has therefore been deprecated. Related to #47567	2019-12-01 19:24:27 +01:00
Henning Andersen	1d745f1e5c	Revert "Deprecate sorting in reindex (#49458 )" This reverts commit `27d45c9f1f`.	2019-11-29 22:08:19 +01:00
Henning Andersen	27d45c9f1f	Deprecate sorting in reindex (#49458 ) Reindex sort never gave a guarantee about the order of documents being indexed into the destination, though it could give a sense of locality of source data. It prevents us from doing resilient reindex and other optimizations and it has therefore been deprecated. Related to #47567	2019-11-29 21:35:11 +01:00
Dimitris Athanasiou	4edb2e7bb6	[7.x][ML] Add optional source filtering during data frame reindexing (#49690 ) (#49718 ) This adds a `_source` setting under the `source` setting of a data frame analytics config. The new `_source` is reusing the structure of a `FetchSourceContext` like `analyzed_fields` does. Specifying includes and excludes for source allows selecting which fields will get reindexed and will be available in the destination index. Closes #49531 Backport of #49690	2019-11-29 16:10:44 +02:00
Benjamin Trent	b5d7c939f8	[7.x] [ML][Inference][HLRC] add GET _stats (#49562 ) (#49600 ) * [ML][Inference][HLRC] add GET _stats (#49562) * fixing for backport	2019-11-26 11:28:26 -05:00
Benjamin Trent	26a8ca00db	[7.x] [ML][Inference][HLRC] Delete trained model API (#49567 ) (#49585 ) * [ML][Inference][HLRC] Delete trained model API (#49567) * fixing for backport	2019-11-26 08:27:08 -05:00
Dimitris Athanasiou	8eaee7cbdc	[7.x][ML] Explain data frame analytics API (#49455 ) (#49504 ) This commit replaces the _estimate_memory_usage API with a new API, the _explain API. The API consolidates information that is useful before creating a data frame analytics job. It includes: - memory estimation - field selection explanation Memory estimation is moved here from what was previously calculated in the _estimate_memory_usage API. Field selection is a new feature that explains to the user whether each available field was selected to be included or not in the analysis. In the case it was not included, it also explains the reason why. Backport of #49455	2019-11-22 22:06:10 +02:00
Lisa Cawley	ca895d3ad5	[DOCS] Merge rollup config details into API (#49412 )	2019-11-22 08:39:49 -08:00
Benjamin Trent	ed787d06e8	[7.x] [ML][Inference][HLRC] GET trained models (#49464 ) (#49488 ) * [ML][Inference][HLRC] GET trained models (#49464) * fixing for backport	2019-11-22 09:24:06 -05:00
Przemysław Witek	c7ac2011eb	[7.x] Implement accuracy metric for multiclass classification (#47772 ) (#49430 )	2019-11-21 15:01:18 +01:00
Michael Basnight	bc23bc5146	Add delete alias to the HLRC (#48819 ) The delete alias call is a rest only API call, but should still be added to the rest client. This commit adds it as well as relevant tests. Ref #47678	2019-11-12 11:02:53 -06:00
James Rodewig	7002ce1e9c	[DOCS] Replace `_uid` refs in reindex slicing docs (#48649 ) PR #25543 removed the `_uid` field in favor of the `_id` field, including for use in slicing. This removes an outdated reference to `_uid` in our reindex docs.	2019-10-29 16:41:53 -04:00
Michael Basnight	1ba57dbe08	[Docs] add missing snapshot restore reference (#45256 )	2019-10-28 09:55:10 -05:00
Alexandre Fonseca	c41951c6b3	[Docs] Fix opType options in IndexRequest API example. (#48290 )	2019-10-22 13:49:19 +02:00
Przemysław Witek	28f68fa221	Make num_top_classes parameter's default value equal to 2 (#48119 ) (#48201 )	2019-10-17 18:43:15 +02:00
Martijn van Groningen	31e41d4ac2	fixed invalid reference	2019-10-15 10:45:35 +02:00
Martijn van Groningen	cc4b6c43b3	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-15 07:23:47 +02:00
Gordon Brown	300ddfa3c1	SLM Start/Stop HLRC and docs (#47966 ) This commit adds HLRC support and documentation for the SLM Start and Stop APIs, as well as updating existing documentation where appropriate. This commit also ensures that the SLM APIs are properly included in the HLRC documentation.	2019-10-14 16:56:31 -06:00
Martijn van Groningen	7cc73f6193	Add HLRC support for enrich execute policy API (#47991 ) This PR also includes HLRC docs for the enrich stats api. Relates to #32789	2019-10-14 19:55:48 +02:00

1 2 3 4 5 ...

503 Commits