OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Roberts	8c33cad2b2	[TEST] Lower ML model memory limit in HLRC datafeed tests (#55210 ) The MachineLearningIT.testStopDatafeed test was creating 3 jobs each with the default model memory limit of 1GB. This meant that the test would not run on a machine with less than 10GB of RAM (due to the default ML memory percentage of 30%). This change reduces the model memory limit for these jobs to 0.5GB, which means the test will run on a machine with only 5GB of RAM. Relates to https://discuss.elastic.co/t/failed-ml-tests-when-running-the-gradle-check-task-against-unchanged-repo-code/227829	2020-04-15 09:44:07 +01:00
Mark Vieira	ce85063653	[7.x] Re-add origin url information to publish POM files (#55173 )	2020-04-14 13:24:15 -07:00
Yang Wang	862799956c	Deprecate local parameter for get field mapping request (#55014 ) (#55099 ) The usage of local parameter for GetFieldMappingRequest has been removed from the underlying transport action since v2.0. This PR deprecates the parameter from rest layer. It will be removed in next major version.	2020-04-12 13:48:47 +10:00
Przemko Robakowski	afa3467957	[7.x] HLRC support for Index Templates V2 (#54838 ) (#54932 ) * HLRC support for Index Templates V2 (#54838) * HLRC support for Index Templates V2 This change adds High Level Rest Client support for Index Templates V2. Relates to #53101 * fixed compilation error Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-04-09 07:43:13 +02:00
Przemko Robakowski	7b1bb9952a	[7.x] HLRC support for Component Templates APIs (#54635 ) (#54828 ) * HLRC support for Component Templates APIs (#54635)	2020-04-06 20:24:23 +02:00
Nhat Nguyen	2fdbed7797	Broadcast cancellation to only nodes have outstanding child tasks (#54312 ) Today when canceling a task we broadcast ban/unban requests to all nodes in the cluster. This strategy does not scale well for hierarchical cancellation. With this change, we will track outstanding child requests and broadcast the cancellation to only nodes that have outstanding child tasks. This change also prevents a parent task from sending child requests once it got canceled. Relates #50990 Supersedes #51157 Co-authored-by: Igor Motov <igor@motovs.org> Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-04-06 11:11:29 -04:00
Andrei Dan	816fec7187	Enable support for decompression of compressed response within RestHighLevelClient (#53533 ) (#54811 ) Added decompression of gzip when gzip value is return as an header from Elasticsearch (cherry picked from commit 4a195b573ab85d4e756669c953419ebdb3003442) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> Co-authored-by: Hakky54 <hakangoudberg@hotmail.com>	2020-04-06 16:04:26 +01:00
Benjamin Trent	4a1610265f	[7.x] [ML] add new inference_config field to trained model config (#54421 ) (#54647 ) * [ML] add new inference_config field to trained model config (#54421) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config. * fixing for backport	2020-04-02 12:25:10 -04:00
Christoph Büscher	4882bb5cd8	[Tests] Remove unneeded test logging (#54634 ) This logging was added to get better insight into #35644 which was closed by #37302 and can be removed now.	2020-04-02 15:47:38 +02:00
Benjamin Trent	eb31be0e71	[7.x] [ML] add num_matches and preferred_to_categories to category defintion objects (#54214 ) (#54639 ) * [ML] add num_matches and preferred_to_categories to category defintion objects (#54214) This adds two new fields to category definitions. - `num_matches` indicating how many documents have been seen by this category - `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized. These fields are only guaranteed to be up to date after a `_flush` or `_close` native change: https://github.com/elastic/ml-cpp/pull/1062 * adjusting for backport	2020-04-02 09:09:19 -04:00
Mayya Sharipova	bf4857d9e0	Search hit refactoring (#41656 ) (#54584 ) Refactor SearchHit to have separate document and meta fields. This is a part of bigger refactoring of issue #24422 to remove dependency on MapperService to check if a field is metafield. Relates to PR: #38373 Relates to issue #24422 Co-authored-by: sandmannn <bohdanpukalskyi@gmail.com>	2020-04-01 15:19:00 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Dimitris Athanasiou	b4b54efa73	[7.x][ML] Hyperparameter names should match config (#54401 ) (#54435 ) Java side of elastic/ml-cpp#1096 Backport of #54401	2020-03-30 23:32:40 +03:00
Nik Everett	e58ad9fed3	Clean up how pipeline aggs check for multi-bucket (backport of #54161 ) (#54379 ) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes #53215	2020-03-30 10:44:55 -04:00
Christoph Büscher	67b9b68c66	[Docs] Add HLRC Async Search API documentation (#54353 ) Adds documentation and a corresponding test case containing typical API usage for the Async Search API to the High Level Rest Client.	2020-03-30 15:37:22 +02:00
Benjamin Trent	374e76d7cd	[Transform] fixing naming in HLRC and _cat to match API content (#54300 ) (#54408 ) Fixing the naming of the HLRC values to match the ToXContent field names (i.e. the field names returned from an API call). Also fixes the names in the _cat API as well. closes #53946	2020-03-30 08:57:02 -04:00
Lee Hinman	f2cc2b1127	[7.x] Add REST APIs for IndexTemplateV2Metadata CRUD (#54039 ) (#54347 ) * Add REST APIs for IndexTemplateV2Metadata CRUD (#54039) * Add REST APIs for IndexTemplateV2Metadata CRUD This commit adds the get/put/delete APIs for interacting with the now v2 versions of index templates. These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag. Relates to #53101 * Add exceptions for HLRC tests * Add skips for 7.x versions * Use index_template instead of template_v2 in action names * Add test for MetaDataIndexTemplateService.addIndexTemplateV2 * Move removal to static method and add test * Add unit tests for request classes (implement hashCode & equals) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Fix compilation Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-27 10:47:22 -06:00
Christoph Büscher	0d17295601	[Docs] Minor fix for SubmitAsyncSearchRequest.keepOnCompletion javadoc (#54325 ) The semantics and the default value for this parameter have changed, adapting the javadoc accordingly.	2020-03-27 16:02:03 +01:00
Hendrik Muhs	5f007b7cb1	fix Zero or negative time interval not supported	2020-03-26 22:33:09 +01:00
Hendrik Muhs	4ecf9904d5	[Transform] Transform optmize date histogram (#54068 ) optimize transform for group_by on date_histogram by injecting an additional range query. This limits the number of search and index requests and avoids unnecessary updates. Only recent buckets get re-written. fixes #54254	2020-03-26 21:39:50 +01:00
Gordon Brown	0d30b48613	Disallow negative TimeValues (#53913 ) This commit causes negative TimeValues, other than -1 which is sometimes used as a sentinel value, to be rejected during parsing. Also introduces a hack to allow ILM to load policies which were written to the cluster state with a negative min_age, treating those values as 0, which should match the behavior of prior versions.	2020-03-26 13:30:35 -06:00
Dimitris Athanasiou	13368aae37	[7.x][ML] DF Analytics should always display operational stats (#54210 ) (#54290 ) This commit populates the _stats API response with sensible "empty" `data_counts` and `memory_usage` objects when the job itself has not started reporting them. Backport of #54210	2020-03-26 20:03:14 +02:00
Christoph Büscher	da404bbce2	HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200 ) (#54266 ) Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no longer needed since we now only send the parameters along with the rest request that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct defaults are set on the client side. This change removes setting and sending these defaults where possible, leaving only the overwrite of batchedReduceSize with a default value of 5, since the default used in the vanilla SearchRequest is 512. However, we don't need to send this value along as a request parameter if its the default since the correct one will be set on the receiving end if no value is specified. Also adding tests for RestSubmitAsyncSearchAction that check the correct defaults are set when parameters are missing on the server side. Backport of #54200	2020-03-26 19:01:17 +01:00
Armin Braun	32d0bb8754	Retry in SnapshotIT Snapshot Abort (#54195 ) (#54249 ) Retry here to work around the possible race between snapshot finalization and deletion. Closes #53509	2020-03-26 09:53:42 +01:00
Luca Cavanna	ff269160af	Async search: rename REST parameters (#54198 ) This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069	2020-03-26 09:40:50 +01:00
David Roberts	7667004b20	[ML] Add a model memory estimation endpoint for anomaly detection (#54129 ) A new endpoint for estimating anomaly detection job model memory requirements: POST _ml/anomaly_detectors/estimate_model_memory Backport of #53507	2020-03-24 22:55:11 +00:00
Tim Brooks	caefa78513	Align remote info api with new settings (#54102 ) Currently the remote info api has added a number of possible fields (proxy, num_socket_connections, etc) that are available in proxy mode. These fields are not aligned with what the settings are named. This commit modifies this API to align with the settings.	2020-03-24 10:27:24 -06:00
Przemysław Witek	7e25563303	Use the new ML state index name (.ml-state-000001) instead of the legacy one (.ml-state) (#54070 ) (#54085 )	2020-03-24 15:22:59 +01:00
Dimitris Athanasiou	5ce7c99e74	[7.x][ML] Data frame analytics data counts (#53998 ) (#54031 ) This commit instruments data frame analytics with stats for the data that are being analyzed. In particular, we count training docs, test docs, and skipped docs. In order to account docs with missing values as skipped docs for analyses that do not support missing values, this commit changes the extractor so that it only ignores docs with missing values when it collects the data summary, which is used to estimate memory usage. Backport of #53998	2020-03-24 11:30:43 +02:00
Hendrik Muhs	7dcacf531f	[7.x][Transform][Rollup] add processing stats to record the ti… (#54027 ) add 2 additional stats: processing time and processing total which capture the time spent for processing results and how often it ran. The 2 new stats correspond to the existing indexing and search stats. Together with indexing and search this now allows the user to see the full picture, all 3 stages.	2020-03-24 09:22:02 +01:00
Jim Ferenczi	9e3f7f4575	Add heuristics to compute pre_filter_shard_size when unspecified (#53873 ) (#54007 ) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes #39835	2020-03-24 02:05:15 +01:00
Christoph Büscher	286c3660bd	Add async_search get and delete APIs to HLRC (#53828 ) (#53980 ) This commit adds the "_async_searhc" get and delete APIs to the AsyncSearchClient in the High Level Rest Client. Relates to #49091 Backport of #53828	2020-03-23 21:21:36 +01:00
Luca Cavanna	932a7e3112	Backport of async search changes (#53976 ) * Get Async Search: omit _clusters section when empty (#53907) The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content. This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`. * Async search: remove version from response (#53960) The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available). That said this commit clarifies this in the docs and removes the version field from the async search response * Async Search: replicas to auto expand from 0 to 1 (#53964) This way single node clusters that are green don't go yellow once async search is used, while all the others still have one replica. * [DOCS] address timing issue in async search docs tests (#53910) The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final. With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response. Closes #53887 Closes #53891	2020-03-23 19:13:31 +01:00
Dimitris Athanasiou	08a8345269	[7.x][ML] Fix typo in outlier detection timing stats (#53988 ) (#53995 ) The field holding the timing stats was mistakenly called `timings_stats`. Backport of #53988	2020-03-23 19:46:39 +02:00
Martijn van Groningen	aef7b89219	Backport: initial data stream commit (#53959 ) This commits adds a data stream feature flag, initial definition of a data stream and the stubs for the data stream create, delete and get APIs. Also simple serialization tests are added and a rest test to thest the data stream API stubs. This is a large amount of code and mainly mechanical, but this commit should be straightforward to review, because there isn't any real logic. The data stream transport and rest action are behind the data stream feature flag and are only intialized if the feature flag is enabled. The feature flag is enabled if elasticsearch is build as snapshot or a release build and the 'es.datastreams_feature_flag_registered' is enabled. The integ-test-zip sets the feature flag if building a release build, otherwise rest tests would fail. Relates to #53100	2020-03-23 12:58:09 +01:00
Christoph Büscher	8eacb153df	Add async_search.submit to HLRC #53592 (#53852 ) This commit adds a new AsyncSearchClient to the High Level Rest Client which initially supporst the submitAsyncSearch in its blocking and non-blocking flavour. Also adding client side request and response objects and parsing code to parse the xContent output of the client side AsyncSearchResponse together with parsing roundtrip tests and a simple roundtrip integration test. Relates to #49091 Backport of #53592	2020-03-20 13:15:58 +01:00
Dimitris Athanasiou	60153c5433	[7.x][ML] Data frame analytics analysis stats (#53788 ) (#53844 ) Adds parsing and indexing of analysis instrumentation stats. The latest one is also returned from the get-stats API. Note that we chose to duplicate objects even where they are currently similar. There are already ideas on how these will diverge in the future and while the duplication looks ugly at the moment, it is the option that offers the highest flexibility. Backport of #53788	2020-03-20 12:11:53 +02:00
Christoph Büscher	9a328c2b83	Add unsupported parameters to HLRC search request (#53745 ) Currently we don't send values for the `pre_filter_shard_size` and `max_concurrent_shard_requests` SearchRequest parameters over http when using the High Level Rest Client. This change adds these parameters to the RequestConverters and tests.	2020-03-18 20:00:31 +01:00
Alan Woodward	580bc40c0c	Make it possible to deprecate all variants of a ParseField with no replacement (#53722 ) Sometimes we want to deprecate and remove a ParseField entirely, without replacement; for example, the various places where we specify a _type field in 7x. Currently we can tell users only that a particular field name should not be used, and that another name should be used in its place. This commit adds the ability to say that a field should not be used at all.	2020-03-18 14:16:19 +00:00
Hendrik Muhs	7a12300ce6	[7.x][Transform] enhance the output of preview to return full… (#53695 ) changes the output format of preview regarding deduced mappings and enhances it to return all the details about auto-index creation. This allows the user to customize the index creation. Using HLRC you can create a index request from the output of the response. backport #53572	2020-03-18 08:37:56 +01:00
Lee Hinman	9c0e846db3	[7.x] Add REST API for ComponentTemplate CRUD (#53558 ) (#53681 ) * Add REST API for ComponentTemplate CRUD This adds the Put/Get/DeleteComponentTemplate APIs that allow inserting, retrieving, and removing ComponentTemplateMetadata into the cluster state metadata. These APIs are currently only available behind a feature flag system property - `es.itv2_feature_flag_registered`. Relates to #53101 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-17 13:23:28 -06:00
Ryan Ernst	5c472fcb47	Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642 ) Re-applies the change from #53523 along with test fixes. closes #53626 closes #53624 closes #53622 closes #53625 Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Jake Landis <jake.landis@elastic.co>	2020-03-17 10:28:51 -07:00
David Kyle	2b635737e1	[ML] Parse single named object in config classes (#53472 ) (#53542 )	2020-03-17 13:59:52 +00:00
Jason Tedor	881d0bfa8a	Add server name to remote info API (#53634 ) This commit adds the configured server_name to the proxy mode info so that it can be exposed in the remote info API.	2020-03-16 21:20:42 -04:00
Mark Vieira	2f0aca992b	Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 )" This reverts commit `b7dbadeea0`.	2020-03-15 18:10:40 -07:00
Jason Tedor	b7dbadeea0	Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 ) This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2 dependency to 2.13.1. Relates #53523	2020-03-14 13:28:06 -04:00
Benjamin Trent	4e43ede735	[ML] renaming inference processor field field_mappings to new name field_map (#53433 ) (#53502 ) This renames the `inference` processor configuration field `field_mappings` to `field_map`. `field_mappings` is now deprecated.	2020-03-13 15:40:57 -04:00
Tom Veasey	690099553c	[7.x][ML] Adds the class_assignment_objective parameter to classification (#53552 ) Adds a new parameter for classification that enables choosing whether to assign labels to maximise accuracy or to maximise the minimum class recall. Fixes #52427.	2020-03-13 17:35:51 +00:00
Nik Everett	9dcd64c110	Preserve metric types in top_metrics (backport of #53288 ) (#53440 ) This changes the `top_metrics` aggregation to return metrics in their original type. Since it only supports numerics, that means that dates, longs, and doubles will come back as stored, with their appropriate formatter applied.	2020-03-12 17:17:09 -04:00
Aleksandr Maus	e3a1291adf	EQL: Add more rest client tests (#53422 ) (#53474 )	2020-03-12 09:44:46 -04:00
Benjamin Trent	89668c5ea0	[ML][Inference] adds new default_field_map field to trained models (#53294 ) (#53419 ) Adds a new `default_field_map` field to trained model config objects. This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data. The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.	2020-03-11 13:49:39 -04:00
Dimitris Athanasiou	0fd0516d0d	[7.x][ML] Rename data frame analytics maximum_number_trees to max_trees (#53300 ) (#53390 ) Deprecates `maximum_number_trees` parameter of classification and regression and replaces it with `max_trees`. Backport of #53300	2020-03-11 12:45:27 +02:00
Hendrik Muhs	696aa4ddaf	[7.x][Transform] add support for script in group_by (#53167 ) (#53324 ) add the possibility to base the group_by on the output of a script. closes #43152 backport #53167	2020-03-10 11:12:58 +01:00
Christoph Büscher	9e561c2921	Fix AbstractBulkByScrollRequest slices parameter via Rest (#53068 ) Currently the AbstractBulkByScrollRequest accepts slice values of 0 via its `setSlices` method, denoting the "auto" slicing behaviour that is usable by settting the "slices=auto" parameter on rest requests. When using the High Level Rest Client, however, we send the 0 value as an integer, which is then rejected as invalid by `AbstractBulkByScrollRequest#parseSlices`. Instead of making parsing of the rest request more lenient, this PR opts for changing the RequestConverter logic in the client to translate 0 values to "auto" on the rest requests. Closes #53044	2020-03-06 15:38:04 +01:00
Aleksandr Maus	2dc872f052	EQL: Add HLRC for EQL stats (#53043 ) (#53148 )	2020-03-05 09:20:38 -05:00
Nik Everett	28df7ae5ed	Support multiple metrics in `top_metrics` agg (backport of #52965 ) (#53163 ) This adds support for returning multiple metrics to the `top_metrics` agg. It looks like: ``` POST /test/_search?filter_path=aggregations { "aggs": { "tm": { "top_metrics": { "metrics": [ {"field": "v"}, {"field": "m"} ], "sort": {"s": "desc"} } } } } ```	2020-03-05 08:12:01 -05:00
Aleksandr Maus	b47bffba24	EQL: consistent naming for event type vs event category (#53073 ) (#53090 ) Related to https://github.com/elastic/elasticsearch/issues/52941	2020-03-04 08:02:38 -05:00
Costin Leau	712e0c05cd	EQL: Add implicit ordering on timestamp (#53004 ) QL: Move Sort base class from SQL to QL (cherry picked from commit 798015b7bbd565e9c4222724614baeb432c7c2b3)	2020-03-02 22:41:36 +02:00
Aleksandr Maus	89ed857c79	EQL: Change request parameter query to filter and rule to query (#52971 ) (#53006 ) Related to https://github.com/elastic/elasticsearch/issues/52911	2020-03-02 09:26:23 -05:00
Dimitris Athanasiou	85b4e45093	[7.x]ML] Parse and report memory usage for DF Analytics (#52778 ) (#52980 ) Adds reporting of memory usage for data frame analytics jobs. This commit introduces a new index pattern `.ml-stats-*` whose first concrete index will be `.ml-stats-000001`. This index serves to store instrumentation information for those jobs. Backport of #52778 and #52958	2020-02-29 13:03:40 +02:00
Dan Hermann	dd44376d27	[7.x] Send the fields param in body instead of URL params (#52948 )	2020-02-28 08:57:35 -06:00
Costin Leau	a674085903	EQL: Disable field extraction for returned events (#52884 ) Return the whole source of matching events (cherry picked from commit 79ca586ab1d89d645fb58142b82202f14ce5d361)	2020-02-28 13:48:15 +02:00
Nik Everett	1d1956ee93	Add size support to `top_metrics` (backport of #52662 ) (#52914 ) This adds support for returning the top "n" metrics instead of just the very top. Relates to #51813	2020-02-27 16:12:52 -05:00
Jason Tedor	d568345f1a	Fix roles parsing in client nodes sniffer (#52888 ) We mades roles pluggable, but never updated the client to account for this. This means that when speaking to a modern cluster, application logs are spammed with warning messages around unrecognized roles. This commit addresses this by accounting for the fact that roles can extend beyond master/data/ingest now.	2020-02-27 15:26:49 -05:00
Benjamin Trent	19a6c5d980	[7.x] [ML][Inference] Add support for multi-value leaves to the tree model (#52531 ) (#52901 ) * [ML][Inference] Add support for multi-value leaves to the tree model (#52531) This adds support for multi-value leaves. This is a prerequisite for multi-class boosted tree classification.	2020-02-27 14:05:28 -05:00
Benjamin Trent	eac38e9847	[ML] Add indices_options to datafeed config and update (#52793 ) (#52905 ) This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index. This is necessary for the following use cases: - Reading from frozen indices - Allowing certain indices in multiple index patterns to not exist yet These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object. closes https://github.com/elastic/elasticsearch/issues/48056	2020-02-27 13:43:25 -05:00
Josh Devins	68ba571f70	Adds recall@k metric to rank eval API (#52889 ) This change adds the recall@k metric and refactors precision@k to match the new metric. Recall@k is an important metric to use for learning to rank (LTR) use-cases. Candidate generation or first ranking phase ranking functions are often optimized for high recall, in order to generate as many relevant candidates in the top-k as possible for a second phase of ranking. Adding this metric allows tuning that base query for LTR. See: https://github.com/elastic/elasticsearch/issues/51676 Backports: https://github.com/elastic/elasticsearch/pull/52577	2020-02-27 16:04:24 +01:00
Costin Leau	40bc06f6ad	EQL: Hook engine to Elasticsearch (#52828 ) Add query execution and return actual results returned from Elasticsearch inside the tests (cherry picked from commit 3e039282bf991af87604a6d4f8eada19d5e33842)	2020-02-27 11:22:22 +02:00
Jake Landis	8d311297ca	[7.x] Smarter copying of the rest specs and tests (#52114 ) (#52798 ) * Smarter copying of the rest specs and tests (#52114) This PR addresses the unnecessary copying of the rest specs and allows for better semantics for which specs and tests are copied. By default the rest specs will get copied if the project applies `elasticsearch.standalone-rest-test` or `esplugin` and the project has rest tests or you configure the custom extension `restResources`. This PR also removes the need for dozens of places where the x-pack specs were copied by supporting copying of the x-pack rest specs too. The plugin/task introduced here can also copy the rest tests to the local project through a similar configuration. The new plugin/task allows a user to minimize the surface area of which rest specs are copied. Per project can be configured to include only a subset of the specs (or tests). Configuring a project to only copy the specs when actually needed should help with build cache hit rates since we can better define what is actually in use. However, project level optimizations for build cache hit rates are not included with this PR. Also, with this PR you can no longer use the includePackaged flag on integTest task. The following items are included in this PR: * new plugin: `elasticsearch.rest-resources` * new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy * new extension 'restResources' ``` restResources { restApi { includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar includeXpack 'baz' //will include x-pack specs that start with baz } restTests { includeCore 'foo', 'bar' //will include the core tests that start with foo and bar includeXpack 'baz' //will include the x-pack tests that start with baz } } ```	2020-02-26 08:13:41 -06:00
Sachin Frayne	d3c0a2f013	Improve the error message when loading text fielddata. (#52753 ) Emphasize keyword over fielddata as the preferred way to use String fields for aggregations or sorting.	2020-02-25 15:45:44 -08:00
Aleksandr Maus	a7bdb0b456	EQL: Add integration tests harness to test EQL feature parity with original implementation (#52248 ) (#52675 ) The tests use the original test queries from https://github.com/endgameinc/eql/blob/master/eql/etc/test_queries.toml for EQL implementation correctness validation. The file test_queries_unsupported.toml serves as a "blacklist" for the queries that we do not support. Currently all of the queries are blacklisted. Over the time the expectation is to eventually have an empty "blacklist" when all of the queries are fully supported. The tests use the original test vector from https://raw.githubusercontent.com/endgameinc/eql/master/eql/etc/test_data.json. Only one EQL and the response is stubbed for now to match the expected output from that query. This part would need some tweaking after EQL is fully wired. Related to https://github.com/elastic/elasticsearch/issues/49581	2020-02-24 12:46:59 -05:00
Przemko Robakowski	e72cb79476	Add docs for errors in GetAlias API (#51850 ) (#52716 ) Closes #31499 Co-authored-by: Maxim <timonin.maksim@mail.ru>	2020-02-24 18:22:09 +01:00
Igor Motov	e5b21a3fc6	Add HLRC for EQL search (#52550 ) Adds EQL HLRC client with the search method. Relates to #51961	2020-02-21 08:44:08 -05:00
Maria Ralli	ba8d6d1fb5	Remove Xlint exclusions from gradle files Backport of #52542. This commit is part of issue #40366 to remove disabled Xlint warnings from gradle files. In particular, it removes the Xlint exclusions from the following files: - benchmarks/build.gradle - client/client-benchmark-noop-api-plugin/build.gradle - x-pack/qa/rolling-upgrade/build.gradle - x-pack/qa/third-party/active-directory/build.gradle - modules/transport-netty4/build.gradle For the first three files no code adjustments were needed. For x-pack/qa/third-party/active-directory move the suppression at the code level. For transport-netty4 replace the variable arguments with ArrayLists and remove any redundant casts.	2020-02-20 14:12:05 +00:00
David Kyle	7bbe5c8464	[Ml] Validate tree feature index is within range (#52514 ) This changes the tree validation code to ensure no node in the tree has a feature index that is beyond the bounds of the feature_names array. Specifically this handles the situation where the C++ emits a tree containing a single node and an empty feature_names list. This is valid tree used to centre the data in the ensemble but the validation code would reject this as feature_names is empty. This meant a broken workflow as you cannot GET the model and PUT it back	2020-02-19 14:41:43 +00:00
Nik Everett	146def8caa	Implement top_metrics agg (#51155 ) (#52366 ) The `top_metrics` agg is kind of like `top_hits` but it only works on doc values so it should be faster. At this point it is fairly limited in that it only supports a single, numeric sort and a single, numeric metric. And it only fetches the "very topest" document worth of metric. We plan to support returning a configurable number of top metrics, requesting more than one metric and more than one sort. And, eventually, non-numeric sorts and metrics. The trick is doing those things fairly efficiently. Co-Authored by: Zachary Tong <zach@elastic.co>	2020-02-14 11:19:11 -05:00
Nik Everett	2dac36de4d	HLRC support for string_stats (#52163 ) (#52297 ) This adds a builder and parsed results for the `string_stats` aggregation directly to the high level rest client. Without this the HLRC can't access the `string_stats` API without the elastic licensed `analytics` module. While I'm in there this adds a few of our usual unit tests and modernizes the parsing.	2020-02-12 19:25:05 -05:00
David Roberts	1cefafdd14	[ML] Add new categorization stats to model_size_stats (#52009 ) This change adds support for the following new model_size_stats fields: - categorized_doc_count - total_category_count - frequent_category_count - rare_category_count - dead_category_count - categorization_status Backport of #51879	2020-02-10 09:10:50 +00:00
Martijn van Groningen	44ea1efd26	Tidy up GetSourceRequest class: (#51916 ) * No need to implement ToXContentObject * Made index and id fields immutable.	2020-02-10 09:42:03 +01:00
Jay Modi	3edadfefd0	RestHandlers declare handled routes (#52123 ) This commit changes how RestHandlers are registered with the RestController so that a RestHandler no longer needs to register itself with the RestController. Instead the RestHandler interface has new methods which when called provide information about the routes (method and path combinations) that are handled by the handler including any deprecated and/or replaced combinations. This change also makes the publication of RestHandlers safe since they no longer publish a reference to themselves within their constructors. Closes #51622 Co-authored-by: Jason Tedor <jason@tedor.me> Backport of #51950	2020-02-09 22:48:32 -07:00
Benjamin Trent	c6111eb90e	[ML][Inference] adding number_samples to TreeNode (#51937 ) (#52060 ) in preparation for feature importance and split information gain, adding `number_samples` field to `TreeNode` definition. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-07 17:04:58 -05:00
David Kyle	8f10a7c6ca	[ML] Make Ensemble feature names optional (#51996 ) The featureNames field is requisite in individual models but is not required by the Ensemble.	2020-02-07 10:08:37 +00:00
Jason Tedor	25daf5f1e1	Add autoscaling API skelton (#51564 ) The main purpose of this commit is to add a single autoscaling REST endpoint skeleton, for the purpose of starting to build out the build and testing infrastructure that will surround it. For example, rather than commiting a fully-functioning autoscaling API, we introduce here the skeleton so that we can start wiring up the build and testing infrastructure, establish security roles/permissions, an so on. This way, in a forthcoming PR that introduces actual functionality, that PR will be smaller and have less distractions around that sort of infrastructure.	2020-02-06 21:55:01 -05:00
Ioannis Kakavas	5092d3098d	Cleanup test user in HLRC test (#49477 ) (#51942 ) SecurityIT.testGetUser creates a user for testing purposes, but did not delete the user at the end of the test. This could leave the cluster in an unexpected state for other tests. This commit: - Deletes the user at the end of `testGetUser` - Adds the test-name as metadata to the users that are created in `SecurityIT` so that their origin is clear if they do interfere with other tests - Enables SecurityDocumentationIT.testGetUsers on the expectation that the new cleanup step will resolve the unreliability of that test. Relates: #48440 Co-authored-by: Tim Vernum <tim@adjective.org>	2020-02-06 13:05:09 +02:00
Martijn van Groningen	0610eb51ef	Change HLRC SourceExists to use GetSourceRequest instead of GetRequest (#51789 ) (#51913 ) Originates from #50885 Co-authored-by: Maxim <timonin.maksim@mail.ru>	2020-02-05 13:27:31 +01:00
Julie Tibshirani	38ce428831	Create a class to hold field capabilities for one index. (#51844 ) Currently, the same class `FieldCapabilities` is used both to represent the capabilities for one index, and also the merged capabilities across indices. To help clarify the logic, this PR proposes to create a separate class `IndexFieldCapabilities` for the capabilities in one index. The refactor will also help when adding `source_path` information in #49264, since the merged source path field will have a different structure from the field for a single index. Individual changes: * Add a new class IndexFieldCapabilities. * Remove extra constructor from FieldCapabilities. * Combine the add and merge methods in FieldCapabilities.Builder.	2020-02-04 11:24:57 -08:00
James Rodewig	4ea7297e1e	[DOCS] Change http://elastic.co -> https (#48479 ) (#51812 ) Co-authored-by: Jonathan Budzenski <jon@budzenski.me>	2020-02-03 09:50:11 -05:00
Ryan Ernst	21224caeaf	Remove comparison to true for booleans (#51723 ) While we use `== false` as a more visible form of boolean negation (instead of `!`), the true case is implied and the true value does not need to explicitly checked. This commit converts cases that have slipped into the code checking for `== true`.	2020-01-31 16:35:43 -08:00
Lee Hinman	b9faa0733d	[7.x] Rename ILM history index enablement setting (#51698 ) (#51705 ) * Rename ILM history index enablement setting The previous setting was `index.lifecycle.history_index_enabled`, this commit changes it to `indices.lifecycle.history_index_enabled` to indicate this is not an index-level setting (it's node level).	2020-01-30 15:27:44 -07:00
Benjamin Trent	1380dd439a	[7.x] [ML][Inference] Fix weighted mode definition (#51648 ) (#51695 ) * [ML][Inference] Fix weighted mode definition (#51648) Weighted mode inaccurately assumed that the "max value" of the input values would be the maximum class value. This does not make sense. Weighted Mode should know how many classes there are. Hence the new parameter `num_classes`. This indicates what the maximum class value to be expected.	2020-01-30 15:33:25 -05:00
Hendrik Muhs	b5106aa59d	dump audit index to logs for better debugging (#51627 ) The audit index is re-created for every testrun and therefore potential useful debug information gets lost. This change reads out the audit index and logs the results, which makes them available for debugging CI issues. relates #51549	2020-01-30 11:14:56 +01:00
Albert Zaharovits	f25b6cc2eb	Add new 'maintenance' index privilege #50643 This commit creates a new index privilege named `maintenance`. The privilege grants the following actions: `refresh`, `flush` (also synced-`flush`), and `force-merge`. Previously the actions were only under the `manage` privilege which in some situations was too permissive. Co-authored-by: Amir H Movahed <arhd83@gmail.com>	2020-01-30 11:59:11 +02:00
Ioannis Kakavas	81e7d926f6	Add HLRC docs for AuthN and TLS (#51355 ) (#51551 ) This commit adds examples in our documentation for - An HLRC instance authenticating to an elasticsearch cluster using an elasticsearch token service access token or an API key - An HLRC instance connecting to an elasticsearch cluster that is setup for TLS on the HTTP layer when the CA certificate of the cluster is available either as a PEM file or a keystore - An HLRC instance connecting to an elasticsearch cluster that requires client authentication where the client key and certificate are available in a keystore Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2020-01-29 08:14:38 +02:00
Ioannis Kakavas	ee202a642f	Enable tests in FIPS 140 in JDK 11 (#49485 ) This change changes the way to run our test suites in JVMs configured in FIPS 140 approved mode. It does so by: - Configuring any given runtime Java in FIPS mode with the bundled policy and security properties files, setting the system properties java.security.properties and java.security.policy with the == operator that overrides the default JVM properties and policy. - When runtime java is 11 and higher, using BouncyCastle FIPS Cryptographic provider and BCJSSE in FIPS mode. These are used as testRuntime dependencies for unit tests and internal clusters, and copied (relevant jars) explicitly to the lib directory for testclusters used in REST tests - When runtime java is 8, using BouncyCastle FIPS Cryptographic provider and SunJSSE in FIPS mode. Running the tests in FIPS 140 approved mode doesn't require an additional configuration either in CI workers or locally and is controlled by specifying -Dtests.fips.enabled=true	2020-01-27 11:14:52 +02:00
Benjamin Trent	fc994d9ce1	[ML][Inference] Adds validations for model PUT (#51376 ) (#51409 ) Adds validations making sure that * `input.field_names` is not empty * `ensemble.trained_models` is not empty * `tree.feature_names` is not empty closes https://github.com/elastic/elasticsearch/issues/51354	2020-01-24 09:29:12 -05:00
Benjamin Trent	76660a5a4f	[7.x] [ML][Inference] add tags url param to GET (#51330 ) (#51404 ) * [ML][Inference] add tags url param to GET (#51330) Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint. This parameter allows the list of models to be further reduced to those who contain all the provided tags.	2020-01-24 08:26:58 -05:00
Nhat Nguyen	072203cba8	Clean soft-deletes setting in ccr tests (#51113 ) (#51372 ) We no longer need to explicitly enable soft-deletes in CCR tests. Relates #50775 Backport of #51113	2020-01-23 16:31:47 -05:00
Martijn van Groningen	0a8d8d7ae3	Add Get Source API to the HLRC (#51342 ) Backport to 7.x branch of #50885. Relates to #47678 Co-authored-by: Maxim <timonin.maksim@mail.ru>	2020-01-23 13:16:20 +01:00
Tim Vernum	e41c0b1224	Deprecating kibana_user and kibana_dashboard_only_user roles (#50963 ) This change adds a new `kibana_admin` role, and deprecates the old `kibana_user` and`kibana_dashboard_only_user`roles. The deprecation is implemented via a new reserved metadata attribute, which can be consumed from the API and also triggers deprecation logging when used (by a user authenticating to Elasticsearch). Some docs have been updated to avoid references to these deprecated roles. Backport of: #46456 Co-authored-by: Larry Gregory <lgregorydev@gmail.com>	2020-01-15 11:07:19 +11:00
Benjamin Trent	72c270946f	[ML][Inference] Adding classification_weights to ensemble models (#50874 ) (#50994 ) * [ML][Inference] Adding classification_weights to ensemble models classification_weights are a way to allow models to prefer specific classification results over others this might be advantageous if classification value probabilities are a known quantity and can improve model error rates.	2020-01-14 12:40:25 -05:00

1 2 3 4 5 ...

1521 Commits