OpenSearch

Commit Graph

Author	SHA1	Message	Date
Dimitris Athanasiou	ee4610c0ca	[7.x][ML] Rename cross validation splitter package (#59529 ) (#59544 ) Renames and moves the cross validation splitter package. First, the package and classes are renamed from using "cross validation splitter" to "train test splitter". Cross validation as a term is overloaded and encompasses more concepts than what we are trying to do here. Second, the package used to be under `process` but it does not make sense to be there, it can be a top level package under `dataframe`. Backport of #59529	2020-07-14 18:54:46 +03:00
Dimitris Athanasiou	37406487b9	[7.x][ML] Improve error for non-included field with unsupported type (#59424 ) (#59541 ) When a field is not included yet its type is unsupported, we currently state that the reason the field is excluded is that it is not in the includes list. However, this implies the user could include it but if the user tried to do so, they would get a failure as they would be including a field with unsupported type. This commit improves this by stating the reason a not included field with unsupported type is excluded is because of its type. Backport of #59424	2020-07-14 18:54:34 +03:00
Andrei Stefan	1fd16ffb70	Add license header to EqlStatsIT.java (#59537 )	2020-07-14 18:45:13 +03:00
Dan Hermann	e54b4a729f	[7.x] Adds write_index_only option to put mapping API (#59539 )	2020-07-14 10:34:08 -05:00
Nhat Nguyen	4d7c59bedb	Assign follower primary to nodes with remote cluster client role (#59375 ) The primary shards of follower indices during the bootstrap need to be on nodes with the remote cluster client role as those nodes reach out to the corresponding leader shards on the remote cluster to copy Lucene segment files and renew the retention leases. This commit introduces a new allocation decider that ensures bootstrapping follower primaries are allocated to nodes with the remote cluster client role. Co-authored-by: Jason Tedor <jason@tedor.me>	2020-07-14 11:23:55 -04:00
Dimitris Athanasiou	e302c66847	[7.x][ML] Fix NPE when starting classification with missing dependent_variable (#59524 ) (#59540 ) Since we have added checking the cardinality of the dependent_variable for classification, we have introduced a bug where an NPE is thrown if the dependent_variable is a missing field. This commit is fixing this issue. Backport of #59524	2020-07-14 17:56:55 +03:00
Andrei Dan	d477aa14ef	Data Streams: fix bwc test (#59528 ) (#59534 ) (cherry picked from commit ed1a5c00abed8c63ad395ea93df7a303da7b7a65) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-07-14 15:17:20 +01:00
Andrei Stefan	cf752992d6	Add telemetry metrics (#59526 )	2020-07-14 16:25:24 +03:00
Dan Hermann	59f639a279	Add auto_configure privilege	2020-07-14 08:23:49 -05:00
David Kyle	d86435938b	[7.x] Add ml licence check to the pipeline inference agg. (#59213 ) (#59412 ) Ensures the licence is sufficient for the model used in inference	2020-07-14 14:03:10 +01:00
Yang Wang	f651487d74	Support prefix search for API key names (#59113 ) (#59520 ) This PR adds minimum support for prefix search of API Key name. It only touches API key name and leave all other query parameters, e.g. realm name, username unchanged.	2020-07-14 22:06:20 +10:00
Andrei Dan	7dcdaeae49	Default to @timestamp in composable template datastream definition (#59317 ) (#59516 ) This makes the data_stream timestamp field specification optional when defining a composable template. When there isn't one specified it will default to `@timestamp`. (cherry picked from commit 5609353c5d164e15a636c22019c9c17fa98aac30) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-07-14 12:36:54 +01:00
Yang Wang	2e71d0aa91	Allow mixed usage of boolean and string when merging OIDC claims (#59112 ) (#59512 ) Certain OPs mix usage of boolean and string for boolean type OIDC claims. For example, the same "email_verified" field is presented as boolean in IdToken, but is a string of "true" in the response of user info. This inconsistency results in failures when we try to merge them during authorization. This PR introduce a small leniency so that it will merge a boolean with a string that has value of the boolean's string representation. In another word, it will merge true with "true", also will merge false with "false", but nothing else.	2020-07-14 20:41:16 +10:00
Andrei Dan	4180333bbc	[7.x] Composable templates: add a default mapping for @timestamp (#59244 ) (#59510 ) This adds a low precendece mapping for the `@timestamp` field with type `date`. This will aid with the bootstrapping of data streams as a timestamp mapping can be omitted when nanos precision is not needed. (cherry picked from commit 4e72f43d62edfe52a934367ce9809b5efbcdb531) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-07-14 11:29:33 +01:00
Costin Leau	5580eb61ed	EQL: Improve sequence limiting (#59439 ) Improve the way limit (in particular offset) is being applied to handle the case where the matches are less than the offset and absolute limit. Combine Matcher and SequenceStateMachine into one class since the two have evolved beyond their original name and structure. (cherry picked from commit 63d3c62cdfc33dea03f21d5565b9c8ea104003eb)	2020-07-14 13:19:09 +03:00
Hendrik Muhs	c8290167a0	[7.x][Transform] separate pivot and extract function interface (#59505 ) separate pivot from the indexer and introduce an abstraction layer, pivot becomes a function. Foundation to add more functions to transform. piggy backed fixes: - when running geo tile group_by it could fail due to query clause limit (unreleased) - new style page size using settings was not validating limit of 10k (7.8)	2020-07-14 11:27:16 +02:00
Martijn van Groningen	5f24be1bc1	Also set system property when running test task. (#59499 ) Closes #59488	2020-07-14 10:34:52 +02:00
Rene Groeschke	d5c11479da	Remove remaining deprecated api usages (#59231 ) (#59498 ) - Fix duplicate path deprecation by removing duplicate test resources - fix deprecated non annotated input property in LazyPropertyList - fix deprecated usage of AbstractArchiveTask.version - Resolve correct test resources	2020-07-14 10:25:00 +02:00
David Roberts	529aa345df	[ML] Account for per-partition categorization in model memory estimate (#59458 ) Now that we have per-partition categorization, the estimate for the model memory limit required for a particular analysis config needs to take into account whether categorization is operating for the job as a whole or per-partition.	2020-07-14 09:16:28 +01:00
Yang Wang	4350add12c	Allow null name when deserialising API key document (#59485 ) (#59496 ) API keys can be created without names using grant API key action. This is considered as a bug (#59484). Since the feature has already been released, we need to accomodate existing keys that are created with null names. This PR relaxes the parser logic so that a null name is accepted.	2020-07-14 16:08:32 +10:00
Tim Brooks	623df95a32	Adding indexing pressure stats to node stats API (#59467 ) We have recently added internal metrics to monitor the amount of indexing occurring on a node. These metrics introduce back pressure to indexing when memory utilization is too high. This commit exposes these stats through the node stats API.	2020-07-13 17:23:42 -06:00
Lee Hinman	81bdb20b8a	Fix license header for DataStreamRestIT	2020-07-13 14:41:29 -06:00
Lee Hinman	bf1a60130d	[7.x] Add telemetery for data streams (#59433 ) (#59454 ) This commit adds data stream info to the `/_xpack` and `/_xpack/usage` APIs. Currently the usage is pretty minimal, returning only the number of data streams and the number of indices currently abstracted by a data stream: ``` ... "data_streams" : { "available" : true, "enabled" : true, "data_streams" : 3, "indices_count" : 17 } ... ```	2020-07-13 14:30:11 -06:00
Christos Soulios	3868bcc7b8	[7.x] Histogram integration on Histogram field type (#59431 ) Backports #58930 to 7.x Implements histogram aggregation over histogram fields as requested in #53285.	2020-07-13 19:36:33 +03:00
Dimitris Athanasiou	a7895ff458	[7.x][ML] Remove unused member var from ExtractedFieldsDetector (#59395 ) (#59406 ) Removes member variable `index` from `ExtractedFieldsDetector` as it is not used. Backport of #59395 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-13 19:10:43 +03:00
Igor Motov	1acb4aeba9	EQL: Prepare for release (#59331 ) (#59426 ) Enables eql setting in release builds. Relates #51613	2020-07-13 11:54:32 -04:00
Martijn van Groningen	b1b7bf3912	Make data streams a basic licensed feature. (#59392 ) Backport of #59293 to 7.x branch. * Create new data-stream xpack module. * Move TimestampFieldMapper to the new module, this results in storing a composable index template with data stream definition only to work with default distribution. This way data streams can only be used with default distribution, since a data stream can currently only be created if a matching composable index template exists with a data stream definition. * Renamed `_timestamp` meta field mapper to `_data_stream_timestamp` meta field mapper. * Add logic to put composable index template api to fail if `_data_stream_timestamp` meta field mapper isn't registered. So that a more understandable error is returned when attempting to store a template with data stream definition via the oss distribution. In a follow up the data stream transport and rest actions can be moved to the xpack data-stream module.	2020-07-13 17:26:46 +02:00
Yang Wang	cc9166a5ea	Mute failed 120_api_key_auth test till #59425 is addressed.	2020-07-14 01:10:36 +10:00
Yang Wang	edf27cd765	Adjust BWC versions for API key auth test. API key realm name is not available in authentication metadata prior to v7.5. The issue is tracked at #59425	2020-07-14 00:38:42 +10:00
David Roberts	b5e8250a4e	[ML] Drive categorization warning notifications from annotations (#59393 ) With the introduction of per-partition categorization the old logic for creating a job notification for categorization status "warn" does not work. However, the C++ code is already writing annotations for categorization status "warn" that take into account whether per-partition categorization is being used and which partition(s) the warnings relate to. Therefore, this change alters the Java results processor to create notifications based on the annotations the C++ writes. (It is arguable that we don't need both annotations and notifications, but they show up in different ways in the UI: only annotations are visible in results and only notifications set the warning symbol in the jobs list. This means it's best to have both.) Backport of #59377	2020-07-13 15:28:57 +01:00
David Kyle	054d5236d4	Mute RegressionIT failure (#59414 ) For #59413	2020-07-13 14:12:19 +01:00
Yang Wang	a84469742c	Improve role cache efficiency for API key roles (#58156 ) (#59397 ) This PR ensure that same roles are cached only once even when they are from different API keys. API key role descriptors and limited role descriptors are now saved in Authentication#metadata as raw bytes instead of deserialised Map<String, Object>. Hashes of these bytes are used as keys for API key roles. Only when the required role is not found in the cache, they will be deserialised to build the RoleDescriptors. The deserialisation is directly from raw bytes to RoleDescriptors without going through the current detour of "bytes -> Map -> bytes -> RoleDescriptors".	2020-07-13 22:58:11 +10:00
Dan Hermann	e01d73c737	[7.x] Data stream admin actions are now index-level actions	2020-07-10 14:36:18 -05:00
Dan Hermann	7fa9cf601b	Data stream support for rollup search	2020-07-10 11:13:34 -05:00
Alan Woodward	4b9cbfca64	Remove test backported in error	2020-07-09 21:45:41 +01:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Lisa Cawley	54483394ae	[DOCS] Clarify subscription requirements (#58958 ) (#59307 )	2020-07-09 12:24:45 -07:00
Dan Hermann	c7e977701a	Data stream support for async search	2020-07-09 13:12:04 -05:00
Dan Hermann	b9fb12924b	Data stream support for EQL search	2020-07-09 13:10:44 -05:00
Dimitris Athanasiou	b2243337d8	[7.x][ML] Data frame analytics max_num_threads setting (#59254 ) (#59308 ) This adds a setting to data frame analytics jobs called `max_number_threads`. The setting expects a positive integer. When used the user specifies the max number of threads that may be used by the analysis. Note that the actual number of threads used is limited by the number of processors on the node where the job is assigned. Also, the process may use a couple more threads for operational functionality that is not the analysis itself. This setting may also be updated for a stopped job. More threads may reduce the time it takes to complete the job at the cost of using more CPU. Backport of #59254 and #57274	2020-07-09 19:15:46 +03:00
Costin Leau	d9c1e531db	EQL: Introduce until functionality (#59292 ) Sequences now support until conditional, which prevents a match from occurring if the until matches a document while doing look-ups. Thus a sequence must complete before the until condition matches - if any document within the sequence occurs at, or after, the until hit, the sequence is discarded. (cherry picked from commit 1ba1b9f0661aee655aa48cf9475ac61aaee2bfda)	2020-07-09 17:12:01 +03:00
Dimitris Athanasiou	d07b11b86b	[7.x][ML] Perform test inference on java (#58877 ) (#59298 ) Since we are able to load the inference model and perform inference in java, we no longer need to rely on the analytics process to be performing test inference on the docs that were not used for training. The benefit is that we do not need to send test docs and fit them in memory of the c++ process. Backport of #58877 Co-authored-by: Dimitris Athanasiou <dimitris@elastic.co> Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2020-07-09 16:30:49 +03:00
David Kyle	86555ec163	Remove unused function InferenceIndexConstants.mapping() (#59146 ) (#59158 ) InferenceIndexConstants.mapping() is broken and unused.	2020-07-09 14:28:53 +01:00
Andrei Stefan	d187b531ed	EQL: Give a name to all toml tests and enforce the naming of new tests (#59283 ) (#59295 ) (cherry picked from commit c8ffe3c9237d3cdd90331795b8e37517155b7e91)	2020-07-09 16:20:29 +03:00
David Kyle	dbb9c802b1	Better error message when the model cannot be parsed due to its size (#59166 ) (#59209 ) The actual cause can be lost in a long list of parse exceptions this surfaces the cause when the problem is size.	2020-07-09 13:43:46 +01:00
David Kyle	c5443f78ce	Add Inference Pipeline aggregation to HLRC (#59086 ) (#59250 ) Adds InferencePipelineAggregationBuilder to the HLRC duplicating the server side classes	2020-07-09 13:38:45 +01:00
Daniel Mitterdorfer	10ef4d2140	Mute testMaxRestoreBytesPerSecIsUsed (#59289 ) Relates #59287	2020-07-09 12:52:17 +02:00
Alan Woodward	67a27e2b9d	Add declarative parameters to FieldMappers (#58663 ) The FieldMapper infrastructure currently has a bunch of shared parameters, many of which are only applicable to a subset of the 41 mapper implementations we ship with. Merging, parsing and serialization of these parameters are spread around the class hierarchy, with much repetitive boilerplate code required. It would be much easier to reason about these things if we could declare the parameter set of each FieldMapper directly in the implementing class, and share the parsing, merging and serialization logic instead. This commit is a first effort at introducing a declarative parameter style. It adds a new FieldMapper subclass, ParametrizedFieldMapper, and refactors two mappers, Boolean and Binary, to use it. Parameters are declared on Builder classes, with the declaration including the parameter name, whether or not it is updateable, a default value, how to parse it from mappings, and how to extract it from another mapper at merge time. Builders have a getParameters method, which returns a list of the declared parameters; this is then used for parsing, merging and serialization. Merging is achieved by constructing a new Builder from the existing Mapper, and merging in values from the merging Mapper; conflicts are all caught at this point, and if none exist then a new, merged, Mapper can be built from the Builder. This allows all values on the Mapper to be final. Other mappers can be gradually migrated to this new style, and once they have all been refactored we can merge ParametrizedFieldMapper and FieldMapper entirely.	2020-07-09 11:43:21 +01:00
Daniel Mitterdorfer	daa48329ec	[TEST] Mute FollowerFailOverIT.testFailOverOnFollower (#58659 ) (#59286 ) Relates #58534 Co-authored-by: Dimitris Athanasiou <dimitris@elastic.co>	2020-07-09 12:38:36 +02:00
Albert Zaharovits	2b7456db7f	Improve auditing of API key authentication #58928 1. Add the `apikey.id`, `apikey.name` and `authentication.type` fields to the `access_granted`, `access_denied`, `authentication_success`, and (some) `tampered_request` audit events. The `apikey.id` and `apikey.name` are present only when authn using an API Key. 2. When authn with an API Key, the `user.realm` field now contains the effective realm name of the user that created the key, instead of the synthetic value of `_es_api_key`.	2020-07-09 13:26:18 +03:00

1 2 3 4 5 ...

5855 Commits