OpenSearch

Commit Graph

Author	SHA1	Message	Date
Costin Leau	3a546f1f51	EQL: Introduce support for sequence maxspan (#58635 ) EQL sequences can specify now a maximum time allowed for their span (computed between the first and the last matching event). (cherry picked from commit 747c3592244192a2e25a092f62aec91a899afc83)	2020-06-29 21:31:00 +03:00
Igor Motov	773f3574a9	Removes debug logging from RestEqlCancellationIT (#58676 ) The test didn't fail since the fix in #58493. So, it's time to remove debug logging and close the issue. Closes #58270	2020-06-29 13:15:01 -04:00
Andrei Stefan	3cb8f54f28	EQL: case sensitivity aware integration testing (#58624 ) (#58672 ) * EQL: case sensitivity aware integration testing (#58624) * Add DataLoader * Rewrite case sensitivity settings: NULL -> run both case sensitive and insensitive tests TRUE -> run case sensitive test only FALSE -> run case insensitive test only * Rename test_queries_supported * Add more toml tests from the Python client Co-authored-by: Ross Wolf <31489089+rw-access@users.noreply.github.com> (cherry picked from commit 34d383421599f060a5c083b40df35f135de49e39)	2020-06-29 18:40:07 +03:00
Tanguy Leroux	73adcf4d44	SparseFileTracker.Gap should keep a reference to the corresponding Range (#58587 ) (#58665 ) SparseFileTracker.Gap can keep a reference to the corresponding range it is about to fill, it does not need to resolve the range each time onSuccess/onProgress/onFailure are called. Relates #58477	2020-06-29 15:24:19 +02:00
Przemysław Witek	3f7c45472e	[7.x] Introduce DataFrameAnalyticsConfig update API (#58302 ) (#58648 )	2020-06-29 10:56:11 +02:00
Yang Wang	61fa7f4d22	Change privilege of enrich stats API to monitor (#52027 ) (#52196 ) The remote_monitoring_user user needs to access the enrich stats API. But the request is denied because the API is categorized under admin. The correct privilege should be monitor.	2020-06-29 10:25:33 +10:00
Dimitris Athanasiou	1817b896c9	[7.x][ML] Add status and increased estimate to memory usage (#58588 ) (#58606 ) Adds parsing of `status` and `memory_reestimate_bytes` to data frame analytics `memory_usage`. When the training surpasses the model memory limit, the status will be set to `hard_limit` and `memory_reestimate_bytes` can be used to update the job's limit in order to restart the job. Backport of #58588	2020-06-28 16:27:26 +03:00
Costin Leau	3c81b91474	EQL: Add Head/Tail pipe support (#58536 ) Introduce pipe support, in particular head and tail (which can also be chained). (cherry picked from commit 4521ca3367147d4d6531cf0ab975d8d705f400ea) (cherry picked from commit d6731d659d012c96b19879d13cfc9e1eaf4745a4)	2020-06-27 09:49:14 +03:00
Benjamin Trent	7a202b149e	Muting analytics tests (#58617 ) (#58618 )	2020-06-26 16:50:59 -04:00
Tanguy Leroux	775fb5d4cf	Allows SparseFileTracker to progressively execute listeners during Gap processing (#58477 ) (#58584 ) Today SparseFileTracker allows to wait for a range to become available before executing a given listener. In the case of searchable snapshot, we'd like to be able to wait for a large range to be filled (ie, downloaded and written to disk) while being able to execute the listener as soon as a smaller range is available. This pull request is an extract from #58164 which introduces a ProgressListenableActionFuture that is used internally by SparseFileTracker. The progressive listenable future allows to register listeners attached to SparseFileTracker.Gap so that they are executed once the Gap is completed (with success or failure) or as soon as the Gap progress reaches a given progress value. This progress value is defined when the tracker.waitForRange() method is called; this method has been modified to accept a range and another listener's range to operate on. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-06-26 18:26:20 +02:00
James Baiera	89243857ce	Update precommit to filter out project dependencies (#58189 ) (#58572 ) If a project is pulling in an external org.elasticsearch dependency, the dependency report generation would require a license file for the dependency to be present. This would break precommit because a license was present that it did not feel was warranted. This un-reverts the update to the dependenciesInfo task, as well as the JNA license addition.	2020-06-25 16:33:25 -04:00
Lee Hinman	f732003370	[7.x] Fix negative limiting with fewer PARTIAL snapshots than minimum required (#58563 ) (#58569 ) In SLM retention, when a minimum number of snapshots is required for retention, we prefer to remove the oldest snapshots first. To perform this, we limit one of the streams, in a rare case this can cause: ``` [mynode] error during snapshot retention task java.lang.IllegalArgumentException: -5 at java.util.stream.ReferencePipeline.limit(ReferencePipeline.java:469) ~[?:?] at org.elasticsearch.xpack.core.slm.SnapshotRetentionConfiguration.lambda$getSnapshotDeletionPredicate$6(SnapshotRetentionConfiguration.java:195) ~[?:?] at org.elasticsearch.xpack.slm.SnapshotRetentionTask.snapshotEligibleForDeletion(SnapshotRetentionTask.java:245) ~[?:?] at org.elasticsearch.xpack.slm.SnapshotRetentionTask$1.lambda$onResponse$0(SnapshotRetentionTask.java:163) ~[?:?] at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:176) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1624) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?] ``` When certain criteria are met. This commit fixes the negative limiting with `Math.max(0, ...)` and adds a unit test for the behavior. Resolves #58515	2020-06-25 14:16:34 -06:00
Henning Andersen	38be2812b1	Enhance extensible plugin (#58542 ) Rather than let ExtensiblePlugins know extending plugins' classloaders, we now pass along an explicit ExtensionLoader that loads the extensions asked for. Extensions constructed that way can optionally receive their own Plugin instance in the constructor.	2020-06-25 20:37:56 +02:00
Jason Tedor	52ad5842a9	Introduce node.roles setting (#58512 ) Today we have individual settings for configuring node roles such as node.data and node.master. Additionally, roles are pluggable and we have used this to introduce roles such as node.ml and node.voting_only. As the number of roles is growing, managing these becomes harder for the user. For example, to create a master-only node, today a user has to configure: - node.data: false - node.ingest: false - node.remote_cluster_client: false - node.ml: false at a minimum if they are relying on defaults, but also add: - node.master: true - node.transform: false - node.voting_only: false If they want to be explicit. This is also challenging in cases where a user wants to have configure a coordinating-only node which requires disabling all roles, a list which we are adding to, requiring the user to keep checking whether a node has acquired any of these roles. This commit addresses this by adding a list setting node.roles for which a user has explicit control over the list of roles that a node has. If the setting is configured, the node has exactly the roles in the list, and not any additional roles. This means to configure a master-only node, the setting is merely 'node.roles: [master]', and to configure a coordinating-only node, the setting is merely: 'node.roles: []'. With this change we deprecate the existing 'node.*' settings such as 'node.data'.	2020-06-25 14:14:51 -04:00
Igor Motov	20af856abd	[7.x] EQL: Adds an ability to execute an asynchronous EQL search (#58192 ) Adds async support to EQL searches Closes #49638 Co-authored-by: James Rodewig james.rodewig@elastic.co	2020-06-25 14:11:57 -04:00
Benjamin Trent	c7ba79bc19	[7.x] [ML] make waiting for renormalization optional for internally flushing job (#58537 ) (#58553 ) * [ML] make waiting for renormalization optional for internally flushing job (#58537) When flushing, datafeeds only need the guaruntee that the latest bucket has been handled. But, in addition to this, the typical call to flush waits for renormalization to complete. For large jobs, this can take a fair bit of time (even longer than a bucket length). This causes unnecessary delays in handling data. This commit adds a new internal only flag that allows datafeeds (and forecasting) to skip waiting on renormalization. closes #58395	2020-06-25 12:26:52 -04:00
Nik Everett	03e6d1b535	Add Variable Width Histogram Aggregation (backport of #42035 ) (#58440 ) Implements a new histogram aggregation called `variable_width_histogram` which dynamically determines bucket intervals based on document groupings. These groups are determined by running a one-pass clustering algorithm on each shard and then reducing each shard's clusters using an agglomerative clustering algorithm. This PR addresses #9572. The shard-level clustering is done in one pass to minimize memory overhead. The algorithm was lightly inspired by [this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches a small number of documents to sample the data and determine initial clusters. Subsequent documents are then placed into one of these clusters, or a new one if they are an outlier. This algorithm is described in more details in the aggregation's docs. At reduce time, a [hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304) continually merges the closest buckets from all shards (based on their centroids) until the target number of buckets is reached. The final values produced by this aggregation are approximate. Each bucket's min value is used as its key in the histogram. Furthermore, buckets are merged based on their centroids and not their bounds. So it is possible that adjacent buckets will overlap after reduction. Because each bucket's key is its min, this overlap is not shown in the final histogram. However, when such overlap occurs, we set the key of the bucket with the larger centroid to the midpoint between its minimum and the smaller bucket’s maximum: `min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to increases the accuracy of the clustering. Nodes are unable to share centroids during the shard-level clustering phase. In the future, resolving https://github.com/elastic/elasticsearch/issues/50863 would let us solve this issue. It doesn’t make sense for this aggregation to support the `min_doc_count` parameter, since clusters are determined dynamically. The `order` parameter is not supported here to keep this large PR from becoming too complex. Co-authored-by: James Dorfman <jamesdorfman@users.noreply.github.com>	2020-06-25 11:40:47 -04:00
Nik Everett	71adade73a	Return clear error message if aggregation type is invalid (#58255 ) (#58365 ) The main changes are: 1. Catch the `NamedObjectNotFoundException` when parsing aggregation type, and then throw a `ParsingException` with clear error message with hint. 2. Add a unit test method: AggregatorFactoriesTests#testInvalidType(). Closes #58146. Co-authored-by: bellengao <gbl_long@163.com>	2020-06-25 11:08:25 -04:00
Dimitris Athanasiou	c3dfafe0b4	[7.x][ML] Avoid assertion error on empty string feature values for inference (#58541 ) (#58550 ) It is possible for the source document to have an empty string value for a field that is mapped as numeric. We should treat those as missing values and avoid throwing an assertion error. Backport of #58541	2020-06-25 18:07:29 +03:00
Dimitris Athanasiou	5af7071db0	[7.x][ML] Change inference default field name to <dep_var>_prediction… (#58546 ) This changes the default value for the results field of inference applied on models that are trained via a data frame analytics job. Previously, the results field default was `predicted_value`. This commit makes it the same as in the training job itself. The new default field is `<dependent_variable>_prediction`. Apart from making inference consistent with the training job the model came from, it is helpful to preserve the dependent variable name by default as it provides some context to the user that may avoid confusion as to which model results came from. Backport of #58538	2020-06-25 18:03:43 +03:00
Benjamin Trent	add8ff1ad3	[ML] assume data streams are enabled in data stream tests (#58502 ) (#58508 )	2020-06-24 14:14:48 -04:00
Chris Roberson	d5899d1765	[Monitoring] APM mapping update (#46244 ) (#58498 ) * Add acm mapping to APM for beats * Add root mapping for APM * Add sourcemap mapping to APM * Fix missing properties * Fix a second missing properties * Add request property to acm * Remove root and sourcemap per review Co-authored-by: Mike Place <mike.place@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-06-24 13:26:30 -04:00
Armin Braun	9e4c5d1dde	Cleaner Handling of Snapshot Related null Custom Values in CS (#58382 ) (#58501 ) Add the ability to get a custom value while specifying a default and use it throughout the codebase to get rid of the `null` edge case and shorten the code a little.	2020-06-24 17:24:44 +02:00
Martijn van Groningen	f4fad9c65a	Re-enable data streams yaml tests in bwc mode (#58500 ) Backport of #58403 to 7.x branch.	2020-06-24 16:59:51 +02:00
Hendrik Muhs	c1bbfeddc9	Improve rolling upgrade test setup assertions (#58313 ) wrap test setup and add proper assert messages relates #58282	2020-06-24 16:54:48 +02:00
Andrei Stefan	69f73d948b	EQL: code cleanup and further tests (#58458 ) (#58497 ) Add FunctionPipe tests to all functions. Cleanup functions code. (cherry picked from commit 0f83d5799841fe99d8aeaf46e50dd11aa6bf8a57)	2020-06-24 17:38:56 +03:00
Przemysław Witek	551b8bcd73	[7.x] Use static methods (rather than constants) to obtain .ml-meta and .ml-config index names (#58484 ) (#58490 )	2020-06-24 15:52:45 +02:00
Benjamin Trent	fa88e71532	[ML] unify usages of _all and wildcard <*> (#58460 ) (#58494 )	2020-06-24 09:47:57 -04:00
Luca Cavanna	dbbf2772d8	Mute newly added ml data streams tests (#58492 ) Relates to #58491	2020-06-24 15:11:40 +02:00
markharwood	d5ac3bb87f	Field capabilities - make `keyword` a family of field types (#58315 ) (#58483 ) Introduces a new method on `MappedFieldType` to return a family type name which defaults to the field type. Changes `wildcard` and `constant_keyword` field types to return `keyword` for field capabilities. Relates to #53175	2020-06-24 12:32:14 +01:00
Alan Woodward	d251a482e9	Move MappedFieldType.similarity() to TextSearchInfo (#58439 ) Similarities only apply to a few text-based field types, but are currently set directly on the base MappedFieldType class. This commit moves similarity information into TextSearchInfo, and removes any mentions of it from MappedFieldType or FieldMapper. It was previously possible to include a similarity parameter on a number of field types that would then ignore this information. To make it obvious that this has no effect, setting this parameter on non-text field types now issues a deprecation warning.	2020-06-24 10:00:32 +01:00
Jim Ferenczi	fcd8a432d9	Submit _async search task should cancel children on cancellation (#58332 ) This change allows the submit async search task to cancel children and removes the manual indirection that cancels the search task when the submit task is cancelled. This is now handled by the task cancellation, which can cancel grand-children since #54757.	2020-06-24 09:10:26 +02:00
Larry Gregory	2ca09cddaf	[DOCS] Rename kibana user to kibana_system (#58423 )	2020-06-23 14:25:09 -07:00
Przemysław Witek	4e4ca6ac25	Extract ClientHelper.filterSecurityHeaders method and use it in ML code (#58447 ) (#58459 )	2020-06-23 22:18:39 +02:00
Benjamin Trent	a9b868b7a9	[7.x] [ML] allow data streams to be expanded for analytics and transforms (#58280 ) (#58455 ) This commits allows data streams to be a valid source for analytics and transforms. Data streams are fairly transparent and our `_search` and `_reindex` actions work without error. For `_transforms` the check-pointing works as desired as well. Data streams are effectively treated as an `alias` and the backing index values are stored within checkpointing information.	2020-06-23 14:40:35 -04:00
Benjamin Trent	0cc84d3caf	[ML] wait for yellow state for stats index in tests (#58436 ) (#58456 ) GET inference stats now reads from the .ml-stats index. Our tests should wait for yellow state before attempting to query the index for stat information.	2020-06-23 13:32:24 -04:00
Dimitris Athanasiou	f67fee387b	[7.x][ML] Make regression training set predictable in size (#58331 ) (#58453 ) Unlike `classification`, which is using a cross validation splitter that produces training sets whose size is predictable and equal to `training_percent * class_cardinality`, for regression we have been using a random splitter that takes an independent decision for each document. This means we cannot predict the exact size of the training set. This poses a problem as we move towards performing test inference on the java side as we need to be able to provide an accurate upper bound of the training set size to the c++ process. This commit replaces the random splitter we use for regression with the same streaming-reservoir approach we do for `classification`. Backport of #58331	2020-06-23 19:49:03 +03:00
Marios Trivyzas	e7c40d973e	SQL: Relax parsing of date/time escaped literals (#58336 ) (#58450 ) Improve the usability of the MS-SQL server/ODBC escaped date/time/timestamp literals, by allowing timezone/offset ids in the parsed string, e.g.: ``` {ts '2000-01-01T11:11:11Z'} ``` Closes: #58262 (cherry picked from commit 0af1f2fef805324e802d97d2fd9b4660abb403f0)	2020-06-23 18:05:54 +02:00
David Roberts	0d6bfd0ac3	[7.x][ML] Fix wire serialization for flush acknowledgements (#58443 ) There was a discrepancy in the implementation of flush acknowledgements: most of the class was designed on the basis that the "last finalized bucket time" could be null but the wire serialization assumed that it was never null. This works because, the C++ sends zero "last finalized bucket time" when it is not known or not relevant. But then the Java code will print that to XContent as it is assuming null represents not known or not relevant. This change corrects the discrepancies. Internally within the class null represents not known or not relevant, but this is translated from/to 0 for communications from the C++ and old nodes that have the bug. Additionally I switched from Date to Instant for this class and made the member variables final to modernise it a bit. Backport of #58413	2020-06-23 16:42:06 +01:00
Mark Tozzi	52806a8f89	Small VS config cleanup (#58294 ) (#58442 )	2020-06-23 10:53:06 -04:00
Benjamin Trent	61142a3005	[ML] only log if forecasts are set to failed (#58421 ) (#58437 ) This adjusts the logging level for setting forecasts to failed to WARN. And it will only log if 1 or more forecasts were adjusted to failed.	2020-06-23 10:24:03 -04:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Alan Woodward	519d1278e2	Make FieldTypeLookup immutable (#58162 ) (#58411 ) FieldTypeLookup maps field names to their MappedFieldTypes. In the past, due to the presence of multiple mapping types within a single index, this had to be updated in-place because a mapping update might only affect one type. However, now that we only have a single type per index, we can completely rebuild the FieldTypeLookup on each update, removing lots of concurrency worries.	2020-06-23 10:51:32 +01:00
David Roberts	f97b37190b	[ML] Add a new annotation type for categorization status changes (#58394 ) Adds a new value to the "event" enum of ML annotations, namely "categorization_status_change". This will allow users to see when categorization was found to be performing poorly. Once per-partition categorization is available, it will allow users to see when categorization is performing poorly for a specific partition. It does not make sense to reuse the "model_change" event that annotations already have, because categorizer state is separate to model state ("model" state is really anomaly detector state), and is not reverted by the revert model snapshot API. Therefore annotations related to categorization need to be treated differently to annotations related to anomaly detection.	2020-06-23 09:16:27 +01:00
Rene Groeschke	bd2dd81bc6	Fix deprecated property usage in archive tasks (#58269 ) (#58308 )	2020-06-23 09:11:46 +02:00
Martijn van Groningen	7dda9934f9	Keep track of timestamp_field mapping as part of a data stream (#58400 ) Backporting #58096 to 7.x branch. Relates to #53100 * use mapping source direcly instead of using mapper service to extract the relevant mapping details * moved assertion to TimestampField class and added helper method for tests * Improved logic that inserts timestamp field mapping into an mapping. If the timestamp field path consisted out of object fields and if the final mapping did not contain the parent field then an error occurred, because the prior logic assumed that the object field existed.	2020-06-22 17:46:38 +02:00
Costin Leau	765f1b5775	SQL: Fix bug in resolving aliases against filters (#58399 ) When doing aliasing with the same name over non existing fields, the analyzer gets stuck in a loop trying to resolve the alias over and over leading to SO. This PR breaks the cycle by checking the relationship between the alias and the child it tries to replace as an alias should never replace its child. Fix #57270 Close #57417 Co-authored-by: Hailei <zhh5919@163.com> (cherry picked from commit 46786ff2e1ed5951006ff4bdd2b6ac6a1ebcf17b)	2020-06-22 16:05:42 +03:00
Przemko Robakowski	a44dad9fbb	[7.x] Add support for snapshot and restore to data streams (#57675 ) (#58371 ) * Add support for snapshot and restore to data streams (#57675) This change adds support for including data streams in snapshots. Names are provided in indices field (the same way as in other APIs), wildcards are supported. If rename pattern is specified it renames both data streams and backing indices. It also adds test to make sure SLM works correctly. Closes #57127 Relates to #53100 * version fix * compilation fix * compilation fix * remove unused changes * compilation fix * test fix	2020-06-19 22:41:51 +02:00
Benjamin Trent	bf8641aa15	[7.x] [ML] calculate cache misses for inference and return in stats (#58252 ) (#58363 ) When a local model is constructed, the cache hit miss count is incremented. When a user calls _stats, we will include the sum cache hit miss count across ALL nodes. This statistic is important to in comparing against the inference_count. If the cache hit miss count is near the inference_count it indicates that the cache is overburdened, or inappropriately configured.	2020-06-19 09:46:51 -04:00
Stuart Tettemer	20abba8433	Scripting: Deprecate general cache settings (#55753 ) (#58283 ) Backport: ef543b0	2020-06-18 11:54:23 -06:00
Jim Ferenczi	1c1a6d4ec8	Handle failures with no explicit cause in async search (#58319 ) This commit fixes an AOOBE in the handling of fatal failures in _async_search. If the underlying cause is not found, this change uses the root failure. Closes #58311	2020-06-18 18:57:58 +02:00
Przemysław Witek	9dd3d5aa48	[7.x] Delete auto-generated annotations when model snapshot is reverted (#58240 ) (#58335 )	2020-06-18 17:59:52 +02:00
Jason Tedor	be08268562	Allow follower indices to override leader settings (#58103 ) Today when creating a follower index via the put follow API, or via an auto-follow pattern, it is not possible to specify settings overrides for the follower index. Instead, we copy all of the leader index settings to the follower. Yet, there are cases where a user would want some different settings on the follower index such as the number of replicas, or allocation settings. This commit addresses this by allowing the user to specify settings overrides when creating follower index via manual put follower calls, or via auto-follow patterns. Note that not all settings can be overrode (e.g., index.number_of_shards) so we also have detection that prevents attempting to override settings that must be equal between the leader and follow index. Note that we do not even allow specifying such settings in the overrides, even if they are specified to be equal between the leader and the follower index. Instead, the must be implicitly copied from the leader index, not explicitly set by the user.	2020-06-18 11:56:06 -04:00
Alan Woodward	4b8cf2af6a	Add serialization test for FieldMappers when include_defaults=true (#58235 ) (#58328 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 15:46:04 +01:00
Marios Trivyzas	50b391e91b	SQL: [Docs] Fix TIME_PARSE documentation (#58182 ) (#58317 ) TIME_PARSE works correctly if both date and time parts are specified, and a TIME object (that contains only time is returned). Adjust docs and add a unit test that validates the behavior. Follows: #55223 (cherry picked from commit 9d6b679a5da88f3c131b9bdba49aa92c6c272abe)	2020-06-18 16:09:13 +02:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Tanguy Leroux	f3b6e41f02	Do not wrap CacheFile reentrant r/w locks with ReleasableLock (#58244 ) Today the read/write locks used internally by CacheFile object are wrapped into a ReleasableLock. This is not strictly required and also prevents usage of the tryLock() methods which we would like to use for early releasing of read operations (#58164).	2020-06-18 11:01:53 +02:00
Andrei Dan	caa5d3abe0	ILM actions check the managed index is not a DS write index (#58239 ) (#58295 ) This changes the actions that would attempt to make the managed index read only to check if the managed index is the write index of a data stream before proceeding. The updated actions are shrink, readonly, freeze and forcemerge. (cherry picked from commit c906f631833fee8628f898917a8613a1f436c6b1) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-18 07:45:11 +01:00
Rene Groeschke	abc72c1a27	Unify dependency licenses task configuration (#58116 ) (#58274 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-18 08:15:50 +02:00
Lee Hinman	d272646a55	Fix name of template in allowed warning for DS YML test (#58273 ) The warning was present, but had the incorrect template name, leading to a test failure.	2020-06-17 11:23:04 -06:00
David Roberts	3f8d16304c	Add ML admin permissions to the kibana_system role (#58172 ) As part of the "ML in Spaces" project, access to the ML UI in Kibana is migrating to being controlled by Kibana privileges. The ML UI will check whether the logged-in user has permission to do something ML-related using Kibana privileges, and if they do will call the relevant ML Elasticsearch API using the Kibana system user. In order for this to work the kibana_system role needs to have administrative access to ML. Backport of #58061	2020-06-17 17:03:32 +01:00
Benjamin Trent	2de242f80e	[ML] rename EnsembleSizeInfo#inputFieldNameLengths to this.featureNameLengths (#58241 ) (#58253 )	2020-06-17 10:08:55 -04:00
Benjamin Trent	69338b03d7	[ML] expand data_streams when assigning datafeed to node (#58175 ) (#58242 )	2020-06-17 08:34:34 -04:00
Ignacio Vera	2d3d7ab387	mute CentroidCalculatorTests#testPolygonAsPoint (#58249 ) (#58250 )	2020-06-17 14:32:13 +02:00
Jason Tedor	b78b3edeea	Upgrade to JNA 5.5.0 (#58183 ) This commit bumps our JNA dependency from 4.5.1 to 5.5.0, so that we are now on the latest maintained line, and pick up a large collection of bug fixes that have accumulated.	2020-06-17 07:35:08 -04:00
Dimitris Athanasiou	36dbf08d47	[7.x][ML] Improve stability of stratified splitter tests (#58180 ) (#58224 ) The main improvement here is that the total expected count of training rows in the test is calculated as the sum of the training fraction times the cardinality of each class (instead of the training fraction times the total doc count). Also relaxes slightly the error bound on the uniformity test from 0.12 to 0.13. Closes #54122 Backport of #58180	2020-06-17 12:40:21 +03:00
Andrei Dan	e17c51151b	[7.x] ILM: don't take snapshot of a data stream's write index (#58159 ) (#58222 ) We don't allow converting a data stream's writeable index into a searchable snapshot. We are currently preventing swapping a data stream's write index with the restored index. This adds another step that will not proceed with the searchable snapshot action until the managed index is not the write index of a data stream anymore. (cherry picked from commit ccd618ead7cf7f5a74b9fb34524d00024de1479a) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-17 09:45:16 +01:00
Ignacio Vera	7080ba5b05	Check for degenerated lines when calculating the centroid (#58216 )	2020-06-17 09:34:49 +02:00
Przemysław Witek	b22e91cefc	[7.x] Delete auto-generated annotations when job is deleted. (#58169 ) (#58219 )	2020-06-17 09:17:20 +02:00
Lisa Cawley	46d797b1d9	[DOCS] Fixes license management links (#58213 )	2020-06-16 16:49:48 -07:00
Stuart Tettemer	01795d1925	Revert "Scripting: Deprecate general cache settings (#55753 )" (#58201 ) This reverts commit `88e8b34fc2`.	2020-06-16 14:58:18 -06:00
Stuart Tettemer	88e8b34fc2	Scripting: Deprecate general cache settings (#55753 ) Backport: ef543b0	2020-06-16 13:06:59 -06:00
Benjamin Trent	081da09c72	Allow GET <pattern>/_rollup/data to expand data streams (#58173 ) (#58177 )	2020-06-16 14:01:54 -04:00
Benjamin Trent	3309817d18	[ML] fixing tree inference ctor to allow target_type to be optional (#58132 ) (#58165 ) The tree trained model object will set its target_type to be regression by default. This updates the inference object to behave the same way.	2020-06-16 13:29:11 -04:00
Benjamin Trent	6c03d97419	Mute TimeSeriesDataStreamsIT.testSearchableSnapshotAction (#58127 ) (#58181 ) Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-16 12:40:38 -04:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Dan Hermann	7079a3b09f	[7.x] Prohibit freezing the write index of a data stream (#58168 )	2020-06-16 09:37:32 -05:00
Yannick Welsch	1e235a7f55	Fix off-by-one on CCR lease (#58158 ) The leases issued by CCR keep one extra operation around on the leader shards. This is not harmful to the leader cluster, but means that there's potentially one delete that can't be cleaned up.	2020-06-16 14:04:58 +02:00
David Turner	423697f414	Default to zero replicas for searchable snapshots (#57802 ) Today a mounted searchable snapshot defaults to having the same replica configuration as the index that was snapshotted. This commit changes this behaviour so that we default to zero replicas on these indices, but allow the user to override this in the mount request. Relates #50999	2020-06-16 10:12:23 +01:00
Tal Levy	69d5e044af	Add optional description parameter to ingest processors. (#57906 ) (#58152 ) This commit adds an optional field, `description`, to all ingest processors so that users can explain the purpose of the specific processor instance. Closes #56000.	2020-06-15 19:27:57 -07:00
Lisa Cawley	554e60860f	[DOCS] Add token and HTTPS requirements for Kerberos (#57180 ) Co-authored-by: Tim Vernum <tim@adjective.org>	2020-06-15 14:30:13 -07:00
Lee Hinman	d56d2dfb09	[7.x] Scope index templates put during cluster upgrade tests (#58065 ) (#58122 ) This template was added for 7.0 for what I am guessing is a BWC issue related to deprecation warnings. It unfortunately seems to cause failures because templates for these tests are not cleared after the test (because these are upgrade tests). Resolves #56363	2020-06-15 10:47:36 -06:00
markharwood	03dd73dc0d	Fix for wildcard fields that returned ByteRefs not Strings to scripts. (#58060 ) (#58109 ) This need some reorg of BinaryDV field data classes to allow specialisation of scripted doc values. Moved common logic to a new abstract base class and added a new subclass to return string-based representations to scripts. Closes #58044	2020-06-15 14:52:56 +01:00
Alejandro Fernández Haro	3d0c8da66d	Add monitor and view_index_metadata to the built-in `kibana_system` role (#57755 ) Allows the kibana user to collect data telemetry in a background task by giving the kibana_system built-in role the view_index_metadata and monitoring privileges over all indices (*).	2020-06-15 14:40:27 +03:00
Shaunak Kashyap	5e2faad783	Add ILM policy PUT and GET for remote_monitoring_agent built-in role (#57963 ) Without this fix, users who try to use Metricbeat for Stack Monitoring today see the following error repeatedly in their Metricbeat log. Due to this error Metricbeat is unwilling to proceed further and, thus, no Stack Monitoring data is indexed into the Elasticsearch cluster. Co-authored-by: Albert Zaharovits <albert.zaharovits@elastic.co>	2020-06-15 14:35:30 +03:00
Rene Groeschke	01e9126588	Remove deprecated usage of testCompile configuration (#57921 ) (#58083 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-14 22:30:44 +02:00
Jason Tedor	dcf4131f00	Revert "Add JNA license to SQL CLI dependency licenses" This reverts commit `076b32d4f3`.	2020-06-12 17:04:39 -04:00
Dan Hermann	17f3318732	[7.x] Resolve index API (#58037 )	2020-06-12 15:41:32 -05:00
Jason Tedor	076b32d4f3	Add JNA license to SQL CLI dependency licenses Previously we excluded requiring licenses for dependencies with the group name org.elasticsearch under the assumption that these use the top-level Elasticsearch license. This is not always correct, for example, for the org.elasticsearch:jna dependency as this is merely a wrapper around the upstream JNA project, and that is the license that we should be including. A recent change modified this check from using the group name to checking only if the dependency is a project dependency. This exposed the use of JNA in SQL CLI to this check, but the license for it was not added. This commit addresses this by adding the license. Relates #58015	2020-06-12 16:38:23 -04:00
Benjamin Trent	79c784932f	[ML] allow feature_names to be optional in ensemble inference model (#58059 ) (#58067 ) This has `EnsembleInferenceModel` not parse feature_names from the XContent. Instead, it will rely on `rewriteFeatureIndices` to be called ahead time. Consequently, protections are made for a fail fast path if `rewriteFeatureIndices` has not been called before `infer`.	2020-06-12 16:33:54 -04:00
Mark Vieira	0ce102a5f4	Fix issue with bwc tests running wrong cluster versions (#58063 ) We were previously configuring BWC testing tasks by matching on task name prefix. This naive approach breaks down when you have versions like 1.0.1 and 1.0.10 since they both share a common prefix. This commit makes the pattern matching more specific so we won't inadvertently spin up the wrong cluster version.	2020-06-12 12:34:15 -07:00
Ignacio Vera	c518670f83	Fix Geo grid aggregation circuit breaker tests (#58028 ) (#58042 ) This commit makes sure we create index with only one shard.	2020-06-12 15:39:27 +02:00
Martijn van Groningen	01d8bb8cfa	Enforce valid field mapping exists for timestamp_field in templates. (#58036 ) Backport of #57741 to 7.x branch. Relates to #53100	2020-06-12 15:24:42 +02:00
David Roberts	93b693527a	[7.x][ML] Add categorizer stats ML result type (#58001 ) This type of result will store stats about how well categorization is performing. When per-partition categorization is in use, separate documents will be written for every partition so that it is possible to see if categorization is working well for some partitions but not others. This PR is a minimal implementation to allow the C++ side changes to be made. More Java side changes related to per-partition categorization will be in followup PRs. However, even in the long term I do not see a major benefit in introducing dedicated APIs for querying categorizer stats. Like forecast request stats the categorizer stats can be read directly from the job's results alias. Backport of #57978	2020-06-12 12:08:07 +01:00
markharwood	2da8e57f59	Search - add range query support to wildcard field (#57881 ) (#57988 ) Backport to add range query support to wildcard field Closes #57816	2020-06-12 11:30:54 +01:00
David Kyle	39020f3900	HLRC for delete expired data by job Id (#57722 ) (#57975 ) High level rest client changes for #57337	2020-06-12 09:44:17 +01:00
Mark Tozzi	36f551bdb4	Make ValuesSourceConfig behave like a config object (#57762 ) (#58012 )	2020-06-11 17:23:55 -04:00
Benjamin Trent	2881995a45	[ML] adding new inference model size estimate handling from native process (#57930 ) (#57999 ) Adds support for reading in `model_size_info` objects. These objects contain numeric values indicating the model definition size and complexity. Additionally, these objects are not stored or serialized to any other node. They are to be used for calculating and storing model metadata. They are much smaller on heap than the true model definition and should help prevent the analytics process from using too much memory. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-06-11 15:59:23 -04:00
Alan Woodward	16e230dcb8	Update to lucene snapshot e7c625430ed (#57981 ) Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.	2020-06-11 14:51:53 +01:00
David Roberts	54d4f2a623	[ML] Refresh annotations index on job flush and close (#57979 ) Now that annotations are part of the anomaly detection job results the annotations index should be refreshed on flushing and closing the job so that flush and close continue to fulfil their contracts that immediately after returning all results the job generated up to that point are searchable.	2020-06-11 12:29:04 +01:00

1 2 3 4 5 ...

5750 Commits