OpenSearch

Commit Graph

Author	SHA1	Message	Date
Dimitris Athanasiou	86c853a7c2	[7.x][ML] Rename outlier score setting to feature_influence_threshold (#43705 ) (#43734 ) Renames outlier score setting `minimum_score_to_write_feature_influence` to `feature_influence_threshold`.	2019-06-28 13:28:25 +03:00
Dimitris Athanasiou	cab879118d	[7.x][ML] Support multiple source indices for df-analytics (#43702 ) (#43731 ) This commit adds support for multiple source indices. In order to deal with multiple indices having different mappings, it attempts a best-effort approach to merge the mappings assuming there are no conflicts. In case conflicts exists an error will be returned. To allow users creating custom mappings for special use cases, the destination index is now allowed to exist before the analytics job runs. In addition, settings are no longer copied except for the `index.number_of_shards` and `index.number_of_replicas`.	2019-06-28 13:28:03 +03:00
Christoph Büscher	2cc7f5a744	Allow reloading of search time analyzers (#43313 ) Currently changing resources (like dictionaries, synonym files etc...) of search time analyzers is only possible by closing an index, changing the underlying resource (e.g. synonym files) and then re-opening the index for the change to take effect. This PR adds a new API endpoint that allows triggering reloading of certain analysis resources (currently token filters) that will then pick up changes in underlying file resources. To achieve this we introduce a new type of custom analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows swapping out analysis components. Custom analyzers that contain filters that are markes as "updateable" will automatically choose this implementation. This PR also adds this capability to `synonym` token filters for use in search time analyzers. Relates to #29051	2019-06-28 09:55:40 +02:00
Przemysław Witek	94f18da5df	Add version and create_time to data frame analytics config (#43683 ) (#43712 )	2019-06-28 07:37:21 +02:00
Ryan Ernst	5b4089e57e	Remove nodeId from BaseNodeRequest (#43658 ) TransportNodesAction provides a mechanism to easily broadcast a request to many nodes, and collect the respones into a high level response. Each node has its own request type, with a base class of BaseNodeRequest. This base request requires passing the nodeId to which the request will be sent. However, that nodeId is not used anywhere. It is private to the base class, yet serialized to each node, where the node could just as easily find the nodeId of the node it is on locally. This commit removes passing the nodeId through to the node request creation, and guards its serialization so that we can remove the base request class altogether in the future.	2019-06-27 18:45:14 -07:00
Igor Motov	3607876a71	Geo: Makes coordinate validator in libs/geo plugable (#43657 ) Moves coordinate validation from Geometry constructors into parser. Relates #43644	2019-06-27 19:53:41 -04:00
Nhat Nguyen	ce8771feb7	Do not use MockInternalEngine in GatewayIndexStateIT (#43716 ) GatewayIndexStateIT#testRecoverBrokenIndexMetadata replies on the flushing on shutdown. This behaviour, however, can be randomly disabled in MockInternalEngine. Closes #43034	2019-06-27 18:28:04 -04:00
Przemysław Witek	68dbbd8793	Deduplicate two similar TimeUtils classes. (#43697 ) * Deduplicate org.elasticsearch.xpack.core.dataframe.utils.TimeUtils and org.elasticsearch.xpack.core.ml.utils.time.TimeUtils into a common class: org.elasticsearch.xpack.core.common.time.TimeUtils. * Add unit tests for parseTimeField and parseTimeFieldToInstant methods	2019-06-27 18:51:48 +02:00
Yannick Welsch	6744344ef2	Handle situation where only voting-only nodes are bootstrapped (#43628 ) Adds support for the situation where only voting-only nodes are bootstrapped. In that case, they will still try to become elected and bring full master nodes into the cluster.	2019-06-27 18:10:15 +02:00
David Roberts	f39619d182	[ML] Don't write timing stats on no-op (#43680 ) Similar to elastic/ml-cpp#512, if a job opens and closes and does nothing in between we shouldn't write timing stats to the results index.	2019-06-27 16:37:54 +01:00
Jim Ferenczi	329d05f61e	Fix UOE on search requests that match a sparse role query (#43668 ) Search requests executed through the SecurityIndexSearcherWrapper throw an UnsupportedOperationException if they match a sparse role query. When low level cancellation is activated (which is the default since #42857), the context index searcher creates a weight that doesn't handle #scorer. This change fixes this bug and adds a test to ensure that we check this case.	2019-06-27 16:56:56 +02:00
Przemysław Witek	ba518722a2	[7.x] [ML] Tag destination index with data frame metadata (#43567 ) (#43660 )	2019-06-27 08:08:39 +02:00
Benjamin Trent	d05593c3ad	[ML][Data Frame] adds tests for continuous DF (#43601 ) (#43654 )	2019-06-26 14:59:19 -05:00
Benjamin Trent	52e26bbc42	[ML][Data Frame] improve pivot nested field validations (#43548 ) (#43636 ) * [ML][Data Frame] improve pivot nested field validations * addressing pr comments	2019-06-26 13:35:51 -05:00
Armin Braun	c00e305d79	Optimize Selector Wakeups (#43515 ) (#43650 ) * Use atomic boolean to guard wakeups * Don't trigger wakeups from the select loops thread itself for registering and closing channels * Don't needlessly queue writes Co-authored-by: Tim Brooks <tim@uncontended.net>	2019-06-26 20:00:42 +02:00
David Kyle	e1f761dfc7	[Ml Data Frame] Size the GET stats search by number of Ids requested (#43206 ) Set the size of the search request to the number of ids limited by 10,000	2019-06-26 17:01:12 +01:00
Benjamin Trent	c121b00c98	[7.x] [ML][Data Frame] Add support for allow_no_match for endpoints (#43490 ) (#43637 ) * [ML][Data Frame] Add support for allow_no_match for endpoints (#43490) * [ML][Data Frame] Add support for allow_no_match parameter in endpoints Adds support for: * Get Transforms * Get Transforms stats * stop transforms * Update DataFrameTransformDocumentationIT.java	2019-06-26 10:09:56 -05:00
David Roberts	31dc5b7d3a	[TEST] Wait for replicas before stopping nodes in ML distributed test (#43622 ) If we stop a node before replicas exist then the test can fail because we lose a whole index if we stop the node with the primary on.	2019-06-26 11:52:53 +01:00
David Roberts	558e323c89	[ML] Introduce a setting for the process connect timeout (#43234 ) This change introduces a new setting, xpack.ml.process_connect_timeout, to enable the timeout for one of the external ML processes to connect to the ES JVM to be increased. The timeout may need to be increased if many processes are being started simultaneously on the same machine. This is unlikely in clusters with many ML nodes, as we balance the processes across the ML nodes, but can happen in clusters with a single ML node and a high value for xpack.ml.node_concurrent_job_allocations.	2019-06-26 09:22:04 +01:00
Yannick Welsch	2049f715b3	Add voting-only master node (#43410 ) A voting-only master-eligible node is a node that can participate in master elections but will not act as a master in the cluster. In particular, a voting-only node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. This only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers. This means that one of the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a voting-only non-dedicated master node can play the role of the third master-eligible node, which allows running an HA cluster with only two dedicated master nodes. Closes #14340 Co-authored-by: David Turner <david.turner@elastic.co>	2019-06-26 08:07:56 +02:00
Yogesh Gaikwad	480453aa24	Make role descriptors optional when creating API keys (#43481 ) (#43614 ) This commit changes the `role_descriptors` field from required to optional when creating API key. The default behavior in .NET ES client is to omit properties with `null` value requiring additional workarounds. The behavior for the API does not change. Field names (`id`, `name`) in the invalidate api keys API documentation have been corrected where they were wrong. Closes #42053	2019-06-26 14:30:51 +10:00
Yogesh Gaikwad	58179af5af	Enable Kerberos tests (#43519 ) (#43612 ) Now that the fix krb5-kdc fixture (entropy problem in docker container) is in and the converting `kerberos-tests` to testclusters is done, enabling the kerberos-tests Closes #40678	2019-06-26 12:55:41 +10:00
Przemysław Witek	76a750a0a0	Remove unused mapStringsOrdered method (#42513 ) (#43585 )	2019-06-25 20:43:38 +02:00
Tanguy Leroux	0dc1c12f13	Fix indices shown in _cat/indices (#43286 ) After two recent changes (#38824 and #33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes #39933	2019-06-25 20:02:34 +02:00
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
Benjamin Trent	970e157eac	[ML][Data Frame] Adjusting error message (#43455 ) (#43580 ) * Adjusting error message * Update TransportPutDataFrameTransformAction.java * Update TransportPutDataFrameTransformAction.java	2019-06-25 10:09:39 -05:00
Przemysław Witek	c702cd7415	[7.x] Implement XContentParser.genericMap and XContentParser.genericMapOrdered methods (#42059 ) (#43575 )	2019-06-25 16:04:54 +02:00
Przemysław Witek	b15e40ffad	Extract TimingStats-related functionality into TimingStatsReporter (#43371 ) (#43557 )	2019-06-25 15:48:39 +02:00
David Roberts	9c285ddbab	[ML] Improve message when native controller cannot connect (#43565 ) The error message if the native controller failed to run (for example due to running Elasticsearch on an unsupported platform) was not easy to understand. This change removes pointless detail from the message and adds some hints about likely causes. Fixes #42341	2019-06-25 12:06:54 +01:00
Lee Hinman	8d9a971259	Revert "Test clusters: convert x-pack qa tests (#43283 )" (#43549 ) This reverts commit `ccaa8c33ba`.	2019-06-24 17:16:29 -06:00
Tim Brooks	38516a4dd5	Move nio ip filter rule to be a channel handler (#43507 ) Currently nio implements ip filtering at the channel context level. This is kind of a hack as the application logic should be implemented at the handler level. This commit moves the ip filtering into a channel handler. This requires adding an indicator to the channel handler to show when a channel should be closed.	2019-06-24 10:03:24 -06:00
Gordon Brown	fac7efba9a	[7.x] Account for node versions during allocation in ILM Shrink (#43300 ) This commit ensures that ILM's Shrink action will take node versions into account when choosing which node to allocate to when shrinking an index. Prior to this change, ILM could pick a node with a lower version than some shards are already allocated to, which causes the new allocation to fail as shards can't be relocated onto a node with a lower version than they are already on. As part of this, when making the decision about which node to allocate to prior to Shrink, all shards in the index are considered, rather than choosing a random shard to consider. Further, the unit tests for the logic that chooses a node to allocate shards to pre-shrink has been improved to validate the behavior in more realistic and varied initial conditions.	2019-06-24 10:02:49 -06:00
Mayya Sharipova	813551e070	Fix eclipse build gradle for vectors project Closes #43496	2019-06-24 09:22:48 -04:00
Martijn van Groningen	101cf384ba	Replace Streamable w/ Writable in AcknowledgedResponse and subclasses (backport 7.x) (#43525 ) This commit replaces usages of Streamable with Writeable for the AcknowledgedResponse and its subclasses, plus associated actions. Note that where possible response fields were made final and default constructors were removed. This is a large PR, but the change is mostly mechanical. Relates to #34389 Backport of #43414	2019-06-24 13:47:37 +02:00
Alpar Torok	ccaa8c33ba	Test clusters: convert x-pack qa tests (#43283 )	2019-06-24 12:20:46 +03:00
Alpar Torok	ea44da6069	Testclusters: conver remaining x-pack (#43335 ) Convert x-pack tests	2019-06-24 12:07:42 +03:00
Benjamin Trent	f4b75d6d14	[7.x] [ML][Data Frame] Add version and create_time to transform config (#43384 ) (#43480 ) * [ML][Data Frame] Add version and create_time to transform config (#43384) * [ML][Data Frame] Add version and create_time to transform config * s/transform_version/version s/Date/Instant * fixing getter/setter for version * adjusting for backport	2019-06-21 09:11:44 -05:00
David Kyle	73221d2265	[ML] Resolve NetworkDisruptionIT (#43441 ) After the network disruption a partition is created, one side of which can form a cluster the other can't. Ensure requests are sent to a node on the correct side of the cluster	2019-06-21 10:24:02 +01:00
Simon Willnauer	424ef4f158	SecurityIndexSearcherWrapper doesn't always carry over caches and similarity (#43436 ) If DocumentLevelSecurity is enabled SecurityIndexSearcherWrapper doesn't carry over the cache, cache policy and similarity from the incoming searcher.	2019-06-21 10:19:10 +02:00
Tim Vernum	059eb55108	Use SecureString for password length validation (#43465 ) This replaces the use of char[] in the password length validation code, with the use of SecureString Although the use of char[] is not in itself problematic, using a SecureString encourages callers to think about the lifetime of the password object and to clear it after use. Backport of: #42884	2019-06-21 17:11:07 +10:00
Armin Braun	21515b9ff1	Fix IpFilteringIntegrationTests (#43019 ) (#43434 ) * Increase timeout to 5s since we saw 500ms+ GC pauses on CI * closes #40689	2019-06-20 22:31:59 +02:00
Yannick Welsch	7f8e1454ab	Advance checkpoints only after persisting ops (#43205 ) Local and global checkpoints currently do not correctly reflect what's persisted to disk. The issue is that the local checkpoint is adapted as soon as an operation is processed (but not fsynced yet). This leaves room for the history below the global checkpoint to still change in case of a crash. As we rely on global checkpoints for CCR as well as operation-based recoveries, this has the risk of shard copies / follower clusters going out of sync. This commit required changing some core classes in the system: - The LocalCheckpointTracker keeps track now not only of the information whether an operation has been processed, but also whether that operation has been persisted to disk. - TranslogWriter now keeps track of the sequence numbers that have not been fsynced yet. Once they are fsynced, TranslogWriter notifies LocalCheckpointTracker of this. - ReplicationTracker now keeps track of the persisted local and persisted global checkpoints of all shard copies when in primary mode. The computed global checkpoint (which represents the minimum of all persisted local checkpoints of all in-sync shard copies), which was previously stored in the checkpoint entry for the local shard copy, has been moved to an extra field. - The periodic global checkpoint sync now also takes async durability into account, where the local checkpoints on shards only advance when the translog is asynchronously fsynced. This means that the previous condition to detect inactivity (max sequence number is equal to global checkpoint) is not sufficient anymore. - The new index closing API does not work when combined with async durability. The shard verification step is now requires an additional pre-flight step to fsync the translog, so that the main verify shard step has the most up-to-date global checkpoint at disposition.	2019-06-20 11:12:38 +02:00
Andrei Stefan	fe0f9055d8	Fix NPE in case of subsequent scrolled requests for a CSV/TSV formatted response (#43365 ) (cherry picked from commit 0ef7bb0f8b07cd0392d37f96ca9360821b19315a)	2019-06-20 11:26:11 +03:00
Jason Tedor	1f1a035def	Remove stale test logging annotations (#43403 ) This commit removes some very old test logging annotations that appeared to be added to investigate test failures that are long since closed. If these are needed, they can be added back on a case-by-case basis with a comment associating them to a test failure.	2019-06-19 22:58:22 -04:00
Lee Hinman	c2bf628a6d	[7.x] Narrow period of Shrink action in which ILM prevents stopping (#43254 ) (#43393 ) * Narrow period of Shrink action in which ILM prevents stopping Prior to this change, we would prevent stopping of ILM if the index was anywhere in the shrink action. This commit changes `IndexLifecycleService` to allow stopping when in any of the innocuous steps during shrink. This changes ILM only to prevent stopping if absolutely necessary. Resolves #43253 * Rename variable for ignore actions -> ignore steps * Fix comment * Factor test out to test all stoppable steps	2019-06-19 16:37:41 -06:00
Benjamin Trent	77ce3260dd	[ML][Data Frame] make response.count be total count of hits (#43241 ) (#43389 ) * [ML][Data Frame] make response.count be total count of hits * addressing line length check * changing response count for filters * adjusting serialization, variable name, and total count logic * making count mandatory for creation	2019-06-19 16:19:06 -05:00
Benjamin Trent	b333ced5a7	[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124 ) (#43388 ) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0	2019-06-19 16:18:27 -05:00
Christos Soulios	d1637ca476	Backport: Refactor aggregation base classes to remove doEquals() and doHashCode() (#43363 ) This PR is a backport a of #43214 from v8.0.0 A number of the aggregation base classes have an abstract doEquals() and doHashCode() (e.g. InternalAggregation.java, AbstractPipelineAggregationBuilder.java). Theoretically this is so the sub-classes can add to the equals/hashCode and don't need to worry about calling super.equals(). In practice, it's mostly just confusing/inconsistent. And if there are more than two levels, we end up with situations like InternalMappedSignificantTerms which has to call super.doEquals() which defeats the point of having these overridable methods. This PR removes the do versions and just use equals/hashCode ensuring the super when necessary.	2019-06-19 22:31:06 +03:00
Lee Hinman	d81ce9a647	Return 0 for negative "free" and "total" memory reported by the OS (#42725 ) * Return 0 for negative "free" and "total" memory reported by the OS We've had a situation where the MX bean reported negative values for the free memory of the OS, in those rare cases we want to return a value of 0 rather than blowing up later down the pipeline. In the event that there is a serialization or creation error with regard to memory use, this adds asserts so the failure will occur as soon as possible and give us a better location for investigation. Resolves #42157 * Fix test passing in invalid memory value * Fix another test passing in invalid memory value * Also change mem check in MachineLearning.machineMemoryFromStats * Add background documentation for why we prevent negative return values * Clarify comment a bit more	2019-06-19 10:35:48 -06:00
Gordon Brown	23a3471394	Fix randomization in testPerformActionAttrsRequestFails (#43304 ) The randomization in this test would occasionally generate duplicate node attribute keys, causing spurious test failures. This commit adjusts the randomization to not generate duplicate keys and cleans up the data structure used to hold the generated keys.	2019-06-19 10:34:39 -06:00

1 2 3 4 5 ...

3358 Commits