OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-05 20:48:22 +00:00

Author	SHA1	Message	Date
Ryan Ernst	3a2c698ce0	Rename Action to ActionType (#43778 ) Action is a class that encapsulates meta information about an action that allows it to be called remotely, specifically the action name and response type. With recent refactoring, the action class can now be constructed as a static constant, instead of needing to create a subclass. This makes the old pattern of creating a singleton INSTANCE both misnamed and lacking a common placement. This commit renames Action to ActionType, thus allowing the old INSTANCE naming pattern to be TYPE on the transport action itself. ActionType also conveys that this class is also not the action itself, although this change does not rename any concrete classes as those will be removed organically as they are converted to TYPE constants. relates #34389	2019-06-30 22:00:17 -07:00
Dimitris Athanasiou	8f49d01113	[7.x][ML] Rename df-analytics `_id_copy` to `ml__id_copy` (#43754 ) (#43783 ) Renames `_id_copy` to `ml__id_copy` as field names starting with underscore are deprecated. The new field name `ml__id_copy` was chosen as an obscure enough field that users won't have in their data. Otherwise, this field is only intented to be used by df-analytics.	2019-06-30 19:37:00 +03:00
Albert Zaharovits	5e17bc5dcc	Consistent Secure Settings #40416 Introduces a new `ConsistentSecureSettingsValidatorService` service that exposes a single public method, namely `allSecureSettingsConsistent`. The method returns `true` if the local node's secure settings (inside the keystore) are equal to the master's, and `false` otherwise. Technically, the local node has to have exactly the same secure settings - setting names should not be missing or in surplus - for all `SecureSetting` instances that are flagged with the newly introduced `Property.Consistent`. It is worth highlighting that the `allSecureSettingsConsistent` is not a consensus view across the cluster, but rather the local node's perspective in relation to the master.	2019-06-29 23:26:17 +03:00
David Roberts	b599c68d23	[ML] Assert that a no-op job creates no results nor state (#43681 ) If a job is opened and then closed and does nothing in between then it should not persist any results or state documents. This change adapts the no-op job test to assert no results in addition to no state, and to log any documents that cause this assertion to fail. Relates elastic/ml-cpp#512 Relates #43680	2019-06-29 14:57:49 +01:00
Ryan Ernst	28ab77a023	Add StreamableResponseAction to aid in deprecation of Streamable (#43770 ) The Action base class currently works for both Streamable and Writeable response types. This commit intorduces StreamableResponseAction, for which only the legacy Action implementions which provide newResponse() will extend. This eliminates the need for overriding newResponse() with an UnsupportedOperationException. relates #34389	2019-06-28 21:40:00 -07:00
David Roberts	7951c63b91	[ML] Mark ml-cpp dependency as regularly changing (#43760 ) Since #41817 was merged the ml-cpp zip file for any given version has been cached indefinitely by Gradle. This is problematic, particularly in the case of the master branch where the version 8.0.0-SNAPSHOT will be in use for more than a year. This change tells Gradle that the ml-cpp zip file is a "changing" dependency, and to check whether it has changed every two hours. Two hours is a compromise between checking on every build and annoying developers with slow internet connections and checking rarely causing bug fixes in the ml-cpp code to take a long time to propagate through to elasticsearch PRs that rely on them.	2019-06-28 21:21:18 +01:00
Benjamin Trent	67a3c656c3	[7.x] [ML][Data Frame] removing format support (#43659 ) (#43747 ) * [ML][Data Frame] removing format support (#43659) * Fixing conflicts	2019-06-28 10:02:37 -05:00
Jim Ferenczi	7ca69db83f	Refactor IndexSearcherWrapper to disallow the wrapping of IndexSearcher (#43645 ) This change removes the ability to wrap an IndexSearcher in plugins. The IndexSearcherWrapper is replaced by an IndexReaderWrapper and allows to wrap the DirectoryReader only. This simplifies the creation of the context IndexSearcher that is used on a per request basis. This change also moves the optimization that was implemented in the security index searcher wrapper to the ContextIndexSearcher that now checks the live docs to determine how the search should be executed. If the underlying live docs is a sparse bit set the searcher will compute the intersection betweeen the query and the live docs instead of checking the live docs on every document that match the query.	2019-06-28 16:28:02 +02:00
Alpar Torok	d1a4d8866d	Add missing dependencies so we can build in parallel (#43672 )	2019-06-28 16:41:18 +03:00
Dimitris Athanasiou	86c853a7c2	[7.x][ML] Rename outlier score setting to feature_influence_threshold (#43705 ) (#43734 ) Renames outlier score setting `minimum_score_to_write_feature_influence` to `feature_influence_threshold`.	2019-06-28 13:28:25 +03:00
Dimitris Athanasiou	cab879118d	[7.x][ML] Support multiple source indices for df-analytics (#43702 ) (#43731 ) This commit adds support for multiple source indices. In order to deal with multiple indices having different mappings, it attempts a best-effort approach to merge the mappings assuming there are no conflicts. In case conflicts exists an error will be returned. To allow users creating custom mappings for special use cases, the destination index is now allowed to exist before the analytics job runs. In addition, settings are no longer copied except for the `index.number_of_shards` and `index.number_of_replicas`.	2019-06-28 13:28:03 +03:00
Christoph Büscher	2cc7f5a744	Allow reloading of search time analyzers (#43313 ) Currently changing resources (like dictionaries, synonym files etc...) of search time analyzers is only possible by closing an index, changing the underlying resource (e.g. synonym files) and then re-opening the index for the change to take effect. This PR adds a new API endpoint that allows triggering reloading of certain analysis resources (currently token filters) that will then pick up changes in underlying file resources. To achieve this we introduce a new type of custom analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows swapping out analysis components. Custom analyzers that contain filters that are markes as "updateable" will automatically choose this implementation. This PR also adds this capability to `synonym` token filters for use in search time analyzers. Relates to #29051	2019-06-28 09:55:40 +02:00
Przemysław Witek	94f18da5df	Add version and create_time to data frame analytics config (#43683 ) (#43712 )	2019-06-28 07:37:21 +02:00
Ryan Ernst	5b4089e57e	Remove nodeId from BaseNodeRequest (#43658 ) TransportNodesAction provides a mechanism to easily broadcast a request to many nodes, and collect the respones into a high level response. Each node has its own request type, with a base class of BaseNodeRequest. This base request requires passing the nodeId to which the request will be sent. However, that nodeId is not used anywhere. It is private to the base class, yet serialized to each node, where the node could just as easily find the nodeId of the node it is on locally. This commit removes passing the nodeId through to the node request creation, and guards its serialization so that we can remove the base request class altogether in the future.	2019-06-27 18:45:14 -07:00
Igor Motov	3607876a71	Geo: Makes coordinate validator in libs/geo plugable (#43657 ) Moves coordinate validation from Geometry constructors into parser. Relates #43644	2019-06-27 19:53:41 -04:00
Nhat Nguyen	ce8771feb7	Do not use MockInternalEngine in GatewayIndexStateIT (#43716 ) GatewayIndexStateIT#testRecoverBrokenIndexMetadata replies on the flushing on shutdown. This behaviour, however, can be randomly disabled in MockInternalEngine. Closes #43034	2019-06-27 18:28:04 -04:00
Przemysław Witek	68dbbd8793	Deduplicate two similar TimeUtils classes. (#43697 ) * Deduplicate org.elasticsearch.xpack.core.dataframe.utils.TimeUtils and org.elasticsearch.xpack.core.ml.utils.time.TimeUtils into a common class: org.elasticsearch.xpack.core.common.time.TimeUtils. * Add unit tests for parseTimeField and parseTimeFieldToInstant methods	2019-06-27 18:51:48 +02:00
Yannick Welsch	6744344ef2	Handle situation where only voting-only nodes are bootstrapped (#43628 ) Adds support for the situation where only voting-only nodes are bootstrapped. In that case, they will still try to become elected and bring full master nodes into the cluster.	2019-06-27 18:10:15 +02:00
David Roberts	f39619d182	[ML] Don't write timing stats on no-op (#43680 ) Similar to elastic/ml-cpp#512, if a job opens and closes and does nothing in between we shouldn't write timing stats to the results index.	2019-06-27 16:37:54 +01:00
Jim Ferenczi	329d05f61e	Fix UOE on search requests that match a sparse role query (#43668 ) Search requests executed through the SecurityIndexSearcherWrapper throw an UnsupportedOperationException if they match a sparse role query. When low level cancellation is activated (which is the default since #42857), the context index searcher creates a weight that doesn't handle #scorer. This change fixes this bug and adds a test to ensure that we check this case.	2019-06-27 16:56:56 +02:00
Przemysław Witek	ba518722a2	[7.x] [ML] Tag destination index with data frame metadata (#43567 ) (#43660 )	2019-06-27 08:08:39 +02:00
Benjamin Trent	d05593c3ad	[ML][Data Frame] adds tests for continuous DF (#43601 ) (#43654 )	2019-06-26 14:59:19 -05:00
Benjamin Trent	52e26bbc42	[ML][Data Frame] improve pivot nested field validations (#43548 ) (#43636 ) * [ML][Data Frame] improve pivot nested field validations * addressing pr comments	2019-06-26 13:35:51 -05:00
Armin Braun	c00e305d79	Optimize Selector Wakeups (#43515 ) (#43650 ) * Use atomic boolean to guard wakeups * Don't trigger wakeups from the select loops thread itself for registering and closing channels * Don't needlessly queue writes Co-authored-by: Tim Brooks <tim@uncontended.net>	2019-06-26 20:00:42 +02:00
David Kyle	e1f761dfc7	[Ml Data Frame] Size the GET stats search by number of Ids requested (#43206 ) Set the size of the search request to the number of ids limited by 10,000	2019-06-26 17:01:12 +01:00
Benjamin Trent	c121b00c98	[7.x] [ML][Data Frame] Add support for allow_no_match for endpoints (#43490 ) (#43637 ) * [ML][Data Frame] Add support for allow_no_match for endpoints (#43490) * [ML][Data Frame] Add support for allow_no_match parameter in endpoints Adds support for: * Get Transforms * Get Transforms stats * stop transforms * Update DataFrameTransformDocumentationIT.java	2019-06-26 10:09:56 -05:00
David Roberts	31dc5b7d3a	[TEST] Wait for replicas before stopping nodes in ML distributed test (#43622 ) If we stop a node before replicas exist then the test can fail because we lose a whole index if we stop the node with the primary on.	2019-06-26 11:52:53 +01:00
David Roberts	558e323c89	[ML] Introduce a setting for the process connect timeout (#43234 ) This change introduces a new setting, xpack.ml.process_connect_timeout, to enable the timeout for one of the external ML processes to connect to the ES JVM to be increased. The timeout may need to be increased if many processes are being started simultaneously on the same machine. This is unlikely in clusters with many ML nodes, as we balance the processes across the ML nodes, but can happen in clusters with a single ML node and a high value for xpack.ml.node_concurrent_job_allocations.	2019-06-26 09:22:04 +01:00
Yannick Welsch	2049f715b3	Add voting-only master node (#43410 ) A voting-only master-eligible node is a node that can participate in master elections but will not act as a master in the cluster. In particular, a voting-only node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. This only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers. This means that one of the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a voting-only non-dedicated master node can play the role of the third master-eligible node, which allows running an HA cluster with only two dedicated master nodes. Closes #14340 Co-authored-by: David Turner <david.turner@elastic.co>	2019-06-26 08:07:56 +02:00
Yogesh Gaikwad	480453aa24	Make role descriptors optional when creating API keys (#43481 ) (#43614 ) This commit changes the `role_descriptors` field from required to optional when creating API key. The default behavior in .NET ES client is to omit properties with `null` value requiring additional workarounds. The behavior for the API does not change. Field names (`id`, `name`) in the invalidate api keys API documentation have been corrected where they were wrong. Closes #42053	2019-06-26 14:30:51 +10:00
Przemysław Witek	76a750a0a0	Remove unused mapStringsOrdered method (#42513 ) (#43585 )	2019-06-25 20:43:38 +02:00
Tanguy Leroux	0dc1c12f13	Fix indices shown in _cat/indices (#43286 ) After two recent changes (#38824 and #33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes #39933	2019-06-25 20:02:34 +02:00
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
Benjamin Trent	970e157eac	[ML][Data Frame] Adjusting error message (#43455 ) (#43580 ) * Adjusting error message * Update TransportPutDataFrameTransformAction.java * Update TransportPutDataFrameTransformAction.java	2019-06-25 10:09:39 -05:00
Przemysław Witek	c702cd7415	[7.x] Implement XContentParser.genericMap and XContentParser.genericMapOrdered methods (#42059 ) (#43575 )	2019-06-25 16:04:54 +02:00
Przemysław Witek	b15e40ffad	Extract TimingStats-related functionality into TimingStatsReporter (#43371 ) (#43557 )	2019-06-25 15:48:39 +02:00
David Roberts	9c285ddbab	[ML] Improve message when native controller cannot connect (#43565 ) The error message if the native controller failed to run (for example due to running Elasticsearch on an unsupported platform) was not easy to understand. This change removes pointless detail from the message and adds some hints about likely causes. Fixes #42341	2019-06-25 12:06:54 +01:00
Tim Brooks	38516a4dd5	Move nio ip filter rule to be a channel handler (#43507 ) Currently nio implements ip filtering at the channel context level. This is kind of a hack as the application logic should be implemented at the handler level. This commit moves the ip filtering into a channel handler. This requires adding an indicator to the channel handler to show when a channel should be closed.	2019-06-24 10:03:24 -06:00
Gordon Brown	fac7efba9a	[7.x] Account for node versions during allocation in ILM Shrink (#43300 ) This commit ensures that ILM's Shrink action will take node versions into account when choosing which node to allocate to when shrinking an index. Prior to this change, ILM could pick a node with a lower version than some shards are already allocated to, which causes the new allocation to fail as shards can't be relocated onto a node with a lower version than they are already on. As part of this, when making the decision about which node to allocate to prior to Shrink, all shards in the index are considered, rather than choosing a random shard to consider. Further, the unit tests for the logic that chooses a node to allocate shards to pre-shrink has been improved to validate the behavior in more realistic and varied initial conditions.	2019-06-24 10:02:49 -06:00
Mayya Sharipova	813551e070	Fix eclipse build gradle for vectors project Closes #43496	2019-06-24 09:22:48 -04:00
Martijn van Groningen	101cf384ba	Replace Streamable w/ Writable in AcknowledgedResponse and subclasses (backport 7.x) (#43525 ) This commit replaces usages of Streamable with Writeable for the AcknowledgedResponse and its subclasses, plus associated actions. Note that where possible response fields were made final and default constructors were removed. This is a large PR, but the change is mostly mechanical. Relates to #34389 Backport of #43414	2019-06-24 13:47:37 +02:00
Alpar Torok	ea44da6069	Testclusters: conver remaining x-pack (#43335 ) Convert x-pack tests	2019-06-24 12:07:42 +03:00
Benjamin Trent	f4b75d6d14	[7.x] [ML][Data Frame] Add version and create_time to transform config (#43384 ) (#43480 ) * [ML][Data Frame] Add version and create_time to transform config (#43384) * [ML][Data Frame] Add version and create_time to transform config * s/transform_version/version s/Date/Instant * fixing getter/setter for version * adjusting for backport	2019-06-21 09:11:44 -05:00
David Kyle	73221d2265	[ML] Resolve NetworkDisruptionIT (#43441 ) After the network disruption a partition is created, one side of which can form a cluster the other can't. Ensure requests are sent to a node on the correct side of the cluster	2019-06-21 10:24:02 +01:00
Simon Willnauer	424ef4f158	SecurityIndexSearcherWrapper doesn't always carry over caches and similarity (#43436 ) If DocumentLevelSecurity is enabled SecurityIndexSearcherWrapper doesn't carry over the cache, cache policy and similarity from the incoming searcher.	2019-06-21 10:19:10 +02:00
Tim Vernum	059eb55108	Use SecureString for password length validation (#43465 ) This replaces the use of char[] in the password length validation code, with the use of SecureString Although the use of char[] is not in itself problematic, using a SecureString encourages callers to think about the lifetime of the password object and to clear it after use. Backport of: #42884	2019-06-21 17:11:07 +10:00
Armin Braun	21515b9ff1	Fix IpFilteringIntegrationTests (#43019 ) (#43434 ) * Increase timeout to 5s since we saw 500ms+ GC pauses on CI * closes #40689	2019-06-20 22:31:59 +02:00
Yannick Welsch	7f8e1454ab	Advance checkpoints only after persisting ops (#43205 ) Local and global checkpoints currently do not correctly reflect what's persisted to disk. The issue is that the local checkpoint is adapted as soon as an operation is processed (but not fsynced yet). This leaves room for the history below the global checkpoint to still change in case of a crash. As we rely on global checkpoints for CCR as well as operation-based recoveries, this has the risk of shard copies / follower clusters going out of sync. This commit required changing some core classes in the system: - The LocalCheckpointTracker keeps track now not only of the information whether an operation has been processed, but also whether that operation has been persisted to disk. - TranslogWriter now keeps track of the sequence numbers that have not been fsynced yet. Once they are fsynced, TranslogWriter notifies LocalCheckpointTracker of this. - ReplicationTracker now keeps track of the persisted local and persisted global checkpoints of all shard copies when in primary mode. The computed global checkpoint (which represents the minimum of all persisted local checkpoints of all in-sync shard copies), which was previously stored in the checkpoint entry for the local shard copy, has been moved to an extra field. - The periodic global checkpoint sync now also takes async durability into account, where the local checkpoints on shards only advance when the translog is asynchronously fsynced. This means that the previous condition to detect inactivity (max sequence number is equal to global checkpoint) is not sufficient anymore. - The new index closing API does not work when combined with async durability. The shard verification step is now requires an additional pre-flight step to fsync the translog, so that the main verify shard step has the most up-to-date global checkpoint at disposition.	2019-06-20 11:12:38 +02:00
Andrei Stefan	fe0f9055d8	Fix NPE in case of subsequent scrolled requests for a CSV/TSV formatted response (#43365 ) (cherry picked from commit 0ef7bb0f8b07cd0392d37f96ca9360821b19315a)	2019-06-20 11:26:11 +03:00
Jason Tedor	1f1a035def	Remove stale test logging annotations (#43403 ) This commit removes some very old test logging annotations that appeared to be added to investigate test failures that are long since closed. If these are needed, they can be added back on a case-by-case basis with a comment associating them to a test failure.	2019-06-19 22:58:22 -04:00

1 2 3 4 5 ...

2921 Commits