OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-04-07 15:59:51 +00:00

Author	SHA1	Message	Date
Tim Brooks	952cf770ed	Reestablish peer recovery after network errors (#57827 ) Currently a network disruption will fail a peer recovery. This commit adds network errors as retryable actions for the source node. Additionally, it adds sequence numbers to the recovery request to ensure that the requests are idempotent. Additionally it adds a reestablish recovery action. The target node will attempt to reestablish an existing recovery after a network failure. This is necessary to ensure that the retries occurring on the source node provide value in bidirectional failures.	2020-06-08 14:17:52 -06:00
Nhat Nguyen	4ecc7dcca5	Avoid StackOverflowError if write circular reference exception (#54147 ) We should never write a circular reference exception as we will fail a node with StackOverflowError. However, we have one in #53589. I tried but failed to find its location. With this commit, we will avoid StackOverflowError in production and detect circular exceptions in tests. Closes #53589	2020-04-04 13:42:27 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Alan Woodward	3cd4b97618	Remove UnknownNamedObjectException (#53105 ) This was originally thrown from NamedXContentRegistry#parseNamedObject() but that method now throws a NamedObjectNotFoundException, so this is unused.	2020-03-05 10:06:59 +00:00
Henning Andersen	125feecabc	Guess root cause support unwrap (#50525 ) (#50742 ) ElasticsearchException.guessRootCauses would return wrapper exception if inner exception was not an ElasticsearchException. Fixed to never return wrapper exceptions. At least following APIs change root_cause.0.type as a result: _update with bad script _index with bad pipeline Relates #50417	2020-01-08 19:09:14 +01:00
Jason Tedor	36dc544819	Adjust version on ingest processor exception The dedicated ingest processor exception was backported to 7.5. This commit updates the version in the 7.x branch.	2019-11-15 09:35:12 -05:00
Jason Tedor	2bcdcb17cd	Introduce dedicated ingest processor exception (#48810 ) Today we wrap exceptions that occur while executing an ingest processor in an ElasticsearchException. Today, in ExceptionsHelper#unwrapCause we only unwrap causes for exceptions that implement ElasticsearchWrapperException, which the top-level ElasticsearchException does not. Ultimately, this means that any exception that occurs during processor execution does not have its cause unwrapped, and so its status is blanket treated as a 500. This means that while executing a bulk request with an ingest pipeline, document-level failures that occur during a processor will cause the status for that document to be treated as 500. Since that does not give the client any indication that they made a mistake, it means some clients will enter infinite retries, thinking that there is some server-side problem that merely needs to clear. This commit addresses this by introducing a dedicated ingest processor exception, so that its causes can be unwrapped. While we could consider a broader change to unwrap causes for more than just ElasticsearchWrapperExceptions, that is a broad change with unclear implications. Since the problem of reporting 500s on client errors is a user-facing bug, we take the conservative approach for now, and we can revisit the unwrapping in a future change.	2019-11-14 11:04:53 -05:00
Jim Ferenczi	73a09b34b8	Replace SearchContextException with SearchException (#47046 ) This commit removes the SearchContextException in favor of a simpler SearchException that doesn't leak the SearchContext. Relates #46523	2019-09-26 14:21:23 +02:00
Nhat Nguyen	cabff5a7cd	Handle lower retaining seqno retention lease error (#46420 ) We renew the CCR retention lease at a fixed interval, therefore it's possible to have more than one in-flight renewal requests at the same time. If requests arrive out of order, then the assertion is violated. Closes #46416 Closes #46013	2019-09-13 08:50:19 -04:00
Andrey Ershov	152ce62c58	Enhanced logging when transport is misconfigured to talk to HTTP port (#45964 ) If a node is misconfigured to talk to remote node HTTP port (instead of transport port) eventually it will receive an HTTP response from the remote node on transport port (this happens when a node sends accidentally line terminating byte in a transport request). If this happens today it results in a non-friendly log message and a long stack trace. This commit adds a check if a malformed response is HTTP response. In this case, a concise log message would appear. (cherry picked from commit 911d02b7a9c3ce7fe316360c127a935ca4b11f37)	2019-08-30 13:02:08 +02:00
Jason Tedor	f7ff0aff79	Execute actions under permit in primary mode only (#42241 ) Today when executing an action on a primary shard under permit, we do not enforce that the shard is in primary mode before executing the action. This commit addresses this by wrapping actions to be executed under permit in a check that the shard is in primary mode before executing the action.	2019-05-21 15:54:31 -04:00
Jason Tedor	c7cdd6a46a	Add dedicated retention lease exceptions (#38754 ) When a retention lease already exists on an add retention lease invocation, or a retention lease is not found on a renew retention lease invocation today we throw an illegal argument exception. This puts a burden on the caller to catch that specific exception and parse the message. This commit relieves the burden from the caller by adding dedicated exception types for these situations.	2019-02-12 00:32:09 -05:00
Tal Levy	9923f0fe6a	fix a few versionAdded values in ElasticsearchExceptions (#37877 ) TooManyBucketsException was introduced in v6.2 and SnapshotInProgressException was introduced in v6.7	2019-01-31 08:28:20 -08:00
Martijn van Groningen	5a9dadb3ff	changed versionAdded now that #37767 is backedported	2019-01-25 09:18:42 +01:00
Martijn van Groningen	1151f3b3ff	Fail with a dedicated exception if remote connection is missing or (#37767 ) or connectivity to the remote connection is failing. Relates to #37681	2019-01-25 08:53:18 +01:00
David Turner	5db7ed22a0	Bootstrap a Zen2 cluster once quorum is discovered (#37463 ) Today when bootstrapping a Zen2 cluster we wait for every node in the `initial_master_nodes` setting to be discovered, so that we can map the node names or addresses in the `initial_master_nodes` list to their IDs for inclusion in the initial voting configuration. This means that if any of the expected master-eligible nodes fails to start then bootstrapping will not occur and the cluster will not form. This is not ideal, and we would prefer the cluster to bootstrap even if some of the master-eligible nodes do not start. Safe bootstrapping requires that all pairs of quorums of all initial configurations overlap, and this is particularly troublesome to ensure given that nodes may be concurrently and independently attempting to bootstrap the cluster. The solution is to bootstrap using an initial configuration whose size matches the size of the expected set of master-eligible nodes, but with the unknown IDs replaced by "placeholder" IDs that can never belong to any node. Any quorum of received votes in any of these placeholder-laden initial configurations is also a quorum of the "true" initial set of master-eligible nodes, giving the guarantee that it intersects all other quorums as required. Note that this change means that the initial configuration is not necessarily robust to any node failures. Normally the cluster will form and then auto-reconfigure to a more robust configuration in which the placeholder IDs are replaced by the IDs of genuine nodes as they join the cluster; however if a node fails between bootstrapping and this auto-reconfiguration then the cluster may become unavailable. This we feel to be less likely than a node failing to start at all. This commit also enormously simplifies the cluster bootstrapping process. Today, the cluster bootstrapping process involves two (local) transport actions in order to support a flexible bootstrapping API and to make it easily accessible to plugins. However this flexibility is not required for the current design so it is adding a good deal of unnecessary complexity. Here we remove this complexity in favour of a much simpler ClusterBootstrapService implementation that does all the work itself.	2019-01-22 11:03:51 +00:00
Tal Levy	a0c504e4a3	Create specific exception for when snapshots are in progress (#37550 ) delete and close index actions threw IllegalArgumentExceptions when attempting to run against an index that has a snapshot in progress. This change introduces a dedicated SnapshotInProgressException for these scenarios. This is done to explicitly signal to clients that this is the reason the action failed, and it is a retryable error. relates to #37541.	2019-01-17 13:21:12 -08:00
David Turner	ca3f5c1e2e	Cancel GetDiscoveredNodesAction when bootstrapped (#36423 ) Today the `GetDiscoveredNodesAction` waits, possibly indefinitely, to discover enough nodes to bootstrap the cluster. However it is possible that the cluster forms before a node has discovered the expected collection of nodes, in which case the action will wait indefinitely despite the fact that it is no longer required. This commit changes the behaviour so that the action fails once a node receives a cluster state with a nonempty configuration, indicating that the cluster has been successfully bootstrapped and therefore the `GetDiscoveredNodesAction` need wait no longer. Relates #36380 and #36381; reverts 558f4ec27820e1a50660dc1f3437422150339af0.	2018-12-10 17:23:03 +00:00
David Turner	77789a733d	Merge branch 'master' into 2018-11-08-merge-master	2018-11-08 13:38:18 +00:00
Alpar Torok	8a85b2eada	Remove build qualifier from server's Version (#35172 ) With this change, `Version` no longer carries information about the qualifier, we still need a way to show the "display version" that does have both qualifier and snapshot. This is now stored by the build and red from `META-INF`.	2018-11-07 14:01:05 +02:00
David Turner	a2cd8f731e	Merge branch 'master' into zen2	2018-09-11 09:38:10 +02:00
Jim Ferenczi	f4e9729d64	Remove unsupported Version.V_5_* (#32937 ) This change removes the es 5x version constants and their usages.	2018-08-24 09:51:21 +02:00
Yannick Welsch	e122505a91	Zen2: Deterministic MasterService (#32493 ) Increases testability of MasterService and the discovery layer. Changes: - Async publish method - Moved a few interfaces/classes top-level to simplify imports - Deterministic MasterService implementation for tests	2018-08-13 18:03:08 +02:00
Yannick Welsch	384cc5455b	Add core coordination algorithm for cluster state publishing (#32171 ) Adds the core coordination algorithm that guarantees safety for cluster state publishing. This is a straight-forward port of the transition rules from the formal model at https://github.com/elastic/elasticsearch-formal-models/blob/master/ZenWithTerms/tla/ZenWithTerms.tla	2018-07-20 11:17:42 +02:00
olcbean	7d7ead95b2	Add Get Aliases API to the high-level REST client (#28799 ) Given the weirdness of the response returned by the get alias API, we went for a client specific response, which allows us to hold the error message, exception and status returned as part of the response together with aliases. See #30536 . Relates to #27205	2018-06-12 10:26:17 +02:00
Luca Cavanna	31351ab880	High-level client: list tasks failure to not lose nodeId (#31001 ) This commit reworks testing for `ListTasksResponse` so that random fields insertion can be tested and xcontent equivalence can be checked too. Proper exclusions need to be configured, and failures need to be tested separately. This helped finding a little problem, whenever there is a node failure returned, the nodeId was lost as it was never printed out as part of the exception toXContent.	2018-06-01 08:53:24 +02:00
Ryan Ernst	916bf9d26d	Convert server javadoc to html5 (#30279 ) This commit converts the remaining javadocs in :server using html4 to html5. This was mostly converting `tt` to `{@code}`.	2018-05-02 08:08:54 -07:00
Nik Everett	99b98fab18	Core: Pick inner most parse exception as root cause (#30270 ) Just like `ElasticsearchException`, the inner most `XContentParseException` tends to contain the root cause of the exception and show be show to the user in the `root_cause` field. The effectively undoes most of the changes that #29373 made to the `root_cause` for parsing exceptions. The `type` field still changes from `parse_exception` to `x_content_parse_exception`, but this seems like a fairly safe change. `ElasticsearchWrapperException` looks tempting to implement this but the behavior isn't quite right. `ElasticsearchWrapperExceptions` are entirely unwrapped until the cause no longer `implements ElasticsearchWrapperException` but `XContentParseException` should be unwrapped until its cause is no longer an `XContentParseException` but no further. In other words, `ElasticsearchWrapperException` are unwrapped one step too far. Closes #30261	2018-05-01 07:44:58 -04:00
Nik Everett	bf05c600c4	REST: Include suppressed exceptions on failures (#29115 ) This modifies xcontent serialization of Exceptions to contain suppressed exceptions. If there are any suppressed exceptions they are included in the exception response by default. The reasoning here is that they are fairly rare but when they exist they almost always add extra useful information. Take, for example, the response when you specify two broken ingest pipelines: ``` { "error" : { "root_cause" : ...snip... "type" : "parse_exception", "reason" : "[field] required property is missing", "header" : { "processor_type" : "set", "property_name" : "field" }, "suppressed" : [ { "type" : "parse_exception", "reason" : "[field] required property is missing", "header" : { "processor_type" : "convert", "property_name" : "field" } } ] }, "status" : 400 } ``` Moreover, when suppressed exceptions come from 500 level errors should give us more useful debugging information. Closes #23392	2018-03-19 10:52:50 -04:00
Jason Tedor	6bf742dd1b	Fix EsAbortPolicy to conform to API (#29075 ) The rejected execution handler API says that rejectedExecution(Runnable, ThreadPoolExecutor) throws a RejectedExecutionException if the task must be rejected due to capacity on the executor. We do throw something that smells like a RejectedExecutionException (it is named EsRejectedExecutionException) yet we violate the API because EsRejectedExecutionException is not a RejectedExecutionException. This has caused problems before where we try to catch RejectedExecution when invoking rejectedExecution but this causes EsRejectedExecutionException to go uncaught. This commit addresses this by modifying EsRejectedExecutionException to extend RejectedExecutionException.	2018-03-16 14:34:36 -04:00
Lee Hinman	697f3f1a3b	Factor UnknownNamedObjectException into its own class (#28931 ) * Factor UnknownNamedObjectException into its own class This moves the inner class `UnknownNamedObjectException` from `NamedXContentRegistry` into a top-level class. This is so that `NamedXContentRegistry` doesn't have to depend on StreamInput and StreamOutput. Relates to #28504	2018-03-08 15:32:41 -07:00
Tim Brooks	99f88f15c5	Rename core module to server (#28180 ) This is related to #27933. It renames the core module to server. This is the first step towards introducing an elasticsearch-core jar.	2018-01-11 11:30:43 -07:00

32 Commits