OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jason Tedor	8f873620ee	Inline global checkpoints Today we rely on background syncs to relay the global checkpoint under the mandate of the primary to its replicas. This means that the global checkpoint on a replica can lag far behind the primary. The commit moves to inlining global checkpoints with replication requests. When a replication operation is performed, the primary will send the latest global checkpoint inline with the replica requests. This keeps the replicas closer in-sync with the primary. However, consider a replication request that is not followed by another replication request for an indefinite period of time. When the replicas respond to the primary with their local checkpoint, the primary will advance its global checkpoint. During this indefinite period of time, the replicas will not be notified of the advanced global checkpoint. This necessitates a need for another sync. To achieve this, we perform a global checkpoint sync when a shard falls idle. Relates #24513	2017-05-09 15:08:11 -04:00
Areek Zillur	4f773e2dbb	Replicate write failures (#23314 ) * Replicate write failures Currently, when a primary write operation fails after generating a sequence number, the failure is not communicated to the replicas. Ideally, every operation which generates a sequence number on primary should be recorded in all replicas. In this change, a sequence number is associated with write operation failure. When a failure with an assinged seqence number arrives at a replica, the failure cause and sequence number is recorded in the translog and the sequence number is marked as completed via executing `Engine.noOp` on the replica engine. * use zlong to serialize seq_no * Incorporate feedback * track write failures in translog as a noop in primary * Add tests for replicating write failures. Test that document failure (w/ seq no generated) are recorded as no-op in the translog for primary and replica shards * Update to master * update shouldExecuteOnReplica comment * rename indexshard noop to markSeqNoAsNoOp * remove redundant conditional * Consolidate possible replica action for bulk item request depanding on it's primary execution * remove bulk shard result abstraction * fix failure handling logic for bwc * add more tests * minor fix * cleanup * incorporate feedback * incorporate feedback * add assert to remove handling noop primary response when 5.0 nodes are not supported	2017-04-19 01:23:54 -04:00
Ryan Ernst	212f24aa27	Tests: Clean up rest test file handling (#21392 ) This change simplifies how the rest test runner finds test files and removes all leniency. Previously multiple prefixes and suffixes would be tried, and tests could exist inside or outside of the classpath, although outside of the classpath never quite worked. Now only classpath tests are supported, and only one resource prefix is supported, `/rest-api-spec/tests`. closes #20240	2017-04-18 15:07:08 -07:00
Ryan Ernst	cc1addeac2	Build: Find bwc version during build (#23801 ) We currently have the last minor version of the previous major hardcoded in tests like rolling upgrade. This change programatically finds this during gradle initialization by parsing versions from Version.java.	2017-03-29 12:11:38 -07:00
Ryan Ernst	8c53555b28	Tests: Use local clone build of 5.x with bwc tests (#22946 ) The current rest backcompat tests, which run against a mixed cluster of 5.x and 6.0 nodes, depend on snapshot builds of 5.x. However, this has the potential for inconsistency that results in CI failures, and happens quite often, whenever some backcompat logic is added to 5.x, but the bwc test on master fails because the 5.x code has not yet been published as a snapshot. This change creates a git clone of the 5.x branch, builds the zip distribution, and ties that into gradle substitutions for the 5.x version.	2017-03-23 22:32:13 -07:00
javanna	4f487ab1b9	[TEST] randomize request content_type between all of the supported formats	2017-02-27 12:27:03 +01:00
javanna	9a2dba3036	[TEST] add support for binary responses to REST tests infra	2017-02-27 12:27:03 +01:00
Ryan Ernst	175bda64a0	Build: Rework integ test setup and shutdown to ensure stop runs when desired (#23304 ) Gradle's finalizedBy on tasks only ensures one task runs after another, but not immediately after. This is problematic for our integration tests since it allows multiple project's integ test clusters to be simultaneously. While this has not been a problem thus far (gradle 2.13 happened to keep the finalizedBy tasks close enough that no clusters were running in parallel), with gradle 3.3 the task graph generation has changed, and numerous clusters may be running simultaneously, causing memory pressure, and thus generally slower tests, or even failure if the system has a limited amount of memory (eg in a vagrant host). This commit reworks how integ tests are configured. It adds an `integTestCluster` extension to gradle which is equivalent to the current `integTest.cluster` and moves the rest test runner task to `integTestRunner`. The `integTest` task is then just a dummy task, which depends on the cluster runner task, as well as the cluster stop task. This means running `integTest` in one project will both run the rest tests, and shut down the cluster, before running `integTest` in another project.	2017-02-22 12:43:15 -08:00
Areek Zillur	148be11f26	Make document write requests immutable (#23038 ) * Make document write requests immutable Previously, write requests were mutated at the transport level to update request version, version type and sequence no before replication. Now that all write requests go through the shard bulk transport action, we can use the primary response stored in item level bulk requests to pass the updated version, seqence no. to replicas. * incorporate feedback * minor cleanup * Add bwc test to ensure correct index version propagates to replica * Fix bwc for propagating write operation versions * Add assertion on replica request version type * fix tests using internal version type for replica op * Fix assertions to assert version type in replica and recovery * add bwc tests for version checks in concurrent indexing * incorporate feedback	2017-02-21 17:41:22 -05:00
Ryan Ernst	5db5dec0e7	Update bwc test to 5.4 snapshot	2017-02-17 14:58:20 -08:00
Jay Modi	b234644035	Enforce Content-Type requirement on the rest layer and remove deprecated methods (#23146 ) This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport requests and their usages. While doing this, it turns out that there are many places where *Entity classes are used from the apache http client libraries and many of these usages did not specify the content type. The methods that do not specify a content type explicitly have been added to forbidden apis to prevent more of these from entering our code base. Relates #19388	2017-02-17 14:45:41 -05:00
jaymode	d8d03f45c2	Fix communication with 5.3.0 nodes This commit fixes communication with 5.3.0 nodes to send XContentType to these nodes since #22691 was backported to the 5.3 branch.	2017-02-13 13:15:51 -05:00
Boaz Leskes	95eafe07f6	back wards tests should run against 5.4.0-SNAP	2017-02-13 10:54:08 +02:00
Jason Tedor	6e9940283b	Avoid losing ops in file-based recovery When a primary is relocated from an old node to a new node, it can have ops in its translog that do not have a sequence number assigned. When a file-based recovery is started, this can lead to skipping these ops when replaying the translog due to a bug in the recovery logic. This commit addresses this bug and adds a test in the BWC tests. Relates #22945	2017-02-03 08:11:57 -05:00
Jay Modi	7520a107be	Optionally require a valid content type for all rest requests with content (#22691 ) This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously. The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type. As part of this, many transport requests and builders were updated to provide methods that accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated. In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header. See #19388	2017-02-02 14:07:13 -05:00
Jason Tedor	930282e161	Introduce sequence-number-based recovery This commit introduces sequence-number-based recovery. When a replica has fallen out of sync, rather than performing a file-based recovery we first attempt to replay operations since the last local checkpoint on the replica. To do this, at the start of recovery the replica tells the primary what its local checkpoint is. The primary will then wait for all operations between that local checkpoint and the current maximum sequence number to complete; this is to ensure that there are no gaps in the operations that will be replayed from the primary to the replica. This is a best-effort attempt as we currently have no guarantees on the primary that these operations will be available; if we are not able to replay all operations in the desired range, we just fallback to file-based recovery. Later work will strengthen the guarantees. Relates #22484	2017-01-27 08:16:38 -08:00
Ali Beyad	b8934945b4	Fixes 5.3.0-SNAPSHOT typo	2017-01-06 20:37:35 -05:00
Ali Beyad	056adbd8cc	Updates backwards compatibility 5.0 tests to pull the latest 5.x version - 5.3.0-SNAPSHOT	2017-01-06 20:36:38 -05:00
javanna	f4aab0138d	introduce ToXContentObject interface `ToXContentObject` extends `ToXContent` without adding new methods to it, while allowing to mark classes that output complete xcontent objects to distinguish them from classes that require starting and ending an anonymous object externally. Ideally ToXContent would be renamed to ToXContentFragment, but that would be a huge change in our codebase, hence we simply document the fact that toXContent outputs fragments with no guarantees that the output is valid per se without an external ancestor. Relates to #16347	2017-01-06 23:31:48 +01:00
Nik Everett	232af512f4	Switch from standalone-test to standalone-rest-test standalone-rest-test doesn't configure unit tests and for these integ test only tests, that is what we want.	2017-01-05 10:55:47 +01:00
Nik Everett	812f63e5ef	Require either BuildPlugin or StandaloneTestBasePlugin to use RestTestPlugin It used to be that RestTestPlugin "came with" StandaloneTestBasePlugin but we'd like to use it with BuildPlugin for the high level rest client.	2017-01-05 10:55:47 +01:00
Nik Everett	f5f2149ff2	Remove much ceremony from parsing client yaml test suites (#22311 ) * Remove a checked exception, replacing it with `ParsingException`. * Remove all Parser classes for the yaml sections, replacing them with static methods. * Remove `ClientYamlTestFragmentParser`. Isn't used any more. * Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods. I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.	2016-12-22 11:00:34 -05:00
Boaz Leskes	b857b316b6	Add BWC layer to seq no infra and enable BWC tests (#22185 ) Sequence BWC logic consists of two elements: 1) Wire level BWC using stream versions. 2) A changed to the global checkpoint maintenance semantics. For the sequence number infra to work with a mixed version clusters, we have to consider situation where the primary is on an old node and replicas are on new ones (i.e., the replicas will receive operations without seq#) and also the reverse (i.e., the primary sends operations to a replica but the replica can't process the seq# and respond with local checkpoint). An new primary with an old replica is a rare because we do not allow a replica to recover from a new primary. However, it can occur if the old primary failed and a new replica was promoted or during primary relocation where the source primary is treated as a replica until the master starts the target. 1) Old Primary & New Replica - this case is easy as is taken care of by the wire level BWC. All incoming requests will have their seq# set to `UNASSIGNED_SEQ_NO`, which doesn't confuse the local checkpoint logic (keeping it at `NO_OPS_PERFORMED`) 2) New Primary & Old replica - this one is trickier as the global checkpoint service currently takes all in sync replicas into consideration for the global checkpoint calculation. In order to deal with old replicas, we change the semantics to say all new node in sync replicas. That means the replicas on old nodes don't count for the global checkpointing. In this state the seq# infra is not fully operational (you can't search on it, because copies may miss it) but it is maintained on shards that can support it. The old replicas will have to go through a file based recovery at some point and will get the seq# information at that point. There is still an edge case where a new primary fails and an old replica takes over. I'lll discuss this one with @ywelsch as I prefer to avoid it completely. This PR also re-enables the BWC tests which were disabled. As such it had to fix any BWC issue that had crept in. Most notably an issue with the removal of the `timestamp` field in #21670. The commit also includes a fix for the default value of the seq number field in replicated write requests (it was 0 but should be -2), that surface some other minor bugs which are fixed as well. Last - I added some debugging tools like more sane node names and forcing replication request to implement a `toString`	2016-12-19 13:08:24 +01:00
javanna	e6b10ca4db	Restore proper bwcVersion in qa/backwards gradle build file	2016-12-10 21:11:05 +01:00
javanna	6003cbfb64	fix typos in qa/backwards gradle build file	2016-12-10 21:09:49 +01:00
Jason Tedor	b2b7595fa7	Temporarily set BWC version to 6.0.0 for seq. no There is not yet a BWC layer in sequence numbers. This commit sets the BWC version to 6.0.0 for the BWC and rolling upgrade tests until this BWC layer is built.	2016-11-16 09:09:38 -05:00
Jason Tedor	db5f51b839	Revert "disable backwards-5.0 tests as there is no BWC layer in the seq# related components" This reverts commit `db87837c72`.	2016-11-16 09:09:35 -05:00
Boaz Leskes	db87837c72	disable backwards-5.0 tests as there is no BWC layer in the seq# related components	2016-11-16 08:04:08 +00:00
Simon Willnauer	bdc942fa72	Enable 5.x to 6.x BWC tests This commit enables real BWC testing against a 5.1 snapshot. All REST tests plus rolling upgrade test now run against a mixed version cross major version cluster.	2016-11-14 14:26:49 +01:00
Ryan Ernst	7a2c984bcc	Test: Remove multi process support from rest test runner (#21391 ) At one point in the past when moving out the rest tests from core to their own subproject, we had multiple test classes which evenly split up the tests to run. However, we simplified this and went back to a single test runner to have better reproduceability in tests. This change removes the remnants of that multiplexing support.	2016-11-07 15:07:34 -08:00
Boaz Leskes	68a0d20ccd	increase logging level to DEBUG in backwards-5.0 tests	2016-11-07 22:21:47 +01:00
Simon Willnauer	f319545814	Prepare master branch to be 6.0.0-alpha1	2016-09-08 12:55:30 +02:00
Clinton Gormley	0dd6f63c49	Fixed missing changes in the version bump to alpha6	2016-08-09 18:52:03 +02:00
Nik Everett	9270e8b22b	Rename client yaml test infrastructure This makes it obvious that these tests are for running the client yaml suites. Now that there are other ways of running tests using the REST client against a running cluster we can't go on calling the shared client yaml tests "REST tests". They are rest tests, but they aren't the rest tests.	2016-07-26 13:53:44 -04:00
Nik Everett	a95d4f4ee7	Add Location header and improve REST testing This adds a header that looks like `Location: /test/test/1` to the response for the index/create/update API. The requirement for the header comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative URIs are OK. So we use an absolute path which should resolve to the appropriate location. Closes #19079 This makes large changes to our rest test infrastructure, allowing us to write junit tests that test a running cluster via the rest client. It does this by splitting ESRestTestCase into two classes: * ESRestTestCase is the superclass of all tests that use the rest client to interact with a running cluster. * ESClientYamlSuiteTestCase is the superclass of all tests that use the rest client to run the yaml tests. These tests are shared across all official clients, thus the `ClientYamlSuite` part of the name.	2016-07-25 17:02:40 -04:00
Adrien Grand	4b0d317e63	Bump version to 5.0.0-alpha5.	2016-07-05 14:34:23 +02:00
Ryan Ernst	991c2221a1	Set next version back to alpha4	2016-06-13 09:26:45 -07:00
Simon Willnauer	93c5a9dacd	[TEST] Set BWC version to 5.0.0-SNAP since this is it's min compat version	2016-06-01 09:11:08 +02:00
Clinton Gormley	9c9bea9258	Set version to 5.0.0-alpha3 (#18550 ) * Set version to 5.0.0-alpha3 * Updated version in qa/backwards tests too	2016-05-24 16:46:05 +02:00
Adrien Grand	e88ac11633	Add back Version.V_5_0_0. #18176 This was lost whene releasing alpha2 since the version constant got renamed.	2016-05-06 12:30:22 +02:00
Alexander Reelsen	f71eb0b888	Version: Set version to 5.0.0-alpha2	2016-04-26 09:30:26 +02:00
Adrien Grand	82849a787a	Add back the Version.V_5_0_0 constant. #17688	2016-04-13 10:00:37 +02:00
Alexander Reelsen	b2573858b6	Version: Set version to 5.0.0-alpha1 Change version, required a minor fix in the RPM building. In case of a alpha/beta version, the release will contain alpha/beta as the RPM version cannot contains dashes/tildes.	2016-03-24 08:36:08 +01:00
Simon Willnauer	cbaa480c16	[TEST] Let the windows machine be slow as hell	2016-03-15 15:35:44 +01:00
Simon Willnauer	121e7c8ca4	Add infrastructure to run REST tests on a multi-version cluster This change adds the infrastructure to run the rest tests on a multi-node cluster that users 2 different minor versions of elasticsearch. It doesn't implement any dedicated BWC tests but rather leverages the existing REST tests. Since we don't have a real version to test against, the tests uses the current version until the first minor / RC is released to ensure the infrastructure works. Relates to #14406 Closes #17072	2016-03-13 10:52:39 +01:00

45 Commits