OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tanguy Leroux	3ae51f25dd	Move testSnapshotWithLargeSegmentFiles to ESMockAPIBasedRepositoryIntegTestCase (#46802 ) This commit moves the common test testSnapshotWithLargeSegmentFiles to the ESMockAPIBasedRepositoryIntegTestCase base class.	2019-09-18 15:41:30 +02:00
Tanguy Leroux	fd42358a6d	Add support for Multipart upload to S3 repository integration tests (#46704 ) This commit adds support for Multipart upload to the internal HTTP server used in S3 repository integration tests.	2019-09-18 09:40:25 +02:00
David Turner	65dc888623	Resume partial download from S3 on connection drop (#46589 ) Today if the connection to S3 times out or drops after starting to download an object then the SDK does not attempt to recover or resume the download, causing the restore of the whole shard to fail and retry. This commit allows Elasticsearch to detect such a mid-stream failure and to resume the download from where it failed.	2019-09-17 13:11:36 +01:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
Tanguy Leroux	88bed09119	Mutualize code in cloud-based repository integration tests (#46483 ) This commit factors out some common code between the cloud-based repository integration tests that were recently improved. Relates #46376	2019-09-09 16:02:14 +02:00
David Turner	cc092b1be1	Add support for OneZoneInfrequentAccess storage (#46436 ) The `repository-s3` plugin has supported a storage class of `onezone_ia` since the SDK upgrade in #30723, but we do not test or document this fact. This commit adds this storage class to the docs and adds a test to ensure that the documented storage classes are all accepted by S3 too. Fixes #30474	2019-09-09 07:54:44 +01:00
Tanguy Leroux	2290865559	Fix usage of randomIntBetween() in testWriteBlobWithRetries (#46380 ) This commit fixes the usage of randomIntBetween() in the test testWriteBlobWithRetries, when the test generates a random array of a single byte.	2019-09-06 09:10:38 +02:00
Tanguy Leroux	bd7a04cd55	Disable request throttling in S3BlobStoreRepositoryTests (#46226 ) When some high values are randomly picked up - for example the number of indices to snapshot or the number of snapshots to create - the tests in S3BlobStoreRepositoryTests can generate a high number of requests to the internal S3 server. In order to test the retry logic of the S3 client, the internal server is designed to randomly generate random server errors. When many requests are made, it is possible that the S3 client reaches its maximum number of successive retries capacity. Then the S3 client will stop retrying requests until enough retry attempts succeed, but it means that any request could fail before reaching the max retries count and make the test fail too. Closes #46217 Closes #46218 Closes #46219	2019-09-02 16:44:43 +02:00
Henning Andersen	d68e05aade	Mute 2 tests in S3BlobStoreRepositoryTests (#46221 ) Muted testSnapshotAndRestore and testMultipleSnapshotAndRollback Relates #46218 and #46219	2019-09-02 10:38:03 +02:00
Tanguy Leroux	0c1b263e8d	Inject random errors in S3BlobStoreRepositoryTests (#46125 ) This commit modifies the HTTP server used in S3BlobStoreRepositoryTests so that it randomly returns server errors for any type of request executed by the SDK client. It is now possible to verify that the repository tests are s uccessfully completed even if one or more errors were returned by the S3 service in response of a blob upload, a blob deletion or a object listing request etc. Because injecting errors forces the SDK client to retry requests, the test limits the maximum errors to send in response for each request at 3 retries.	2019-08-30 11:58:09 +02:00
Tanguy Leroux	b526309fbd	Replace MockAmazonS3 usage in S3BlobStoreRepositoryTests by a HTTP server (#46081 ) This commit removes the usage of MockAmazonS3 in S3BlobStoreRepositoryTests and replaces it by a HttpServer that emulates the S3 service. This allows the repository tests to use the real Amazon's S3 client under the hood in tests and will allow to test the behavior of the snapshot/restore feature for S3 repositories by simulating random server-side internal errors. The HTTP server used to emulate the S3 service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the AmazonS3Fixture.	2019-08-29 13:16:59 +02:00
Tanguy Leroux	9e14ffa8be	Few clean ups in ESBlobStoreRepositoryIntegTestCase (#46068 )	2019-08-28 16:29:46 +02:00
Jason Tedor	3d64605075	Remove node settings from blob store repositories (#45991 ) This commit starts from the simple premise that the use of node settings in blob store repositories is a mistake. Here we see that the node settings are used to get default settings for store and restore throttle rates. Yet, since there are not any node settings registered to this effect, there can never be a default setting to fall back to there, and so we always end up falling back to the default rate. Since this was the only use of node settings in blob store repository, we move them. From this, several places fall out where we were chaining settings through only to get them to the blob store repository, so we clean these up as well. That leaves us with the changeset in this commit.	2019-08-26 16:26:13 -04:00
Tanguy Leroux	a3d918bddb	Refactor RepositoryCredentialsTests (#45919 ) This commit refactors the S3 credentials tests in RepositoryCredentialsTests so that it now uses a single node (ESSingleNodeTestCase) to test how secure/insecure credentials are overriding each other. Using a single node makes it much easier to understand what each test is actually testing and IMO better reflect how things are initialized. It also allows to fold into this class the test testInsecureRepositoryCredentials which was wrongly located in S3BlobStoreRepositoryTests. By moving this test away, the S3BlobStoreRepositoryTests class does not need the allow_insecure_settings option anymore and thus can be executed as part of the usual gradle test task.	2019-08-26 15:14:43 +02:00
Tanguy Leroux	aee92d573c	Allow partial request body reads in AWS S3 retries tests (#45847 ) This commit changes the tests added in #45383 so that the fixture that emulates the S3 service now sometimes consumes all the request body before sending an error, sometimes consumes only a part of the request body and sometimes consumes nothing. The idea here is to beef up a bit the tests that writes blob because the client's retry logic relies on marking and resetting the blob's input stream. This pull request also changes the testWriteBlobWithRetries() so that it (rarely) tests with a large blob (up to 1mb), which is more than the client's default read limit on input streams (131Kb). Finally, it optimizes the ZeroInputStream so that it is a bit more effective (now works using an internal buffer and System.arraycopy() primitives).	2019-08-23 13:43:31 +02:00
Tanguy Leroux	57a36eb373	Add tests to check that requests are retried when writing/reading blobs on S3 (#45383 ) This commit adds tests to verify the behavior of the S3BlobContainer and its underlying AWS SDK client when the remote S3 service is responding errors or not responding at all. The expected behavior is that requests are retried multiple times before the client gives up and the S3BlobContainer bubbles up an exception. The test verifies the behavior of BlobContainer.writeBlob() and BlobContainer.readBlob(). In the case of S3 writing a blob can be executed as a single upload or using multipart requests; the test checks both scenario by writing a small then a large blob.	2019-08-22 11:41:40 +02:00
Armin Braun	6aaee8aa0a	Repository Cleanup Endpoint (#43900 ) (#45780 ) * Repository Cleanup Endpoint (#43900) * Snapshot cleanup functionality via transport/REST endpoint. * Added all the infrastructure for this with the HLRC and node client * Made use of it in tests and resolved relevant TODO * Added new `Custom` CS element that tracks the cleanup logic. Kept it similar to the delete and in progress classes and gave it some (for now) redundant way of handling multiple cleanups but only allow one * Use the exact same mechanism used by deletes to have the combination of CS entry and increment in repository state ID provide some concurrency safety (the initial approach of just an entry in the CS was not enough, we must increment the repository state ID to be safe against concurrent modifications, otherwise we run the risk of "cleaning up" blobs that just got created without noticing) * Isolated the logic to the transport action class as much as I could. It's not ideal, but we don't need to keep any state and do the same for other repository operations (like getting the detailed snapshot shard status)	2019-08-21 17:59:49 +02:00
Armin Braun	a9e1402189	Remove Settings from BaseRestRequest Constructor (#45418 ) (#45429 ) * Resolving the todo, cleaning up the unused `settings` parameter * Cleaning up some other minor dead code in affected classes	2019-08-12 05:14:45 +02:00
Armin Braun	5d7fafec14	Add Assertion to Ensure Retries in S3BlobContainer (#45224 ) (#45230 ) * We need a `markSupported` input stream to retry uploads * Relates #45153	2019-08-06 16:11:19 +02:00
Armin Braun	548c767b6b	S3 3rd Party Test Goal (#44799 ) (#45004 ) * Create S3 Third Party Test Task that Covers the S3 CLI Tool * Adjust snapshot cli test tool tests to work with real S3 * Build adjustment * Clean up repo path before testing * Dedup the logic for asserting path contents by using the correct utility method here that somehow became unused	2019-07-30 17:16:41 +02:00
Armin Braun	07cf2cb665	Add disable_chunked_encoding Setting to S3 Repo (#44052 ) (#44562 ) * Add disable_chunked_encoding setting to S3 repo plugin to support S3 implementations that don't support chunked encoding	2019-07-18 16:57:56 +02:00
Armin Braun	65fcaecce1	Remove Minio Host Hack in S3 Repository Build (#44491 ) (#44497 ) * Resolving the todo to clean this hackyness up	2019-07-17 19:59:00 +02:00
Armin Braun	c8db0e9b7e	Remove blobExists Method from BlobContainer (#44472 ) (#44475 ) * We only use this method in one place in production code and can replace that with a read -> remove it to simplify the interface * Keep it as an implementation detail in the Azure repository	2019-07-17 11:56:02 +02:00
Armin Braun	940aa71930	Cleanup S3 BlobContainer Listing Logic (#43088 ) (#44406 ) * Cleanup duplication in creating and looping over IO Requests	2019-07-16 12:19:20 +02:00
Mark Vieira	7c2e4b2857	[Backport] Enable caching of rest tests which use integ-test distribution (#44181 )	2019-07-10 15:42:28 -07:00
Alpar Torok	bde5802ad6	Test fixtures improovements (#43956 ) * Test fixtures improovements Don't disable some of the precommit tasks on fixtures. This no longer makes sense now that a project can both produce and use a fixture. In order for this to be possible, had to add an additional configuration to make JarHell class accessible to the task even if it's not a dependency of the project and fix some of the third party audit fallout from #43671 which wasn't detected at the time due to the issue being fixed here. Closes #43918	2019-07-10 21:21:06 +03:00
Alpar Torok	0c8294e633	Make sure the clean task doesn't break test fixtures (#43641 ) Use a dedicated fixture dir.	2019-07-08 17:58:27 +03:00
Armin Braun	af9b98e81c	Recursively Delete Unreferenced Index Directories (#42189 ) (#44051 ) * Use ability to list child "folders" in the blob store to implement recursive delete on all stale index folders when cleaning up instead of using the diff between two `RepositoryData` instances to cover aborted deletes * Runs after ever delete operation * Relates #13159 (fixing most of this issues caused by unreferenced indices, leaving some meta files to be cleaned up only)	2019-07-08 10:55:39 +02:00
Armin Braun	2176d09c37	Provide an Option to Use Path-Style-Access with S3 Repo (#41966 ) (#44046 ) * Provide an Option to Use Path-Style-Access with S3 Repo * As discussed, added the option to use path style access back again and deprecated it. * Defaulted to `false` * Added warning to docs * Closes #41816	2019-07-08 08:10:01 +02:00
Armin Braun	be20fb80e4	Recursive Delete on BlobContainer (#43281 ) (#43920 ) This is a prerequisite of #42189: * Add directory delete method to blob container specific to each implementation: * Some notes on the implementations: * AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting. * For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block. * HDFS and FS are trivial since we have directory delete methods available for them * Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)	2019-07-03 17:14:57 +02:00
Armin Braun	455b12a4fb	Add Ability to List Child Containers to BlobContainer (#42653 ) (#43903 ) * Add Ability to List Child Containers to BlobContainer (#42653) * Add Ability to List Child Containers to BlobContainer * This is a prerequisite of #42189	2019-07-03 11:30:49 +02:00
Armin Braun	cd4f81e15e	Remove Unused AWS KMS Dependency (#43671 ) (#43679 ) * We don't make use of KMS at the moment, no need to have this dependency here	2019-06-27 16:51:11 +02:00
Armin Braun	b7322812e0	Upgrade AWS SDK to Latest Version (#42708 ) (#43422 ) * Just staying up to data on the SDK version * Use `AbstractAmazonEC2` to shorten code	2019-06-20 16:43:42 +02:00
Yannick Welsch	e5a4a2272b	Wipe repositories more often (#42511 ) Fixes an issue where repositories are unintentionally shared among tests (given that the repo contents is captured in a static variable on the test class, to allow "sharing" among nodes) and two tests randomly chose the same snapshot name, leading to a conflict. Closes #42519	2019-06-12 11:58:38 +02:00
Alpar Torok	9def454ea9	Clean up configuration when docker isn't available (#42745 ) We initially added `requireDocker` for a way for tasks to say that they absolutely must have it, like the build docker image tasks. Projects using the test fixtures plugin are not in this both, as the intent with these is that they will be skipped if docker and docker-compose is not available. Before this change we were lenient, the docker image build would succeed but produce nothing. The implementation was also confusing as it was not immediately obvious this was the case due to all the indirection in the code. The reason we have this leniency is that when we added the docker image build, docker was a fairly new requirement for us, and we didn't have it deployed in CI widely enough nor had CI configured to prefer workers with docker when possible. We are in a much better position now. The other reason was other stack teams running `./gradlew assemble` in their respective CI and the possibility of breaking them if docker is not installed. We have been advocating for building specific distros for some time now and I will also send out an additional notice The PR also removes the use of `requireDocker` from tests that actually use test fixtures and are ok without it, and fixes a bug in test fixtures that would cause incorrect configuration and allow some tasks to run when docker was not available and they shouldn't have. Closes #42680 and #42829 see also #42719	2019-06-10 13:44:15 +03:00
Jason Tedor	371cb9a8ce	Remove Log4j 1.2 API as a dependency (#42702 ) We had this as a dependency for legacy dependencies that still needed the Log4j 1.2 API. This appears to no longer be necessary, so this commit removes this artifact as a dependency. To remove this dependency, we had to fix a few places where we were accidentally relying on Log4j 1.2 instead of Log4j 2 (easy to do, since both APIs were on the compile-time classpath). Finally, we can remove our custom Netty logger factory. This was needed when we were on Log4j 1.2 and handled logging in our own unique way. When we migrated to Log4j 2 we could have dropped this dependency. However, even then Netty would still pick up Log4j 1.2 since it was on the classpath, thus the advantage to removing this as a dependency now.	2019-05-30 16:08:07 -04:00
Mark Vieira	c1816354ed	[Backport] Improve build configuration time (#42674 )	2019-05-30 10:29:42 -07:00
Armin Braun	116b050cc6	Cleanup Bulk Delete Exception Logging (#41693 ) (#42606 ) * Cleanup Bulk Delete Exception Logging * Follow up to #41368 * Collect all failed blob deletes and add them to the exception message * Remove logging of blob name list from caller exception logging	2019-05-28 11:00:28 +02:00
Armin Braun	44bf784fe1	Add Infrastructure to Run 3rd Party Repository Tests (#42586 ) (#42604 ) * Add Infrastructure to Run 3rd Party Repository Tests * Add infrastructure to run third party repository tests using our standard JUnit infrastructure * This is a prerequisite of #42189	2019-05-28 10:46:22 +02:00
Armin Braun	c4f44024af	Remove Delete Method from BlobStore (#41619 ) (#42574 ) * Remove Delete Method from BlobStore (#41619) * The delete method on the blob store was used almost nowhere and just duplicates the delete method on the blob containers * The fact that it provided for some recursive delete logic (that did not behave the same way on all implementations) was not used and not properly tested either	2019-05-27 12:24:20 +02:00
Armin Braun	aad33121d8	Async Snapshot Repository Deletes (#40144 ) (#41571 ) Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete. * Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable. * See https://github.com/elastic/elasticsearch/pull/39656#issuecomment-470492106 * Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore). * By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in #39657	2019-04-26 15:36:09 +02:00
Armin Braun	23b3741618	Remove Exists Check from S3 Repository Deletes (#40931 ) (#41534 ) * The check doesn't add much if anything practically, since the S3 repository is eventually consistent and we only log the non-existence of a blob anyway * We don't do the check on writes for this very reason and documented it as such * Removing the check saves one API call per single delete speeding up the deletion process and lowering costs	2019-04-25 18:25:03 +02:00
Armin Braun	c4e84e2b34	Add Bulk Delete Api to BlobStore (#40322 ) (#41253 ) * Adds Bulk delete API to blob container * Implement bulk delete API for S3 * Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs * Closes #40250	2019-04-16 17:19:05 +02:00
Mark Vieira	1287c7d91f	[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978 ) (#40993 ) * Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions. (cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6) * Fix forking JVM runner * Don't bump shadow plugin version	2019-04-09 11:52:50 -07:00
Jay Modi	f34663282c	Update apache httpclient to version 4.5.8 (#40875 ) This change updates our version of httpclient to version 4.5.8, which contains the fix for HTTPCLIENT-1968, which is a bug where the client started re-writing paths that contained encoded reserved characters with their unreserved form.	2019-04-05 13:48:10 -06:00
Alpar Torok	35d96c22c0	Fix 3rd pary S3 tests (#40588 ) * Fix 3rd pary S3 tests This is allready excluded on line 186, by doing this again here, the other exclusion from arround that line are removed causing the tests to fail. * Fix blacklisting with the fixture	2019-03-29 08:04:16 +02:00
Alpar Torok	524e0273ae	Testclusters: convert plugin repository-s3 (#40399 ) * Add support for setting and keystore settings * system properties and env var config * use testclusters for repository-s3 * Some cleanup of the build.gradle file for plugin-s3 * add runner {} to rest integ test task	2019-03-27 08:40:16 +02:00
Armin Braun	65732d707f	Add Support for S3 Intelligent Tiering (#39376 ) (#39620 ) * Add support for S3 intelligent tiering * Closes #38836	2019-03-04 10:32:37 +01:00
Jason Tedor	224600f370	Bump jackson-databind version for AWS SDK (#39183 ) This commit bumps the jackson-databind version for discovery-ec2 and repository-s3 to 2.8.11.3.	2019-02-20 13:04:50 -05:00
Henning Andersen	00a26b9dd2	Blob store compression fix (#39073 ) Blob store compression was not enabled for some of the files in snapshots due to constructor accessing sub-class fields. Fixed to instead accept compress field as constructor param. Also fixed chunk size validation to work. Deprecated repositories.fs.compress setting as well to be able to unify in a future commit.	2019-02-20 09:24:41 +01:00

1 2 3 4 5 ...

299 Commits