OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tanguy Leroux	b1bf05bb89	Add blob container retries tests for Azure SDK client (#47032 ) Similarly to what has been done for S3 and GCS, this commit adds unit tests that verify the retry logic of the Azure SDK client implementation when the remote service returns errors. It only tests the retry logic in case of errors and not in case of timeouts because Azure client timeout options are not exposed as settings.	2019-09-25 09:19:48 +02:00
Armin Braun	00f2e7f627	Update AWS SDK for repository-s3 plugin to support IAM Roles for Service Accounts (#46969 ) (#47004 ) * Update AWS SDK for repository-s3 and discovery-ec2 plugins	2019-09-24 17:15:11 +02:00
Tanguy Leroux	6986d7f968	Add blob container retries tests for Google Cloud Storage (#46968 ) Similarly to what has been done for S3 in #45383, this commit adds unit tests that verify the behavior of the SDK client and blob container implementation for Google Storage when the remote service returns errors. The main purpose was to add an extra test to the specific retry logic for 410-Gone errors added in #45963. Relates #45963	2019-09-24 08:58:24 +02:00
Alpar Torok	5fd7505efc	Testfixtures allow a single service only (#46780 ) This PR adds some restrictions around testfixtures to make sure the same service ( as defiend in docker-compose.yml ) is not shared between multiple projects. Sharing would break running with --parallel. Projects can still share fixtures as long as each has it;s own service within. This is still useful to share some of the setup and configuration code of the fixture. Project now also have to specify a service name when calling useCluster to refer to a specific service. If this is not the case all services will be claimed and the fixture can't be shared. For this reason fixtures have to explicitly specify if they are using themselves ( fixture and tests in the same project ).	2019-09-23 14:13:49 +03:00
Tanguy Leroux	add7148f3b	GCS deleteBlobsIgnoringIfNotExists should catch StorageException (#46832 ) GoogleCloudStorageBlobStore.deleteBlobsIgnoringIfNotExists() does not correctly catch StorageException thrown by batch.submit(). In the case a snapshot is deleted through BlobStoreRepository.deleteSnapshot() a storage exception is not caught (only IOException are) so the deletion is interrupted and indices cannot be cleaned up. The storage exception bubbles up to SnapshotService.deleteSnapshotFromRepository() but the listener that removes the deletion from the cluster state is not executed, leaving the deletion in the cluster state. This bug has been reported in #46772 where batch.submit() threw an exception in the test testIndicesDeletedFromRepository and following tests failed because a snapshot deletion was running. Relates #46772	2019-09-20 10:02:23 +02:00
Tanguy Leroux	3ae51f25dd	Move testSnapshotWithLargeSegmentFiles to ESMockAPIBasedRepositoryIntegTestCase (#46802 ) This commit moves the common test testSnapshotWithLargeSegmentFiles to the ESMockAPIBasedRepositoryIntegTestCase base class.	2019-09-18 15:41:30 +02:00
Tanguy Leroux	799f7def9f	Add block support to AzureBlobStoreRepositoryTests (#46664 ) This commit adds support for Put Block API to the internal HTTP server used in Azure repository integration tests. This allows to test the behavior of the Azure SDK client when the Azure Storage service returns errors when uploading Blob in multiple blocks or when downloading a blob using ranged downloads.	2019-09-18 09:43:08 +02:00
Tanguy Leroux	fd42358a6d	Add support for Multipart upload to S3 repository integration tests (#46704 ) This commit adds support for Multipart upload to the internal HTTP server used in S3 repository integration tests.	2019-09-18 09:40:25 +02:00
Tanguy Leroux	4db37801d0	Add resumable uploads support to GCS repository integration tests (#46562 ) This commit adds support for resumable uploads to the internal HTTP server used in GoogleCloudStorageBlobStoreRepositoryTests. This way we can also test the behavior of the Google's client when the service returns server errors in response to resumable upload requests. The BlobStore implementation for GCS has the choice between 2 methods to upload a blob: resumable and multipart. In the current implementation, the client executes a resumable upload if the blob size is larger than LARGE_BLOB_THRESHOLD_BYTE_SIZE, otherwise it executes a multipart upload. This commit makes this logic overridable in tests, allowing to randomize the decision of using one method or the other. The commit add support for single request resumable uploads and chunked resumable uploads (the blob is uploaded into multiple 2Mb chunks; each chunk being a resumable upload). For this last case, this PR also adds a test testSnapshotWithLargeSegmentFiles which makes it more probable that a chunked resumable upload is executed.	2019-09-18 09:33:05 +02:00
Armin Braun	371c355bca	Retry GCS Resumable Upload on Error 410 (#45963 ) (#46783 ) A resumable upload session can fail on with a 410 error and should be retried in that case. I added retrying twice using resetting of the given `InputStream` as the retry mechanism since the same approach is used by the AWS S3 SDK already as well and relied upon by the S3 repository implementation. Related GCS documentation: https://cloud.google.com/storage/docs/json_api/v1/status-codes#410_Gone	2019-09-17 19:06:43 +02:00
Armin Braun	b00de8edf3	Ensure SAS Tokens in Test Use Minimal Permissions (#46112 ) (#46628 ) There were some issues with the Azure implementation requiring permissions to list all containers ue to a container exists check. This was caught in CI this time, but going forward we should ensure that CI is executed using a token that does not allow listing containers. Relates #43288	2019-09-17 15:40:11 +02:00
David Turner	65dc888623	Resume partial download from S3 on connection drop (#46589 ) Today if the connection to S3 times out or drops after starting to download an object then the SDK does not attempt to recover or resume the download, causing the restore of the whole shard to fail and retry. This commit allows Elasticsearch to detect such a mid-stream failure and to resume the download from where it failed.	2019-09-17 13:11:36 +01:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
Mark Vieira	ccf656a9d0	Repository plugin test cacheability fixes (#46572 )	2019-09-11 08:24:55 -07:00
Tanguy Leroux	88bed09119	Mutualize code in cloud-based repository integration tests (#46483 ) This commit factors out some common code between the cloud-based repository integration tests that were recently improved. Relates #46376	2019-09-09 16:02:14 +02:00
Tanguy Leroux	023cf44025	Inject random server errors in AzureBlobStoreRepositoryTests (#46371 ) This commit modifies the HTTP server used in AzureBlobStoreRepositoryTests so that it randomly returns server errors for any type of request executed by the Azure client.	2019-09-09 10:00:09 +02:00
Tanguy Leroux	8e3dc68454	Inject random server errors in GoogleCloudStorageBlobStoreRepositoryTests (#46376 ) This commit modifies the HTTP server used in GoogleCloudStorageBlobStoreRepositoryTests so that it randomly returns server errors. The test does not inject server errors for the following types of request: batch request, resumable upload request.	2019-09-09 09:59:59 +02:00
David Turner	cc092b1be1	Add support for OneZoneInfrequentAccess storage (#46436 ) The `repository-s3` plugin has supported a storage class of `onezone_ia` since the SDK upgrade in #30723, but we do not test or document this fact. This commit adds this storage class to the docs and adds a test to ensure that the documented storage classes are all accepted by S3 too. Fixes #30474	2019-09-09 07:54:44 +01:00
Tanguy Leroux	2290865559	Fix usage of randomIntBetween() in testWriteBlobWithRetries (#46380 ) This commit fixes the usage of randomIntBetween() in the test testWriteBlobWithRetries, when the test generates a random array of a single byte.	2019-09-06 09:10:38 +02:00
Tanguy Leroux	28974b5723	Replace mocked client in GCSBlobStoreRepositoryTests by HTTP server (#46255 ) This commit removes the usage of MockGoogleCloudStoragePlugin in GoogleCloudStorageBlobStoreRepositoryTests and replaces it by a HttpServer that emulates the Storage service. This allows the repository tests to use the real Google's client under the hood in tests and will allow us to test the behavior of the snapshot/restore feature for GCS repositories by simulating random server-side internal errors. The HTTP server used to emulate the Storage service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the GoogleCloudStorageFixture.	2019-09-05 10:37:37 +02:00
Tanguy Leroux	6d1a82134c	Add repository integration tests for Azure (#46263 ) Similarly to what had been done for S3 (#46081) and GCS (#46255) this commit adds repository integration tests for Azure, based on an internal HTTP server instead of mocks.	2019-09-05 09:26:42 +02:00
Tanguy Leroux	bd7a04cd55	Disable request throttling in S3BlobStoreRepositoryTests (#46226 ) When some high values are randomly picked up - for example the number of indices to snapshot or the number of snapshots to create - the tests in S3BlobStoreRepositoryTests can generate a high number of requests to the internal S3 server. In order to test the retry logic of the S3 client, the internal server is designed to randomly generate random server errors. When many requests are made, it is possible that the S3 client reaches its maximum number of successive retries capacity. Then the S3 client will stop retrying requests until enough retry attempts succeed, but it means that any request could fail before reaching the max retries count and make the test fail too. Closes #46217 Closes #46218 Closes #46219	2019-09-02 16:44:43 +02:00
Henning Andersen	d68e05aade	Mute 2 tests in S3BlobStoreRepositoryTests (#46221 ) Muted testSnapshotAndRestore and testMultipleSnapshotAndRollback Relates #46218 and #46219	2019-09-02 10:38:03 +02:00
Tanguy Leroux	0c1b263e8d	Inject random errors in S3BlobStoreRepositoryTests (#46125 ) This commit modifies the HTTP server used in S3BlobStoreRepositoryTests so that it randomly returns server errors for any type of request executed by the SDK client. It is now possible to verify that the repository tests are s uccessfully completed even if one or more errors were returned by the S3 service in response of a blob upload, a blob deletion or a object listing request etc. Because injecting errors forces the SDK client to retry requests, the test limits the maximum errors to send in response for each request at 3 retries.	2019-08-30 11:58:09 +02:00
Tanguy Leroux	b526309fbd	Replace MockAmazonS3 usage in S3BlobStoreRepositoryTests by a HTTP server (#46081 ) This commit removes the usage of MockAmazonS3 in S3BlobStoreRepositoryTests and replaces it by a HttpServer that emulates the S3 service. This allows the repository tests to use the real Amazon's S3 client under the hood in tests and will allow to test the behavior of the snapshot/restore feature for S3 repositories by simulating random server-side internal errors. The HTTP server used to emulate the S3 service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the AmazonS3Fixture.	2019-08-29 13:16:59 +02:00
Tanguy Leroux	9e14ffa8be	Few clean ups in ESBlobStoreRepositoryIntegTestCase (#46068 )	2019-08-28 16:29:46 +02:00
Jason Tedor	3d64605075	Remove node settings from blob store repositories (#45991 ) This commit starts from the simple premise that the use of node settings in blob store repositories is a mistake. Here we see that the node settings are used to get default settings for store and restore throttle rates. Yet, since there are not any node settings registered to this effect, there can never be a default setting to fall back to there, and so we always end up falling back to the default rate. Since this was the only use of node settings in blob store repository, we move them. From this, several places fall out where we were chaining settings through only to get them to the blob store repository, so we clean these up as well. That leaves us with the changeset in this commit.	2019-08-26 16:26:13 -04:00
Tanguy Leroux	a3d918bddb	Refactor RepositoryCredentialsTests (#45919 ) This commit refactors the S3 credentials tests in RepositoryCredentialsTests so that it now uses a single node (ESSingleNodeTestCase) to test how secure/insecure credentials are overriding each other. Using a single node makes it much easier to understand what each test is actually testing and IMO better reflect how things are initialized. It also allows to fold into this class the test testInsecureRepositoryCredentials which was wrongly located in S3BlobStoreRepositoryTests. By moving this test away, the S3BlobStoreRepositoryTests class does not need the allow_insecure_settings option anymore and thus can be executed as part of the usual gradle test task.	2019-08-26 15:14:43 +02:00
Tanguy Leroux	aee92d573c	Allow partial request body reads in AWS S3 retries tests (#45847 ) This commit changes the tests added in #45383 so that the fixture that emulates the S3 service now sometimes consumes all the request body before sending an error, sometimes consumes only a part of the request body and sometimes consumes nothing. The idea here is to beef up a bit the tests that writes blob because the client's retry logic relies on marking and resetting the blob's input stream. This pull request also changes the testWriteBlobWithRetries() so that it (rarely) tests with a large blob (up to 1mb), which is more than the client's default read limit on input streams (131Kb). Finally, it optimizes the ZeroInputStream so that it is a bit more effective (now works using an internal buffer and System.arraycopy() primitives).	2019-08-23 13:43:31 +02:00
Tanguy Leroux	57a36eb373	Add tests to check that requests are retried when writing/reading blobs on S3 (#45383 ) This commit adds tests to verify the behavior of the S3BlobContainer and its underlying AWS SDK client when the remote S3 service is responding errors or not responding at all. The expected behavior is that requests are retried multiple times before the client gives up and the S3BlobContainer bubbles up an exception. The test verifies the behavior of BlobContainer.writeBlob() and BlobContainer.readBlob(). In the case of S3 writing a blob can be executed as a single upload or using multipart requests; the test checks both scenario by writing a small then a large blob.	2019-08-22 11:41:40 +02:00
Armin Braun	6aaee8aa0a	Repository Cleanup Endpoint (#43900 ) (#45780 ) * Repository Cleanup Endpoint (#43900) * Snapshot cleanup functionality via transport/REST endpoint. * Added all the infrastructure for this with the HLRC and node client * Made use of it in tests and resolved relevant TODO * Added new `Custom` CS element that tracks the cleanup logic. Kept it similar to the delete and in progress classes and gave it some (for now) redundant way of handling multiple cleanups but only allow one * Use the exact same mechanism used by deletes to have the combination of CS entry and increment in repository state ID provide some concurrency safety (the initial approach of just an entry in the CS was not enough, we must increment the repository state ID to be safe against concurrent modifications, otherwise we run the risk of "cleaning up" blobs that just got created without noticing) * Isolated the logic to the transport action class as much as I could. It's not ideal, but we don't need to keep any state and do the same for other repository operations (like getting the detailed snapshot shard status)	2019-08-21 17:59:49 +02:00
Jim Ferenczi	fe2a7523ec	Add support for inlined user dictionary in the Kuromoji plugin (#45489 ) This change adds a new option called user_dictionary_rules to Kuromoji's tokenizer. It can be used to set additional tokenization rules to the Japanese tokenizer directly in the settings (instead of using a file). This commit also adds a check that no rules are duplicated since this is not allowed in the UserDictionary. Closes #25343	2019-08-21 16:28:30 +02:00
Igor Motov	1818c5fa44	Ingest Attachment: Upgrade tika to v1.22 (#45575 ) Upgrades: Apache Tika: 1.19.1 -> 1.22. pdfbox : 2.0.12 -> 2.0.16 poi : 4.0.0 -> 4.0.1	2019-08-19 18:17:16 -04:00
Luca Cavanna	c31cddf27e	Update the schema for the REST API specification (#42346 ) * Update the REST API specification This patch updates the REST API spefication in JSON files to better encode deprecated entities, to improve specification of URL paths, and to open up the schema for future extensions. Notably, it changes the `paths` from a list of strings to a list of objects, where each particular object encodes all the information for this particular path: the `parts` and the `methods`. Among the benefits of this approach is eg. encoding the difference between using the `PUT` and `POST` methods in the Index API, to either use a specific document ID, or let Elasticsearch generate one. Also `documentation` becomes an object that supports an `url` and also a `description` which is a new field. * Adapt YAML runner to new REST API specification format The logic for choosing the path to use when running tests has been simplified, as a consequence of the path parts being listed under each path in the spec. The special case for create and index has been removed. Also the parsing code has been hardened so that errors are thrown earlier when the structure of the spec differs from what expected, and their error messages should be more helpful.	2019-08-16 14:40:00 +02:00
Yogesh Gaikwad	471d940c44	Refactor cluster privileges and cluster permission (#45265 ) (#45442 ) The current implementations make it difficult for adding new privileges (example: a cluster privilege which is more than cluster action-based and not exposed to the security administrator). On the high level, we would like our cluster privilege either: - a named cluster privilege This corresponds to `cluster` field from the role descriptor - or a configurable cluster privilege This corresponds to the `global` field from the role-descriptor and allows a security administrator to configure them. Some of the responsibilities like the merging of action based cluster privileges are now pushed at cluster permission level. How to implement the predicate (using Automaton) is being now enforced by cluster permission. `ClusterPermission` helps in enforcing the cluster level access either by performing checks against cluster action and optionally against a request. It is a collection of one or more permission checks where if any of the checks allow access then the permission allows access to a cluster action. Implementations of cluster privilege must be able to provide information regarding the predicates to the cluster permission so that can be enforced. This is enforced by making implementations of cluster privilege aware of cluster permission builder and provide a way to specify how the permission is to be built for a given privilege. This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`. `ConfigurableClusterPrivilege` is a renderable cluster privilege exposed as a `global` field in role descriptor. Other than this there is a requirement where we would want to know if a cluster permission is implied by another cluster-permission (`has-privileges`). This is helpful in addressing queries related to privileges for a user. This is not just simply checking of cluster permissions since we do not have access to runtime information (like request object). This refactoring does not try to address those scenarios. Relates #44048	2019-08-13 09:06:18 +10:00
Armin Braun	a9e1402189	Remove Settings from BaseRestRequest Constructor (#45418 ) (#45429 ) * Resolving the todo, cleaning up the unused `settings` parameter * Cleaning up some other minor dead code in affected classes	2019-08-12 05:14:45 +02:00
Armin Braun	a501d68f23	Upgrade to Netty 4.1.38 (#45132 ) (#45364 ) * A number of fixes to buffer handling in the .37 and .38 -> we should stay up to date	2019-08-09 03:38:14 +02:00
Tim Brooks	af908efa41	Disable netty direct buffer pooling by default (#44837 ) Elasticsearch does not grant Netty reflection access to get Unsafe. The only mechanism that currently exists to free direct buffers in a timely manner is to use Unsafe. This leads to the occasional scenario, under heavy network load, that direct byte buffers can slowly build up without being freed. This commit disables Netty direct buffer pooling and moves to a strategy of using a single thread-local direct buffer for interfacing with sockets. This will reduce the memory usage from networking. Elasticsearch currently derives very little value from direct buffer usage (TLS, compression, Lucene, Elasticsearch handling, etc all use heap bytes). So this seems like the correct trade-off until that changes.	2019-08-08 15:10:31 -06:00
Armin Braun	5d7fafec14	Add Assertion to Ensure Retries in S3BlobContainer (#45224 ) (#45230 ) * We need a `markSupported` input stream to retry uploads * Relates #45153	2019-08-06 16:11:19 +02:00
Yannick Welsch	7aeb2fe73c	Add per-socket keepalive options (#44055 ) Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings. By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like to explore whether we can enable them by default, in particular to force keepalive configurations that are better tuned for running ES.	2019-08-06 10:45:44 +02:00
Tim Brooks	984ba82251	Move nio channel initialization to event loop (#45155 ) Currently in the transport-nio work we connect and bind channels on the a thread before the channel is registered with a selector. Additionally, it is at this point that we set all the socket options. This commit moves these operations onto the event-loop after the channel has been registered with a selector. It attempts to set the socket options for a non-server channel at registration time. If that fails, it will attempt to set the options after the channel is connected. This should fix #41071.	2019-08-02 17:31:31 -04:00
Armin Braun	9450505d5b	Stop Passing Around REST Request in Multiple Spots (#44949 ) (#45109 ) * Stop Passing Around REST Request in Multiple Spots * Motivated by #44564 * We are currently passing the REST request object around to a large number of places. This works fine since we simply copy the full request content before we handle the rest itself which is needlessly hard on GC and heap. * This PR removes a number of spots where the request is passed around needlessly. There are many more spots to optimize in follow-ups to this, but this one would already enable bypassing the request copying for some error paths in a follow up.	2019-08-02 07:31:38 +02:00
Tim Brooks	aff66e3ac5	Add Cors integration tests (#44361 ) This commit adds integration tests to ensure that the basic cors functionality works for the netty and nio transports.	2019-07-31 14:24:23 -06:00
Armin Braun	548c767b6b	S3 3rd Party Test Goal (#44799 ) (#45004 ) * Create S3 Third Party Test Task that Covers the S3 CLI Tool * Adjust snapshot cli test tool tests to work with real S3 * Build adjustment * Clean up repo path before testing * Dedup the logic for asserting path contents by using the correct utility method here that somehow became unused	2019-07-30 17:16:41 +02:00
Armin Braun	4495140d1f	Release Pooled Buffers Earlier for HTTP Requests (#44952 ) (#44991 ) * We should release the buffers right after copying and not only do so after we did all the request handling on the copy * Relates #44564	2019-07-30 10:30:01 +02:00
Ignacio Vera	821f6f893b	Upgrade to Lucene 8.2.0 release (#44859 ) (#44892 )	2019-07-26 08:14:59 +02:00
Ioannis Kakavas	3714cb63da	Allow parsing the value of java.version sysprop (#44017 ) We often start testing with early access versions of new Java versions and this have caused minor issues in our tests (i.e. #43141) because the version string that the JVM reports cannot be parsed as it ends with the string -ea. This commit changes how we parse and compare Java versions to allow correct parsing and comparison of the output of java.version system property that might include an additional alphanumeric part after the version numbers (see [JEP 223[(https://openjdk.java.net/jeps/223)). In short it handles a version number part, like before, but additionally a PRE part that matches ([a-zA-Z0-9]+). It also changes a number of tests that would attempt to parse java.specification.version in order to get the full version of Java. java.specification.version only contains the major version and is thus inappropriate when trying to compare against a version that might contain a minor, patch or an early access part. We know parse java.version that can be consistently parsed. Resolves #43141	2019-07-22 20:14:56 +03:00
Jason Tedor	d82a570a2a	Reomve debugging loging statements from Azure tests This commit removes some unneeded debugging logging statements from the Azure storage tests. Relates #44672	2019-07-22 16:55:55 +09:00
Jason Tedor	a493a34143	Use debug logging instead for Azure tests (#44672 ) These Azure tests have hard println statements which means we always see these messages during configuration. Yet, there are unnecessary most of the time. This commit changes them to use debug logging.	2019-07-22 16:46:13 +09:00
Armin Braun	07cf2cb665	Add disable_chunked_encoding Setting to S3 Repo (#44052 ) (#44562 ) * Add disable_chunked_encoding setting to S3 repo plugin to support S3 implementations that don't support chunked encoding	2019-07-18 16:57:56 +02:00

1 2 3 4 5 ...

2401 Commits