Commit Graph

76 Commits

Author SHA1 Message Date
Francisco Fernández Castaño 2bb5716b3d
Add repositories metering API (#62088)
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.

In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.

Backport of #60371
2020-09-08 14:01:04 +02:00
Ryan Ernst 6d3b691048
Add snapshot only test modules (#61954)
This commit adds external test modules. These are modules meant for
external systems to test edge cases in elasticsearch, but only within
snapshots. They are not meant to be used in production, so protections
are also added from their accidental inclusion in release builds.

Note that this commit does not actually add any new modules, it only
adds the infrastructure for the new modules, under
`test/external-modules`.
2020-09-04 16:35:18 -07:00
Jay Modi f0128ae074
Canonicalize client name in krb5kdc-fixture (#61119)
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes #61050
2020-08-13 14:58:08 -06:00
Armin Braun 3e2dfc6eac
Remove GCS Bucket Exists Check (#60899) (#60914)
Same as https://github.com/elastic/elasticsearch/pull/43288 for GCS.
We don't need to do the bucket exists check before using the repo, that just needlessly
increases the necessary permissions for using the GCS repository.
2020-08-11 09:54:27 +02:00
Mark Vieira dc7d4c615c
Ensure fixture runtime dependencies are built before starting containers (#59474) 2020-07-13 15:58:01 -07:00
Armin Braun 9268b25789
Add Check for Metadata Existence in BlobStoreRepository (#59141) (#59216)
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.

Relates #56911
2020-07-08 14:25:01 +02:00
Rene Groeschke d952b101e6
Replace compile configuration usage with api (7.x backport) (#58721)
* Replace compile configuration usage with api (#58451)

- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build

* Fix compile usages in 7.x branch
2020-06-30 15:57:41 +02:00
Armin Braun be6fa72432
Fix GCS Mock Behavior for Missing Bucket (#57283) (#57310)
* Fix GCS Mock Behavior for Missing Bucket

We were throwing a 500 instead of a 404 for a missing bucket.
This would make yaml tests needlessly wait for multiple seconds, retrying
the 500 response with backoff, in the test checking behavior for missing buckets.
2020-05-29 10:01:20 +02:00
Armin Braun a4eb3edf46
Fix GCS Repository YAML Test Build (#57073) (#57101)
A few relatively obvious issues here:

* We cannot run the different IT runs (large blob setting one and normal integ run) concurrently
* We need to set the dependency tasks up correctly for the large blob run so that it works in isolation
* We can't use the `localAddress` for the location header of the resumable upload
(this breaks in YAML tests because GCS is using a loopback port forward for the initial request and the
local address will be chosen as the actual Docker container host)

Closes #57026
2020-05-25 11:10:39 +02:00
Armin Braun 0a879b95d1
Save Bounds Checks in BytesReference (#56577) (#56621)
Two spots that allow for some optimization:

* We are often creating a composite reference of just a single item in
the transport layer => special cased via static constructor to make sure we never do that
   * Also removed the pointless case of an empty composite bytes ref
* `ByteBufferReference` is practically always created from a heap buffer these days so there
is no point of dealing with all the bounds checks and extra references to sliced buffers from that
and we can just use the underlying array directly
2020-05-12 20:33:45 +02:00
Tanguy Leroux 35622747fd
Add Minio tests for searchable snapshots (#56112) (#56179)
This commit adds QA tests for searchable snapshot on MinIO,
similarly to what already exist for S3, GCS and Azure.
2020-05-05 11:40:06 +02:00
Yannick Welsch ba39c261e8 Use streaming reads for GCS (#55506)
To read from GCS repositories we're currently using Google SDK's official BlobReadChannel,
which issues a new request every 2MB (default chunk size for BlobReadChannel) using range
requests, and fully downloads the chunk before exposing it to the returned InputStream. This
means that the SDK issues an awfully high number of requests to download large blobs.
Increasing the chunk size is not an option, as that will mean that an awfully high amount of
heap memory will be consumed by the download process.

The Google SDK does not provide the right abstractions for a streaming download. This PR
uses the lower-level primitives of the SDK to implement a streaming download, similar to what
S3's SDK does.

Also closes #55505
2020-04-21 13:22:26 +02:00
Yannick Welsch b9da307cd1 Add GCS support for searchable snapshots (#55403)
Adds ranged read support for GCS repositories in order to enable searchable snapshot support
for GCS.

As part of this PR, I've extracted some of the test infrastructure to make sure that
GoogleCloudStorageBlobContainerRetriesTests and S3BlobContainerRetriesTests are covering
similar test (as I saw those diverging in what they cover)
2020-04-20 13:02:59 +02:00
Rory Hunter a5b545b2a0
Use LTS version of Ubuntu in Dockerfiles (#55370)
We have some Dockerfiles that reference Ubuntu 19.04, which is not an LTS
version and has now appears to have been retired from the Ubuntu repositories.
Switch to 18.04, which is the current long-term support version. This
also requires a switch from OpenJDK 12 to 11.

Also change a usage of 16.04 to 18.04, for consistency.
2020-04-17 16:14:14 -04:00
Rory Hunter 49f8f66a41 Revert "Use LTS version of Ubuntu in Dockerfiles (#55327)"
This reverts commit dd76fbac60.
2020-04-16 20:05:22 +01:00
Rory Hunter dd76fbac60 Use LTS version of Ubuntu in Dockerfiles (#55327)
We have some Dockerfiles that reference Ubuntu 19.04, which is not an LTS
version and has now appears to have been retired from the Ubuntu repositories.
Switch to 18.04, which is the current long-term support version. Also change a
usage of 16.04 to 18.04, for consistency.
2020-04-16 19:47:18 +01:00
Ryan Ernst 29b70733ae
Use task avoidance with forbidden apis (#55034)
Currently forbidden apis accounts for 800+ tasks in the build. These
tasks are aggressively created by the plugin. In forbidden apis 3.0, we
will get task avoidance
(https://github.com/policeman-tools/forbidden-apis/pull/162), but we
need to ourselves use the same task avoidance mechanisms to not trigger
these task creations. This commit does that for our foribdden apis
usages, in preparation for upgrading to 3.0 when it is released.
2020-04-15 13:27:53 -07:00
Tanguy Leroux 4d36917e52
Merge feature/searchable-snapshots branch into 7.x (#54803) (#54825)
This is a backport of #54803 for 7.x.

This pull request cherry picks the squashed commit from #54803 with the additional commits:

    6f50c92 which adjusts master code to 7.x
    a114549 to mute a failing ILM test (#54818)
    48cbca1 and 50186b2 that cleans up and fixes the previous test
    aae12bb that adds a missing feature flag (#54861)
    6f330e3 that adds missing serialization bits (#54864)
    bf72c02 that adjust the version in YAML tests
    a51955f that adds some plumbing for the transport client used in integration tests

Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-07 13:28:53 +02:00
James Baiera b84c74cf70
Update the HDFS version used by HDFS Repo (#53693) (#54125) 2020-03-25 14:01:29 -04:00
Armin Braun 4271963462
Revert "Use Azure Bulk Deletes in Azure Repository (#53919)" (#54089) (#54111)
This reverts commit 23cccf088810b8416ed278571352393cc2de9523.
Unfortunately SAS token auth still doesn't work with bulk deletes so we can't use them yet.

Closes #54080
2020-03-25 12:13:25 +01:00
Armin Braun b51ea25a00
Use Azure Bulk Deletes in Azure Repository (#53919) (#53967)
Now that we upgraded the Azure SDK to 8.6.2 in #53865 we can make use of
bulk deletes.
2020-03-23 13:35:05 +01:00
Armin Braun 204c366a4e
Upgrade GCS SDK to 1.104.0 (#52839) (#53152)
Upgrading the GCS SDK to the most recent version.
Adjusting (i.e. improving) the REST mock accordingly.
This should significantly boost performance by pulling in
https://github.com/googleapis/java-core/issues/86 in some cases.
2020-03-05 11:18:18 +01:00
Armin Braun f7d71b5930
Fix GCS Mock Range Downloads (#52804) (#52830)
We were not correctly respecting the download range which lead
to the GCS SDK client closing the connection at times.
Also, fixes another instance of failing to drain the request fully before sending the response headers.

Closes #51446
2020-02-26 18:38:02 +01:00
Armin Braun 3be70f64d8
Fix GCS Mock Http Handler JDK Bug (#51933) (#51941)
There is an open JDK bug that is causing an assertion in the JDK's
http server to trip if we don't drain the request body before sending response headers.
See https://bugs.openjdk.java.net/browse/JDK-8180754
Working around this issue here by always draining the request at the beginning of the handler.

Fixes #51446
2020-02-05 15:37:06 +01:00
Armin Braun 7914c1a734
Optimize GCS Mock (#51593) (#51594)
This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) `byte[]` allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles`
run.

Closes #51446
Closes #50754
2020-01-29 11:06:05 +01:00
Armin Braun a725896c92
Fix and Reenable SnapshotTool Minio Tests (#50736) (#50745)
This solves half of the problem in #46813 by moving the S3
tests to using the shared minio fixture so we at least have
some non-3rd-party, constantly running coverage on these tests.
2020-01-08 16:33:36 +01:00
Armin Braun d0d48311f4
Faster and Simpler GCS REST Mock (#50706) (#50707)
* Faster and Simpler GCS REST Mock

I reworked the GCS mock a little to use less copying+allocation,
log the full request body on failure to read a multi-part request
and generally be a little simpler and easy to follow to track down
the remaining issues that are causing almost daily failures from this
class's multi-part request parsing that can't be reproduced locally.
2020-01-07 20:17:46 +01:00
Armin Braun 72a405fafb
Fix GCS Mock Broken Handling of some Blobs (#50666) (#50671)
* Fix GCS Mock Broken Handling of some Blobs

We were incorrectly handling blobs starting in `\r\n` which broke
tests randomly when blobs started on these.

Relates #49429
2020-01-06 19:27:57 +01:00
Armin Braun c73930988b
Remove Unused Delete Endpoint from GCS Mock (#50128) (#50134)
Follow up to #50024: we're not using the single-delete any more so no need to have a mock endpoint for it
2019-12-12 14:18:06 +01:00
Armin Braun 0fae4065ef
Better Logging GCS Blobstore Mock (#50102) (#50124)
* Better Logging GCS Blobstore Mock

Two things:
1. We should just throw a descriptive assertion error and figure out why we're not reading a multi-part instead of
returning a `400` and failing the tests that way here since we can't reproduce these 400s locally.
2. We were missing logging the exception on a cleanup delete failure that coincides with the `400` issue in tests.

Relates #49429
2019-12-12 11:17:22 +01:00
Armin Braun d19c8db4e4
Fix GCS Mock Batch Delete Behavior (#50034) (#50084)
Batch deletes get a response for every delete request, not just those that actually hit an existing blob.
The fact that we only responded for existing blobs leads to a degenerate response that throws a parse exception if a batch delete only contains non-existant blobs.
2019-12-11 17:40:25 +01:00
Armin Braun 90e9d61f2b
Optimize GoogleCloudStorageHttpHandler (#49677) (#49707)
Removing a lot of needless buffering and array creation
to reduce the significant memory usage of tests using this.
The incoming stream from the `exchange` is already buffered
so there is no point in adding a ton of additional buffers
everywhere.
2019-11-29 11:17:47 +01:00
Armin Braun 495b543e63
Improve Stability of GCS Mock API (#49592) (#49597)
Same as #49518 pretty much but for GCS.
Fixing a few more spots where input stream can get closed
without being fully drained and adding assertions to make sure
it's always drained.
Moved the no-close stream wrapper to production code utilities since
there's a number of spots in production code where it's also useful
(will reuse it there in a follow-up).
2019-11-26 16:53:51 +01:00
Armin Braun a5fa86ed97
Improve Stability of Mock APIs (#49518) (#49524)
This commit ensures that even for requests that are known to be empty body
we at least attempt to read one bytes from the request body input stream.
This is done to work around the behavior in `sun.net.httpserver.ServerImpl.Dispatcher#handleEvent`
that will close a TCP/HTTP connection that does not have the `eof` flag (see `sun.net.httpserver.LeftOverInputStream#isEOF`)
set on its input stream. As far as I can tell the only way to set this flag is to do a read when there's no more bytes buffered.
This fixes the numerous connection closing issues because the `ServerImpl` stops closing connections that it thinks
weren't fully drained.

Also, I removed a now redundant drain loop in the Azure handler as well as removed the connection closing in the error handler's
drain action (this shouldn't have an effect but makes things more predictable/easier to reason about IMO).

I would suggest merging this and closing related issue after verifying that this fixes things on CI.

The way to locally reproduce the issues we're seeing in tests is to make the retry timings more aggressive in e.g. the azure tests
and move them to single digit values. This makes the retries happen quickly enough that they run into the async connecting closing
of allegedly non-eof connections by `ServerImpl` and produces the exact kinds of failures we're seeing currently.

Relates #49401, #49429
2019-11-25 10:28:55 +01:00
Armin Braun 231d079bf8
Fix Azure Mock Issues (#49377) (#49381)
Fixing a few small issues found in this code:
1. We weren't reading the request headers but the response headers when checking for blob existence in the mocked single upload path
2. Error code can never be `null` removed the dead code that resulted
3. In the logging wrapper we weren't checking for `Throwable` so any failing assertions in the http mock would not show up since they
run on a thread managed by the mock http server
2019-11-21 19:57:50 +01:00
Tanguy Leroux f753fa2265 HttpHandlers should return correct list of objects (#49283)
This commit fixes the server side logic of "List Objects" operations
of Azure and S3 fixtures. Until today, the fixtures were returning a "
flat" view of stored objects and were not correctly handling the
delimiter parameter. This causes some objects listing to be wrongly
interpreted by the snapshot deletion logic in Elasticsearch which
relies on the ability to list child containers of BlobContainer (#42653)
to correctly delete stale indices.

As a consequence, the blobs were not correctly deleted from the
 emulated storage service and stayed in heap until they got garbage
collected, causing CI failures like #48978.

This commit fixes the server side logic of Azure and S3 fixture when
listing objects so that it now return correct common blob prefixes as
expected by the snapshot deletion process. It also adds an after-test
check to ensure that tests leave the repository empty (besides the
root index files).

Closes #48978
2019-11-20 09:26:42 +01:00
Tanguy Leroux ca4f55f2e4
Add docker-compose fixtures for S3 integration tests (#49107) (#49229)
Similarly to what has been done for Azure (#48636) and GCS (#48762),
this committ removes the existing Ant fixture that emulates a S3 storage
service in favor of multiple docker-compose based fixtures.

The goals here are multiple: be able to reuse a s3-fixture outside of the
repository-s3 plugin; allow parallel execution of integration tests; removes
the existing AmazonS3Fixture that has evolved in a weird beast in
dedicated, more maintainable fixtures.

The server side logic that emulates S3 mostly comes from the latest
HttpHandler made for S3 blob store repository tests, with additional
features extracted from the (now removed) AmazonS3Fixture:
authentication checks, session token checks and improved response
errors. Chunked upload request support for S3 object has been added
too.

The server side logic of all tests now reside in a single S3HttpHandler class.

Whereas AmazonS3Fixture contained logic for basic tests, session token
tests, EC2 tests or ECS tests, the S3 fixtures are now dedicated to each
kind of test. Fixtures are inheriting from each other, making things easier
to maintain.
2019-11-18 05:56:59 -05:00
Rory Hunter c46a0e8708
Apply 2-space indent to all gradle scripts (#49071)
Backport of #48849. Update `.editorconfig` to make the Java settings the
default for all files, and then apply a 2-space indent to all `*.gradle`
files. Then reformat all the files.
2019-11-14 11:01:23 +00:00
Tanguy Leroux 20fc1dbe18
Move MinIO fixture in its own project (#49036)
This commit moves the MinIO docker-compose fixture from the
:plugins:repository-s3 to its own :test:minio-fixture Gradle project.
2019-11-13 10:03:59 -05:00
Tanguy Leroux 8a14ea5567
Add docker-composed based test fixture for GCS (#48902)
Similarly to what has be done for Azure in #48636, this commit
adds a new :test:fixtures:gcs-fixture project which provides two
docker-compose based fixtures that emulate a Google Cloud
Storage service.

Some code has been extracted from existing tests and placed
into this new project so that it can be easily reused in other
projects.
2019-11-07 13:27:22 -05:00
Tanguy Leroux 989467ca1e
Add docker-compose based test fixture for Azure (#48736)
This commit adds a new :test:fixtures:azure-fixture project which 
provides a docker-compose based container that runs a AzureHttpFixture 
Java class that emulates an Azure Storage service.

The logic to emulate the service is extracted from existing tests and 
placed in AzureHttpHandler into the new project so that it can be 
easily reused. The :plugins:repository-azure project is an example 
of such utilization.

The AzureHttpFixture fixture is just a wrapper around AzureHttpHandler 
and is now executed within the docker container. 

The :plugins:repository-azure:qa:microsoft-azure project uses the new 
test fixture and the existing AzureStorageFixture has been removed.
2019-10-31 10:43:43 +01:00
Yogesh Gaikwad 91c342a888
fix and enable repository-hdfs secure tests (#44044) (#44199)
Due to recent changes are done for converting `repository-hdfs` to test
clusters (#41252), the `integTestSecure*` tasks did not depend on
`secureHdfsFixture` which when running would fail as the fixture
would not be available. This commit adds the dependency of the fixture
to the task.

The `secureHdfsFixture` is a `AntFixture` which is spawned a process.
Internally it waits for 30 seconds for the resources to be made available.
For my local machine, it took almost 45 seconds to be available so I have
added the wait time as an input to the `AntFixture` defaults to 30 seconds
 and set it to 60 seconds in case of secure hdfs fixture.

The integ test for secure hdfs was disabled for a long time and so
the changes done in #42090 to fix the tests are also done in this commit.
2019-07-12 12:44:01 +10:00
Alpar Torok 0c8294e633 Make sure the clean task doesn't break test fixtures (#43641)
Use a dedicated fixture dir.
2019-07-08 17:58:27 +03:00
Alpar Torok 7cc6dca697 Remove explicily enabled build fixture task 2019-06-14 10:42:08 +03:00
Yogesh Gaikwad 4ae1e30a98
Enable krb5kdc-fixture, kerberos tests mount urandom for kdc container (#41710) (#43178)
Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
2019-06-13 13:02:16 +10:00
Mark Vieira 1287c7d91f
[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) (#40993)
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)

This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.

(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)

* Fix forking JVM runner

* Don't bump shadow plugin version
2019-04-09 11:52:50 -07:00
Alpar Torok 293297ae3d Fix repository-hdfs when no docker and unnecesary fixture
The hdfs-fixture is actually executed in plugin/repository-hdfs as a
dependency. The fixture is not needed and actually causes a failure
because we have two copies now and both use the same ports.
2019-03-29 16:55:12 +02:00
Alpar Torok 2b91fb1cc0 Avoid building hdfs-fixure use an image that works instead
Avoid the additional requirement for the debian package repos to be up,
and depend on dockerhub only instead.
2019-03-29 16:55:11 +02:00
Alpar Torok e8c0b53796 Add ability to mute and mute flaky fixture (#40630) 2019-03-29 12:10:04 +02:00
Alpar Torok d791e08932 Test fixtures krb5 (#40297)
Replaces the vagrant based kerberos fixtures with docker based test fixtures plugin.
The configuration is now entirely static on the docker side and no longer driven by Gradle,
also two different services are being configured since there are two different consumers of the fixture that can run in parallel and require different configurations.
2019-03-28 17:26:58 +02:00