Commit Graph

164 Commits

Author SHA1 Message Date
Armin Braun 6aaee8aa0a
Repository Cleanup Endpoint (#43900) (#45780)
* Repository Cleanup Endpoint (#43900)

* Snapshot cleanup functionality via transport/REST endpoint.
* Added all the infrastructure for this with the HLRC and node client
* Made use of it in tests and resolved relevant TODO
* Added new `Custom` CS element that tracks the cleanup logic.
Kept it similar to the delete and in progress classes and gave it
some (for now) redundant way of handling multiple cleanups but only allow one
* Use the exact same mechanism used by deletes to have the combination
of CS entry and increment in repository state ID provide some
concurrency safety (the initial approach of just an entry in the CS
was not enough, we must increment the repository state ID to be safe
against concurrent modifications, otherwise we run the risk of "cleaning up"
blobs that just got created without noticing)
* Isolated the logic to the transport action class as much as I could.
It's not ideal, but we don't need to keep any state and do the same
for other repository operations
(like getting the detailed snapshot shard status)
2019-08-21 17:59:49 +02:00
Yogesh Gaikwad 4c95cc3223 skip repository-hdfs integTest in case of fips jvm (#44319)
The repository-hdfs runners need to be disabled it in fips mode.

Testing done for all the tasks, dynamic created and static (integTest, integTestHa, integSecureTest, integSecureHaTest)
2019-07-18 21:10:53 +10:00
Armin Braun c8db0e9b7e
Remove blobExists Method from BlobContainer (#44472) (#44475)
* We only use this method in one place in production code and can replace that with a read -> remove it to simplify the interface
   * Keep it as an implementation detail in the Azure repository
2019-07-17 11:56:02 +02:00
Yogesh Gaikwad b40b6dd542 Disable repository-hdfs tests in FIPS jvm (#44283)
Due to https://github.com/elastic/elasticsearch/issues/40079,
we need to disable repository-hdfs tests in FIPS jvm.
2019-07-13 20:11:32 +10:00
Yogesh Gaikwad 91c342a888
fix and enable repository-hdfs secure tests (#44044) (#44199)
Due to recent changes are done for converting `repository-hdfs` to test
clusters (#41252), the `integTestSecure*` tasks did not depend on
`secureHdfsFixture` which when running would fail as the fixture
would not be available. This commit adds the dependency of the fixture
to the task.

The `secureHdfsFixture` is a `AntFixture` which is spawned a process.
Internally it waits for 30 seconds for the resources to be made available.
For my local machine, it took almost 45 seconds to be available so I have
added the wait time as an input to the `AntFixture` defaults to 30 seconds
 and set it to 60 seconds in case of secure hdfs fixture.

The integ test for secure hdfs was disabled for a long time and so
the changes done in #42090 to fix the tests are also done in this commit.
2019-07-12 12:44:01 +10:00
Mark Vieira 7c2e4b2857
[Backport] Enable caching of rest tests which use integ-test distribution (#44181) 2019-07-10 15:42:28 -07:00
Armin Braun af9b98e81c
Recursively Delete Unreferenced Index Directories (#42189) (#44051)
* Use ability to list child "folders" in the blob store to implement recursive delete on all stale index folders when cleaning up instead of using the diff between two `RepositoryData` instances to cover aborted deletes
* Runs after ever delete operation
* Relates  #13159 (fixing most of this issues caused by unreferenced indices, leaving some meta files to be cleaned up only)
2019-07-08 10:55:39 +02:00
Armin Braun be20fb80e4
Recursive Delete on BlobContainer (#43281) (#43920)
This is a prerequisite of #42189:

* Add directory delete method to blob container specific to each implementation:
  * Some notes on the implementations:
       * AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting.
       * For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block.
       * HDFS and FS are trivial since we have directory delete methods available for them
* Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)
2019-07-03 17:14:57 +02:00
Armin Braun 455b12a4fb
Add Ability to List Child Containers to BlobContainer (#42653) (#43903)
* Add Ability to List Child Containers to BlobContainer (#42653)

* Add Ability to List Child Containers to BlobContainer
* This is a prerequisite of #42189
2019-07-03 11:30:49 +02:00
Yogesh Gaikwad 4ae1e30a98
Enable krb5kdc-fixture, kerberos tests mount urandom for kdc container (#41710) (#43178)
Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
2019-06-13 13:02:16 +10:00
Jason Tedor 371cb9a8ce
Remove Log4j 1.2 API as a dependency (#42702)
We had this as a dependency for legacy dependencies that still needed
the Log4j 1.2 API. This appears to no longer be necessary, so this
commit removes this artifact as a dependency.

To remove this dependency, we had to fix a few places where we were
accidentally relying on Log4j 1.2 instead of Log4j 2 (easy to do, since
both APIs were on the compile-time classpath).

Finally, we can remove our custom Netty logger factory. This was needed
when we were on Log4j 1.2 and handled logging in our own unique
way. When we migrated to Log4j 2 we could have dropped this
dependency. However, even then Netty would still pick up Log4j 1.2 since
it was on the classpath, thus the advantage to removing this as a
dependency now.
2019-05-30 16:08:07 -04:00
Mark Vieira c1816354ed
[Backport] Improve build configuration time (#42674) 2019-05-30 10:29:42 -07:00
Armin Braun c4f44024af
Remove Delete Method from BlobStore (#41619) (#42574)
* Remove Delete Method from BlobStore (#41619)
* The delete method on the blob store was used almost nowhere and just duplicates the delete method on the blob containers
  * The fact that it provided for some recursive delete logic (that did not behave the same way on all implementations) was not used and not properly tested either
2019-05-27 12:24:20 +02:00
Armin Braun aad33121d8
Async Snapshot Repository Deletes (#40144) (#41571)
Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete.

* Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll
   * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable.
* See https://github.com/elastic/elasticsearch/pull/39656#issuecomment-470492106
* Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore).
* By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in #39657
2019-04-26 15:36:09 +02:00
Alpar Torok 4ef4ed66b9 Convert repository-hdfs to testclusters (#41252)
* Convert repository-hdfs to testclusters

Relates #40862
2019-04-19 09:47:14 +03:00
Mark Vieira 1287c7d91f
[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) (#40993)
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)

This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.

(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)

* Fix forking JVM runner

* Don't bump shadow plugin version
2019-04-09 11:52:50 -07:00
Alpar Torok 293297ae3d Fix repository-hdfs when no docker and unnecesary fixture
The hdfs-fixture is actually executed in plugin/repository-hdfs as a
dependency. The fixture is not needed and actually causes a failure
because we have two copies now and both use the same ports.
2019-03-29 16:55:12 +02:00
Alpar Torok d791e08932 Test fixtures krb5 (#40297)
Replaces the vagrant based kerberos fixtures with docker based test fixtures plugin.
The configuration is now entirely static on the docker side and no longer driven by Gradle,
also two different services are being configured since there are two different consumers of the fixture that can run in parallel and require different configurations.
2019-03-28 17:26:58 +02:00
Henning Andersen 00a26b9dd2 Blob store compression fix (#39073)
Blob store compression was not enabled for some of the files in
snapshots due to constructor accessing sub-class fields. Fixed to
instead accept compress field as constructor param. Also fixed chunk
size validation to work.

Deprecated repositories.fs.compress setting as well to be able to unify
in a future commit.
2019-02-20 09:24:41 +01:00
Tanguy Leroux 510829f9f7
TransportVerifyShardBeforeCloseAction should force a flush (#38401)
This commit changes the `TransportVerifyShardBeforeCloseAction` so that it 
always forces the flush of the shard. It seems that #37961 is not sufficient to 
ensure that the translog and the Lucene commit share the exact same max 
seq no and global checkpoint information in case of one or more noop 
operations have been made.

The `BulkWithUpdatesIT.testThatMissingIndexDoesNotAbortFullBulkRequest` 
and `FrozenIndexTests.testFreezeEmptyIndexWithTranslogOps` test this trivial 
situation and they both fail 1 on 10 executions.

Relates to #33888
2019-02-06 13:22:54 +01:00
Jay Modi 54dbf9469c
Update httpclient for JDK 11 TLS engine (#37994)
The apache commons http client implementations recently released
versions that solve TLS compatibility issues with the new TLS engine
that supports TLSv1.3 with JDK 11. This change updates our code to
use these versions since JDK 11 is a supported JDK and we should
allow the use of TLSv1.3.
2019-01-30 14:24:29 -07:00
Alpar Torok a7c3d5842a
Split third party audit exclusions by type (#36763) 2019-01-07 17:24:19 +02:00
Armin Braun 99f13b90d3
SNAPSHOT: Speed up HDFS Repository Writes (#37069)
* There is no point in hsyncing after every individual write since there is only value in completely written blobs for restores, this is ensures by the `SYNC` flag already and there is no need for separately invoking `hsync` after individual writes
2019-01-03 16:16:05 +01:00
Armin Braun 10d9819f99
Implement Atomic Blob Writes for HDFS Repository (#37066)
* Implement atomic writes the same way we do for the FsBlobContainer via rename which is atomic
* Relates #37011
2019-01-03 15:51:47 +01:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Alpar Torok 60e45cd81d
Testing conventions task part 2 (#36107)
Closes #35435

- make it easier to add additional testing tasks with the proper configuration and add some where they were missing.
- mute or fix failing tests
- add a check as part of testing conventions to find classes not included in any testing task.
2018-12-05 14:20:01 +02:00
Gordon Brown b2057138a7
Remove AbstractComponent from AbstractLifecycleComponent (#35560)
AbstractLifecycleComponent now no longer extends AbstractComponent. In
order to accomplish this, many, many classes now instantiate their own
logger.
2018-11-19 09:51:32 -07:00
Pratik Sanglikar f1135ef0ce Core: Replace deprecated Loggers calls with LogManager. (#34691)
Replace deprecated Loggers calls with LogManager.

Relates to #32174
2018-10-29 15:52:30 -04:00
Alpar Torok 59536966c2
Add a new "contains" feature (#34738)
The contains syntax was added in #30874 but the skips were not properly
put in place.
The java runner has the feature so the tests will run as part of the
build, but language clients will be able to support it at their own
pace.
2018-10-25 08:50:50 +03:00
Benjamin Trent 6ccaa83d2a
Fixing line lengths in murmur3 and hdfs plugins (#34603) 2018-10-18 09:57:20 -05:00
Alpar Torok 2cc611604f
Run Third party audit with forbidden APIs CLI (part3/3) (#33052)
The new implementation is functional equivalent with the old, ant based one.
It parses task standard error to get the missing classes and violations in the same way.
I considered re-using ForbiddenApisCliTask but Gradle makes it hard to build inheritance with tasks that have task actions , since the order of the task actions can't be controlled.
This inheritance isn't dully desired either as the third party audit task is much more opinionated and we don't want to expose some of the configuration.
We could probably extract a common base class without any task actions, but probably more trouble than it's worth.

Closes #31715
2018-08-28 10:03:30 +03:00
Lee Hinman 48281ac5bc
Use generic AcknowledgedResponse instead of extended classes (#32859)
This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead.

While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.
2018-08-15 08:06:14 -06:00
Armin Braun 0a67cb4133
LOGGING: Upgrade to Log4J 2.11.1 (#32616)
* LOGGING: Upgrade to Log4J 2.11.1
* Upgrade to `2.11.1` to fix memory leaks in slow logger when logging large requests
   * This was caused by a bug in Log4J https://issues.apache.org/jira/browse/LOG4J2-2269 and is fixed in `2.11.1` via https://git-wip-us.apache.org/repos/asf?p=logging-log4j2.git;h=9496c0c
* Fixes #32537
* Fixes #27300
2018-08-06 14:56:21 +02:00
James Baiera bba1da642d
Add new permission for JDK11 to load JAAS libraries (#32132)
Hadoop's security model uses the OS level authentication modules to collect 
information about the current user. In JDK 11, the UnixLoginModule makes 
use of a new permission to determine if the executing code is allowed to load 
the libraries required to pull the user information from the OS. This PR adds 
that permission and re-enables the tests that were previously failing when 
testing against JDK 11.
2018-07-23 15:11:41 -04:00
Nik Everett d596447f3d
Switch non-x-pack to new style requests (#32106)
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes most of the calls not in X-Pack to their new versions.
2018-07-16 17:44:19 -04:00
Vladimir Dolzhenko b1bf643e41
lazy snapshot repository initialization (#31606)
lazy snapshot repository initialization
2018-07-13 20:05:49 +02:00
Alpar Torok eaa247d543 Properly mute test involving JDK11 closes #31739 2018-07-06 09:58:35 +03:00
Alpar Torok 1e8e3f6dae Correct exclusion of test on JDK 11 2018-07-05 10:30:20 +03:00
Alpar Torok cf2295b408
Add JDK11 support and enable in CI (#31644)
* Upgrade bouncycastle

Required to fix
`bcprov-jdk15on-1.55.jar; invalid manifest format `
on jdk 11

* Downgrade bouncycastle to avoid invalid manifest

* Add checksum for new jars

* Update tika permissions for jdk 11

* Mute test failing on jdk 11

* Add JDK11 to CI

* Thread#stop(Throwable) was removed

http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-June/053536.html

* Disable failing tests #31456

* Temprorarily disable doc tests

To see if there are other failures on JDK11

* Only blacklist specific doc tests

* Disable only failing tests in ingest attachment plugin

* Mute failing HDFS tests #31498

* Mute failing lang-painless tests #31500

* Fix backwards compatability builds

Fix JAVA version to 10 for ES 6.3

* Add 6.x to bwx -> java10

* Prefix out and err from buildBwcVersion for readability

```
> Task :distribution:bwc:next-bugfix-snapshot:buildBwcVersion
  [bwc] :buildSrc:compileJava
  [bwc] WARNING: An illegal reflective access operation has occurred
  [bwc] WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/home/alpar/.gradle/wrapper/dists/gradle-4.5-all/cg9lyzfg3iwv6fa00os9gcgj4/gradle-4.5/lib/groovy-all-2.4.12.jar) to method java.lang.Object.finalize()
  [bwc] WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass
  [bwc] WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
  [bwc] WARNING: All illegal access operations will be denied in a future release
  [bwc] :buildSrc:compileGroovy
  [bwc] :buildSrc:writeVersionProperties
  [bwc] :buildSrc:processResources
  [bwc] :buildSrc:classes
  [bwc] :buildSrc:jar

```

* Also set RUNTIME_JAVA_HOME for bwcBuild

So that we can make sure it's not too new for the build to understand.

* Align bouncycastle dependency

* fix painles array tets

closes #31500

* Update jar checksums

* Keep 8/10 runtime/compile untill consensus builds on 11

* Only skip failing tests if running on Java 11

* Failures are dependent of compile java version not runtime

* Condition doc test exceptions on compiler java version as well

* Disable hdfs tests based on runtime java

* Set runtime java to minimum supported for bwc

* PR review

* Add comment with ticket for forbidden apis
2018-07-05 03:24:01 +00:00
Yannick Welsch 2bb4f38371
Add write*Blob option to replace existing blob (#31729)
Adds a new parameter to the BlobContainer#write*Blob methods to specify whether the existing file
should be overridden or not. For some metadata files in the repository, we actually want to replace
the current file. This is currently implemented through an explicit blob delete and then a fresh write.
In case of using a cloud provider (S3, GCS, Azure), this results in 2 API requests instead of just 1.
This change will therefore allow us to achieve the same functionality using less API requests.
2018-07-03 09:13:50 +02:00
Alpar Torok 08b8d11e30
Add support for switching distribution for all integration tests (#30874)
* remove left-over comment

* make sure of the property for plugins

* skip installing modules if these exist in the distribution

* Log the distrbution being ran

* Don't allow running with integ-tests-zip passed externally

* top level x-pack/qa can't run with oss distro

* Add support for matching objects in lists

Makes it possible to have a key that points to a list and assert that a
certain object is present in the list. All keys have to be present and
values have to match. The objects in the source list may have additional
fields.

example:
```
  match:  { 'nodes.$master.plugins': { name: ingest-attachment }  }
```

* Update plugin and module tests to work with other distributions

Some of the tests expected that the integration tests will always be ran
with  the `integ-test-zip` distribution so that there will be no other
plugins loaded.

With this change, we check for the presence of the plugin without
assuming exclusivity.

* Allow modules to run on other distros as well

To match the behavior of tets.distributions

* Add and use a new `contains` assertion

Replaces the  previus changes that caused `match` to do a partial match.

* Implement PR review comments
2018-06-26 06:49:03 -07:00
Tanguy Leroux 8b4d80ad09
Fix AntFixture waiting condition (#31272)
The AntFixture waiting condition is evaluated to false 
but it should be true.
2018-06-13 12:40:22 +02:00
Tanguy Leroux b5f05f676c
Remove BlobContainer.move() method (#31100)
closes #30680
2018-06-07 10:48:31 +02:00
Yannick Welsch 1dca00deb9
Remove extra checks from HdfsBlobContainer (#31126)
This commit saves one network roundtrip when reading or deleting files from an HDFS repository.
2018-06-06 16:38:37 +02:00
David Turner 15df911e41
Suppress hdfsFixture if there are spaces in the path (#30302)
HDFS sets its thread-name format based partly on a URL-encoded version of the
path, but the URL-encoding of spaces as `%20` is interpreted as a field in the
formatted string of type `2`, which is nonsensical. This change simply skips 
these tests in this case.
2018-05-11 13:36:31 +01:00
Yannick Welsch fc870fdb4c
Use simpler write-once semantics for HDFS repository (#30439)
There's no need for an extra `blobExists()` call when writing a blob to the HDFS service. The writeBlob implementation for the HDFS repository already uses the `CreateFlag.CREATE` option on the file creation, which ensures that the blob that's uploaded does not already exist. This saves one network roundtrip.
2018-05-11 09:50:37 +02:00
Ryan Ernst fab5e21e7d Build: Split distributions into oss and default
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
2018-04-20 15:33:57 -07:00
Jason Tedor aded32f48f
Fix third-party audit tasks on JDK 8
This one is interesting. The third party audit task runs inside the
Gradle JVM. This means that if Gradle is started on JDK 8, the third
party audit tasks will fail as a result of the changes to support
building Elasticsearch with the JDK 9 compiler. This commit reverts the
third party audit changes to support running this task when Gradle is
started with JDK 8.

Relates #28256
2018-01-16 22:59:29 -05:00
Jason Tedor 0a79555a12
Require JDK 9 for compilation (#28071)
This commit modifies the build to require JDK 9 for
compilation. Henceforth, we will compile with a JDK 9 compiler targeting
JDK 8 as the class file format. Optionally, RUNTIME_JAVA_HOME can be set
as the runtime JDK used for running tests. To enable this change, we
separate the meaning of the compiler Java home versus the runtime Java
home. If the runtime Java home is not set (via RUNTIME_JAVA_HOME) then
we fallback to using JAVA_HOME as the runtime Java home. This enables:
 - developers only have to set one Java home (JAVA_HOME)
 - developers can set an optional Java home (RUNTIME_JAVA_HOME) to test
   on the minimum supported runtime
 - we can test compiling with JDK 9 running on JDK 8 and compiling with
   JDK 9 running on JDK 9 in CI
2018-01-16 13:45:13 -05:00
James Baiera e16f1271b6
Fix SecurityException when HDFS Repository used against HA Namenodes (#27196)
* Sense HA HDFS settings and remove permission restrictions during regular execution.

This PR adds integration tests for HA-Enabled HDFS deployments, both regular and secured. 
The Mini HDFS fixture has been updated to optionally run in HA-Mode. A new test suite has 
been added for reproducing the effects of a Namenode failing over during regular repository 
usage. Going forward, the HDFS Repository will still be subject to its self imposed permission 
restrictions during normal use, but will no longer restrict them when running against an HA 
enabled HDFS cluster. Instead, the plugin will rely on the provided security policy and not 
further restrict the permissions so that the transparent operation to failover to a different 
Namenode in the client does not raise security exceptions. Additionally, we are now testing the 
secure mode with SASL based wire encryption of data between Elasticsearch and HDFS. This 
includes a missing library (commons codec) in order to support this change.
2017-12-01 14:26:05 -05:00