This commit changes some Azure tests so that they do not rely on
MockZenPing and TestZenDiscovery anymore, but instead use a mocked
AzureComputeService that exposes internal test cluster nodes as if
they were real Azure nodes.
Related to #27859Closes#27917, #11533
TestZenDiscovery is used to allow discovery based on in memory structures. This isn't a relevant for the cloud providers tests (but isn't a problem at the moment either)
* Fixes ByteSizeValue to serialise correctly
This fix makes a few fixes to ByteSizeValue to make it possible to perform round-trip serialisation:
* Changes wire serialisation to use Zlong methods instead of VLong methods. This is needed because the value `-1` is accepted but previously if `-1` is supplied it cannot be serialised using the wire protocol.
* Limits the supplied size to be no more than Long.MAX_VALUE when converted to bytes. Previously values greater than Long.MAX_VALUE bytes were accepted but would be silently interpreted as Long.MAX_VALUE bytes rather than erroring so the user had no idea the value was not being used the way they had intended. I consider this a bug and so fine to include this bug fix in a minor version but I am open to other points of view.
* Adds a `getStringRep()` method that can be used when serialising the value to JSON. This will print the bytes value if the size is positive, `”0”` if the size is `0` and `”-1”` if the size is `-1`.
* Adds logic to detect fractional values when parsing from a String and emits a deprecation warning in this case.
* Modifies hashCode and equals methods to work with long values rather than doubles so they don’t run into precision problems when dealing with large values. Previous to this change the equals method would not detect small differences in the values (e.g. 1-1000 bytes ranges) if the actual values where very large (e.g. PBs). This was due to the values being in the order of 10^18 but doubles only maintaining a precision of ~10^15.
Closes#27568
* Fix bytes settings default value to not use fractional values
* Fixes test
* Addresses review comments
* Modifies parsing to preserve unit
This should be bwc since in the case that the input is fractional it reverts back to the old method of parsing it to the bytes value.
* Addresses more review comments
* Fixes tests
* Temporarily changes version check to 7.0.0
This will be changed to 6.2 when the fix has been backported
This pull request changes the S3BlobContainer.blobExists() method implementation
to make it use the AmazonS3.doesObjectExist() method instead of
AmazonS3.getObjectMetadata(). The AmazonS3 implementation takes care of
catching any thrown AmazonS3Exception and compares its response code with 404,
returning false (object does not exist) or lets the exception be propagated.
Add support for filtering fields returned as part of mappings in get index, get mappings, get field mappings and field capabilities API.
Plugins can plug in their own function, which receives the index as argument, and return a predicate which controls whether each field is included or not in the returned output.
This commit adds the node name to the names of thread pool executors so
that the node name is visible in rejected execution exception messages.
Relates #27663
Using custom rules in the icu_collation filter can fail on Windows. If the rules are interpreted
as a file location, this leads to an InvalidPathException when trying to read the rules from a file.
This new snapshot mostly brings a change to TopFieldCollector which can now
early terminate collection when trackTotalHits is `false`.
As a follow-up, we should replace our usage of
`EarlyTerminatingSortingCollector` with this new option.
* Sense HA HDFS settings and remove permission restrictions during regular execution.
This PR adds integration tests for HA-Enabled HDFS deployments, both regular and secured.
The Mini HDFS fixture has been updated to optionally run in HA-Mode. A new test suite has
been added for reproducing the effects of a Namenode failing over during regular repository
usage. Going forward, the HDFS Repository will still be subject to its self imposed permission
restrictions during normal use, but will no longer restrict them when running against an HA
enabled HDFS cluster. Instead, the plugin will rely on the provided security policy and not
further restrict the permissions so that the transparent operation to failover to a different
Namenode in the client does not raise security exceptions. Additionally, we are now testing the
secure mode with SASL based wire encryption of data between Elasticsearch and HDFS. This
includes a missing library (commons codec) in order to support this change.
This awaits fix has been there forever and no one seems to know what to
do with this test. I say let CI churn on it because it passed for me
three out of three times. If there is something wrong with it, we will
know quickly and can then address with the new information that we have.
The main highlight of this new snapshot is that it introduces the opportunity
for queries to opt out of caching. In case a query opts out of caching, not only
will it never be cached, but also no compound query that wraps it will be
cached.
This commit changes the DefaultHttpRequestInitializer in order to make
it create new HttpIOExceptionHandler and HttpUnsuccessfulResponseHandler
for every new HTTP request instead of reusing the same two handlers for
all requests.
Closes#27092
The AWS SDK has a transitive dependency on Jackson Databind. While the
AWS SDK was recently upgraded, the Jackson Databind dependency was not
pulled along with it to the version that the AWS SDK depends on. This
commit upgrades the dependencies for discovery-ec2 and repository-s3
plugins to match versions on the AWS SDK transitive dependencies.
Relates #27361
We use affix settings to group settings / values under a certain namespace.
In some cases like login information for instance a setting is only valid if
one or more other settings are present. For instance `x.test.user` is only valid
if there is an `x.test.passwd` present and vice versa. This change allows to specify
such a dependency to prevent settings updates that leave settings in an inconsistent
state.
Now the blob size information is available before writing anything,
the repository implementation can know upfront what will be the
more suitable API to upload the blob to S3.
This commit removes the DefaultS3OutputStream and S3OutputStream
classes and moves the implementation of the upload logic directly in the
S3BlobContainer.
related #26993closes#26969
Gradle 5.0 will remove support for colons in configuration and task
names. This commit fixes this for our build by removing all current uses
of colons in configuration and task names.
Relates #27305
Only tests should use the single argument Environment constructor. To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.
Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns. This Environment
is also available via dependency injection.
For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist.
Closes#21495
* Enhances exists queries to reduce need for `_field_names`
Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.
This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.
Closes#26770
* Addresses review comments
* Addresses more review comments
* implements existsQuery explicitly on every mapper
* Reinstates ability to perform term query on `_field_names`
* Added bwc depending on index created version
* Review Comments
* Skips tests that are not supported in 6.1.0
These values will need to be changed after backporting this PR to 6.x
Currently, when we create a BeiderMorseFilter with an unspecified `languageset`,
the filter will not guess the language, which should be the default behaviour.
This change fixes this and adds a simple test for the cases with and without
provided `languageset` settings.
Closes#26771
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.
With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting. Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.
Relates to #26723
While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already.
This commit simplifies a bit that part of the code.
It also removes `repositoryName` from AzureBlobStore which was not used anymore.
Also we move some properties in AzureBlobContainer to `private` members.
Since `#getAsMap` exposes internal representation we are trying to remove it
step by step. This commit is cleaning up some xcontent writing as well as
usage in tests
We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.
Even though you annotate the Test class with `@ThirdParty` the static
code is initialized.
In that case it fails with:
```
==> Test Info: seed=529C3C6977F695FC; jvms=3; suites=6
Suite: org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests
ERROR 0.00s J2 | AzureSnapshotRestoreTests (suite) <<< FAILURES!
> Throwable #1: java.lang.IllegalStateException: to run integration tests, you need to set -Dtests.thirdparty=true and -Dtests.azure.account=azure-account -Dtests.azure.key=azure-key
> at org.elasticsearch.cloud.azure.AzureTestUtils.generateMockSecureSettings(AzureTestUtils.java:37)
> at org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests.generateMockSettings(AzureSnapshotRestoreTests.java:81)
> at org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests.<clinit>(AzureSnapshotRestoreTests.java:84)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
Completed [1/6] on J2 in 2.21s, 0 tests, 1 error <<< FAILURES!
```
Closes#26812.
(cherry picked from commit eb6d714 for master branch)
* Use Azure upload method instead of our own implementation
We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container
Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote.
And well... Let's just read the doc!
* Adapt integration tests to secure settings
That was a missing part in #23405.
* Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests
* removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository
* merges 2 IT classes so all IT tests are ran from one single class
* We don't remove/create anymore the container between each single test but only for the test suite
While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code:
`AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff.
But with PR #22858, I wrote:
```java
@Override
public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) {
if (blobStore.doesContainerExist(blobStore.container()) == false) {
throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " +
" creating an azure snapshot repository backed by it.");
}
}
```
instead of
```java
@Override
public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) {
if (blobStore.doesContainerExist(blobStore.container()) == false) {
throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " +
" creating an azure snapshot repository backed by it.");
}
super.initializeSnapshot(snapshotId, indices, clusterMetadata);
}
```
As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved.
Closes#26777.
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.