This commit adds the SPDX license header and modifications copyright to security
policy files.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
This commit adds the SPDX Apache-2.0 license header along with an additional
copyright header for all modifications.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
This commit refactors instances of 'elasticsearch' with opensearch everywhere
except references to issues, and other places needed to test compatibility with
old elasticsearch clusters.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Fix miscellaneous issues identified during `gradle precommit`. These issues are the side effects of the renaming to OpenSearch work.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
This commit refactors the remaining o.e.index and o.e.test packages in the
test/fixtures module. References throughout the codebase are also refactored.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
* [Rename] plugins (#193)
This PR refactors files under "plugins" folders part of the Elasticsearch to OpenSearch renaming effort.
Signed-off-by: Harold Wang <harowang@amazon.com>
This commit refactors the classes in o.e.action.support to
o.opensearch.action.support. The remaining directories will be refactored in a
separate commit.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors top level classes in o.e.action to o.opensearch.action.
References throughout the rest of the codebase have been updated.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.
In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.
Backport of #60371
For OSS plugins that being with repository-*, integTest
task is now a no-op and all of the tests are now executed via a test,
yamlRestTest, javaRestTest, or internalClusterTest.
related: #56841
related: #59444
Removing these limits as they cause unnecessarily many object in the blob stores.
We do not have to worry about BwC of this change since we do not support any 3rd party
implementations of Azure or GCS.
Also, since there is no valid reason to set a different than the default maximum chunk size at this
point, removing the documentation (which was incorrect in the case of Azure to begin with) for the setting
from the docs.
Closes#56018
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.
Relates #56911
Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account
(i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository
setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a
per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to
configure throttling in a single place.
The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to
`40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change
will be observed by clusters where the recovery and restore settings were not adapted.
Relates https://github.com/elastic/elasticsearch/issues/57023
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Adds tracking for the API calls performed by the Azure Storage
underlying SDK. It relies on the ability to hook a request
listener into the OperationContext.
Backport of #56773
Add tracking for regular and multipart uploads.
Regular uploads are categorized as PUT.
Multi part uploads are categorized as POST.
The number of documents created for the test #testRequestStats
have been increased so all upload methods are exercised.
Backport of #56826
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
This is a backport of #54803 for 7.x.
This pull request cherry picks the squashed commit from #54803 with the additional commits:
6f50c92 which adjusts master code to 7.x
a114549 to mute a failing ILM test (#54818)
48cbca1 and 50186b2 that cleans up and fixes the previous test
aae12bb that adds a missing feature flag (#54861)
6f330e3 that adds missing serialization bits (#54864)
bf72c02 that adjust the version in YAML tests
a51955f that adds some plumbing for the transport client used in integration tests
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
This reverts commit 23cccf088810b8416ed278571352393cc2de9523.
Unfortunately SAS token auth still doesn't work with bulk deletes so we can't use them yet.
Closes#54080
Upgrading to 8.6.2 in #53865 broke running against HTTPs endpoints (and hence real azure)
because the https url connection needs the newly added permission to work.
* Remove BlobContainer Tests against Mocks
Removing all these weird mocks as asked for by #30424.
All these tests are now part of real repository ITs and otherwise left unchanged if they had
independent tests that didn't call the `createBlobStore` method previously.
The HDFS tests also get added coverage as a side-effect because they did not have an implementation
of the abstract repository ITs.
Closes#30424
* Remove Unused Single Delete in BlobStoreRepository
There are no more production uses of the non-bulk delete or the delete that throws
on missing so this commit removes both these methods.
Only the bulk delete logic remains. Where the bulk delete was derived from single deletes,
the single delete code was inlined into the bulk delete method.
Where single delete was used in tests it was replaced by bulk deleting.
Today settings can declare dependencies on another setting. This
declaration is implemented so that if the declared setting is not set
when the declaring setting is, settings validation fails. Yet, in some
cases we want not only that the setting is set, but that it also has a
specific value. For example, with the monitoring exporter settings, if
xpack.monitoring.exporters.my_exporter.host is set, we not only want
that xpack.monitoring.exporters.my_exporter.type is set, but that it is
also set to local. This commit extends the settings infrastructure so
that this declaration is possible. The use of this in the monitoring
exporter settings will be implemented in a follow-up.
* Make BlobStoreRepository Aware of ClusterState (#49639)
This is a preliminary to #49060.
It does not introduce any substantial behavior change to how the blob store repository
operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency
by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation
(create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple.
This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the
repository operates in #49060
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
Fixing a few small issues found in this code:
1. We weren't reading the request headers but the response headers when checking for blob existence in the mocked single upload path
2. Error code can never be `null` removed the dead code that resulted
3. In the logging wrapper we weren't checking for `Throwable` so any failing assertions in the http mock would not show up since they
run on a thread managed by the mock http server
This commit fixes the server side logic of "List Objects" operations
of Azure and S3 fixtures. Until today, the fixtures were returning a "
flat" view of stored objects and were not correctly handling the
delimiter parameter. This causes some objects listing to be wrongly
interpreted by the snapshot deletion logic in Elasticsearch which
relies on the ability to list child containers of BlobContainer (#42653)
to correctly delete stale indices.
As a consequence, the blobs were not correctly deleted from the
emulated storage service and stayed in heap until they got garbage
collected, causing CI failures like #48978.
This commit fixes the server side logic of Azure and S3 fixture when
listing objects so that it now return correct common blob prefixes as
expected by the snapshot deletion process. It also adds an after-test
check to ensure that tests leave the repository empty (besides the
root index files).
Closes#48978
This commit adds a new :test:fixtures:azure-fixture project which
provides a docker-compose based container that runs a AzureHttpFixture
Java class that emulates an Azure Storage service.
The logic to emulate the service is extracted from existing tests and
placed in AzureHttpHandler into the new project so that it can be
easily reused. The :plugins:repository-azure project is an example
of such utilization.
The AzureHttpFixture fixture is just a wrapper around AzureHttpHandler
and is now executed within the docker container.
The :plugins:repository-azure:qa:microsoft-azure project uses the new
test fixture and the existing AzureStorageFixture has been removed.
In repository integration tests, we drain the HTTP request body before
returning a response. Before this change this operation was done using
Streams.readFully() which uses a 8kb buffer to read the input stream, it
now uses a 1kb for the same operation. This should reduce the allocations
made during the tests and speed them up a bit on CI.
Co-authored-by: Armin Braun <me@obrown.io>
In #47176 we changed the internal HTTP server that emulates
the Azure Storage service so that it includes a response body
for injected errors. This fixed most of the issues reported in
#47120 but sadly I missed to map one error to its Azure
equivalent, and it triggered some CI failures today.
Closes#47120
We were incorrectly handling `IOExceptions` thrown by
the `InputStream` side of the upload operation, resulting
in a `ClassCastException` as we expected to never get
`IOException` from the Azure SDK code but we do in practice.
This PR also sets an assertion on `markSupported` for the
streams used by the SDK as adding the test for this scenario
revealed that the SDK client would retry uploads for
non-mark-supporting streams on `IOException` in the `InputStream`.
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
The Azure SDK client expects server errors to have a body,
something that looks like:
<?xml version="1.0" encoding="utf-8"?>
<Error>
<Code>string-value</Code>
<Message>string-value</Message>
</Error>
I've forgot to add such errors in Azure tests and that triggers
some NPE in the client like the one reported in #47120.
Closes#47120
Similarly to what has been done for S3 and GCS, this commit
adds unit tests that verify the retry logic of the Azure SDK
client implementation when the remote service returns errors.
It only tests the retry logic in case of errors and not in
case of timeouts because Azure client timeout options are
not exposed as settings.
This commit adds support for Put Block API to the internal HTTP server
used in Azure repository integration tests. This allows to test the
behavior of the Azure SDK client when the Azure Storage service
returns errors when uploading Blob in multiple blocks or when
downloading a blob using ranged downloads.
There were some issues with the Azure implementation requiring
permissions to list all containers ue to a container exists
check. This was caught in CI this time, but going forward we
should ensure that CI is executed using a token that does not
allow listing containers.
Relates #43288
This commit modifies the HTTP server used in
AzureBlobStoreRepositoryTests so that it randomly returns
server errors for any type of request executed by the Azure client.