This commit refactors o.e.Version to o.opensearch.Version. This is retained in a
single commit to serve as a reference for re-versioning the opensearch codebase
from legacy 7.10 to 1.0.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors all OpenSearch classes in the root server package to
o.opensearch. All references throughout the codebase are also refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the o.e.cli and o.e.client packages from elasticsearch to
o.opensearch.cli and o.opensearch.client packages in the server module,
respectively.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the following subpackages:
* o.e.cluster.health
* o.e.cluster.metadata
* o.e.cluster.node
to o.opensearch.cluster.*. All other references throughout the codebase are
updated.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the heavily used ESPolicy, Elasticsearch (main class), and Elasticsearch
prefixed test classes used in the bootstrap package under the server module. Refactoring the
namespace will come in a separate commit.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
* Refector server/src/main/java/org/opensearch/gateway as part of Elasticsearch to OpenSearch renaming
Signed-off-by: Harold Wang <harowang@amazon.com>
Refactor the repositories package in the server module to rename the package from `org.elasticsearch.repositories` to `org.opensearch.repositories`
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
This commit refactors the following:
* o.e.cluster.ack
* o.e.cluster.action
* o.e.cluster.block
* o.e.cluster.coordination
to o.opensearch package. all other references are also refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors all classes in o.e.cluster to o.opensearch.cluster.
Refereences throughtout the code base are updated.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
Refactor the `libs/cli` module to rename the package name from `org.elasticsearch.cli` to `org.opensearch.cli` as part of the rename to OpenSearch work.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Refactor the `org.elasticsearch.usage` package in the server module to rename it to `org.opensearch.usage`.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Refactor the package`org.elasticsearch.script` in server module to rename it to`org.opensearch.script`.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Refactor the `libs/geo` module to rename the package name from `org.elasticsearch.geometry` to `org.opensearch.geometry` as part of the rename to OpenSearch work.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Refactor the libs/plugin-cli and libs/secure-sm modules to rename the package names
- `org.elasticsearch.plugins` to `org.opensearch.plugins`
- `org.elasticsearch.secure_sm` to `org.opensearch.secure_sm`
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Refactor following code paths as part of the Elasticsearch to OpenSearch renaming effort.
- server/src/test/java/org/elasticsearch/discovery
- server/src/test/java/org/elasticsearch/gateway
- server/src/test/java/org/elasticsearch/http
- server/src/test/java/org/elasticsearch/ingest
Signed-off-by: Abbas Hussain <abbas_10690@yahoo.com>
Refactor the server/tasks package to rename the package names from`org.elasticsearch.tasks` to `org.opensearch.tasks`.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Refactor the server/threadpool package to rename the package names from`org.elasticsearch.threadpool` to `org.opensearch.threadpool`.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
* [Rename] plugins (#193)
This PR refactors files under "plugins" folders part of the Elasticsearch to OpenSearch renaming effort.
Signed-off-by: Harold Wang <harowang@amazon.com>
This commit fixes an incorrect import in ClusterHealthRequest after refactoring
o.o.action.support classes.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the classes in o.e.action.support to
o.opensearch.action.support. The remaining directories will be refactored in a
separate commit.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
Refactor `server/snapshots` to rename the package names from `org.elasticsearch.snapshots` to `org.opensearch.snapshots` as part of the rename to OpenSearch work.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
This commit refactors all classes in o.e.action.bulk to o.opensearch.action.bulk
all references throughout the rest of the codebase are updated.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
Refactor below three folders as part of the Elasticsearch to OpenSearch renaming effort.
. server/src/internalClusterTest
. server/src/main/java11
. server/src/main/resources
. rest-api-spec
Signed-off-by: Harold Wang <harowang@amazon.com>
This commit adds back the full qualified package name back to javadoc in
RestoreInfo and RepositoriesMetadata.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors all classes in o.e.action.admin.cluster to
org.opensearch.action.admin.cluster. References are updated
throughout the codebase.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors top level classes in o.e.action to o.opensearch.action.
References throughout the rest of the codebase have been updated.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors o.e.action.admin.indices package to
o.opensearch.action.admin.indices. References through out the codebase have been
updated to reflect the new package location.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors ElasticsearchParseException class in the server module to
OpenSearchParseException. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors ElasticsearchMergePolicy class in the server module to
OpenSearchMergePolicy. References and usages throughout the rest of the codebase
are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors ElasticsearchConcurrentMergeScheduler class in the server
module to OpenSearchConcurrentMergeScheduler. References and usages throughout
the rest of the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors XContentElasticsearchExtension class in the server module
to XContentOpenSearchExtension. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors ElasticsearchReaderManager class in the server module to
OpenSearchReaderManager. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchleafReader class located in the server
module to OpenSearchLeafReader. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchWrapperException class in the server
module to OpenSearchWrapperException. References and usages throughout the rest
of the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchDirectoryReader class located in the
server module to OpenSearchDirectoryReader. References and usages, along with
method names, throughout the rest of the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchNodeCommand class located in the server
module to OpenSearchNodeCommand. References and usages throughout the rest of
the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchTimeoutException class in the server
module to OpenSearchTimeoutException. References and usages throught the rest of
the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchStatusException in the server module to
OpenSearchStatusException. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors ElasticsearchSecurityException class in the server module
to OpenSearchSecurityException. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors ElasticsearchGenerationException class in the server
module to OpenSearchGenerationException. References and usages throughout the
rest of the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors all instances of elasticsearch in server/src/test/java/org/apache to opensearch.
Signed-off-by: Abbas Hussain <abbas_10690@yahoo.com>
This commit refactors the ElasticsearchCorruptionException in the server module
to OpenSearchCorruptionException. References and usages throughtout the rest of
the codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchClient class located in the server module to
OpenSearchClient. References and usages throughout the rest of the codebase are
fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors the ElasticsearchException class located in the server module
to OpenSearchException. References and usages throughout the rest of the
codebase are fully refactored.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This commit refactors all instances of elasticsearch in
server/src/main/java/org/apache to opensearch.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
This reverts commit c50e8c83a2476ca4ca2f1dd05fa5a608bc0e9ef6
which went should have merged to the rename branch instead of
the main branch.
Signed-off-by: Peter Nied <petern@amazon.com>
This commit refactors all instances of elasticsearch in
server/src/main/java/org/apache to opensearch.
Signed-off-by: Nicholas Knize <nknize@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
This commit changes the building, packaging, and testing framework to only support OSS on different distributions.
Next steps:
completely remove -oss flag dependencies in package and build tests
move 6.x bwc testing to be an explicit option
remove any references to elastic.co download site (or replace with downloads from the OSS website)
Co-authored-by: Himanshu Setia <setiah@amazon.com>
Co-authored-by: Rabi Panda <pandarab@amazon.com>
Co-authored-by: Himanshu Setia <58999915+setiah@users.noreply.github.com>
Co-authored-by: Sarat Vemulapalli <vemsarat@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
ReplicationOperation can notify the listener twice if the primary shard
is demoted after it has completed the primary operation.
Closes#68049
Signed-off-by: Peter Nied <petern@amazon.com>
This commit fixes DeleteDataStreamRequestTests.testDeleteSnapshottingDataStream unit test failure by passing SnapshotsInProgress.Entry.SUCCESS in the createEntrymethod.
Signed-off-by: Peter Nied <petern@amazon.com>
This commit cleans up the following:
* Remove unused imports
* Remove ILM settings in hlrc testCluster formation
* Comment out security users settings in ElasticsearchNode creation for build-tools tests
Signed-off-by: Peter Nied <petern@amazon.com>
This fixes the constructor for IndexNameExpressionResolver to pass in
Settings.EMPTY to a ThreadContext used by the resolver.
Signed-off-by: Peter Nied <petern@amazon.com>
Drops an assertion that in 1911 Paris rolled its clocks back past
midnight. This assertion is so far in the past that it isn't
consistently included in the JDK's tzdb. When we upgraded to 15.0.1 it
fell out of my tzdb locally. It doesn't seem to have happened to CI,
oddly, but it just doesn't seem worth keeping.
Closes#67930
We have seen a case where the memory of IndexingPressure was
re-adjusted twice. With this commit, we will log that error with a
stack trace so that we can figure out the source of the issue.
Our test for early termination would break if we made many small
segments because none of them would be large enough to trigger the early
termination. This makes sure have only a single segment for these test,
making sure we terminate early.
Closes#62769
This test failed on WindowsFS. We failed to remove the corrupted file if
it's being opened (for a short window by ListShardStore action) and the
pending delete files were clear when we restarted that node.
This commit fixes the issue by shutting down the node before removing
the corrupted file to avoid any access to that file.
Closes#66893
We started passing down the root document's _source when processing
nested hits, to avoid reloading and reparsing the root source for each hit.
Unfortunately the approach did not work when there are multiple layers of
`inner_hits`. In this case, the second-layer inner hit received its immediate
parent's source instead of the root source. This parent source is filtered to
just contain the parts corresponding to the nested document, but the source
parsing logic is designed to always operate on the top-level root source. This
caused failures when loading the second-layer inner hits.
This PR makes sure to always pass the root document's _source when processing
inner hits, even if there are multiple layers.
Previously we treated attribute filtering for _tier-prefixed attributes a pass-through, meaning
that they were essentially always treated as matching in DiscoveryNodeFilters.match, however, for
exclude settings, this meant that the node was considered to match the node if a _tier* filter was
specified.
This commit prunes these attributes from the DiscoveryNodeFilters when considering the filters for
FilterAllocationDecider so that they are only considered in DataTierAllocationDecider.
Resolves#66679
This PR fixes two bugs that can arise when _source is disabled and we fetch nested documents:
* Fix exception when highlighting `inner_hits` with disabled _source.
* Fix exception in nested `top_hits` with disabled _source.
* Add more tests for highlighting `inner_hits`.
For async searches (EQL included) the client's request headers were
erroneously stored in the .tasks index. This might expose the requesting
client's HTTP Authorization header. This PR fixes that by employing the
usual approach to store only the security-internal headers, which carry
the authentication result, instead of the original Authorization header,
which is commonly utilized to redo authentication for scheduled tasks.
This PR removes outdated overrides in some tests that prevent them from testing
older index versions. Also removes an old comment + logic from
AggregatorFactoriesTests.
This commit fixes the cat tasks api parameter specification and the
handler so that the parameters are consumed during request preparation.
Closes#59493
Backport of #66272
This commit ensures that the after key is parsed with the doc value formatter.
This is needed for unsigned longs that uses shifted longs internally.
Closes#65685
In the assertion mentioned in the new comment we first get the `endingSnapshots`
and then check that we don't have a listener that isn't referred to by it so we need
to remove from the listers map before removing from `endingSnapshots` to avoid rare, random
assertion tripping here with concurrent repository operations.
If creating the latest translog file and retrieving a translog stats
happen within the same millisecond, then the earliestLastModifiedAge
will be zero.
Closes#66092
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.
Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.
The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.
Fixes#65533.
This commit adjusts the behavior when calculating the diff between two
`AbstractScopedSettings` objects, so that the default values of settings
whose default values depend on the values of other settings are
correctly calculated. Previously, when calculating the diff, the default
value of a depended setting would be calculated based on the default
value of the setting(s) it depends on, rather than the current value of
those settings.
In #61906 we added the possibility for the master node to fetch
the size of a shard snapshot before allocating the shard to a
data node with enough disk space to host it. When merging
this change we agreed that any failure during size fetching
should not prevent the shard to be allocated.
Sadly it does not work as expected: the service only triggers
reroutes when fetching the size succeed but never when it
fails. It means that a shard might stay unassigned until
another cluster state update triggers a new allocation
(as in #64372). More sadly, the test I wrote was wrong as
it explicitly triggered a reroute.
This commit changes the InternalSnapshotsInfoService
so that it also triggers a reroute when fetching the snapshot
shard size failed, ensuring that the allocation can move
forward by using an UNAVAILABLE_EXPECTED_SHARD_SIZE
shard size. This unknown shard size is kept around in the
snapshot info service until no corresponding unassigned
shards need the information.
Backport of #65436
This change simplifies logic and allow more legit cases in Metadata.Builder.validateDataStreams.
It will only show conflict on names that are in form of .ds-<data stream name>-<[0-9]+> and will allow any names like .ds-<data stream name>-something-else-<[0-9]+>.
This fixes problem with rollover when you have 2 data streams with names like a and a-b - currently if a-b has generation greater than a you won't be able to rollover a anymore.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
TransportService doesn't respond to the pending requests of proxy
connections when the underlying connections get disconnected because
proxy connections do not override the getCacheKey method. Some CCS
requests would never be completed because of this bug.
These strings are quite long individually and will be repeated
potentially up to the number of snapshots in the repository times.
Since these make up more than half of the size of the repository metadata
and are likely the same for all snapshots the savings from deduplicating them
can make up for more than half the size of `RepositoryData` easily in most real-world
cases.
Store stats can be `null` if e.g. the shard was already closed
when the stats where retrieved. Don't record those shards in the
sizes map to fix an NPE in this case.
When n=0 in TranslogTests.testTotalTests we never update earliestLastModifiedAge so it fails comparison with default value of total.getEarliestLastModifiedAge() which is 0.
In this change we always check this special case and then select n>0
Closes#65629
In the refactoring of TextFieldMapper, we lost the ability to define
a default search or search_quote analyzer in index settings. This
commit restores that ability, and adds some more comprehensive
testing.
Fixes#65434
In certain situations, such as when configured in FIPS 140 mode,
the Java security provider in use might throw a subclass of
java.lang.Error. We currently do not catch these and as a result
the JVM exits, shutting down elasticsearch.
This commit attempts to address this by catching subclasses of Error
that might be thrown for instance when a PBKDF2 implementation
is used from a Security Provider in FIPS 140 mode, with the password
input being less than 14 bytes (112 bits).
- In our PBKDF2 family of hashers, we catch the Error and
throw an ElasticsearchException while creating or verifying the
hash. We throw on verification instead of simply returning false
on purpose so that the message bubbles up and the cause becomes
obvious (otherwise it would be indistinguishable from a wrong
password).
- In KeyStoreWrapper, we catch the Error in order to wrap and re-throw
a GeneralSecurityException with a helpful message. This can happen when
using any of the keystore CLI commands, when the node starts or when we
attempt to reload secure settings.
- In the `elasticsearch-users` tool, we catch the ElasticsearchException that
the Hasher class re-throws and throw an appropriate UserException.
Tests are missing because it's not trivial to set CI in fips approved mode
right now, and thus any tests would need to be muted. There is a parallel
effort in #64024 to enable that and tests will be added in a followup.
KeyStoreAwareCommand attempted to deduce whether an error occurred
because of a wrong password by checking the cause of the
SecurityException that KeyStoreWrapper.decrypt() throws. Checking
for AEADBadTagException was wrong becase that exception could be
(and usually is) wrapped in an IOException. Furthermore, since we
are doing the check already in KeyStoreWrapper, we can just return
the message of the SecurityException to the user directly, as we do
in other places.
A bug was introduced in 7.10 that causes explicit `null` values to be indexed in the _field_names
field. This change fixes this bug for newly ingested data but `null` values ingested with 7.10 will
continue to match `exists` query so a reindex is required.
Fixes#65306
This change fixes the equals and hashCode methods of the custom FieldValuesSource
that is used internally to extract the value from a doc value field.
Using the field data instance to check equality prevented the query to be cached in
previous versions. Switching to the field name should make the query eligible for
caching again.
Watcher has a search template that stores indices options to be used as
part of a search during watch execution, but this was not updated to be
aware of hidden indices and the `hidden` expand_wildcards option. This
change makes use of the `IndicesOptions#toXContent` method in Watcher,
which already handles the new value. Additionally, the XContent parsing
is moved to the IndicesOptions class so that we will be less likely to
miss updating this in the future.
Closes#65148
Backport of #65332
In aa1ea96b8698aa12bed1c4e8d704882a2a639791 I made all
`testReduceRandom` tests for aggs mimick production more precisely.
More precisely, they pick the correct "lead" result when performing
partial reduction. This is great, but, sadly, some tests assumed that we
always reduced against the "first" aggregator. This fixes those tests.
Closes#65163
This commit updates the IndexAbstractionResolver so that hidden indices
are properly resolved when date math is in use and when we are checking
if the index is visible.
Closes#65157
Backport of #65236
Backport of #64454
- Add LongRareTerms and StringRareTerms to the DefaultNamedXContents,
ensure that the response of RareTerms aggregation can be parsed
correctly.
- Add testSearchWithRareTermsAgg method to test the response of
RareTerms aggregation can be parsed correctly.
- Add some test code to ensure the AggregationsTests can execute
successfully.
Co-authored-by: bellengao <gbl_long@163.com>
We were correctly dealing with boosts that had an effect, but mappers
that had a silently accepted but ignored boost parameter were throwing
an error instead of continuing to ignore the boost but emitting a
warning.
Fixes#64982
Currently a rejected execution exception can be swallowed when async
actions return during transport bulk actions. This includes scenarios
where we went async to perform ingest pipelines or index creation. This
commit resolves the issue by propagating a rejected exception.
Node roles vary by version, and new roles are suppressed for BWC. This
means we can receive a join from a node that's already in the cluster
but with a different set of roles: the node didn't change roles, but the
cluster state came via an older master. This commit ensures that we
properly process a join from such a node to ensure that the roles are
correct.
Closes#62840
This change fixes a bug introduced in #61779 that uses a compound order to
compare buckets when merging. The bug is triggered when the compound order
uses a primary sort ordered by key (asc or desc).
This commit ensures that we always extract the primary sort when comparing keys
during merging.
The PR is marked as no-issue since the bug has not been released in any official version.
This commit internalizes whether or not a role represents the ability to
contain data. In the future, this will let us remove the compatibility
role notion.
Now that we're consistently using `cat_match` to filter which shards we
run on we can get this confusing case:
1. You have a search with, say, a range and a sub-agg.
2. That search has a query that `can_match` can recognize will match no
docs. On *any* shard.
3. So we dutifully run it on a single shard so it can produce the
"empty" aggs.
4. The shard we pick happens to not have the target of the range mapped.
5. This kicks in the special range aggregator that doesn't collect any
documents.
6. Before this commit, that range aggregator *also* never produced any
sub-aggs.
So, without this change, it was quite possible for a search that
happened to match no documents to "throw away" the sub-aggs of a range
and a few other aggs.
We've had this problem for a long, long time but it is more confusing
now because `can_match` is really kicking in and causing us to see cases
where it looks like you are targeting a lot of shards but you really are
only targeting a couple. It used to be that to get the "no sub-aggs"
behavior you had to explicitly target only shards that didn't map the
target field of the `range` agg. And, like, in that case it isn't too
bad because you targeted a sort of degenerate shard. But now that
`can_match` is doing its thing you can end up with the confusing steps
above. It took me several hours to track down what what happening I know
how the individual pieces of all of this works. It took four hours to
figure out how they fit together in this case....
Anyway! This replaces all the aggregator implementations that throw out
the sub-aggregators with ones that keep them. I think this'll be less
confusing in the future.
Closes#64142
This commit adds logging to indicate whether or not we are using the
bundled JDK. We distinguish between using a distribution that bundles
the JDK versus using a distribution that does not bundle the JDK.
In 7.x we can't just by default generate this setting as it might not be
supported by data nodes that are assigned shards for an older version in mixed version
clusters.
Closes#64152
This commit fixes an issue with the detection on macOS for whether or
not the bundled JDK is being used. The logic between macOS and non-macOS
is different because the JDK has a different directory structure on
macOS versus non-macOS. However, due to notarization issues, we changed
the top-level directory from jdk to jdk.app, yet never updated this
detection logic to account for that.
Ideally, we would have a packaging test that asserts that we have the
behavior here correct, and it maintains over time. Alas, we do not
currently have packaging tests on macOS.
With this change, we will always return the same point in time in a
search response as its input until we implement the retry mechanism
for the point in times.
The formatting of the global bottom value does not take the resolution of the provided
numeric_type into account. This change fixes this bug by providing the resolution
directly in the doc value format if the numeric_type is provided as `date_nanos`.
Closes#63719
We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).
Closes#64115
If we run into a background merge between creating the snapshot and closing the index
then with compound files we could be in a situation where we get zero file reuse
on restore.
Force merging before the snapshot gives us a single segment that won't change down the line
so the restore always sees file reuse from the close index.
Closes#63476
Assuming the clone failed when the request failed is not sufficient.
There are failure modes where the request fails but the clone still works out
because the data node resent the requeest after the first clone had already been
failed and removed from the cluster state when master was restarted.
Closes#63473