AbstractDisruptionTestCase set a lower global checkpoint sync interval setting, but this was ignored by
testAckedIndexing, which has led to spurious test failures
Relates #41068, #38931
ShardId already implements Writeable so there is no need for it to implement Streamable too. Also the readShardId static method can be
easily replaced with direct usages of the constructor that takes a
StreamInput as argument.
In case a search request holds only the suggest section, the query phase
is skipped and only the suggest phase is executed instead. There will
never be hits returned, and in case the explain flag is set to true, the
explain sub phase throws a null pointer exception as the query is null.
Usually a null query is replaced with a match all query as part of SearchContext#preProcess which is though skipped as well with suggest
only searches. To address this, we skip the explain sub fetch phase
for search requests that only requested suggestions.
Closes#31260
As a follow-up to #38540 we can use lambda functions and method
references where convenient in the low-level REST client.
Also, we need to update the docs to state that the minimum java version
required is 1.8.
Removing of payload in BulkRequest (#39843) had a side effect of making
`BulkRequest.add(DocWriteRequest<?>...)` (with varargs) recursive, thus
leading to StackOverflowError. This PR adds a small change in
RequestConvertersTests to show the error and the corresponding fix in
`BulkRequest`.
Fixes#41668
* Remove IndexShard dependency from Repository
In order to simplify repository testing especially for BlobStoreRepository
it's important to remove the dependency on IndexShard and reduce it to
Store and MapperService (in the snapshot case). This significantly reduces
the dependcy footprint for Repository and allows unittesting without starting
nodes or instantiate entire shard instances. This change deprecates the old
method signatures and adds a unittest for FileRepository to show the advantage
of this change.
In addition, the unittesting surfaced a bug where the internal file names that
are private to the repository were used in the recovery stats instead of the
target file names which makes it impossible to relate to the actual lucene files
in the recovery stats.
* don't delegate deprecated methods
* apply comments
* test
When using High Level Rest Client Java API to produce search query, using AggregationBuilders.topHits("th").sort("_score", SortOrder.DESC)
caused query to contain duplicate sort clauses.
A shard that is undergoing peer recovery is subject to logging warnings of the form
org.elasticsearch.action.FailedNodeException: Failed node [XYZ]
...
Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in ...
These failures are actually harmless, and expected to happen while a peer recovery is ongoing (i.e.
there is an IndexShard instance, but no proper IndexCommit just yet).
As these failures are currently bubbled up to the master, they cause unnecessary reroutes and
confusion amongst users due to being logged as warnings.
Closes #40107
rp.client_secret is a required secure setting. Make sure we fail with
a SettingsException and a clear, actionable message when building
the realm, if the setting is missing.
Enhance the handling of merging the claims sets of the
ID Token and the UserInfo response. JsonObject#merge would throw a
runtime exception when attempting to merge two objects with the
same key and different values. This could happen for an OP that
returns different vales for the same claim in the ID Token and the
UserInfo response ( Google does that for profile claim ).
If a claim is contained in both sets, we attempt to merge the
values if they are objects or arrays, otherwise the ID Token claim
value takes presedence and overwrites the userinfo response.
This adds the node name where we fail to start a process via the native
controller to facilitate debugging as otherwise it might not be known
to which node the job was allocated.
Moves the test infrastructure away from using node.max_local_storage_nodes, allowing us in a
follow-up PR to deprecate this setting in 7.x and to remove it in 8.0.
This also changes the behavior of InternalTestCluster so that starting up nodes will not automatically
reuse data folders of previously stopped nodes. If this behavior is desired, it needs to be explicitly
done by passing the data path from the stopped node to the new node that is started.
This commit reworks and clarifies the docs for the `discovery-ec2` plugin:
- folds the tiny "Getting started with AWS" into the page on configuration
- spells out the name of each setting in full instead of noting the
`discovery.ec2` prefix at the top of the page.
- replaces each `(Secure)` marker with a sentence describing what that means in
situ
- notes some missing defaults
- clarifies the behaviour of `discovery.ec2.groups` (dependent on `.any_group`)
- clarifies what `discovery.ec2.host_type` is for
- adds `discovery.ec2.tag.TAGNAME` as a (meta-)setting rather than describing
it in a separate section
- notes that the tags mentioned in `discovery.ec2.tag.TAGNAME` cannot contain
colons (see #38406)
- clarifies the EC2-specific interface names and what they're for
- reorders and rewords the recommendations for storage
- expands on why you should not span a cluster across regions
- adds a suggestion on protecting instances against termination during scale-in
- reformat to 80 columns where possible
Fixes#38406
Simplifies the voting configuration reconfiguration logic by switching to an explicit Comparator for
the priorities. Does not make changes to the behavior of the component.
SHA256 was recently added to the Hasher class in order to be used
in the TokenService. A few tests were still using values() to get
the available algorithms from the Enum and it could happen that
SHA256 would be picked up by these.
This change adds an extra convenience method
(Hasher#getAvailableAlgoCacheHash) and enures that only this and
Hasher#getAvailableAlgoStoredHash are used for getting the list of
available password hashing algorithms in our tests.
This performs a simple restart test to move a basic licensed
cluster from no security (the default) to security & transport TLS
enabled.
Backport of: #41933
Flushing at the end of a peer recovery (if needed) can bring these
benefits:
1. Closing an index won't end up with the red state for a recovering
replica should always be ready for closing whether it performs the
verifying-before-close step or not.
2. Good opportunities to compact store (i.e., flushing and merging
Lucene, and trimming translog)
Closes#40024Closes#39588
This change verifies and aborts recovery if source and target have the
same syncId but different sequenceId. This commit also adds an upgrade
test to ensure that we always utilize syncId.
The verifying-before-close step ensures the global checkpoints on all
shard copies are in sync; thus, we don' t need to sync global
checkpoints for closed indices.
Relate #33888
There is an off-by-one error in this test. It leads to the recovery
thread never being started, and that means joining on it will wait
indefinitely. This commit addresses that by fixing the off-by-one error.
Relates #42325
I forgot to git add these before pushing, sorry. This commit fixes
compilation in IndexShardTests, they are needed here and not in master
due to differences in how Java infers types in generics between JDK 8
and JDK 11.
Today when executing an action on a primary shard under permit, we do
not enforce that the shard is in primary mode before executing the
action. This commit addresses this by wrapping actions to be executed
under permit in a check that the shard is in primary mode before
executing the action.
Today we are persisting the retention leases at least every thirty
seconds by a scheduled background sync. This sync causes an fsync to
disk and when there are a large number of shards allocated to slow
disks, these fsyncs can pile up and can severely impact the system. This
commit addresses this by only persisting and fsyncing the retention
leases if they have changed since the last time that we persisted and
fsynced the retention leases.
Re-enable muted tests and accommodate recent backend changes
that result in higher memory usage being reported for a job
at the start of its life-cycle