Today we reroute the cluster as part of the process of starting a shard, which
runs at `URGENT` priority. In large clusters, rerouting may take some time to
complete, and this means that a mere trickle of shard-started events can cause
starvation for other, lower-priority, tasks that are pending on the master.
However, it isn't really necessary to perform a reroute when starting a shard,
as long as one occurs eventually. This commit removes the inline reroute from
the process of starting a shard and replaces it with a deferred one that runs
at `NORMAL` priority, avoiding starvation of higher-priority tasks.
Backport of #44433 and #44543.
The repository-hdfs runners need to be disabled it in fips mode.
Testing done for all the tasks, dynamic created and static (integTest, integTestHa, integSecureTest, integSecureHaTest)
Multi search accepts multiple search requests and runs them as
independent requests, each one as part of their own search task. Today
they don't get associated though with their parent multi search task,
which would be useful to monitor which msearch a certain search was part
of, if any, and also to cancel all of the sub-requests in case the
parent msearch gets cancelled (though this will also require making
the multi search task cancellable as a follow-up).
While the code works perfectly well for a single segment, it returns the wrong values for multiple segments. E.g. If we have 500 docs in one segment and if we want to get the doc id = 280 then data.advanceExact(topDocs.scoreDocs[i].doc) works fine. If we have two segments, say, with first segment having docs 1-200 and the second segment having docs 201-500, then 280 is fetched from the second segment but is actually 480. Subtracting the docBase (280-200) takes us to the correct document which is 80 in the second segment and actually 280.
Today the long list of `BUILT_IN_CLUSTER_SETTINGS` is indented differently
between `master` and `7.x`. This sometimes makes backporting painful. This
commit adjusts the indentation of earlier branches to match that in `master`.
Removes the warning suppression -Xlint:-deprecation,-rawtypes,-serial,-try,-unchecked.
Many warnings were unchecked warnings in the test code often because of the use of mocks.
These are suppressed with @SuppressWarning
* Detect process third party audit being killed by OOM
It's very common for the third party audit to be killed by the OOM
killer when the system is running low on memory.
Since the forbidden APIs call is expected to fail, we were ignoring
these and incorrectly interpreting the partial output.
With this change we detect and provide a proper error message when this
happens.
* Many messages deserialized from a `StreamInput` only contain short strings, some use-cases of instantiating a `StreamInput` don't deserialize any strings
* Don't allocate `CharsRef` for small strings to save some allocations (especially on the IO threads)
* Lazily allocate a larger `CharsRef` if needed for larger strings like we did before and have it live as long as the `StreamInput` like before as well
This commit deprecates all constructors of HandledTransportAction
that take in a Supplier instead of a Writeable.Reader for response
objects.
in addition to the deprecation, the following modules were updated to
leverage Writeable
- modules:ingest-common
- modules:lang-mustache
relates #34389.
many classes still use the Streamable constructors of HandledTransportAction,
this commit moves more of those classes to the new Writeable constructors.
relates #34389.
this commit removes usage of the deprecated
constructor with a single argument and no Writeable.Reader.
The purpose of this is to reduce the boilerplate necessary for
properly implementing a new action, as well as reducing the
chances of using the incorrect super constructor while classes
are being migrated to Writeable
relates #34389.
This commit converts all remaining ActionType response classes to
writeable in xpack core. It also converts a few from server which were
used by xpack core.
relates #34389
Today we have an annotation for controlling logging levels in
tests. This annotation serves two purposes, one is to control the
logging level used in tests, when such control is needed to impact and
assert the behavior of loggers in tests. The other use is when a test is
failing and additional logging is needed. This commit separates these
two concerns into separate annotations.
The primary motivation for this is that we have a history of leaving
behind the annotation for the purpose of investigating test failures
long after the test failure is resolved. The accumulation of these stale
logging annotations has led to excessive disk consumption. Having
recently cleaned this up, we would like to avoid falling into this state
again. To do this, we are adding a link to the test failure under
investigation to the annotation when used for the purpose of
investigating test failures. We will add tooling to inspect these
annotations, in the same way that we have tooling on awaits fix
annotations. This will enable us to report on the use of these
annotations, and report when stale uses of the annotation exist.
This commit adds constructors to AcknolwedgedRequest subclasses to
implement Writeable.Reader, and ensures all future subclasses implement
the same.
relates #34389
Large histories can be problematic and have the linearizability checker occasionally run OOM. As it's
very difficult to bound the size of the histories just right, this PR will let it instead run for 10 seconds
on large histories and then abort.
Closes#44429
Changes in #44187 introduced some optimization in the way shapes are
generated. These changes were not captured in GeoWKTShapeParserTests.
Relates #44187
Extracts dateline decomposition logic from ShapeBuilder into a separate
utility class that is used on the indexing side. The search side
will be handled as part of another PR at this time we will remove
the decomposition logic from ShapeBuilders as well. This PR also doesn't
change any existing logic including bugs.
Relates to #40908
Due to https://issues.apache.org/jira/browse/LUCENE-8916, when you
try to use a synonym filter with the index_phrases option on a text field,
you can end up with null values in a Phrase query, leading to weird
exceptions further down the querying chain. As a workaround, this commit
disables the index_phrases optimization for queries that produce token
graphs.
Fixes#43976
Zen 1 stops pinging threads in ZenDiscovery by calling Thread.interrupt(). This is incompatible with
the CancellableThreads that only allow threads to be interrupted through cancellation. The use of
CancellableThreads was introduced in #42844 and added to UnicastZenPing as part of the
backport, as both Zen1 and Zen2 share the same SeedHostsResolver implementation. This commit
effectively undoes the change in the backport while still allowing to share same implementation.
Closes#44425
This PR adds a list of index compatible versions to the `.ci` directory
as well as a way to generate and verify it.
Unfortunetly there is no easy way in Jenkins to have the build generate then
consume this YML axis config.
I hate that this would need maintenance on versions bumps, but the
potential benefir here is reducing the bwc builds that can take more than
24 hours to less than 20 minutes.
This is possible because the CI setup would use a matrix job to run
something like:
```
./graldew 'v7.0.0#bwcTest'
```
For every index compatible version.
On top of that `--parallel` should be possible even without testclusters
due to the limited number of clusters being set up here.
The example command above runs in exactly 10 minutes on my laptop,
thus I'm proposing to accept this compromise while we work out the
infra to do this more dinamically.
BucketScript was using the old-style parser and could easily be
converted over to the newer static parser.
Also adds a test for GapPolicy enum serialization
Moves the following API sections under the REST APIs navigations:
- API Conventions
- Document APIs
- Search APIs
- Index APIs (previously named Indices APIs)
- cat APIs
- Cluster APIs
Other supporting changes:
- Removes the previous index APIs page under REST APIs. Adds a redirect for the removed page.
- Removes several [partintro] macros so the docs build correctly.
- Changes anchors for pages that become sections of a parent page.
- Adds several redirects for existing pages that become sections of a parent page.
This commit re-applies changes from #44238. Changes from that PR were reverted due to broken links in several repos. This commit adds redirects for those broken links.
Today we report an exception on a handshake failure (e.g. cluster name
mismatch) but the message does not include all the details of the mismatch. If
the mismatch is something subtle like `my-cluster` instead of `my_cluster` then
we cannot diagnose this from the message alone. This commit adds the details of
the local cluster to the message, along with the details of the remote cluster,
improving the utility of the exception message if reported in isolation.
* Fix Incorrect Uncompressed Error Handling in InboundMessage
* CompressorFactory.compressor does not throw uncompressed exception on uncompressed bytes, it merely returns `null` in this case if the bytes are at least XContent so the current catch and re-throw logic is dead code
* Made it work again by throwing on a `null` return so we get a real error message instead of an NPE
* We only use this method in one place in production code and can replace that with a read -> remove it to simplify the interface
* Keep it as an implementation detail in the Azure repository
In #44348 we changed the cluster health action so that it sometimes uses the
cluster state directly from the master service rather than from the cluster
applier. If the state is not recovered then this is inappropriate, because
prior to state recovery the state available to the cluster applier contains no
indices. This commit moves us back to using the state from the applier.
Fixes#44416.
Today when the cluster health changes the `AllocationService` reports at most
ten shards that were started or failed, and always ends its message with `...`
suggesting that the list is truncated. This commit adjusts these messages to be
clearer about whether the list is truncated or not. When debug logging is
enabled the list is not truncated; if the list is truncated then its length is
logged, and if it is not truncated then no `...` is included in the message.
This commit converts the request and response classes for broadcast
actions to implement ctors for Writeable.Reader and forces all future
implementations to implement the same.
relates #34389
This commit moves the config that stores Cors options into the server
package. Currently both nio and netty modules must have a copy of this
config. Moving it into server allows one copy and the tests to be in a
common location.
Registering a channel with a selector is a required operation for the
channel to be handled properly. Currently, we mix the registeration with
other setup operations (ip filtering, SSL initiation, etc). However, a
fail to register is fatal. This PR modifies how registeration occurs to
immediately close the channel if it fails.
There are still two clear loopholes for how a user can interact with a
channel even if registration fails. 1. through the exception handler.
2. through the channel accepted callback. These can perhaps be improved
in the future. For now, this PR prevents writes from proceeding if the
channel is not registered.
The contract for MappedFieldType#fielddataBuilder is to throw an
IllegalArgumentException if fielddata is not supported. The rank feature mappers
were instead throwing an UnsupportedOperationException, which caused
MappedFieldType#isAggregatable to fail.
This commit avoids a situation where we might stack overflow in the
auto-follower coordinator. In the face of repeated failures to get the
remote cluster state, we would previously be called back on the same
thread and then recurse to try again. If this failure persists, the
repeated callbacks on the same thread would lead to a stack
overflow. The repeated failures can occur, for example, if the connect
queue is full when we attempt to make a connection to the remote
cluster. This commit avoids this by truncating the call stack if we are
called back on the same thread as the initial request was made on.