This commit changes the ack timeout mechanism so that its behavior is closer to the publish
timeout, i.e., it only comes into play after committing a cluster state. This ensures for example that
an index creation request with a low (ack) timeout value does not return before the cluster state
that contains information about the newly created index is even committed.
This adds a `description` to ML filters in order
to allow users to describe their filters in a human
readable form which is also editable (filter updates
to be added shortly).
Due to a runtime classpath clash, featureAware task was failing on JVMs
higher than 1.8 (since the ASM version from Painless was used instead
which does not recognized Java 9 or 10 bytecode) causing the task to
fail.
This commit excludes the ASM dependency (since it's not used by SQL
itself).
Packaging tests are occasionally failing (#30295) because of very slow index
template creation. It looks like the slow part is updating the on-disk cluster
state, and this change will help to confirm this.
Many fixtures have similar code for writing the pid & ports files or
for handling HTTP requests. This commit adds an AbstractHttpFixture
class in the test framework that can be extended for specific testing purposes.
We currently have a specific REST action to retrieve all aliaes, which
uses internally the get index API. This doesn't seem to be required
anymore though as the existing RestGetAliaesAction could as well take
the requests with no indices and aliases specified.
This commit removes the RestGetAllAliasesAction in favour of using
RestGetAliasesAction also for requests that don't specify indices nor
aliases. Similar to #31129.
x-pack/sql depends on lang-painless which depends on ASM 5.1
FeatureAwareCheck needs ASM 6
This is a hack to strip ASM5 from the classpath for FeatureAwareCheck
This is related to #27260. Currently when we queue a write with a
channel we set OP_WRITE and wait until the next selection loop to flush
the write. However, if the channel does not have a pending write, it
is probably ready to flush. This PR implements an optimistic flush logic
that will attempt this flush.
- All rollup pages should be marked as experimental instead of just
the top page
- While the job config docs state which aggregations are allowed, adding
a section which specifically details this in one place is more convenient
for the user
- Add a clarification that the DeleteJob API does not delete the rollup
data, just the rollup job.
Cross-cluster search selects a subset of nodes for each remote cluster
and sends requests only to them, which will act as a proxy and properly
redirect such requests to the target nodes that hold the relevant data.
What happens today is that every time we send a request to a remote
cluster, it will be sent to the next node in the proxy list
(in round-robin fashion), regardless of whether the target node is
already amongst the ones that we are connected to. In case for instance
we need to send a shard search request to a data node that's also one of
the selected proxy nodes, we may end up sending the request to it
through one of the other proxy nodes.
This commit optimizes this case to make sure that whenever we are
already connected to a remote node, we will send a direct request rather
than using the next proxy node.
There is a side-effect to this, which is that round-robin will be a bit
unbalanced as the data nodes that are also selected as proxies will
receive more requests.
The parser for the Metric config was directly instantiating
the config object, rather than using the builder. That means it was
bypassing the validation logic built into the builder, and would allow
users to create invalid metric configs (like using unsupported metrics).
The job would later blow up and abort due to bad configs, but this isn't
immediately obvious to the user since the PutJob API succeeded.
This change prevents a datafeed using cross cluster search from starting if the remote cluster
does not have x-pack installed and a sufficient license. The check is made only when starting a
datafeed.
We have some use cases for an index setting to only be manageable by
dedicated APIs rather than be updateable via the update settings
API. This commit adds the notion of an internal index setting. Such
settings can be set on create index requests, they can not be changed
via the update settings API, yet they can be changed by action on behalf
of or triggered by the user via dedicated APIs.
The `requires_replica` yaml test feature hasn't worked for years. This
is what happens if you try to use it:
```
> Throwable #1: java.lang.NullPointerException
> at __randomizedtesting.SeedInfo.seed([E6602FB306244B12:6E341069A8D826EA]:0)
> at org.elasticsearch.test.rest.yaml.Features.areAllSupported(Features.java:58)
> at org.elasticsearch.test.rest.yaml.section.SkipSection.skip(SkipSection.java:144)
> at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:321)
```
None of our tests use it.
With #29331 we added support for the cluster health API to the
high-level REST client. The transport client does not support the level
parameter, and it always returns all the info needed for shards level
rendering. We have maintained that behaviour when adding support for
cluster health to the high-level REST client, to ease migration, but the
correct thing to do is to default the high-level REST client to
`cluster` level, which is the same default as when going through the
Elasticsearch REST layer.
If the publishing of a cluster state to a node fails, we currently only log it as debug information and
only on the master. This makes it hard to see the cause of (test) failures when logging is set to
default levels. This PR adds a warn level log on the node receiving the cluster state when it fails to
deserialise the cluster state and a warn level log on the master with a list of nodes for which
publication failed.
Today, if GET /_cluster/health?wait_for_active_shards=all does not immediately
succeed then it throws an exception due to an erroneous and unnecessary call to
ActiveShardCount#enoughShardsActive(). This commit fixes this logic.
Fixes#31151
Rules allow users to supply a detector with domain
knowledge that can improve the quality of the results.
The model detects statistically anomalous results but it
has no knowledge of the meaning of the values being modelled.
For example, a detector that performs a population analysis
over IP addresses could benefit from a list of IP addresses
that the user knows to be safe. Then anomalous results for
those IP addresses will not be created and will not affect
the quantiles either.
Another example would be a detector looking for anomalies
in the median value of CPU utilization. A user might want
to inform the detector that any results where the actual
value is less than 5 is not interesting.
This commit introduces a `custom_rules` field to the `Detector`.
A detector may have multiple rules which are combined with `or`.
A rule has 3 fields: `actions`, `scope` and `conditions`.
Actions is a list of what should happen when the rule applies.
The current options include `skip_result` and `skip_model_update`.
The default value for `actions` is the `skip_result` action.
Scope is optional and allows for applying filters on any of the
partition/over/by field. When not defined the rule applies to
all series. The `filter_id` needs to be specified to match the id
of the filter to be used. Optionally, the `filter_type` can be specified
as either `include` (default) or `exclude`. When set to `include`
the rule applies to entities that are in the filter. When set to
`exclude` the rule only applies to entities not in the filter.
There may be zero or more conditions. A condition requires `applies_to`,
`operator` and `value` to be specified. The `applies_to` value can be
either `actual`, `typical` or `diff_from_typical` and it specifies
the numerical value to which the condition applies. The `operator`
(`lt`, `lte`, `gt`, `gte`) and `value` complete the definition.
Conditions are combined with `and` and allow to specify numerical
conditions for when a rule applies.
A rule must either have a scope or one or more conditions. Finally,
a rule with scope and conditions applies when all of them apply.
TransportAction has many variants of execute. One of those variants
executes by returning a future, which is then often blocked on by
calling get(). This commit removes this variant of execute, instead
using a helper method for tests that want to block, or having tests
pass in a PlainActionFuture directly as a listener.
Co-authored-by: Simon Willnauer <simonw@apache.org>
Currently the http pipelining handlers seem to support chunked http
content. However, this does not make sense. There is a content
aggregator in the pipeline before the pipelining handler. This means the
pipelining handler should only see full http messages. Additionally, the
request handler immediately after the pipelining handler only supports
full messages.
This commit modifies both nio and netty4 pipelining handlers to assert
that an inbound message is a full http message. Additionally it removes
the tests for chunked content.
This was silly; Bouncy Castle has an armored input stream for reading
keys in ASCII armor format. This means that we do not need to strip the
header ourselves and base64 decode the key. This had problems anyway
because of discrepancies in the padding that Bouncy Castle would produce
and the JDK base64 decoder was expecting. Now that we armor input/output
the whole way during tests, we fix all random failures in test cases
too.
The default wait_for_active_shards is NONE for cluster health, which
differs from all the other API in master, hence we need to make sure to
set the parameter whenever it differs from NONE (0). The test around
this also had a bug, which is why this was not originally uncovered.
Relates to #29331
This commit removes all the API methods that accept a `Header` varargs
argument, in favour of the newly introduced API methods that accept a
`RequestOptions` argument.
Relates to #31069
Here is the problem: if two threads are racing and one hits a failure
freeing a context and the other succeeded, we can expose the value of
the has failure marker to the succeeding thread before the failing
thread has had a chance to set the failure marker. This is a problem if
the failing thread counted down the expected number of operations, then
be put to sleep by a gentle lullaby from the OS, and then the other
thread could count down to zero. Since the failing thread did not get to
set the failure marker, the succeeding thread would respond that the
clear scroll succeeded and that makes that thread a liar. This commit
addresses by first setting the failure marker before we potentially
expose its value to another thread.