Instead of implementing onModule(ActionModule) to register actions,
this has plugins implement ActionPlugin to declare actions. This is
yet another step in cleaning up the plugin infrastructure.
While I was in there I switched AutoCreateIndex and DestructiveOperations
to be eagerly constructed which makes them easier to use when
de-guice-ing the code base.
In 995e4eda08be99f72ef56052b3f78ceef9100885 we changed the cluster health Java API.
We need to also change the documentation.
Backport of #19093 in master branch
The discovery-plugin has been broken since 2.x because the code was not compliant with the security manager and because plugins have been refactored.
closes#18637, #15630
This commit modifies TimeValue parsing to keep the input time unit. This
enables round-trip parsing from instances of String to instances of
TimeValue and vice-versa. With this, this commit removes support for the
unit "w" representing weeks, and also removes support for fractional
values of units (e.g., 0.5s).
Relates #19102
This commit adds randomization for the packaging upgrade test. In
particular, we extract a list of the released version of Elasticsearch
from Maven Central and randomize the selection of the version to upgrade
from. The randomization is repeatable, and supports the tests.seed
property. Specific versions can be tested by setting the property
tests.packaging.upgrade.from.versions.
Relates #19033
Today all settings are only validated against their validators
that are available when settings are registered. Yet, some settings updaters
have validators that are dynamic ie. their validation depends on other variables
that are only available at runtime. We do not run those validators when settings
are updated causing index updates to fail on the data nodes instead of on the master.
Relates to #19046
Previous to this change the unresolved extended bounds was passed into the histogram aggregator which meant extendedbounds.min and extendedbounds.max was passed through as null. This had two effects on the histogram aggregator:
1. If the histogram aggregator was unmapped across all shards, the reduce phase would not add buckets for the extended bounds and the response would contain zero buckets
2. If the histogram aggregator was not unmapped in some shards, the reduce phase might sometimes chose to reduce based on the unmapped shard response and therefore the extended bounds would be ignored.
This change resolves the extended bounds in the unmapped case and solves the above two issues.
Closes#19009
#18938 has changed the timing in which we send out to nodes to fetch their shard stores. Instead of doing this after the cluster state resulting of the node's join was published, #18938 made it be sent concurrently to the publishing processes. This revealed a couple of points where the shard store fetching is dependent of the current state of affairs of the cluster state, both on the master and the data nodes. The problem discovered were already present without #18938 but required a failure/extreme situations to make them happen.This PR tries to remove as much as possible of these dependencies making shard store fetching simpler and make the way to re-introduce #18938 which was reverted.
These are the notable changes:
1) Allow TransportNodesAction (of which shard store fetching is derived) callers to supply concrete disco nodes, so it won't need the cluster state to resolve them. This was a problem because the cluster state containing the needed nodes was not yet made available through ClusterService. Note that long term we can expect the rest layer to resolve node ids to concrete nodes, making this mode the only one needed.
2) The data node relied on the cluster state to have the relevant index meta data so it can find data when custom paths are used. We now fall back to read the meta data from disk if needed.
3) The data node was relying on it's own IndexService state to indicate whether the data it has corresponds to an existing allocation. This is of course something it can not know until it got (and processed) the new cluster state from the master. This flag in the response is now removed. This is not a problem because we used that flag to protect against double assigning of a shard to the same node, but we are already protected from it by the allocation deciders.
4) I removed the redundant filterNodeIds method in TransportNodesAction - if people want to filter they can override resolveRequest.
Since the Settings infrastructure has been improved, a group setting must be registered by the repository-azure plugin to allow settings like "cloud.azure.storage.my_account.account" to be coherent with Azure plugin documentation.
This commit fixes several NPEs caused by implicitly performing a get request for a document that exists with its _source disabled and then trying to access the source. Instead of causing an NPE the following queries will throw an exception with a "source disabled" message (similar behavior as if the document does not exist).:
- GeoShape query for pre-indexed shape (throws IllegalArgumentException)
- Percolate query for an existing document (throws IllegalArgumentException)
A Terms query with a lookup will ignore the document if the source does not exist (same as if the document does not exist).
GET and HEAD requests for the document _source will return a 404 if the source is disabled (even if the document exists).
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.
This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.
Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
Today when parsing the timeout field in a query body, if time units are
supplied the parser throws a NumberFormatException. Addtionally, the
parsing allows the timeout field to not specify units (it assumes
milliseconds). This commit fixes this behavior by not only allowing time
units to be specified but requires time units to be specified. This is
consistent with the documented behavior and the behavior in 2.x.
Relates #19077
This commit fixes several NPEs caused by implicitly performing a get request for a document that exists with its _source disabled and then trying to access the source. Instead of causing an NPE the following queries will throw an exception with a "source disabled" message (similar behavior as if the document does not exist).:
- GeoShape query for pre-indexed shape (throws IllegalArgumentException)
- Percolate query for an existing document (throws IllegalArgumentException)
A Terms query with a lookup will ignore the document if the source does not exist (same as if the document does not exist).
GET and HEAD requests for the document _source will return a 404 if the source is disabled (even if the document exists).
This commit reverts the slow tests heartbeat added in
b6fbd18e09. The heartbeat has not served
its stated purpose of drawing attention to slow tests, and the heartbeat
can kill builds with SELinux enforcing enabled. While the heartbeats can
be disabled via the environment variable PULSE_SERVER, and SELinux
policy files can be changed, since the heartbeats are not accomplishing
their intended purpose they should be removed.
Relates #19071
If we don't care about scoring then for certain candidate matches we can be certain, that if they are a candidate match,
then they will always match. So verifying these queries with the MemoryIndex can be skipped.
Currently we don't throw an error when there is more than one query clause
specified in a must/must_not/should/filter object of the bool query without
using array notation, e.g.:
{ "bool" : { "must" : { "match" : { ... }, "match": { ... }}}}
In these cases, only the first query will be parsed and further behaviour is
unspecified, possibly leading to silently ignoring the rest of the query.
Instead we should throw a ParsingException if we don't encounter an END_OBJECT
token after having parsed the query clause.
This moves the "Performance Considerations for Elasticsearch Indexing" blog post
to the reference guide and adds similar recommendations for tuning disk usage
and search speed.