Today we initialize Netty in a static initializer. We trigger this
method via static initializers from Netty-related classes, but we can
trigger this method earlier than we do to ensure that Netty is
initialized how we want it to be.
`ClusterService` is responsible of updating the cluster state on every node (as a response to an API call on the master and when non-masters receive a new state from the master). When a new cluster state is processed, it is made visible via the `ClusterService#state` method and is sent to series of listeners. Those listeners come in two flavours - one is to change the state of the node in response to the new cluster state (call these cluster state appliers), the other is to start a secondary process. Examples for the later include an indexing operation waiting for a shard to be started or a master node action waiting for a master to be elected.
The fact that we expose the state before applying it means that samplers of the cluster state had to worry about two things - working based on a stale CS and working based on a future, i.e., "being applied" CS. The `ClusterStateStatus` was used to allow distinguishing between the two. Working with a stale cluster state is not avoidable. How this PR changes things to make sure consumers don't need to worry about future CS, removing the need for the status and simplifying the waiting logic.
This change does come with a price as "cluster state appliers" can't sample the cluster state from `ClusterService` whenever they want as the cluster state isn't exposed yet. However, recent clean ups made this is situation easier and this PR takes the last steps to remove such sampling. This also helps clarify the "information flow" and helps component separation (and thus potential unit testing). It also adds an assertion that will trigger if the cluster state is sampled by such listeners.
Note that there are still many "appliers" that could be made a simpler, unrestricted "listener" but this can be done in smaller bits in the future. The commit also makes it clear what the `appliers` and what the `listeners` are by using dedicated interfaces.
Also, since I had to change the listener types I went ahead and changed the data structure for temporary/timeout listeners (used for the observer) so addition and removal won't be an O(n) operation.
This commit addresses an issue in the stats APIs where
include_segment_file_sizes was not being consumed leading to requests
containing this parameter being rejected.
Relates #21879
Since #22094 has been back-ported to 5.2 we can remove all BWC layers from master since all supported version will handle handshake requests.
Relates to #22094
Today sending a message on a closed channel doesn't throw an exception. The channel
might just swallow the exception and informs the internal async exception handler
that a channel got disconnected. This change adds a safety check that we fail
the handshake if we registered a handler but the channel has been closed already
for instance due to a reset by peer.
Merging mappings ensures that fields are used consistently across mapping types. Disabling norms for a specific field in one mapping type for example also disables norms for the same field in other mapping types of that index. The logic that ensures this while merging mappings currently always creates a fresh document mapper for all existing mapping types, even if no change occurred. Creating such a fresh document mapper does not come for free though as it involves recompressing the source. Making a mapping change to one type of an index with 100 types will thus re-serialize and recompress all 100 types, independent of any changes made to those types.
This commit fixes the update logic to only create a new DocumentMapper if a field type actually changes.
* Replace _suggest endpoint to _search in docs
In 5.0, the _suggest endpoint is just sugar for _search
with suggestions specified. Users should move away from
using the _suggest endpoint, as it is marked as deprecated in 5.x and
will be removed in 6.0
* update docs to use _search endpoint instead of _suggest
* Add deprecation logging to RestSuggestAction
* Use search endpoint instead of suggest endpoint in rest tests
Inline scripts defined in Ingest Pipelines are now compiled at creation time to preemptively catch errors on initialization of the pipeline.
Fixes#21842.
* Build: Remove hardcoded reference to x-plugins in build
The gradle build currenlty allows extra plugins to be hooked into the
elasticsearch build by placing under an x-plugins directory as a sibling
of the elasticsearch checkout. This change converts this directory to
one called elasticsearch-extra. The subdirectories of
elasticsearch-extra become first level projects in gradle (while before
they would have been subprojects of `:x-plugins`. Additionally, this
allows major version checkouts to be associated with extra plugins. For
example, you could have elasticsearch checked out as
`elasticsearch-5.x`, and a sibling `elasticsearch-5.x-extra` directory.
Low level handshake code doesn't handle situations gracefully if the connection
is concurrently closed or reset by peer. This commit adds the relevant code to
fail the handshake if the connection is closed.
Moves the last of the "easy" parser construction into
`RestRequest`, this time with a new method
`RestRequest#contentParser`. The rest of the production
code that builds `XContentParser` isn't "easy" because it is
exposed in the Transport Client API (a Builder) object.
In order to start clusters with min master nodes set without setting `discovery.initial_state_timeout`, #21846 has changed the way we start nodes. Instead to the previous serial start up, we now always start the nodes in an async fashion (internally). This means that starting a cluster is unsafe without `min_master_nodes` being set. We should therefore make it mandatory.
ElasticsearchException is used in various response objects like IndexResponse, DeleteResponse or BulkItemResponse.Failure. It would be helpful to the High Level Rest Client to be able to parse these exceptions back.
This commit adds the fromXContent() method to the ElasticsearchException object. This method does not return the original (wrapped or unwrapped) exception but always returns a ElasticsearchException that serves as a simple POJO for all types of exceptions. The parsed ElasticsearchException's message will be composed of the original exception type (ex: illegal_argument_exception) concatenated with the original reason to help users/clients to known and handle the error.
PR #22049 changed the node update logic to never remove nodes from the cluster state when the cluster state is not published. This led to an issue electing a master (#22120) based on nodes with same transport address (but different node id) as previous nodes. The joining nodes should take precedence over conflicting ones. Note that this only applies to the action of becoming master. If a master is established and a node joins an existing master, it will be rejected if there is another node with same transport address.
For minDocCount=0 the numeric terms aggregator should also check the includes/excludes when buckets with empty count are added to the result.
This change fixes this bug and adds a test for it.
Fixes#22140
The creation of the `ValuesSource` used to pass `DateTimeZone.UTC` as a time
zone all the time in case of empty fields in spite of the fact that all doc
value formats but the date one reject this parameter.
This commit centralizes the creation of the `ValuesSource` and adds unit tests
to it.
Closes#22009
With this commit we enable the Jackson feature 'STRICT_DUPLICATE_DETECTION'
by default. This ensures that JSON keys are always unique. While this has
a performance impact, benchmarking has indicated that the typical drop in
indexing throughput is around 1 - 2%.
As a last resort, we allow users to still disable strict duplicate checks
by setting `-Des.json.strict_duplicate_detection=false` which is
intentionally undocumented.
Closes#19614
Today we write 0x00 or 0x01 for false or true when serializing a boolean
(and 0x02 for null when serializing an optional boolean) but we
deserialize any non-zero byte to true (except when deserializing an
optional boolean in which case we deserialize 0x02 to null, 0x01 to
true, and any other non-zero byte to false). This too easily allows
corruption into the stream. Instead, we should mark the stream as
corrupted and stop deserializing. This catches when we try to
deserialize something as a boolean that is not a boolean.
Relates #22152
This commit enables CLI commands to be closeable and installs a runtime
shutdown hook to ensure that if the JVM shuts down (as opposed to
aborting) the close method is called.
It is not enough to wrap uses of commands in main methods in
try-with-resources blocks as these will not run if, say, the virtual
machine is terminated in response to SIGINT, or system shutdown event.
Relates #22126
Grok was originally ignoring potential matches to named-capture groups
larger than one. For example, If you had two patterns containing the
same named field, but only the second pattern matched, it would fail to
pick this up.
This PR fixes this by exploring all potential places where a
named-capture was used and chooses the first one that matched.
Fixes#22117.
This commit fixes a for loop that reverses the order of shard stats
coming off the wire, and is really hard to read anyway (with the
post-increment in the loop initializer).
Relates #22150
Today we rely on the version that the API user passes in together with the DiscoveryNode. This commit introduces a low level handshake where nodes exchange their version to be used with the transport protocol that is executed every time a connection to a node is established. This, on the one hand allows to change the wire protocol based on the version we are talking to even without a full cluster restart. Today we would need to carry on a BWC layer across major versions but with a handshake we can rely on the fact that the latest version of the previous minor executes a handshake and uses the latest protocol version across all communication with the N+1 version nodes.
This change is yet fully backwards compatible, a followup PR will remove the BWC in 6.0 once this has been back-ported to the 5.x branch
Starts to centralize creation of the `XContentParser` in
`protected final` methods on `ESTestCase`. The idea is to enable
adding `NamedXContentRegistry` relatively easily by giving tests
a single place they can override to define the
`NamedXContentRegistry`. Since `NamedXContentRegistry` doesn't
exist yet neither does the override point.
This doesn't attempt to migrate all the tests to calling the
new methods to build the parsers. I wanted to make this so we
could review the concept and then I'll merge a followup to
migrate the tests.
Not only was StringJoiner unused, it's also a class only available in java 1.8, which is a problem given that the REST client has minimum java required set to 1.7
This class is just a wrapper around `SearchContext`, so let's use
`SearchContext` directly. The change is mechanical, except the
`ValuesSourceConfig` class, where I moved the logic to get a `ValuesSource`
given a config.
When using dynamic templates, ES will now throw an exception if a
`match_mapping_type` is used that doesn't correspond to an actual type.
Relates to #17285
Plugins also have the need to provide better OOTB experience by configuring
defaults unless the plugin is used in _production_ mode. This change exposes
the bootstrap check infrastructure as part of the plugin API to allow plugins
to specify / install their own bootstrap checks if necessary.
Our query DSL supports empty queries (`{}`), which have a different meaning depending on the query that holds it, either ignored, match_all or match_none. We deprecated the support for empty queries in 5.0, where we log a deprecation warning wherever they are used.
The way we supported it once we moved query parsing to the coordinating node was having an Optional<QueryBuilder> return type in all of our parse methods (called fromXContent). See #17624. The central place for this was QueryParseContext#parseInnerQueryBuilder. We can now remove all the optional return types and simply throw an exception whenever an empty query is found.
The warnings get printed out in a single line e.g. WARNING: request [DELETE http://localhost:9200/index/type/_api] returned 3 warnings:[this is warning number 0],[this is warning number 1],[this is warning number 2]
When we decided to deprecate and remove fuzzy query in #15760, we didn't realize we would take away the possibililty for uses to use a fuzzy query as part of a span query, which is not possible using match query. This means we have to go back and un-deprecate fuzzy query, which will not be removed.
Closes#15760
Queries must be rewritten before the query phase executes otherwise non-executable queries like `wrapper` query or `terms` will fail or queries that require resources like script service can't access these service unless rewritten.
Relates to #21303
`include` / `exclude` in terms / sig-terms aggs seems completely broken
and massively untested. This commit makes the TermsTests pass again that
randomly use `include` / `exclude`. This class must be tested individually
and we need real integ tests that use xcontent that use this feature.