We should open up the node to the world when it's as ready as possiblAt the moment we open up the transport service before the local node has been fully initialized. This causes bug as some data structures are not fully initialized yet. See for example #16723.
Sadly, we can't just start the TransportService last (as we do with the HTTP server) because the ClusterService needs to know the bound published network address for the local DiscoveryNode. This address can only be determined by actually binding (people may use, for example, port 0). Instead we start the TransportService as late as possible but block any incoming requests until the node has completed initialization.
A couple of other cleanup during start time:
1) The gateway service now starts before the initial cluster join so we can simplify the logic to recover state if the local node has become master.
2) The discovery is started before the transport service accepts requests, but we only start the join process later using a dedicated method.
Closes#16723Closes#16746
This commit removes the system property "es.useLinkedTransferQueue" that
defaulted to false and was used to control the queue implementation used
in a few places.
Closes#16786
This commit adds a check on startup for G1 GC while running on early
versions of HotSpot version 25. This is to prevent potential data
corruption issues that can occur on those versions.
Closes#16737
Java NIO has the notion of gathering writes. These are writes that
gather data from multiple buffers into a single channel. These gathering
writes in Netty have been enabled by default with the possibility to
disable them using "es.netty.gathering". This flag was added in case
having gathering writes on by default did not work out. We have not
published this ability and sufficient time has passed to render
judgement that using gathering writes is okay.
Closes#16774
Expose http address in cat/nodes and cat/nodeattrs APIs
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
Elasticsearch should reject ids that are this long, to ensure a document
always remains retrievable for clients that impose a maximum URI length
Closes#16034
Most elements in SearchSourceBuilder (e.g. aggs, queries) write their top-level
ParseField name in toXContent(), while HighlightBuilder used to do it in
its own toXContent() method. Moved this up so SeachSourceBuilder for consistency.
Today we might start a node and some of the paths might not have the
required permissions. This commit goes through all data directories as
well as index, shard and state directories and ensures we have write access.
To make this work across all OS etc. we are trying to write a real file
and remove it again in each of those directories
This commit removes the es.max-open-files flag as the same information
can be obtained from the cluster nodes info API, and is warn logged on
startup if it's set too low anyway.
Closes#16757
This commit tries to 'guess' if a user starts a node in production by
checking if any network host is configured. If that is the case soft-limits
that are only logged otherwise are enforced like number of open file descriptors.
Closes#16727
Today we have the notion of a snapshot inside Version.java which makes
releasing complicated since to do a release Version.java must be changed.
This commit removes all notions of snapshot from the code and allows to
switch between snapshot and release build by specifying a system property on
the build. For instance running:
```
gradle run -Dbuild.snapshot=false
```
will build and package a release build while the default always
builds snapshots. Calls to the main rest action will still get the snapshot
information rendered out with the response.
This was changed when adding the text field in an attempt to clean up
how analyzers are set. Unfortunately this change was not safe for the
string field given that it can also represent keywords.
Also renamed histogram.AbstractBuilcer to AbstractHistogramBuilder, range.AbstractBuilder to AbstractRangeBuilder and org.elasticsearch.search.aggregations.pipeline.having to org.elasticsearch.search.aggregations.pipeline.bucketselector
Function Score Query now checks the type of token that we are parsing, which makes parsing stricter and allows to throw useful errors in case the json is malformed. It also makes code more readable as in what gets parsed when.
Closes#16583
This commit updates the OrdinalsBuilder and GeoPoint FieldData loader to work with the new PREFIX_ENCODING introduced in lucene-5.5.0. Backcompat is included to support legacy encoding types.
closes#16634
After #15776 got in, we don't need these copy constructors anymore. When we used to copy requests it was to make sure that headers and context were copied from the parent requests (e.g. index/delete as part of update). This is not a problem anymore.
The current logic for doing recovery from a source to a target shourd is tightly coupled with the underlying network pipes. This changes decouple the two, making it easier to add unit tests for shard recovery that doesn't involve the node and network environment.
On top that, RecoveryTarget is renamed to RecoveryTargetService leaving space to renaming RecoveryStatus to RecoveryTarget (and thus avoid the confusion we have today with RecoveryState).
Correspondingly RecoverySource is renamed to RecoverySourceService.
Closes#16605
All we do is check the cancelled flag and stop the request at a few key
points.
Adds the cancellation cause to the status so any request that is cancelled
but doesn't die can be seen in the task list.
The `keyword` field is intended to replace `not_analyzed` string fields. It is
indexed and has doc values by default, and doesn't support enabling term
vectors.
Although it doesn't support setting an analyzer for now, there are plans for
it to support basic normalization in the future such as case folding.
2.x has show so far that running with security manager is the way to go.
This commit make this non-optional. Users that need to pass their own rules
can still do this via the system configuration for the security manager. They
can even opt out of all security that way.
This commit moves IndicesRequestCache into o.e.indics and makes all API in this
class package private. All references to SearchReqeust, SearchContext etc. have been factored
out and relevant glue code has been added to IndicesService. The IndicesRequestCache is not a
simple class without any hard dependencies on ThreadPool nor SearchService or IndexShard. This now
allows to add unittests.
This commit also removes two settings `indices.requests.cache.clean_interval` and `indices.fielddata.cache.clean_interval`
in favor of `indices.cache.clean_interval` which cleans both caches.
Some bw incompatible setting changes:
http.netty.http.blocking_server -> http.tcp.blocking_server
http.netty.host (removed, we just have http.host)
http.netty.bind_host (removed, we just have http.bind_host)
http.netty.publish_host (removed, we just have http.publish_host)
http.netty.tcp_no_delay -> http.tcp.no_delay
http.netty.tcp_keep_alive -> http.tcp.keep_alive
http.netty.reuse_address -> http.txp.reuse_address
http.netty.tcp_send_buffer_size -> http.tcp.send_buffer_size
http.netty.tcp_receive_buffer_size -> http.tcp.receive_buffer_size
Closes#16531
this is a minor cleanup that detaches `IndicesRequestCache` and `IndicesQueryCache`
from guice and moves it into `IndicesService`. It also decouples the `IndexShard` and `IndexService`
from these caches which are unnecessary dependencies.
QueryBuilders today do all their heavy lifting in toQuery() which
can be too late for several operations. For instance if we want to fetch geo shapes
on the coordinating node we need to do all this before we create the actual lucene query
which happens on the shard itself. Also optimizations for request caching need to be done
to the query builder rather than the query which then in-turn needs to be serialized again.
This commit adds the basic infrastructure for query rewriting and moves the heavy lifting into
the rewrite method for the following queries:
* `WrapperQueryBuilder`
* `GeoShapeQueryBuilder`
* `TermsQueryBuilder`
* `TemplateQueryBuilder`
Other queries like `MoreLikeThisQueryBuilder` still need to be fixed / converted. The nice
sideeffect of this is that queries like template queries will now also match the request cache
if their non-template equivalent has been cached befoore. In the future this will allow to
add optimizataion like rewriting time-based queries into primitives like `match_all_docs` or `match_no_docs`
based on the currents shards bounds. This is especially appealing for indices that are read-only ie. never change.
In the testCanFetchIndexStatus the task check can occur before the indexing process is started making the test to fail. This commit adds an additional lock to make sure we check tasks only after at least one of the tasks is registered.
This commit removes bootstrap support for Java Service Wrapper. The
implementation of this has been moved to its own repository where it was
deprecated, does not work with Elasticsearch 2.x, and is untested and
therefore unmaintained.
Closes#16580
This commit handles the scenario where a replication action fails on a
replica shard, the primary shard attempts to fail the replica shard
but the primary shard is notified of demotion by the master. In this
scenario, the demoted primary shard must be failed, and then the
request rerouted again to the new primary shard.
Closes#16415, closes#14252
Adds to GeoDistanceSortBuilder:
* equals
* hashcode
* writeto/readfrom
* moves xcontent parsing logic over
* adds roundtrip tests
* fixes roundtrip test for xcontent by keeping points just as geopoints not geohashes internally
* fixes xcontent parsing of ignore_malformed if coerce is set/unset
* adds exception to sortMode setter to avoid setting invalid sort modes
Relates to #15178
Today put mapping operations only update metadata of the type that is being
modified, which is not enough since some modifications may have side-effects
on other types.
Closes#16239
Only tasks that extend CancellableTask can be cancelled using this mechanism. If a cancellable task has children it can elect to cancel all child tasks as well. In this case a special ban parent request is sent to all nodes. This request does two things: 1) it prevents any tasks with the banned parent task from being started, and 2) it cancels all currently running tasks that have the banned task as a parent. The ban is lifted as soon as the coordinating node notifies all other nodes that the cancelled task has finished executing. If the coordinating node leaves the cluster before it has a chance to lift its bans, all bans set by this coordinating node are automatically removed.
As an option a task can elect to automatically cancel all child tasks if their parent task was running on a node that just left the cluster. This option makes sense for cancellable heavy tasks that have no side-effects and only return results to the coordinating node. With the coordinating node gone, it doesn't make sense to run such tasks any longer since their results will be most likely discarded.
That is like some kind of cardinal sin or something, right?
We had two violations though they weren't super likely to be keys in a hashmap
any time soon.
This is a simple port of the mapper attachment plugin to the ingest
functionality, no new features. The only option is to limit
the number of chars to prevent indexing of huge documents.
Fields can be selected in the processor as well.
Close#16303
This PR renames the following three variables to fix a typo `settting` into `setting`.
* Rename a static class member:
INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTTING -> INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTING
* Rename a parameter: aSettting --> aSetting
* Rename a local variable: indexSetttings -> indexSettings
This commit registers bootstrap settings used on startup. Without
registration, setting any of these settings causes node startup to
fail. By registering these settings (rather than clearing) after use, we
enable them to be visible in any APIs that show all settings.
Closes#16513
The purpose of this commit is to speed up the runtime of
MessageDigestTests#testToHexString. As written, the test contains a loop
that creates 1024 test cases leading to a test runtime on the order of a
few seconds. Given build infrastructure, a single test case should
suffice. Therefore, this commit removes this loop so that the test can
execute on the order of a couple hundred milliseconds.
This commit includes a few minor cleanups to o/e/b/JavaVersion.java:
- Stronger argument checking in JavaVersion#parse
- Use JDK 8 string joiner
- Keep an immutable copy of the version sequence
IndexShard currently holds an arbitraritly used `getQueryShardContext` that comes
out of a ThreadLocal. It's usage is undefined and arbitraty since there is also
such a method with different semantics on `IndexService` This commit removes the threadLocal on
IndexShard as well as on the context itself. It's types are now a member and the QueryShardContext
lifecycle is managed byt SearchContext which passes the types on from the SearchRequest.
Recovery from store fails to correctly set the translog recovery stats. This fixes it and tightens up the logic bringing it all to IndexShard (previously it was set by the recovery logic).
Closes#15974Closes#16493
This commit modifies the MessageDigests message digest provider to
return a thread local instance of MessageDigest instances instead of
using clone since some providers do not support clone.
Closes#16479
There is no need for IndicesWarmer to be a global accessible class. All it needs
access to is inside IndexService. It also doesn't need to be mutable once it's not a per node
instance. This commit move IndicesWarmer to IndexWarmer and makes the default impls like field data and
norms warming an impl detail. Also the IndexShard doesn't depend on this class anymore, instead it accepts
an Engine.Warmer as a ctor argument which delegates to the actual warmer from the index.
The cat API previously used the Content-Type header field for
determining the media type of the response. This is in opposition to the
HTTP spec which specifies the Accept header field for this purpose. This
commit replaces the use of the Content-Type header field with the Accept
header field in the cat API.
Closes#14421
One of our tests leaked a system property here since we failed after appling some
system properties in BootstrapCLIParser. This is not a huge deal in production since
we exit the JVM if we fail on that. Yet for correctnes we should only apply them if
we manage to parse them all.
This also caused a test failure lately on CI but on an unrelated test:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/314/console
Indices level field data cacheing belongs into IndicesService and doesn't need to be
wired by guice. This commit also moves the async cache refresh out of the class into
IndicesService such that threadpool dependencies are removed and testing / creation becomes
simpler.
This processor is useful when all elements of a json array need to be processed in the same way.
This avoids that a processor needs to be defined for each element in an array.
Also it is very likely that it is unknown how many elements are inside an json array.
Retrieving distributed DF for TermVectors is beside it's esotheric justification
a very slow process and can cause serious load on the cluster. We also don't have nearly
enough testing for this stuff and given the complexity we should remove it rather than carrying it
around.
During initial cluster forming, when a master is elected, it reaches out to all other masters nodes and ask the last cluster state they persisted. To make sure we select the right state, we must successfully read from a `min_master_nodes` nodes. The gateway currently have specific settings to override this behavior, but I don't think they are ever used. We can drop them and reach out to the discovery layer, the single source of truth for the min master nodes settings.
Closes#16446
This change documents the Terminal abstraction that cli tools use, as
well as simplifies the api to be a minimal set of methods to interact
with a terminal.
Uses a refactored version of Netty's CORS implementation to provide more
robust cross-origin resource request functionality. The CORS specific
Elasticsearch parameters remain the same, just the underlying
implementation has changed.
It has also been refactored in a way that allows dropping in Netty's
CORS handler as a replacement once Elasticsearch is upgraded to Netty 4.
This change rewrites the entire settings filtering mechanism to be immutable.
All filters must be registered up-front in the SettingsModule. Filters that are comma-sparated are
not allowed anymore and check on registration.
This commit also adds settings filtering to the default settings recently added to ensure we don't render
filtered settings.
Mostly just wrapping the exception list.
Also:
* Reworks the docs on ElasticsearchExceptionHandle
* Removes long lines from ExceptionSerializationTests
* Switches one method from arrow shaped to early returns
* Adds line breaks
This splits the geo distance and geo distance sorting tests marked messy:
Test cases that don't really need Groovy support are moved back to the
core test suite closer to the code they actually test.
Relates to #15178
This commit modifies the string representation of a shard state action
request. The issue being addressed is that the previous logging would
log "failure: [Unknown]" for shard started actions but this just leads
to confusion that there is a failure but its cause is unknown.
Closes#16396
Today, shard failure requests are blindly handled on the master without
any validation that the request is a legal request. A legal request is a
shard failure request for which the shard requesting the failure is
either the local allocation or the primary allocation. This is because
shard failure requests are classified into only two sets: requests that
correspond to shards that exist, and requests that correspond to shards
that do not exist. Requests that correspond to shards that do not exist
are immediately marked as successful (there is nothing to do), and
requests that correspond to shards that do exist are sent to the
allocation service for handling the failure.
This pull request adds a third classification for shard failure requests
to separate out illegal shard failure requests and enables the master to
validate shard failure requests. The master communicates the illegality
of a shard failure request via a new exception:
NoLongerPrimaryShardException. This exception can be used by shard
failure listeners to discover when they've sent a shard failure request
that they were not allowed to send (e.g., if they are no longer the
primary allocation for the shard).
Closes#16275
Identifying when a plugin id is maven coordinates is currently done by
checking if the plugin id contains 2 colons. However, a valid url could
have 2 colons, for example when a port is specified. This change adds
another check, ensuring the plugin id with maven coordinates does not
contain a slash, which only a url would have.
closes#16376
This commit marks OldIndexBackwardsCompatibilityIT#testOldIndexes as
awaiting a Lucene snapshot upgrade to reflect the fact that
Elasticsearch 2.2.0 is built against Lucene 5.4.1 but the current Lucene
snapshot in master/2.x does not contain the Lucene version 5.4.1 field.
Relates #16373
This commit fixes a test bug in
JvmGcMonitorServiceSettingsTests#testMissingSetting. The purpose of the
test is to test that if settings are provided for a collector for at
least one of warn, info, and debug then it is provided for all of warn,
info, and debug. However, for a collector setting to be valid it must be
a positive time value but the randomization in the test construction
could produce zero time values.
Closes#16369