This seems to be an error introduced in refactoring around #16821
where we now wait 30seconds by default if the node already joined
a cluster and got a master. This can slow down tests dramatically
espeically on slow boxes and notebooks.
Closes#16956
Decommissioning a node or applying a filter inclusion / exclusion can potentially lead to many shards that need to be moved to other nodes. This commit reuses the model across all
shard movements in an allocation round: It calculates the shard model once and simulates the application of all shards that can be moved on this model.
Closes#16926
Today we might run into a rejected execution exception when
we shutdown the node while handling a transport exception. The
exception is run in a seperate thread but that thread might
not be able to execute due to the shutdown. Today we barf and fill
the logs with large exception. This commit catches this exception
and logs it as debug logging instead.
Elasticsearch 5.0 doesn't support indices wiht legacy checksums anymore.
The last time we write legacy checksums was in 1.3.0 which was based
on lucene 4.9 already which means that all files have CRC32 checksums.
All indices that Elasticsearch can read today must be written with
lucene version >= 4.8 anyway so we can drop this layer of backwards
compatibility entirely.
Since we are close to upgrading to Lucene 6.0 we should get rid of this
in a more contiained change than the lucene upgrade.
On shared FS / shadow replicas we rely on a lock retry if the lock has
not yet been relesed on a relocated primary. This commit adds this `hack`
for shared filesystems only.
Closes#16936
The field name is a required argument for all suggesters, but
it was specified via a field() setter in SuggestionBuilder so far.
This changes field name to being a mandatory constructor argument
and lets suggestion builders throw an error if field name is missing
or the empty string.
This commit modifies TransportBulkAction to use relative time instead of
absolute time when measuring how long a bulk request took to be
processed, and adds tests for this functionality.
Closes#16916
`writeLockTimeout` has been removed in Lucene 6 completely and since we have
the shard locking mechanism now for quite a while we don't need this anymore.
Shards should only be allocated once all resources are released such that there
can't be any other shard holding the lock to that index in any sane situation.
This commit removes the ability to use string fields on indices created on or
after 5.0. Dynamic mappings now generate text fields by default for strings
but there are plans to also add a sub keyword field (in a future PR).
Most of the changes in this commit are just about replacing string with
keyword or text. Some tests have been removed because they existed because of
corner cases of string mappings like setting ignore-above on a text field or
enabling term vectors on a keyword field which are now impossible.
The plan is to remove strings entirely in 6.0.
This method was originally introduced to prevent client nodes from connecting to other client nodes directly in the cluster. That said, it worked only if node.client was set to true and not when node.master and node.data were both set to false. It looks safe to remove, which allows to solve all kinds of problems around monitoring that happen wherever there are 2 or more clients nodes in the cluster, and a REST call hits one of them (node counts are off, clients nodes are missing).
Recent simplifications to org.elasticsearch.Version led to "-SNAPSHOT"
to not be displayed in the node version upon startup. Yet, this is
useful information to have. This simple commit adds back the display of
"-SNAPSHOT" in the version on startup if the build is a snapshot build.
Closes#16908
This commit refactors the bootstrap checks into a dedicated class. The
refactoring provides a model for different limits per operating system,
and provides a model for unit tests for individual checks.
Closes#16844
This is a backport of #16845 in this branch.
We now also support marking settings with `SettingsProperty.Deprecated`.
If the setting is still used, it will print a `warn` to the user.
In this case we compute elapsed time and `System.nanoTime()` is designed to do just do that.
The absolute timing `System.currentTimeMillis()` provides in not needed and the relative timing `System.nanoTime()` provides is likely to be more accurate.
Use 'includeSegmentFileSizes' as the flag name to report disk usage.
Added test that verifies reported segment disk usage is growing accordingly after adding a document.
Documentation: Reference the new parameter as part of indices stats.
In 2.2, the client nodes created internally by a tribe node were changed
to be explicit about which settings the client nodes use, no longer
loading all settings from elasticsearch.yml. However, some settings were
missed, notably network bind settings. This change adds those settings
to be passed through, as well as adds unit tests for building the tribe
client node settings.
I've noticed throughout the code that we have a need to remove the boilerplate lifecycle check when starting/rescheduling certain runnables. This provides a simpler implementation to get this functionality without duplicating it.
The `ingest_took` is separate from `took`, which keeps track how much time is spent on indexing/deleting/updating.
The `ingest_took` is only visible in the rest response if at least for one bulk item has ingest enabled.
Currently each suggestion keeps track of its own name. This has
the disadvantage of having to pass down the parsed name property
in the suggestions fromXContent() and the serialization methods
as an argument, since we need it in the ctor.
This change moves the naming of the suggestions to the surrounding
SuggestBuilder and by this eliminates the need for passind down
the names in the parsing and serialization methods. By making
`name` a required argument in SuggestBuilder#addSuggestion() we
also make sure it is always set and prevent using the same name twice,
which wasn't possible before.
In case where the publish address is different to all bound addresses and an explicit publish_port
setting is not provided, the publish_port is now selected as the unique port of bound addresses.
An exception is thrown in case of ambiguities.
Closes#16626
If a node was isolated from the cluster while a delete was happening,
the node will ignore the deleted operation when rejoining as we couldn't
detect whether the new master genuinely deleted the indices or it is a
new fresh "reset" master that was started without the old data folder.
We can now be smarter and detect these reset masters and actually delete
the indices on the node if its not the case of a reset master.
Note that this new protection doesn't hold if the node was shut down. In
that case it's indices will still be imported as dangling indices.
Closes#16825Closes#11665
If you connect to a client node and call _cat/nodes, and there is at least another client node in the cluster, the http address cannot be retrieved thus we get an NPE. This commit prevents an NPE from being thrown. This bug was introduced with #16770
The suffix TransportAction is misleading as it may make think that it extends TransportAction, but it does not. This class makes accessible the different search operations exposed by SearchService through the transport layer. Also resolved few compiler warnings in the class itself.
The big win here is catching tests that are incorrectly named and will
be skipped by gradle, providing a false sense of security.
The whole thing takes about 10 seconds on my Macbook Air, not counting
compiling the test classes, which seems worth it. Because this runs as
a gradle task with propery UP-TO-DATE handling it can be skipped if the
tests haven't been changed which should save some time.
I chose to keep this in test:framework rather than a new subproject of
buildSrc because ESIntegTestCase and doesn't inroduce any additional
dependencies.
DiscoveryService was a bridge into the discovery universe. This is unneeded and we can just access discovery directly or do things in a different way.
One of those different ways, is not having a dedicated discovery implementation for each our dicovery plugins but rather reuse ZenDiscovery.
UnicastHostProviders are now classified by discovery type, removing unneeded checks on plugins.
Closes#16821
Its useful when you need to have defaults that don't match
SearchSourceBuilder's defaults. You build your SearchSourceBuilder, set the
defaults, and then layer the XContent on top of that.
TransportSearchTypeAction and subclasses are not actually transport actions, but just support classes useful for their inner async actions that can easily be extracted out so that we get rid of one too many level of abstraction.
Same pattern can be applied to TransportSearchScrollQueryAndFetchAction & TransportSearchScrollQueryThenFetchAction which we could remove in favour of keeping only their inner classes named SearchScrollQueryAndFetchAsyncAction and SearchScrollQueryThenFetchAsyncAction.
Remove org.elasticsearch.action.search.type package, collapsed remaining classes into existing org.elasticsearch.action.search package
Make also ParsedScrollId ScrollIdForNode and TransportSearchHelper classes and their methods package private.
Closes#11710
As we rely on active allocation ids persisted in the cluster state to select
the primary shard copy, we can write shard state metadata on the allocated node
as soon as the node knows about receiving this shard. This also ensures that
in case of primary relocation, when the relocation target is marked as started
by the master node, the shard state metadata with the correct allocation id has
already been written on the relocation target. Before this change, shard state
metadata was only written once the node knows it is marked as started. In case
of failures between master marking the node as started and the node
receiving and processing this event, the relation between the shard copy on disk
and the cluster state could get lost. This means that manual allocation of
the shard using the reroute command allocate_stale_primary was necessary.
Closes#16625
Instead of modifying methods each time we need to add a new behavior for settings, we can simply pass `SettingsProperty... properties` instead.
`SettingsProperty` could be defined then:
```
public enum SettingsProperty {
Filtered,
Dynamic,
ClusterScope,
NodeScope,
IndexScope
// HereGoesYours;
}
```
Then in setting code, it become much more flexible.
TODO: Note that we need to validate SettingsProperty which are added to a Setting as some of them might be mutually exclusive.
As part of #10136 we removed the transport action for broadcast deletes in case routing is required but not specified. Bulk api worked differently though and kept on doing the broadcast delete internally in that case. This commit makes sure that delete items are marked as failed in such cases. Also the check has been moved up in the code together with the existing check for the update api, and we now make sure that the exception is the same as the one thrown for single document apis (delete/update).
Note that the failure for the update api contained the wrong optype (the type of the document rather than "update"), that's been fixed too and tested.
Closes#16645
This commit disables the production limits checks on snapshot
builds. This is at a minimum short-term relief for developers that do in
fact bind to external network interfaces, and is possibly a long-term
fix as well. The situation with using the JVM flag MaxFDLimit is far too
complicated.
Closes#16835
At some time in the distant past, Bootstrap could be used by those
embedding elasticsearch. However, it is no longer allowed (now package
private), and so any errors (for example, log4j missing, or any other
exception while initializing) should be exposed directly and cause
elasticsearch to fail to start. This change removes hiding of logging
initialization exceptions.
Removes all our logger wrappers except the wrapper for log4j1.2. If you
depend on Elasticsearch's jar in your application you'll need to declare
log4j 1.2 and/or some bridge to your favorite logger.
We did this to simplify our builds and code. No more commons-logging like
log implementation sniffing. No more optional dependency hacks in gradle.
We might one day want to use j.u.l instead of log4j. If we do want that
we can recover its wrapper by studying this commit. We didn't go directly
to j.u.l in this commit because that is a bigger change. Our logging
configuration is based on log4j1.2 and people are used to it. So it'd
be a much more fraught breaking change to do that conversion.
It is possible to register multiple settings with complex matchers that could both match
a given key. The behavior when this occurs can lead to issues and depends on the
number of settings that have been registered. In order to identify the setting for a given
key, we iterate over the values in a map to find the first setting that matches the given key
and iteration order of a map should not be relied upon.
This commit checks complex settings when adding them and if the keys for these overlap,
an IllegalArgumentException is now thrown.
Currently dynamic mappings propgate through call semantics, where deeper
dynamic mappings are merged into higher level mappings through
return values of recursive method calls. This makese it tricky
to handle multiple updates in the same method, for example when
trying to create parent object mappers dynamically for a field name
that contains dots.
This change makes the api for adding mappers a simple list
of new mappers, and moves construction of the root level mapping
update to the end of doc parsing.
We should open up the node to the world when it's as ready as possiblAt the moment we open up the transport service before the local node has been fully initialized. This causes bug as some data structures are not fully initialized yet. See for example #16723.
Sadly, we can't just start the TransportService last (as we do with the HTTP server) because the ClusterService needs to know the bound published network address for the local DiscoveryNode. This address can only be determined by actually binding (people may use, for example, port 0). Instead we start the TransportService as late as possible but block any incoming requests until the node has completed initialization.
A couple of other cleanup during start time:
1) The gateway service now starts before the initial cluster join so we can simplify the logic to recover state if the local node has become master.
2) The discovery is started before the transport service accepts requests, but we only start the join process later using a dedicated method.
Closes#16723Closes#16746
This commit removes the system property "es.useLinkedTransferQueue" that
defaulted to false and was used to control the queue implementation used
in a few places.
Closes#16786
This commit adds a check on startup for G1 GC while running on early
versions of HotSpot version 25. This is to prevent potential data
corruption issues that can occur on those versions.
Closes#16737
Java NIO has the notion of gathering writes. These are writes that
gather data from multiple buffers into a single channel. These gathering
writes in Netty have been enabled by default with the possibility to
disable them using "es.netty.gathering". This flag was added in case
having gathering writes on by default did not work out. We have not
published this ability and sufficient time has passed to render
judgement that using gathering writes is okay.
Closes#16774
Expose http address in cat/nodes and cat/nodeattrs APIs
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
Elasticsearch should reject ids that are this long, to ensure a document
always remains retrievable for clients that impose a maximum URI length
Closes#16034
Most elements in SearchSourceBuilder (e.g. aggs, queries) write their top-level
ParseField name in toXContent(), while HighlightBuilder used to do it in
its own toXContent() method. Moved this up so SeachSourceBuilder for consistency.
Today we might start a node and some of the paths might not have the
required permissions. This commit goes through all data directories as
well as index, shard and state directories and ensures we have write access.
To make this work across all OS etc. we are trying to write a real file
and remove it again in each of those directories
This commit removes the es.max-open-files flag as the same information
can be obtained from the cluster nodes info API, and is warn logged on
startup if it's set too low anyway.
Closes#16757
This commit tries to 'guess' if a user starts a node in production by
checking if any network host is configured. If that is the case soft-limits
that are only logged otherwise are enforced like number of open file descriptors.
Closes#16727
Today we have the notion of a snapshot inside Version.java which makes
releasing complicated since to do a release Version.java must be changed.
This commit removes all notions of snapshot from the code and allows to
switch between snapshot and release build by specifying a system property on
the build. For instance running:
```
gradle run -Dbuild.snapshot=false
```
will build and package a release build while the default always
builds snapshots. Calls to the main rest action will still get the snapshot
information rendered out with the response.
This was changed when adding the text field in an attempt to clean up
how analyzers are set. Unfortunately this change was not safe for the
string field given that it can also represent keywords.
Also renamed histogram.AbstractBuilcer to AbstractHistogramBuilder, range.AbstractBuilder to AbstractRangeBuilder and org.elasticsearch.search.aggregations.pipeline.having to org.elasticsearch.search.aggregations.pipeline.bucketselector
Function Score Query now checks the type of token that we are parsing, which makes parsing stricter and allows to throw useful errors in case the json is malformed. It also makes code more readable as in what gets parsed when.
Closes#16583
This commit updates the OrdinalsBuilder and GeoPoint FieldData loader to work with the new PREFIX_ENCODING introduced in lucene-5.5.0. Backcompat is included to support legacy encoding types.
closes#16634
After #15776 got in, we don't need these copy constructors anymore. When we used to copy requests it was to make sure that headers and context were copied from the parent requests (e.g. index/delete as part of update). This is not a problem anymore.
Now we have a nice Setting infra, we can define in Setting class if a setting should be filtered or not.
So when we register a setting, setting filtering would be automatically done.
Instead of writing:
```java
Setting<String> KEY_SETTING = Setting.simpleString("cloud.aws.access_key", false, Setting.Scope.CLUSTER);
settingsModule.registerSetting(AwsEc2Service.KEY_SETTING, false);
settingsModule.registerSettingsFilterIfMissing(AwsEc2Service.KEY_SETTING.getKey());
```
We could simply write:
```java
Setting<String> KEY_SETTING = Setting.simpleString("cloud.aws.access_key", false, Setting.Scope.CLUSTER, true);
settingsModule.registerSettingsFilterIfMissing(AwsEc2Service.KEY_SETTING.getKey());
```
It also removes `settingsModule.registerSettingsFilterIfMissing` method.
The plan would be to remove as well `settingsModule.registerSettingsFilter` method but it still used with wildcards. For example in Azure Repository plugin:
```java
module.registerSettingsFilter(AzureStorageService.Storage.PREFIX + "*.account");
module.registerSettingsFilter(AzureStorageService.Storage.PREFIX + "*.key");
```
Closes#16598.
The current logic for doing recovery from a source to a target shourd is tightly coupled with the underlying network pipes. This changes decouple the two, making it easier to add unit tests for shard recovery that doesn't involve the node and network environment.
On top that, RecoveryTarget is renamed to RecoveryTargetService leaving space to renaming RecoveryStatus to RecoveryTarget (and thus avoid the confusion we have today with RecoveryState).
Correspondingly RecoverySource is renamed to RecoverySourceService.
Closes#16605
All we do is check the cancelled flag and stop the request at a few key
points.
Adds the cancellation cause to the status so any request that is cancelled
but doesn't die can be seen in the task list.
The `keyword` field is intended to replace `not_analyzed` string fields. It is
indexed and has doc values by default, and doesn't support enabling term
vectors.
Although it doesn't support setting an analyzer for now, there are plans for
it to support basic normalization in the future such as case folding.
2.x has show so far that running with security manager is the way to go.
This commit make this non-optional. Users that need to pass their own rules
can still do this via the system configuration for the security manager. They
can even opt out of all security that way.
This commit moves IndicesRequestCache into o.e.indics and makes all API in this
class package private. All references to SearchReqeust, SearchContext etc. have been factored
out and relevant glue code has been added to IndicesService. The IndicesRequestCache is not a
simple class without any hard dependencies on ThreadPool nor SearchService or IndexShard. This now
allows to add unittests.
This commit also removes two settings `indices.requests.cache.clean_interval` and `indices.fielddata.cache.clean_interval`
in favor of `indices.cache.clean_interval` which cleans both caches.
Some bw incompatible setting changes:
http.netty.http.blocking_server -> http.tcp.blocking_server
http.netty.host (removed, we just have http.host)
http.netty.bind_host (removed, we just have http.bind_host)
http.netty.publish_host (removed, we just have http.publish_host)
http.netty.tcp_no_delay -> http.tcp.no_delay
http.netty.tcp_keep_alive -> http.tcp.keep_alive
http.netty.reuse_address -> http.txp.reuse_address
http.netty.tcp_send_buffer_size -> http.tcp.send_buffer_size
http.netty.tcp_receive_buffer_size -> http.tcp.receive_buffer_size
Closes#16531
this is a minor cleanup that detaches `IndicesRequestCache` and `IndicesQueryCache`
from guice and moves it into `IndicesService`. It also decouples the `IndexShard` and `IndexService`
from these caches which are unnecessary dependencies.
QueryBuilders today do all their heavy lifting in toQuery() which
can be too late for several operations. For instance if we want to fetch geo shapes
on the coordinating node we need to do all this before we create the actual lucene query
which happens on the shard itself. Also optimizations for request caching need to be done
to the query builder rather than the query which then in-turn needs to be serialized again.
This commit adds the basic infrastructure for query rewriting and moves the heavy lifting into
the rewrite method for the following queries:
* `WrapperQueryBuilder`
* `GeoShapeQueryBuilder`
* `TermsQueryBuilder`
* `TemplateQueryBuilder`
Other queries like `MoreLikeThisQueryBuilder` still need to be fixed / converted. The nice
sideeffect of this is that queries like template queries will now also match the request cache
if their non-template equivalent has been cached befoore. In the future this will allow to
add optimizataion like rewriting time-based queries into primitives like `match_all_docs` or `match_no_docs`
based on the currents shards bounds. This is especially appealing for indices that are read-only ie. never change.
In the testCanFetchIndexStatus the task check can occur before the indexing process is started making the test to fail. This commit adds an additional lock to make sure we check tasks only after at least one of the tasks is registered.
This adds a build method for the SuggestionContext to the PhraseSuggestionBuilder
and another one that creates the SuggestionSearchContext to the top level
SuggestBuilder. Also adding tests that make sure the current way of parsing
xContent to a SuggestionContext is reflected in the output the builders create.
This commit removes bootstrap support for Java Service Wrapper. The
implementation of this has been moved to its own repository where it was
deprecated, does not work with Elasticsearch 2.x, and is untested and
therefore unmaintained.
Closes#16580
This commit handles the scenario where a replication action fails on a
replica shard, the primary shard attempts to fail the replica shard
but the primary shard is notified of demotion by the master. In this
scenario, the demoted primary shard must be failed, and then the
request rerouted again to the new primary shard.
Closes#16415, closes#14252
Adds to GeoDistanceSortBuilder:
* equals
* hashcode
* writeto/readfrom
* moves xcontent parsing logic over
* adds roundtrip tests
* fixes roundtrip test for xcontent by keeping points just as geopoints not geohashes internally
* fixes xcontent parsing of ignore_malformed if coerce is set/unset
* adds exception to sortMode setter to avoid setting invalid sort modes
Relates to #15178
Today put mapping operations only update metadata of the type that is being
modified, which is not enough since some modifications may have side-effects
on other types.
Closes#16239
Only tasks that extend CancellableTask can be cancelled using this mechanism. If a cancellable task has children it can elect to cancel all child tasks as well. In this case a special ban parent request is sent to all nodes. This request does two things: 1) it prevents any tasks with the banned parent task from being started, and 2) it cancels all currently running tasks that have the banned task as a parent. The ban is lifted as soon as the coordinating node notifies all other nodes that the cancelled task has finished executing. If the coordinating node leaves the cluster before it has a chance to lift its bans, all bans set by this coordinating node are automatically removed.
As an option a task can elect to automatically cancel all child tasks if their parent task was running on a node that just left the cluster. This option makes sense for cancellable heavy tasks that have no side-effects and only return results to the coordinating node. With the coordinating node gone, it doesn't make sense to run such tasks any longer since their results will be most likely discarded.
That is like some kind of cardinal sin or something, right?
We had two violations though they weren't super likely to be keys in a hashmap
any time soon.
This is a simple port of the mapper attachment plugin to the ingest
functionality, no new features. The only option is to limit
the number of chars to prevent indexing of huge documents.
Fields can be selected in the processor as well.
Close#16303
This PR renames the following three variables to fix a typo `settting` into `setting`.
* Rename a static class member:
INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTTING -> INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTING
* Rename a parameter: aSettting --> aSetting
* Rename a local variable: indexSetttings -> indexSettings