Makes ScriptModule just a plain class that manages building the
ScriptSettings and ScriptService from plugins. When we *need*
to bind ScriptService with guice we bind it in a lambda.
- fix it so that processors with the `ignore_failure` option do not
record their exception in the response
- add more tests to make empty `on_failure`. This now throws an
exception
Today we only throw random exceptions on the translog writer. This commit
extends it to also throw exceptions during checkpoint writing etc to test
if the correct flags are provided to open method etc.
This makes this sequence:
```
curl -XDELETE localhost:9200/source,dest?pretty
for i in $( seq 1 100 ); do
curl -XPOST localhost:9200/source/test -d'{"test": "test"}'; echo
done
curl localhost:9200/_refresh?pretty
curl -XPOST 'localhost:9200/_reindex?pretty&wait_for_completion=false' -d'{
"source": {
"index": "source"
},
"dest": {
"index": "dest"
}
}'
curl 'localhost:9200/_tasks/Jsyd6d9wSRW-O-NiiKbPcQ:237?wait_for_completion&pretty'
```
Return task *AND* the response to the user.
This also renames "result" to "response" in the persisted task info
to line it up with how we name the objects in Elasticsearch.
This commit modifies reading the Elasticsearch jar manifest via the URL
instead of converting the URL to an NIO path for increased portability.
Relates #18999
The default similarity was set to `classic` which refers to TFIDF and has not been moved after the upgrade to Lucene 6.
Though moving to BM25 could have some downside for queries that relies on coordination factor (match_query, multi_match_query) ?
relates #18944
```
> Throwable #1: java.lang.RuntimeException: file handle leaks: [SeekableByteChannel(/var/lib/jenkins/workspace/elastic+elasticsearch+master+g1gc/core/build/testrun/integTest/J0/temp/org.elasticsearch.search.suggest.CompletionSuggestSearch2xIT_518545A20D129C8C-001/tempDir-001/data/nodes/1/indices/4sTECv6WSJOJsw9L4CGamg/0/index/segments_1), SeekableByteChannel(/var/lib/jenkins/workspace/elastic+elasticsearch+master+g1gc/core/build/testrun/integTest/J0/temp/org.elasticsearch.search.suggest.CompletionSuggestSearch2xIT_518545A20D129C8C-001/tempDir-001/data/nodes/1/indices/4sTECv6WSJOJsw9L4CGamg/0/index/segments_1)]
> at __randomizedtesting.SeedInfo.seed([518545A20D129C8C]:0)
> at org.apache.lucene.mockfile.LeakFS.onClose(LeakFS.java:63)
> at org.apache.lucene.mockfile.FilterFileSystem.close(FilterFileSystem.java:77)
> at org.apache.lucene.mockfile.FilterFileSystem.close(FilterFileSystem.java:78)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception
> at org.apache.lucene.mockfile.LeakFS.onOpen(LeakFS.java:46)
> at org.apache.lucene.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:81)
> at org.apache.lucene.mockfile.HandleTrackingFS.newByteChannel(HandleTrackingFS.java:271)
> at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212)
> at org.apache.lucene.mockfile.HandleTrackingFS.newByteChannel(HandleTrackingFS.java:240)
> at java.nio.file.Files.newByteChannel(Files.java:361)
> at java.nio.file.Files.newByteChannel(Files.java:407)
> at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77)
> at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:94)
> at org.apache.lucene.util.LuceneTestCase.slowFileExists(LuceneTestCase.java:2695)
> at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:737)
> at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:94)
> at org.elasticsearch.common.lucene.Lucene$1.doBody(Lucene.java:237)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:685)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:637)
> at org.elasticsearch.common.lucene.Lucene.checkSegmentInfoIntegrity(Lucene.java:242)
> at org.elasticsearch.index.store.Store$MetadataSnapshot.loadMetadata(Store.java:847)
> at org.elasticsearch.index.store.Store$MetadataSnapshot.<init>(Store.java:740)
> at org.elasticsearch.index.store.Store.getMetadata(Store.java:260)
> at org.elasticsearch.index.store.Store.getMetadata(Store.java:240)
> at org.elasticsearch.index.shard.IndexShard.doCheckIndex(IndexShard.java:1310)
> at org.elasticsearch.common.util.CancellableThreads.executeIO(CancellableThreads.java:102)
> at org.elasticsearch.index.shard.IndexShard.checkIndex(IndexShard.java:1288)
> at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:921)
> at org.elasticsearch.index.shard.IndexShard.skipTranslogRecovery(IndexShard.java:964)
> at org.elasticsearch.indices.recovery.RecoveryTarget.prepareForTranslogOperations(RecoveryTarget.java:297)
> at
```
The MMapDirectory has a switch that allows the content of files to be loaded
into the filesystem cache upon opening. This commit exposes it with the new
`index.store.pre_load` setting.
Today we only emit that the setting wasn't found unless we have
some DYM suggestions. Yet, if a setting is not found at all and there
are no suggestions due to typos it's likely a removed setting or the plugin
that is supposed to be configured is not installed.
This commit adds some info text to the exception to help the user debugging
the problem before opening bugreports.
Instead of emitting:
`unknown setting [foo.bar]`
we now emit:
`unknown setting [foo.bar] please check the migration guide for removed settings and ensure that the plugin you are configuring is installed`
Relates to #18663
If someone sets `index.shard.check_on_startup`, indexing start up time can be slow (by design, it diligently goes and checks all data). If for some reason the shard is closed in that time, the store ref is kept around and prevents a new shard copy to be allocated to this node via the shard level locks. This is especially tricky if the shard is close due to a cancelled recovery which may re-restart soon.
This commit adds a cancellable threads instance to each IndexShard and perform index checking underneath it, so it can be cancelled on close.
We pretended to be able to ackt like a different version node for so long it's
time to be honest and remove this ability. It's just confusing and where needed
and tested we should build dedicated extension points.
This commit introduce unit testing infrastructure to test replication operations using real index shards. This is infra is complementary to the full integration tests and unit testing of ReplicationOperation we already have. The new ESIndexLevelReplicationTestCase base makes it easier to test and simulate failure mode that require real shards and but do not need the full blow stack of a complete node.
The commit also add a simple "nothing is wrong" test plus a test that checks we don't drop docs during the various stages of recovery.
For now, only single doc indexing is supported but this can be easily extended in the future.
This class was forked in 0.20 to remove a volatile keyword. While there
is no issue attached to the commit, no evidence of the criticality of the
change nor does it seem to be correct since we set this value internally as well
I think this class should be used as is from joda time even if we have to pay
the price of volatile reads. We can't do 3rd party optimization in our codebase that
way it just not maintainable.
This was added in 2280915d3c
This commit removes the search preference _only_node as the same
functionality can be obtained by using the search preference
_only_nodes. This commit also adds a test that ensures that _only_nodes
will continue to support specifying node IDs.
Relates #18875
In the past, we had the semantics where the very first cluster state a node processed after joining could not contain shard assignment to it. This was to make sure the node cleans up local / stale shard copies before receiving new ones that might confuse it. Since then a lot of work in this area, most notably the introduction of allocation ids and #17270 . This means we don't have to be careful and just reroute in the same cluster state change where we process the join, keeping things simple and following the same pattern we have in other places.
This change removes some unnecessary dependencies from ClusterService
and cleans up ClusterName creation. ClusterService is now not created
by guice anymore.
The NodeJoinController is responsible for processing joins from nodes, both normally and during master election. For both use cases, the class processes incoming joins in batches in order to be efficient and to accumulated enough joins (i.e., >= min_master_nodes) to seal an election and ensure the new cluster state can be committed. Since the class was written, we introduced a new infrastructure to support batch changes to the cluster state at the `ClusterService` level. This commit rewrites NodeJoinController to use that infra and be simpler.
The PR also introduces a new concept to ClusterService allowing to submit tasks in batches, guaranteeing that all tasks submitted in a batch will be processed together (potentially with more tasks). On top of that I added some extra safety checks to the ClusterService, around potential double submission of task objects into the queue.
This is done in preparation to revive #17811
Today we have a push model for registering basically anything. All our extension points
are defined on modules which we pass in to plugins. This is harder to maintain and adds
unnecessary dependencies on the modules itself. This change moves towards a pull model
where the plugin offers a getter kind of method to get the extensions. This will also
help in the future if we need to pass dependencies to the extension points which can
easily be defined on the method as arguments if a pull model is used.
Currently the error messages for failing tests in the TimeZoneRoundingTests test
suite are hard to read because they usually report the actual end expected date
in milliseconds utc (e.g. "Expected: <1414270860000L> but: was <1414270800000L>".
This makes failing tests hard to read.
This change introduces a new Matcher that can be used for equality checks for
long dates but reports the error both as a formated date string according to
some time zone and also as the actual long values, so you get messages like
"Expected: 2014-10-26T00:01:00.000+03:00 [1414270860000] but: was
"2014-10-26T00:00:00.000+03:00 [1414270800000]".
Also clean cleaning up some helper methods and generally simplifying a few test
cases. Otherwise this change shouldn't affect either the scope of the test or
anything about the rounding implementation itself.
They have been implemented in https://issues.apache.org/jira/browse/LUCENE-7289.
Ranges are implemented so that the accuracy loss only occurs at index time,
which means that if you are searching for values between A and B, the query will
match exactly all documents whose value rounded to the closest half-float point
is between A and B.
Registering a script engine or native scripts still uses Guice today
and is much more complicated than needed. This change moves to a pull
based model where script plugins have to implement a dedicated interface
`ScriptPlugin` and defines simple getter returning instances rather than
classes.
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
This interface used to have dedicated methods to prevent calling execute
methods. These methods are unnecessary as the checks can simply be
done inside the execute methods itself. This simplifies the interface
as well as its usage.
With this commit we exclude certain HTTP requests that are needed to inspect the cluster
from HTTP request limiting to ensure these commands are processed even in critical
memory conditions.
Relates #17951, relates #18145, closes#18833
When installing plugins, we first try the elastic download service for
official plugins, then try maven coordinates, and finally try the
argument as a url. This can lead to confusing error messages about
unknown protocols when eg an official plugin name is mispelled. This
change adds a heuristic for determining if the argument in the final
case is in fact a url that we should try, and gives a simplified error
message in the case it is definitely not a url.
closes#17226
The search preference _prefer_node allows specifying a single node to
prefer when routing a request. This functionality can be enhanced by
permitting multiple nodes to be preferred. This commit replaces the
search preference _prefer_node with the search preference _prefer_nodes
which supplants the former by specifying a single node and otherwise
adds functionality.
Relates #18872
This caches FieldStats at the field level. For one off requests or for
few indicies this doesn't save anything, but when there are 30 indices,
5 shards, 1 replica, 100 parallel requests this is about twice as fast
as not caching. I expect lots of usage won't see much benefit from this
but pointing kibana to a cluster with many indexes and shards, will be
faster.
Closes#18717
This adds a get task API that supports GET /_tasks/${taskId} and
removes that responsibility from the list tasks API. The get task
API supports wait_for_complation just as the list tasks API does
but doesn't support any of the list task API's filters. In exchange,
it supports falling back to the .results index when the task isn't
running any more. Like any good GET API it 404s when it doesn't
find the task.
Then we change reindex, update-by-query, and delete-by-query to
persist the task result when wait_for_completion=false. The leads
to the neat behavior that, once you start a reindex with
wait_for_completion=false, you can fetch the result of the task by
using the get task API and see the result when it has finished.
Also rename the .results index to .tasks.