ClusterState has 3 different methods to access RoutingNodes:
* #routingNodes() - mutable version
* #getRoutingNodes() - delegates to #getReadOnlyRoutingNodes()
* #getReadOnlyRoutingNodes() - it's docs say `NOTE, the routing nodes are mutable, use them just for read operations`
The latter also reuses the instance that it creates. This has several problems beside the obvious:
* creating RoutingNodes is costly and should be done only if really needed ie. use cached version as much as possible
* the common case is ReadOnly but all kinds of things are called
* mutable version are only needed in one place and should only be used in the AllocationService
* RoutingNodes can freeze it's ShardRoutings but doesn't
* RoutingNodes should check if it's read-only or not
This commit fixed all the problems and special cases the mutable case such that all accesses via ClusterState#getRoutingNodes()
is read-only and RoutingNodes enforces this.
This is one of our esoteric metadata mappers so I think we should distribute
it in a plugin rather than in elasticsearch core.
This introduces one limitation: the value of the `_size` parameter is not
retrievable for documents that are only in the transaction log.
The `multi_match` query groups terms that have the same analyzer together and
then applies the boost of the first query in each group. This is not necessary
given that boosts for each term are already applied another way.
Versions can be tricky with linux vendors and such. To help debug any possible issues, we should output a better version.
Today:
```
[elasticsearch] java.lang.RuntimeException: Java version: 1.7.0_55 suffers from critical bug https://bugs.openjdk.java.net/browse/JDK-8024830 which can cause data corruption.
[elasticsearch] Please upgrade the JVM, see http://www.elastic.co/guide/en/elasticsearch/reference/current/_installation.html for current recommendations.
[elasticsearch] If you absolutely cannot upgrade, please add -XX:-UseSuperWord to the JAVA_OPTS environment variable.
[elasticsearch] Upgrading is preferred, this workaround will result in degraded performance.
[elasticsearch] at org.elasticsearch.bootstrap.JVMCheck.check(JVMCheck.java:121)
[elasticsearch] at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:270)
[elasticsearch] at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:28)
```
With patch:
```
java.lang.RuntimeException: Java version: Oracle Corporation 1.7.0_40 [Java HotSpot(TM) 64-Bit Server VM 24.0-b56] suffers from critical bug https://bugs.openjdk.java.net/browse/JDK-8024830 which can cause data corruption.
Please upgrade the JVM, see http://www.elastic.co/guide/en/elasticsearch/reference/current/_installation.html for current recommendations.
If you absolutely cannot upgrade, please add -XX:-UseSuperWord to the JAVA_OPTS environment variable.
Upgrading is preferred, this workaround will result in degraded performance.
at org.elasticsearch.bootstrap.JVMCheck.check(JVMCheck.java:121)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:270)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:28)
```
This commit adds a new API to allow scripts to say whether they need scores.
In practice, only the `expression` script engine makes use of it correctly,
other engines just return `true` since they can't predict whether they'll
need scores. This should make scripted aggregations and `function_query`
faster as we'll now be able to pass needsScores=false to Query.createWeight.
Adding a named query that is null can lead to a NullPointerException
when copying the named queries. This is due to an implementation detail
in QueryParseContent.copyNamedQueries. In particular, this method uses
com.google.common.collect.ImmutableMap.copyOf. A documented requirement
of ImmutableMap is that none of the entries have a null key nor null
value. Therefore, we should not add such queries to the namedQueries
map. This will not change any behavior since Map.get returns null if no
entry with the given key exists anyway.
Closes#12683
This change improves the `function_score` query to not compute scores at all
when they are not needed, and to not compute scores on the underlying query
when the combine function is to replace the score with the scores of the
functions.
This commit makes it possible to serialize arbitrary objects by having them extend Writeable. When reading them though, we need to be able to identify which object we have to create, based on its name. This is useful for queries once we move to parsing on the coordinating node, as well as with aggregations and so on.
Introduced a new abstraction called NamedWriteable, which is supported by StreamOutput and StreamInput through writeNamedWriteable and readNamedWriteable methods. A new NamedWriteableRegistry is introduced also where named writeable prototypes need to be registered so that we are able to retrieve the proper instance of the writeable given its name and then de-serialize it calling readFrom against it.
Closes#12393
We have a smoke_test_plugins.py, but its a bit slow, not integrated
into our build, etc.
I converted this into an integration test. It is definitely uglier
but more robust and fast (e.g. 20 seconds time to verify).
Also there is refactoring of existing integ tests logic, like printing
out commands we execute and stuff
In order to avoid extra reroutes, `RoutingService` should avoid
scheduling a reroute of any shards where the delay is negative. To make
sure that we don't encounter a race condition between the
GatewayAllocator thinking a shard is delayed and RoutingService thinking
it is not, the GatewayAllocator will update the RoutingService with the
last time it checked in order to use a consistent "view" of the delay.
Resolves#12456
Relates to #12515 and #12456
When postFilter generates a token that is identical to the input term
DirectCandidateGenerator should not preFilter this token. If postFilter
and preFilter are the same analyzer instance it would fail with :
"TokenStream contract violation: close() call missing"
This is a forward port of @nomoa's #12670
Today we try to detect if there is an index existing in the directory
and if not we create one. This can be tricky and errof prone since we
rely on the filesystem without taking the context into account when the
engine gets created. We know in all situations if the index should be created
so we can just use this infromation and rely on the lucene index writer to barf
if we hit a situations where we can't append to an index while we should.
Today we miss to throw / rethrow an recovery exception if it happens during
the finalization of phase 1 if the source files are not affected. Even worse
this can cause some dataloss if the reason for this exception is a failure of
deleting a corruption marker or similar pre-existing corruptions since we continue
with the recovery and mark the target shared as started which will in-turn open
an engine with an empty index.
This makes the output of EsThreadPoolExecutor#toString less pretty but
we no longer have funky hacky that rely on the specific format of the
toString produced by ThreadPoolExecutor which isn't part of its API and
could change with any JVM version and break the output.