EsExecutorsTests had a test that was failing spuriously due to threadpools
being threadpools. This weakens the assertions that the test makes to what
should always be true.
It might turn out to be useful to have the actual commit hash of the version we are
looking for if our download manager can just redirect to the right staging repository.
This is purely for maintainance reasons since it easier to see if we can drop
certain stageing urls if we have the version next to the hash.
I also removed the gpg passphrase from the example URL since it's better to get prompted?
This is caused by sending the same file to the chunk handler with offset
`0` which in-turn opens a new outputstream and waits for bytes. But the next round
will send 0 bytes again with offset 0. This commit adds some checks / validators that those
settings are positive byte values and fixes the RecoveryStatus to throw an IAE if the same file
is opened twice.
What is the problem we are trying to solve ?
===========================================
When we are doing aggregations against a field name as shown in
https://github.com/HarishAtGitHub/elasticsearch-tester/blob/master/12135.py#L37-L46
search = {
"aggs": {
"NAME": {
"terms": {
"field": "ip_str",
"size": 10
}
}
}
}
and when the field "ip_str" has values of different types in different indices
. say one is of type StringTerms type and other is of IP(LongTerms type) then
the aggregation fails as the types do not match(incompatible).
The failure throws a class cast exception as follows:
{
"error": {
"root_cause": [],
"type": "reduce_search_phase_exception",
"reason": "[reduce] ",
"phase": "query",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "class_cast_exception",
"reason": "org.elasticsearch.search.aggregations.bucket.terms.LongTerms$Bucket cannot be cast to org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket"
}
},
"status": 503
}
which is hard to understand . User cannot infer anything about the cause of the problem and what he should do from seeing the
class cast exception.
What can be the possible solution ?
===================================
Make the exception more readable by showing him the root cause of the problem so that he can
understand which area actually caused the problem, so that he can take necessary steps further.
Code Analysis
=============
Debugging code shows that:
the query /{indices}/_search?search_type=count involves two phases
1) search phase
***************
searchService.sendExecuteQuery(...) [Ref: TransportSearchCountAction]
what happens here ?
the phase 1, which is the search phase goes without error.
In this phase the shards for the given indexes are collected and the search is done on all asynchronously
and finally collected in the variable "firstResults" and given to meger phase.
[Flow: .... -> TransportSearchTypeAction -> method performFirstPhase]
2) merge phase
**************
searchPhaseController.merge(...firstResults...) [Ref: TransportSearchCountAction]
what happens here ?
the "firstresults" QuerySearchResults are now to be aggregated and combined.
[Flow: SearchPhaseController.merge(...) -> ..... -> InternalTerms.doReduce(...)]
the phase 1, which is the search phase goes without error.
The problem comes in phase 2, which is merge phase.
Now the individual term buckets are available.
As per the test case , there are two indices cast and cast2, so by default 10 shards.
cast has ip_str of type StringTerms
cast2 has ip_str of type ip which is actually LongTerms
so here two types of Buckets exist. StringTerms_Bucket and LongTerms_Bucket.
Now the aggregation is to be put inside the BucketPriorityQueue(size 2: as out of 10, 2 has hits) finally.
(docs of PriorityQueue: https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/util/PriorityQueue.html#insertWithOverflow(T))
Now first the LongTerms$Bucket is put inside.
then the StringTerms$Bucket is to be put in.
This is the area where exception is thrown. What happens is when adding the StringTerms$Bucket now it has to
goes through the code "lessThan(element, heap[1])"
which finally calls
---------------------------------------------------------------------------------------------
| StringTerms$Bucket.compareTerms(other) <---------------- Area of exception |
| |
--------------------------------------------------------------------------------------------
where when comparing one to other a type cast is done and it fails as StringTerms$Bucket and LongTerms$Bucket are
incompatible.
Approach to solve:
==================
The best way is to make user understand that the problem is when reducing/merging/aggregating the buckets which came as a result of
querying different shards, so that this will make them infer that the problem is because the values of the fields are of different types.
The message is also user friendly and much better than the indecipherable classcastexception.
The only place to infer correctly that the aggregation has failed is in the place where aggregations take place.
so
at InternalTerms.java -> (BucketPriorityQueue)ordered.insertWithOverflow(b);
so here I can throw AggregationExecutionException saying it is because the buckets are of different
types.
But when can I infer at this point that the failure is due to mismatch of types of buckets ???
it can be possible only if at this point it is informed that the problem which occurred deep inside
is due to buckets that were incomparable.
so from just a classCastException we cannot make such a pointed exact inference, because
as class cast exception can be due to a number of scenarios and at a number of places.
so unless we inform the exact problem to InternalTerms it will not be able to infer properly.
so infer the classCastException at the compareTerms function itself that it is a IncomparableTermBucktesTypeException.
This is the best place to infer classCastException as this the place which generated the exception.
Best inference of exceptions can be done only at the source/origin of the exception.
so IncomparableTermBucktesTypeException to InternalTerms-> will make it infer and conclude on why
aggregation failed and give best information to user.
Close#12821
The IndicesModule was made up of two submodules, one which
handled registering queries, and the other for registering
hunspell dictionaries. This change moves those into
IndicesModule. It also adds a new extension point type,
InstanceMap. This is simply a Map<K,V>, where K and V are
actual objects, not classes like most other extension points.
I also added a test method to help testing instance map extensions.
This was particularly painful because of how guice binds the key
and value as separate bindings, and then reconstitutes them
into a Map at injection time. In order to gain access to the
object which links the key and value, I had to tweak our
guice copy to not use an anonymous inner class for the Provider.
Note that I also renamed the existing extension point types, since
they were very redundant. For example, ExtensionPoint.MapExtensionPoint
is now ExtensionPoint.ClassMap.
See #12783.
shard variable dependency from processFirstPhaseResults as shard is no more
needed here . it only deals with the results obtained from the synchronous search on each shard.
The ClusterModule contained a couple submodules. This moves the
functionality from those modules into ClusterModule. Two of those
had to do with DynamicSettings. This change also cleans up
how DynamicSettings are built, and enforces they are added, with
validators, in ClusterModule.
See #12783.
Previously settings specified in index templates were not validated upon
template creation. Creating an index from an index template with invalid
settings could lead to cluster stability issues because creation of such
indexes would bypass index settings validation.
This commit adds validation of settings specified in index templates at
template creation time. This works by routing the index template
settings through the index settings validation mechanism.
Closes#12865
Users with IPv6 preferred over IPv4 may have `localhost` resolve to
`::1` instead of `127.0.0.1`, so we should be explicit so they don't run
into issues.
Today on a failure the reproduce line printed out by the test framework
will build all projects and might fail if the test class is not present.
This commit adds a reactor filter to the reproduction line to ensure
unrelated projects are skipped.
Closes#12838
In order to match the paths of official plugins, we need to fix
the broken test by removing the elasticsearch prefix from the official
plugin names before testing.
In order to create releases without actually changing the version
as part of a commit, we also need to reflect the path of the potentially
changing S3 repo.
This method has multiple modes of resolving config files by
first looking in the config directory, then on the classpath,
and finally by prefixing with "config/" on the classpath.
Most of the places taking advantage of this were tests, so they
did not have to setup a real home dir with config. The only place
that was really relying on it was the code which loads names.txt
to randomly choose a node name.
This change fixes test to setup fake home dirs with their config
files. It also makes the logic for finding names.txt explicit:
look in config dir, and if it doesn't exist, load /config/names.txt
from the classpath.
TermsQueryParser still parses those values although deprecated. These need to be present in the java api as well to get ready for the query refactoring, where the builders are the intermediate query format that we parse our json queries into. Whatever the parser supports need to be supported by the builder as well.
Closes#12870
TermsQueryParser doesn't support the cache field anymore, so if it gets set through java api, the subsequent parsing of that query will throw error
Relates to #12870
Refactored a part out of the release script, so the user can
change the version locally as well as move the documentation
and change the Version.java
The background of this change is to have a very simple release
process that puts stuff into a staging environment, so the beta
release can be tested, before it is officially released.
This means the build_release script can be removed soon.
Settings currently has a classloader member, which any user (plugin
or core ES code) can access to load classes/resources. This is extremely
error prone as setting the classloder on the Settings instance is a
public method. Furthermore, it is not really necessary. Classes that
need resources should load resources using normal means
(getClass().getResourceAsStream). Those that need classes
should use Class.forName, which will load the class with the
same classloader as the calling class. This means, in the few
places where classes are loaded by string name, they will use
the appropriate loader: either the default classloader which loads
core ES code, or a child classloader for each plugin.
This change removes the classloader member from Settings, as
well as other classloader related uses (except for a handful
of cases which must use a classloader, at least for now).
We previous used something like Class.forName to load mock classes,
where tests would set a setting that was *supposed* to only be used by
tests. This change make these impls package private so that only tests
can change out these implementations, through test plugins.
closes#12784
No need to load catch this query since it's cheap and not reused.
If we cache it, it can cause assertions to be tripped since this
method is executed during postRecovery phase and might still run while
nodes are shutdown in tests.
This commit tries to add some infrastructure to streamline how extension
points should be strucutred. It's a simple approache with 4 implementations
for `highlighter`, `suggester`, `allocation_decider` and `shards_allocator`.
It simplifies adding new extension points and forces to register classes instead
of strings.
Build fails with maven 3.3.1 and 3.3.3. To reproduce, install one of the 3.3.x versions of maven and run `mvn clean verify` in the root directory of the project. The build will fail in the QA: Smoke Test Shaded Jar module with the following error:
```
Started J0 PID(99979@flea.local).
Suite: org.elasticsearch.shaded.test.ShadedIT
2> NOTE: reproduce with: ant test -Dtestcase=ShadedIT -Dtests.method=testJodaIsNotOnTheCP -Dtests.seed=2F4D23A7462CF921 -Dtests.locale= -Dtests.timezone=Asia/Baku -Dtests.asserts=true -Dtests.file.encoding=UTF-8
FAILURE 0.06s | ShadedIT.testJodaIsNotOnTheCP <<<
> Throwable #1: junit.framework.AssertionFailedError: Expected an exception but the test passed: java.lang.ClassNotFoundException
> at __randomizedtesting.SeedInfo.seed([2F4D23A7462CF921:3A9404F1F69FD80]:0)
> at junit.framework.Assert.fail(Assert.java:57)
> at java.lang.Thread.run(Thread.java:745)
2> NOTE: reproduce with: ant test -Dtestcase=ShadedIT -Dtests.method=testGuavaIsNotOnTheCP -Dtests.seed=2F4D23A7462CF921 -Dtests.locale= -Dtests.timezone=Asia/Baku -Dtests.asserts=true -Dtests.file.encoding=UTF-8
FAILURE 0.01s | ShadedIT.testGuavaIsNotOnTheCP <<<
> Throwable #1: junit.framework.AssertionFailedError: Expected an exception but the test passed: java.lang.ClassNotFoundException
> at __randomizedtesting.SeedInfo.seed([2F4D23A7462CF921:C2502FD54D83433D]:0)
> at junit.framework.Assert.fail(Assert.java:57)
> at java.lang.Thread.run(Thread.java:745)
2> NOTE: reproduce with: ant test -Dtestcase=ShadedIT -Dtests.method=testjsr166eIsNotOnTheCP -Dtests.seed=2F4D23A7462CF921 -Dtests.locale= -Dtests.timezone=Asia/Baku -Dtests.asserts=true -Dtests.file.encoding=UTF-8
FAILURE 0.01s | ShadedIT.testjsr166eIsNotOnTheCP <<<
> Throwable #1: junit.framework.AssertionFailedError: Expected an exception but the test passed: java.lang.ClassNotFoundException
> at __randomizedtesting.SeedInfo.seed([2F4D23A7462CF921:35593286F4269392]:0)
> at junit.framework.Assert.fail(Assert.java:57)
> at java.lang.Thread.run(Thread.java:745)
2> NOTE: leaving temporary files on disk at: /Users/Shared/Jenkins/Home/workspace/elasticsearch-master/qa/smoke-test-shaded/target/J0/temp/org.elasticsearch.shaded.test.ShadedIT_2F4D23A7462CF921-001
2> NOTE: test params are: codec=CheapBastard, sim=DefaultSimilarity, locale=, timezone=Asia/Baku
2> NOTE: Mac OS X 10.10.4 x86_64/Oracle Corporation 1.8.0_25 (64-bit)/cpus=8,threads=1,free=482137936,total=514850816
2> NOTE: All tests run in this JVM: [ShadedIT]
Completed [1/1] in 6.61s, 5 tests, 3 failures <<< FAILURES!
Tests with failures:
- org.elasticsearch.shaded.test.ShadedIT.testJodaIsNotOnTheCP
- org.elasticsearch.shaded.test.ShadedIT.testGuavaIsNotOnTheCP
- org.elasticsearch.shaded.test.ShadedIT.testjsr166eIsNotOnTheCP
```
Please note that build doesn't fail with maven 3.2.x and it doesn't fail if mvn command is executed inside the qa/smoke-test-shaded directory. Only when the build is started from the root directory the error above can be observed.
The reason is because of the shaded version which depends on elasticsearch core.
When Maven build the module only, then elasticsearch core is not added to the dependency tree.
```sh
mvn dependency:tree -pl :smoke-test-shaded
```
```
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ smoke-test-shaded ---
[INFO] org.elasticsearch.qa:smoke-test-shaded:jar:2.0.0-beta1-SNAPSHOT
[INFO] +- org.elasticsearch.distribution.shaded:elasticsearch:jar:2.0.0-beta1-SNAPSHOT:compile
[INFO] | +- org.apache.lucene:lucene-core:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-backward-codecs:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-analyzers-common:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-queries:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-memory:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-highlighter:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-queryparser:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-sandbox:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-suggest:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-misc:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-join:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-grouping:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-spatial:jar:5.2.1:compile
[INFO] | \- com.spatial4j:spatial4j:jar:0.4.1:compile
[INFO] +- org.hamcrest:hamcrest-all:jar:1.3:test
[INFO] \- org.apache.lucene:lucene-test-framework:jar:5.2.1:test
[INFO] +- org.apache.lucene:lucene-codecs:jar:5.2.1:test
[INFO] +- com.carrotsearch.randomizedtesting:randomizedtesting-runner:jar:2.1.16:test
[INFO] +- junit:junit:jar:4.11:test
[INFO] \- org.apache.ant🐜jar:1.8.2:test
```
But if shaded plugin is involved during the build, it modifies the `projectArtifactMap`:
```sh
mvn dependency:tree -pl org.elasticsearch.distribution.shaded:elasticsearch,:smoke-test-shaded
```
```
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ smoke-test-shaded ---
[INFO] org.elasticsearch.qa:smoke-test-shaded:jar:2.0.0-beta1-SNAPSHOT
[INFO] +- org.elasticsearch.distribution.shaded:elasticsearch:jar:2.0.0-beta1-SNAPSHOT:compile
[INFO] | \- org.elasticsearch:elasticsearch:jar:2.0.0-beta1-SNAPSHOT:compile
[INFO] | +- org.apache.lucene:lucene-backward-codecs:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-analyzers-common:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-queries:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-memory:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-highlighter:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-queryparser:jar:5.2.1:compile
[INFO] | | \- org.apache.lucene:lucene-sandbox:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-suggest:jar:5.2.1:compile
[INFO] | | \- org.apache.lucene:lucene-misc:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-join:jar:5.2.1:compile
[INFO] | | \- org.apache.lucene:lucene-grouping:jar:5.2.1:compile
[INFO] | +- org.apache.lucene:lucene-spatial:jar:5.2.1:compile
[INFO] | | \- com.spatial4j:spatial4j:jar:0.4.1:compile
[INFO] | +- com.google.guava:guava:jar:18.0:compile
[INFO] | +- com.carrotsearch:hppc:jar:0.7.1:compile
[INFO] | +- joda-time:joda-time:jar:2.8:compile
[INFO] | +- org.joda:joda-convert:jar:1.2:compile
[INFO] | +- com.fasterxml.jackson.core:jackson-core:jar:2.5.3:compile
[INFO] | +- com.fasterxml.jackson.dataformat:jackson-dataformat-smile:jar:2.5.3:compile
[INFO] | +- com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:jar:2.5.3:compile
[INFO] | | \- org.yaml:snakeyaml:jar:1.12:compile
[INFO] | +- com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:jar:2.5.3:compile
[INFO] | +- io.netty:netty:jar:3.10.3.Final:compile
[INFO] | +- com.ning:compress-lzf:jar:1.0.2:compile
[INFO] | +- com.tdunning:t-digest:jar:3.0:compile
[INFO] | +- org.hdrhistogram:HdrHistogram:jar:2.1.6:compile
[INFO] | +- org.apache.commons:commons-lang3:jar:3.3.2:compile
[INFO] | +- commons-cli:commons-cli:jar:1.3.1:compile
[INFO] | \- com.twitter:jsr166e:jar:1.1.0:compile
[INFO] +- org.hamcrest:hamcrest-all:jar:1.3:test
[INFO] \- org.apache.lucene:lucene-test-framework:jar:5.2.1:test
[INFO] +- org.apache.lucene:lucene-codecs:jar:5.2.1:test
[INFO] +- org.apache.lucene:lucene-core:jar:5.2.1:compile
[INFO] +- com.carrotsearch.randomizedtesting:randomizedtesting-runner:jar:2.1.16:test
[INFO] +- junit:junit:jar:4.11:test
[INFO] \- org.apache.ant🐜jar:1.8.2:test
```
A fix could consist of fixing something on Maven side. Probably something changed in a recent version and introduced this "issue" but it might be not really an issue. More a fix.
There are two workarounds:
1) exclude manually elasticsearch core from shaded version in smoke-test-shaded module and add manually each lucene lib needed by elasticsearch
2) add a new `elasticsearch-lucene` (lucene) POM module which simply declares all needed lucene libs in subprojects (such as the smoke tester one).
I choose the later.
Closes#12791.
This allows `path.shared_data` to be added to the security manager while
still allowing a custom `data_path` for indices using shadow replicas.
For example, configuring `path.shared_data: /tmp/foo`, then created an
index with:
```
POST /myindex
{
"index": {
"number_of_shards": 1,
"number_of_replicas": 1,
"data_path": "/tmp/foo/bar/baz",
"shadow_replicas": true
}
}
```
The index will then reside in `/tmp/foo/bar/baz`.
`path.shared_data` defaults to `null` if not specified.
Resolves#12714
Relates to #11065
The upper bound must be 0-based since we are corrupting an offset into
the file but it can be a 1-based length of the file which results in an
uncorrupted file.
There were two submodules of AllocationModule. This combines them into a
single module, adds a base test case for module testing, and adds back
the ability for plugins to provide custom ShardsAllocators.
closes#12781
Instead of logging the entire `_source` in the indexing slowlog we log by
default just the first 1000 characters - this is controlled by the
`index.indexing.slowlog.source` settings and can be set to `true` to log the
whole `_source`, `false` to log none of it, and a number to log at most that
many characters.
Closes#4485