1. Move `clean_before_test` to the first test so its more explicit.
2. Move `skip_not_tar_gz` to setup because it was run first in every test.
3. Remove calls to `run` that only check the status. Its simpler to just
execute the command. Its better because std-out will be captured and replayed
on error.
4. Switch from `su` to `sudo` because `su` was breaking `bats`'s error
reporting.
EsExecutorsTests had a test that was failing spuriously due to threadpools
being threadpools. This weakens the assertions that the test makes to what
should always be true.
In order to have consistent deploys across several repositories,
we should deploy to sonatype first, then mirror those contents,
and then upload to s3.
This means, the aws wagon is not needed anymore.
It might turn out to be useful to have the actual commit hash of the version we are
looking for if our download manager can just redirect to the right staging repository.
mvn has a versions:set plugin, that can be easily invoked and does not
require the python script to parse the files and hope that there are no
other snapshot mentions.
This is purely for maintainance reasons since it easier to see if we can drop
certain stageing urls if we have the version next to the hash.
I also removed the gpg passphrase from the example URL since it's better to get prompted?
This is caused by sending the same file to the chunk handler with offset
`0` which in-turn opens a new outputstream and waits for bytes. But the next round
will send 0 bytes again with offset 0. This commit adds some checks / validators that those
settings are positive byte values and fixes the RecoveryStatus to throw an IAE if the same file
is opened twice.
What is the problem we are trying to solve ?
===========================================
When we are doing aggregations against a field name as shown in
https://github.com/HarishAtGitHub/elasticsearch-tester/blob/master/12135.py#L37-L46
search = {
"aggs": {
"NAME": {
"terms": {
"field": "ip_str",
"size": 10
}
}
}
}
and when the field "ip_str" has values of different types in different indices
. say one is of type StringTerms type and other is of IP(LongTerms type) then
the aggregation fails as the types do not match(incompatible).
The failure throws a class cast exception as follows:
{
"error": {
"root_cause": [],
"type": "reduce_search_phase_exception",
"reason": "[reduce] ",
"phase": "query",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "class_cast_exception",
"reason": "org.elasticsearch.search.aggregations.bucket.terms.LongTerms$Bucket cannot be cast to org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket"
}
},
"status": 503
}
which is hard to understand . User cannot infer anything about the cause of the problem and what he should do from seeing the
class cast exception.
What can be the possible solution ?
===================================
Make the exception more readable by showing him the root cause of the problem so that he can
understand which area actually caused the problem, so that he can take necessary steps further.
Code Analysis
=============
Debugging code shows that:
the query /{indices}/_search?search_type=count involves two phases
1) search phase
***************
searchService.sendExecuteQuery(...) [Ref: TransportSearchCountAction]
what happens here ?
the phase 1, which is the search phase goes without error.
In this phase the shards for the given indexes are collected and the search is done on all asynchronously
and finally collected in the variable "firstResults" and given to meger phase.
[Flow: .... -> TransportSearchTypeAction -> method performFirstPhase]
2) merge phase
**************
searchPhaseController.merge(...firstResults...) [Ref: TransportSearchCountAction]
what happens here ?
the "firstresults" QuerySearchResults are now to be aggregated and combined.
[Flow: SearchPhaseController.merge(...) -> ..... -> InternalTerms.doReduce(...)]
the phase 1, which is the search phase goes without error.
The problem comes in phase 2, which is merge phase.
Now the individual term buckets are available.
As per the test case , there are two indices cast and cast2, so by default 10 shards.
cast has ip_str of type StringTerms
cast2 has ip_str of type ip which is actually LongTerms
so here two types of Buckets exist. StringTerms_Bucket and LongTerms_Bucket.
Now the aggregation is to be put inside the BucketPriorityQueue(size 2: as out of 10, 2 has hits) finally.
(docs of PriorityQueue: https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/util/PriorityQueue.html#insertWithOverflow(T))
Now first the LongTerms$Bucket is put inside.
then the StringTerms$Bucket is to be put in.
This is the area where exception is thrown. What happens is when adding the StringTerms$Bucket now it has to
goes through the code "lessThan(element, heap[1])"
which finally calls
---------------------------------------------------------------------------------------------
| StringTerms$Bucket.compareTerms(other) <---------------- Area of exception |
| |
--------------------------------------------------------------------------------------------
where when comparing one to other a type cast is done and it fails as StringTerms$Bucket and LongTerms$Bucket are
incompatible.
Approach to solve:
==================
The best way is to make user understand that the problem is when reducing/merging/aggregating the buckets which came as a result of
querying different shards, so that this will make them infer that the problem is because the values of the fields are of different types.
The message is also user friendly and much better than the indecipherable classcastexception.
The only place to infer correctly that the aggregation has failed is in the place where aggregations take place.
so
at InternalTerms.java -> (BucketPriorityQueue)ordered.insertWithOverflow(b);
so here I can throw AggregationExecutionException saying it is because the buckets are of different
types.
But when can I infer at this point that the failure is due to mismatch of types of buckets ???
it can be possible only if at this point it is informed that the problem which occurred deep inside
is due to buckets that were incomparable.
so from just a classCastException we cannot make such a pointed exact inference, because
as class cast exception can be due to a number of scenarios and at a number of places.
so unless we inform the exact problem to InternalTerms it will not be able to infer properly.
so infer the classCastException at the compareTerms function itself that it is a IncomparableTermBucktesTypeException.
This is the best place to infer classCastException as this the place which generated the exception.
Best inference of exceptions can be done only at the source/origin of the exception.
so IncomparableTermBucktesTypeException to InternalTerms-> will make it infer and conclude on why
aggregation failed and give best information to user.
Close#12821
The IndicesModule was made up of two submodules, one which
handled registering queries, and the other for registering
hunspell dictionaries. This change moves those into
IndicesModule. It also adds a new extension point type,
InstanceMap. This is simply a Map<K,V>, where K and V are
actual objects, not classes like most other extension points.
I also added a test method to help testing instance map extensions.
This was particularly painful because of how guice binds the key
and value as separate bindings, and then reconstitutes them
into a Map at injection time. In order to gain access to the
object which links the key and value, I had to tweak our
guice copy to not use an anonymous inner class for the Provider.
Note that I also renamed the existing extension point types, since
they were very redundant. For example, ExtensionPoint.MapExtensionPoint
is now ExtensionPoint.ClassMap.
See #12783.
shard variable dependency from processFirstPhaseResults as shard is no more
needed here . it only deals with the results obtained from the synchronous search on each shard.
The ClusterModule contained a couple submodules. This moves the
functionality from those modules into ClusterModule. Two of those
had to do with DynamicSettings. This change also cleans up
how DynamicSettings are built, and enforces they are added, with
validators, in ClusterModule.
See #12783.
* Centralised plugin docs in docs/plugins/
* Moved integrations into same docs
* Moved community clients into the clients section of the docs
* Removed docs/community
Closes#11734Closes#11724Closes#11636Closes#11635Closes#11632Closes#11630Closes#12046Closes#12438Closes#12579
Systemd looks to be a bit less tolerant about $VAR than bash is. Replace
$VAR with ${VAR} in places in the systemd configuration file to get the
substitutions working.