Commit Graph

787 Commits

Author SHA1 Message Date
Adrien Grand 7765b0497d Merge pull request #12497 from oyiadom/master
Update BulkProcessor.java
2015-08-17 09:43:41 +02:00
Adrien Grand 1f2345db34 Merge pull request #12913 from xuzha/xu-exception
Validate class before cast.
2015-08-17 09:38:59 +02:00
Harish Kayarohanam 3976854ada Improve error handling of ClassCastException in terms aggregations.
What is the problem we are trying to solve ?
===========================================

When we are doing aggregations against a field name as shown in
https://github.com/HarishAtGitHub/elasticsearch-tester/blob/master/12135.py#L37-L46

search = {
           "aggs": {
             "NAME": {
               "terms": {
                 "field": "ip_str",
                 "size": 10
               }
             }
           }
         }
and when the field "ip_str" has values of different types in different indices
. say one is of type StringTerms type and other is of IP(LongTerms type) then
the aggregation fails as the types do not match(incompatible).
The failure throws a class cast exception as follows:
{
   "error": {
      "root_cause": [],
      "type": "reduce_search_phase_exception",
      "reason": "[reduce] ",
      "phase": "query",
      "grouped": true,
      "failed_shards": [],
      "caused_by": {
         "type": "class_cast_exception",
         "reason": "org.elasticsearch.search.aggregations.bucket.terms.LongTerms$Bucket cannot be cast to org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket"
      }
   },
   "status": 503
}

which is hard to understand . User cannot infer anything about the cause of the problem and what he should do from seeing the
class cast exception.

What can be the possible solution ?
===================================

Make the exception more readable by showing him the root cause of the problem so that he can
understand which area actually caused the problem, so that he can take necessary steps further.

Code Analysis
=============

Debugging code shows that:
the query /{indices}/_search?search_type=count involves two phases
1) search phase
***************
     searchService.sendExecuteQuery(...) [Ref: TransportSearchCountAction]

     what happens here ?
        the phase 1, which is the search phase goes without error.
        In this phase the shards for the given indexes are collected and the search is done on all asynchronously
        and finally collected in the variable "firstResults" and given to meger phase.

        [Flow: .... -> TransportSearchTypeAction -> method performFirstPhase]

2) merge phase
**************
     searchPhaseController.merge(...firstResults...) [Ref: TransportSearchCountAction]

     what happens here ?
        the "firstresults" QuerySearchResults are now to be aggregated and combined.

        [Flow: SearchPhaseController.merge(...) -> ..... -> InternalTerms.doReduce(...)]

the phase 1, which is the search phase goes without error.
The problem comes in phase 2, which is merge phase.
Now the individual term buckets are available.
As per the test case , there are two indices cast and cast2, so by default 10 shards.
cast has ip_str of type StringTerms
cast2 has ip_str of type ip which is actually LongTerms

so here two types of Buckets exist. StringTerms_Bucket and LongTerms_Bucket.
Now the aggregation is to be put inside the BucketPriorityQueue(size 2: as out of 10, 2 has hits) finally.
(docs of PriorityQueue: https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/util/PriorityQueue.html#insertWithOverflow(T))

Now first the LongTerms$Bucket is put inside.
then the StringTerms$Bucket is to be put in.
This is the area where exception is thrown. What happens is when adding the StringTerms$Bucket now it has to
goes through the code "lessThan(element, heap[1])"
which finally calls

---------------------------------------------------------------------------------------------
|      StringTerms$Bucket.compareTerms(other)  <---------------- Area of exception          |
|                                                                                           |
--------------------------------------------------------------------------------------------

where when comparing one to other a type cast is done and it fails as StringTerms$Bucket and LongTerms$Bucket are
incompatible.

Approach to solve:
==================

The best way is to make user understand that the problem is when reducing/merging/aggregating the buckets which came as a result of
querying different shards, so that this will make them infer that the problem is because the values of the fields are of different types.
The message is also user friendly and much better than the indecipherable classcastexception.

The only place to infer correctly that the aggregation has failed is in the place where aggregations take place.
so

at InternalTerms.java -> (BucketPriorityQueue)ordered.insertWithOverflow(b);

so here I can throw AggregationExecutionException saying it is because the buckets are of different
types.

But when can I infer at this point that the failure is due to mismatch of types of buckets ???
it can be possible only if at this point it is informed that the problem which occurred deep inside
is due to buckets that were incomparable.
so from just a classCastException we cannot make such a pointed exact inference, because
as class cast exception can be due to a number of scenarios and at a number of places.

so unless we inform the exact problem to InternalTerms it will not be able to infer properly.
so infer the classCastException at the compareTerms function itself that it is a IncomparableTermBucktesTypeException.
This is the best place to infer classCastException as this the place which generated the exception.
Best inference of exceptions can be done only at the source/origin of the exception.

so IncomparableTermBucktesTypeException to InternalTerms-> will make it infer and conclude on why
aggregation failed and give best information to user.

Close #12821
2015-08-17 09:35:32 +02:00
Martijn van Groningen 532806af1a inner hits: Use provided StreamContext instead of fetching a new one.
Closes #12905
2015-08-16 23:39:31 +02:00
Ryan Ernst 9974b79c8a Merge pull request #12916 from rjernst/module_culling
Flatten ClusterModule and add more tests
2015-08-16 10:08:40 -07:00
Simon Willnauer 5ab0833990 Don't swallow cause if Store stats can't be build 2015-08-16 16:36:22 +02:00
Simon Willnauer 5699492575 Merge pull request #12917 from HarishAtGitHub/refactorprocessFirstPhase
Refactor - shard variable dependency from processFirstPhaseResults as shard is no more needed
2015-08-16 16:21:13 +02:00
Simon Willnauer 606f5b368b mute the entire InnerHitsIT - Relates to #12905 2015-08-16 16:08:32 +02:00
Harish Kayarohanam 8122243ee7 this is a small commit to remove the
shard variable dependency from processFirstPhaseResults as shard is no more
needed here . it only deals with the results obtained from the synchronous search on each shard.
2015-08-16 17:10:25 +05:30
Ryan Ernst 008dc8ec31 Internal: Flatten ClusterModule and add more tests
The ClusterModule contained a couple submodules. This moves the
functionality from those modules into ClusterModule. Two of those
had to do with DynamicSettings. This change also cleans up
how DynamicSettings are built, and enforces they are added, with
validators, in ClusterModule.

See #12783.
2015-08-16 01:23:05 -07:00
xuzha 062e038360 Throw IllegalArgumentException instead of ClassCastException,
Let stats aggregation returns 400 error when performed over an invalid field

closes #12842
2015-08-15 15:42:03 -07:00
Simon Willnauer 20f6b41337 Mute InnerHitsIT Relates to #12905 2015-08-15 08:41:28 +02:00
Simon Willnauer b447e2ae99 Move master to [2.1.0-SNAPSHOT] 2015-08-14 23:44:06 +02:00
Jason Tedor 6292bc07f9 Merge pull request #12892 from jasontedor/fix/12865
Validate settings specified in index templates at template creation time
2015-08-14 15:43:16 -06:00
Jason Tedor b88d2f6255 Validate settings specified in index templates at template creation time
Previously settings specified in index templates were not validated upon
template creation. Creating an index from an index template with invalid
settings could lead to cluster stability issues because creation of such
indexes would bypass index settings validation.

This commit adds validation of settings specified in index templates at
template creation time. This works by routing the index template
settings through the index settings validation mechanism.

Closes #12865
2015-08-14 15:23:10 -06:00
Ryan Ernst 33690b990a Merge pull request #12872 from rjernst/resolve_your_own_config
Remove Environment.resolveConfig
2015-08-14 13:56:32 -07:00
Simon Willnauer 03ceabb1da Merge pull request #12894 from s1monw/fix_repro_line
Fix reproduction line to include project filters
2015-08-14 22:52:55 +02:00
Lee Hinman 6f5a25d98e [DOC] Use 127.0.0.1 instead of localhost in READMEs
Users with IPv6 preferred over IPv4 may have `localhost` resolve to
`::1` instead of `127.0.0.1`, so we should be explicit so they don't run
into issues.
2015-08-14 14:47:58 -06:00
Martijn van Groningen 942d040f45 inner hits: Reset the `ShardTargetType` after serializing inner hits.
This fixes a bug where only the first top level search hit has a shard target and any subsequent search hits don't.
2015-08-14 22:38:12 +02:00
Lee Hinman 1b9877bb65 Use Java 7 version of Files.readAllLines instead of Java 8 version 2015-08-14 14:36:17 -06:00
Lee Hinman d35a3a37eb Also catch NoSuchFileException 2015-08-14 13:54:14 -06:00
Lee Hinman 33f118e9c8 Print out the name of the sum that failed 2015-08-14 13:51:47 -06:00
Simon Willnauer bba34de6b3 Fix reproduction line to include project filters
Today on a failure the reproduce line printed out by the test framework
will build all projects and might fail if the test class is not present.
This commit adds a reactor filter to the reproduction line to ensure
unrelated projects are skipped.

Closes #12838
2015-08-14 21:36:49 +02:00
Ryan Ernst 867f056cf6 Simplify random name index and move method to its only user 2015-08-14 11:22:20 -07:00
Lee Hinman 41d8b552e6 Validate checksums for plugins if available
When a plugin is downloaded, this change additionally tries to download
`${pluginurl}.sha1` and verify the SHA1 checksum for the file. If no
.sha1 file is found, it tries `${pluginurl}.md5`.

Note that if neither checksum file is found, a notice is printed but the
plugin can still be installed. If the checksum check fails, the plugin
install is aborted.

Example output if no checksums are available:

```
bin/plugin install elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT
-> Installing elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT...
Trying http://download.elastic.co/elasticsearch/elasticsearch-analysis-icu/elasticsearch-analysis-icu-2.6.0-SNAPSHOT.zip ...
Trying http://search.maven.org/remotecontent?filepath=elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT/elasticsearch-analysis-icu-2.6.0-SNAPSHOT.zip ...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT/elasticsearch-analysis-icu-2.6.0-SNAPSHOT.zip ...
Trying https://github.com/elasticsearch/elasticsearch-analysis-icu/archive/2.6.0-SNAPSHOT.zip ...
Trying https://github.com/elasticsearch/elasticsearch-analysis-icu/archive/master.zip ...
Downloading .....................................DONE
Verifying https://github.com/elasticsearch/elasticsearch-analysis-icu/archive/master.zip checksums if available ...
NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify)
```

Example output if checksums are available:

```
bin/plugin install elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT
-> Installing elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT...
Trying http://download.elastic.co/elasticsearch/elasticsearch-analysis-icu/elasticsearch-analysis-icu-2.6.0-SNAPSHOT.zip ...
Trying http://search.maven.org/remotecontent?filepath=elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT/elasticsearch-analysis-icu-2.6.0-SNAPSHOT.zip ...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/elasticsearch/elasticsearch-analysis-icu/2.6.0-SNAPSHOT/elasticsearch-analysis-icu-2.6.0-SNAPSHOT.zip ...
Trying https://github.com/elasticsearch/elasticsearch-analysis-icu/archive/2.6.0-SNAPSHOT.zip ...
Trying https://github.com/elasticsearch/elasticsearch-analysis-icu/archive/master.zip ...
Downloading .....................................DONE
Verifying https://github.com/elasticsearch/elasticsearch-analysis-icu/archive/master.zip checksums if available ...
Downloading .DONE
```

Example output if checksums fail:

```
bin/plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.5.0 -url http://localhost:8000/elasticsearch-analysis-kuromoji-2.5.0.zip
-> Installing elasticsearch/elasticsearch-analysis-kuromoji/2.5.0...
Trying http://localhost:8000/elasticsearch-analysis-kuromoji-2.5.0.zip ...
Downloading .............................................DONE
Verifying http://localhost:8000/elasticsearch-analysis-kuromoji-2.5.0.zip checksums if available ...
Downloading .DONE
ERROR: incorrect hash, file hash: [dbdc9c2cd32782054497a21fbdcae3ca1ff23c80], expected: [dbdc9c2cd32782054497a21fbdcae3ca1ff23c80-bad]
```

Resolves #12750
2015-08-14 10:40:40 -06:00
Alexander Reelsen 0240b581e7 PluginManager: Fix automatically generated URLs for official plugins
In order to match the paths of official plugins, we need to fix
the broken test by removing the elasticsearch prefix from the official
plugin names before testing.
2015-08-14 18:28:54 +02:00
Igor Motov e44991d2ef Mute test SharedClusterSnapshotRestoreIT#renameOnRestoreTest
Working on the fix
2015-08-14 12:12:53 -04:00
Simon Willnauer 4c1ef2c943 Fix test to also include the mapper-size 2015-08-14 18:11:27 +02:00
Simon Willnauer 30692fe917 Add mapper-size to plugin list 2015-08-14 18:04:45 +02:00
Martijn van Groningen 0688dddebf Also delegate rewrite the the wrapped IndexSearcher.
In the the AssertingIndexSearcher is used we then also have extra validation when rewritting the query.
2015-08-14 16:34:04 +02:00
Adrien Grand 2fecc7e5c9 Merge pull request #12875 from jpountz/fix/dumber_context_indexsearcher
Simplify ContextIndexSearcher.
2015-08-14 15:42:23 +02:00
Alexander Reelsen cedbd20f2c PluginManager: Change staging URL to reflect S3 bucket
In order to create releases without actually changing the version
as part of a commit, we also need to reflect the path of the potentially
changing S3 repo.
2015-08-14 15:23:31 +02:00
Adrien Grand b3e7146b22 Simplify ContextIndexSearcher.
In particular this commit moves collector wrapping logic from
ContextIndexSearcher to QueryPhase.
2015-08-14 15:03:38 +02:00
Ryan Ernst f3d63095db Merge branch 'master' into resolve_your_own_config
Conflicts:
	core/src/main/java/org/elasticsearch/env/Environment.java
	core/src/test/java/org/elasticsearch/index/analysis/commongrams/CommonGramsTokenFilterFactoryTests.java
	core/src/test/java/org/elasticsearch/index/analysis/synonyms/SynonymsAnalysisTest.java
	plugins/analysis-kuromoji/src/test/java/org/elasticsearch/index/analysis/KuromojiAnalysisTests.java
2015-08-14 03:34:26 -07:00
Ryan Ernst be638fb6ef Internal: Remove Environment.resolveConfig
This method has multiple modes of resolving config files by
first looking in the config directory, then on the classpath,
and finally by prefixing with "config/" on the classpath.

Most of the places taking advantage of this were tests, so they
did not have to setup a real home dir with config. The only place
that was really relying on it was the code which loads names.txt
to randomly choose a node name.

This change fixes test to setup fake home dirs with their config
files. It also makes the logic for finding names.txt explicit:
look in config dir, and if it doesn't exist, load /config/names.txt
from the classpath.
2015-08-14 03:03:47 -07:00
javanna 4010e7e9a7 Java api: restore support for minimumShouldMatch and disableCoord in TermsQueryBuilder
TermsQueryParser still parses those values although deprecated. These need to be present in the java api as well to get ready for the query refactoring, where the builders are the intermediate query format that we parse our json queries into. Whatever the parser supports need to be supported by the builder as well.

Closes #12870
2015-08-14 11:17:19 +02:00
javanna 2e0b548b06 Java api: remove support for lookup cache in TermsLooukpBuilder
TermsQueryParser doesn't support the cache field anymore, so if it gets set through java api, the subsequent parsing of that query will throw error

Relates to #12870
2015-08-14 11:16:55 +02:00
Ryan Ernst 470f5370b9 Merge pull request #12868 from rjernst/bye_bye_classloaders
Remove ClassLoader from Settings
2015-08-14 02:15:07 -07:00
Alexander Reelsen 0f3ada159e Release: Create pre release script
Refactored a part out of the release script, so the user can
change the version locally as well as move the documentation
and change the Version.java

The background of this change is to have a very simple release
process that puts stuff into a staging environment, so the beta
release can be tested, before it is officially released.

This means the build_release script can be removed soon.
2015-08-14 11:04:32 +02:00
Ryan Ernst 6dcfda99e8 Internal: Remove ClassLoader from Settings
Settings currently has a classloader member, which any user (plugin
or core ES code) can access to load classes/resources. This is extremely
error prone as setting the classloder on the Settings instance is a
public method. Furthermore, it is not really necessary. Classes that
need resources should load resources using normal means
(getClass().getResourceAsStream). Those that need classes
should use Class.forName, which will load the class with the
same classloader as the calling class. This means, in the few
places where classes are loaded by string name, they will use
the appropriate loader: either the default classloader which loads
core ES code, or a child classloader for each plugin.

This change removes the classloader member from Settings, as
well as other classloader related uses (except for a handful
of cases which must use a classloader, at least for now).
2015-08-13 23:55:27 -07:00
Ryan Ernst dcf3f4679f Fourth time, for real, last mock -> test jar 2015-08-13 19:40:31 -07:00
Ryan Ernst c16772b0fc Undo accidental commit of crap
This reverts commit 589eecf55d.
2015-08-13 19:39:55 -07:00
Ryan Ernst 589eecf55d Fourth time's a charm, one more mock class to add to test jar 2015-08-13 19:36:53 -07:00
Ryan Ernst d7da6e673f One more mock rule needed for test jar 2015-08-13 16:28:38 -07:00
Ryan Ernst a89ea15b41 Add more mock classes to test jar 2015-08-13 15:32:03 -07:00
Ryan Ernst c4f8c333ec Fix test jar to contain Mock classes that were moved in 71a3bdb 2015-08-13 15:17:22 -07:00
Ryan Ernst dc1a2d2e5a Merge branch 'master' into fix/12784 2015-08-13 14:35:48 -07:00
Ryan Ernst 9f64c75391 Remove leftover classes 2015-08-13 14:33:11 -07:00
Ryan Ernst ccef0551ef Tests: Refactor classes only plugged in by tests to use package private extension points
We previous used something like Class.forName to load mock classes,
where tests would set a setting that was *supposed* to only be used by
tests. This change make these impls package private so that only tests
can change out these implementations, through test plugins.

closes #12784
2015-08-13 13:52:55 -07:00
Simon Willnauer 1f41a8c682 Don't cache percolator query on loading percolators
No need to load catch this query since it's cheap and not reused.
If we cache it, it can cause assertions to be tripped since this
method is executed during postRecovery phase and might still run while
nodes are shutdown in tests.
2015-08-13 22:27:14 +02:00