Commit Graph

15294 Commits

Author SHA1 Message Date
David Pilato 975eb60a12 [doc] we don't use `check_lucene` anymore in plugins 2015-08-18 13:02:18 +02:00
Martijn van Groningen e74f559fd4 parent/child: Explicitly disabled the query cache
It was already disabled, but during tests the test framework enabled the query cache by setting the static default query cache. The caching behaviour should be the same in production and in tests.
2015-08-18 12:56:37 +02:00
Adrien Grand a72adbf0b4 Fix mapper-murmur3 compatibility version. 2015-08-18 12:31:36 +02:00
kakakakakku d7bf510fe0 Fixed section name and api name in docs 2015-08-18 19:26:23 +09:00
Clinton Gormley 0b5a027d6a Docs: Fixed bad ID in geo bound box 2015-08-18 12:20:00 +02:00
Clinton Gormley d13078546a Docs: Fixed malforme table in geo-polygon query 2015-08-18 12:16:49 +02:00
Adrien Grand c169386dd4 Merge pull request #12931 from jpountz/fix/murmur3_defaults
Move the `murmur3` field to a plugin and fix defaults.
2015-08-18 12:09:32 +02:00
Ryan Ernst dc1fa6736a Merged AbstractPlugin and Plugin. Also added Settings back to
indexModules and shardModules
2015-08-18 02:46:32 -07:00
Adrien Grand a91b3fcbb9 Move the `murmur3` field to a plugin and fix defaults.
This move the `murmur3` field to the `mapper-murmur3` plugin and fixes its
defaults so that values will not be indexed by default, as the only purpose
of this field is to speed up `cardinality` aggregations on high-cardinality
string fields, which only requires doc values.

I also removed the `rehash` option from the `cardinality` aggregation as it
doesn't bring much value (rehashing is cheap) and allowed to remove the
coupling between the `cardinality` aggregation and the `murmur3` field.

Close #12874
2015-08-18 11:41:52 +02:00
Martijn van Groningen 2b97f5d9eb Pass down the EngineConfig to IndexSearcherWrapper
If a new IndexSearcher gets created the engine config can be used to get the query cache and query cache policy from.
2015-08-18 10:41:24 +02:00
Ryan Ernst 2bf84593e0 Plugins: Simplify Plugin API for constructing modules
The Plugin interface currently contains 6 different methods for
adding modules. Elasticsearch has 3 different levels of injectors,
and for each of those, there are two methods. The first takes no
arguments and returns a collection of class objects to construct. The
second takes a Settings object and returns a collection of module
objects already constructed. The settings argument is unecessary because
the plugin can already get the settings from its constructor. Removing
that, the only difference between the two versions is returning an
already constructed Module, or a module Class, and there is no reason
the plugin can't construct all their modules themselves.

This change reduces the plugin api down to just 3 methods for adding
modules. Each returns a Collection<Module>. It also removes the
processModule method, which was unnecessary since onModule
implementations fullfill the same requirement. And finally, it renames
the modules() method to nodeModules() so it is clear these are created
once for each node.
2015-08-17 20:41:45 -07:00
Robert Muir 34635a4b4f Merge pull request #12951 from rmuir/more_networking_cleanup
Use preferIPv6Addresses for sort order, not preferIPv4Stack
2015-08-17 23:30:35 -04:00
Robert Muir 3ca12889e5 Use preferIPv6Addresses for sort order, not preferIPv4Stack
java.net.preferIPv6Addresses is a better choice. preferIPv4Stack is a nuclear option
and you just won't even bind to any IPv6 addresses. This reduces confusion.
2015-08-17 22:52:22 -04:00
Igor Motov 4114a8359d Improve stability of Snapshot/Restore test
The restore portion of some snapshot/restore test is failing randomly due to #9421. This change suspends rebalance during snapshot/restore operations until #9421 is fixed.

Closes #12855
2015-08-17 21:30:46 -04:00
Ryan Ernst 2e90be77ff Plugins: Ensure additionalSettings() do not conflict
Plugins can preovide additional settings to be added to the settings
provided in elasticsearch.yml. However, if two different plugins supply
the same setting key, the last one to be loaded wins, which is
indeterminate. This change enforces plugins cannot have conflicting
settings, at startup time. As a followup, we should do this when
installing plugins as well, to give earlier errors when two plugins
collide.
2015-08-17 17:01:00 -07:00
Ryan Ernst 6f124e6eec Internal: Simplify custom repository type setup
Custom repository types are registered through the RepositoriesModule.
Later, when a specific repository type is used, the RespositoryModule
is installed, which in turn would spawn the module that was
registered for that repository type. However, a module is not needed
here. Each repository type has two associated classes, a Repository and
an IndexShardRepository.

This change makes the registration method for custom repository
types take both of these classes, instead of a module.

See #12783.
2015-08-17 15:08:08 -07:00
Nicholas Knize ee227efc62 move integration test dependency file gzippedmap.gz from sources to resources 2015-08-17 16:49:25 -05:00
Simon Willnauer 8624022222 Print es.node.mode if integration tests fail 2015-08-17 22:50:54 +02:00
Nicholas Knize b2ba3847f7 Refactor geo_point validate* and normalize* options to ignore_malformed and coerce*
For consistency geo_point mapper's validate and normalize options are converted to ignore_malformed and coerced
2015-08-17 14:46:23 -05:00
Robert Muir 68307aa9f3 Fix network binding for ipv4/ipv6
When elasticsearch is configured by interface (or default: loopback interfaces),
bind to all addresses on the interface rather than an arbitrary one.

If the publish address is not specified, default it from the bound addresses
based on the following sort ordering:

* ipv4/ipv6 (java.net.preferIPv4Stack, defaults to true)
* ordinary addresses
* site-local addresses
* link local addresses
* loopback addresses

One one address is published, and multicast is still always over ipv4: these
need to be future improvements.

Closes #12906
Closes #12915

Squashed commit of the following:

commit 7e60833312f329a5749f9a256b9c1331a956d98f
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 14:45:33 2015 -0400

    fix java 7 compilation oops

commit c7b9f3a42058beb061b05c6dd67fd91477fd258a
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 14:24:16 2015 -0400

    Cleanup/fix logic around custom resolvers

commit bd7065f1936e14a29c9eb8fe4ecab0ce512ac08e
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 13:29:42 2015 -0400

    Add some unit tests for utility methods

commit 0faf71cb0ee9a45462d58af3d1bf214e8a79347c
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 12:11:48 2015 -0400

    localhost all the way down

commit e198bb2bc0d1673288b96e07e6e6ad842179978c
Merge: b55d092 b93a75f
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 12:05:02 2015 -0400

    Merge branch 'master' into network_cleanup

commit b55d092811d7832bae579c5586e171e9cc1ebe9d
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 12:03:03 2015 -0400

    fix docs, fix another bug in multicast (publish host = bad here!)

commit 88c462eb302b30a82585f95413927a5cbb7d54c4
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 11:50:49 2015 -0400

    remove nocommit

commit 89547d7b10d68b23d7f24362e1f4782f5e1ca03c
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 11:49:35 2015 -0400

    fix http too

commit 9b9413aca8a3f6397b5031831f910791b685e5be
Author: Robert Muir <rmuir@apache.org>
Date:   Mon Aug 17 11:06:02 2015 -0400

    Fix transport / interface code

    Next up: multicast and then http
2015-08-17 15:43:07 -04:00
Ryan Ernst 75ced057ed Fix changed class name after bad merge, see #12921 2015-08-17 12:14:01 -07:00
Ryan Ernst 52f3eb5898 Merge pull request #12921 from rjernst/module_culling2
Flatten IndicesModule and add tests
2015-08-17 12:08:17 -07:00
Nik Everett f5735e8571 Merge pull request #12782 from nik9000/prompts_2
Set prompts to be more consistent
2015-08-17 11:33:38 -07:00
Nik Everett ded5d7456a Merge pull request #12895 from nik9000/config_tests
Use jvm-example for testing bin/plugin
2015-08-17 11:13:02 -07:00
Nik Everett 0b650ed203 Add tests for plugins with bin directory
Also removes all mention of shield:
```bash
$ find $BATS -type f -exec grep -Hi shield {} \;
$
```
2015-08-17 10:53:16 -07:00
Nik Everett 391ea379e2 Test: Use jvm-example for testing bin/plugin
Related to #12651
2015-08-17 10:53:16 -07:00
Nik Everett 513ac4471a Tests: Clean up the tar tests
1. Move `clean_before_test` to the first test so its more explicit.
2. Move `skip_not_tar_gz` to setup because it was run first in every test.
3. Remove calls to `run` that only check the status. Its simpler to just
execute the command. Its better because std-out will be captured and replayed
on error.
4. Switch from `su` to `sudo` because `su` was breaking `bats`'s error
reporting.
2015-08-17 10:31:08 -07:00
David Pilato b93a75f309 [doc] fix asciidoc format 2015-08-17 17:45:18 +02:00
Nik Everett 0314ccd2a0 Merge pull request #12940 from nik9000/unflake_test
Make a test less flakey
2015-08-17 08:26:42 -07:00
Nik Everett 4307e165c1 Tests: Make a test less flakey
EsExecutorsTests had a test that was failing spuriously due to threadpools
being threadpools. This weakens the assertions that the test makes to what
should always be true.
2015-08-17 08:20:18 -07:00
Alexander Reelsen f95a509538 Release script: Set versions for non inherited projects
rest-api-spec and dev-tools dont have the elasticsearch-parent
set as a parent and thus need a separate mvn run to change the
plugin version.
2015-08-17 16:40:29 +02:00
Nik Everett c908c582c2 Merge pull request #12909 from nik9000/systemd_start
Fix variable substitution for OS's using systemd
2015-08-17 07:14:20 -07:00
Alexander Reelsen a75adaaaaa Prepare release script: fix python compilation error 2015-08-17 15:52:36 +02:00
Alexander Reelsen d1c93fb573 Release: Remove aws-maven plugin/improve release docs
In order to have consistent deploys across several repositories,
we should deploy to sonatype first, then mirror those contents,
and then upload to s3.

This means, the aws wagon is not needed anymore.
2015-08-17 15:39:22 +02:00
Simon Willnauer ea03e5dd17 Add build short hash to the download manager headers to identify staging builds
It might turn out to be useful to have the actual commit hash of the version we are
looking for if our download manager can just redirect to the right staging repository.
2015-08-17 15:17:45 +02:00
Tanguy Leroux 8e052f0da2 Make platform specific assumptions in OS & Process probes tests 2015-08-17 14:47:23 +02:00
Boaz Leskes e424701819 Merge pull request #12922 from xuzha/xu-network
Refactor, remove _node/network and _node/stats/network. 

Closes #12889 , Closes #12922
2015-08-17 14:46:05 +02:00
Boaz Leskes bb34b2fd85 Elasticsearch bootstrap help shouldn't mention plugins
We have a dedicated entry point for that.

Closes #12933
2015-08-17 14:36:13 +02:00
Boaz Leskes 6c4ef32160 Test: un-mute PluginManagerUnitTests.testSimplifiedNaming 2015-08-17 14:14:50 +02:00
Simon Willnauer 9608fe9dff Fix test - and don't use URL.equals() 2015-08-17 14:14:23 +02:00
Boaz Leskes cd9552eb07 Test: mute PluginManagerUnitTests.testSimplifiedNaming 2015-08-17 14:02:09 +02:00
Alexander Reelsen c3b3b0e6f8 Release: Replace python search/replace with mvn versions:set plugin
mvn has a versions:set plugin, that can be easily invoked and does not
require the python script to parse the files and hope that there are no
other snapshot mentions.
2015-08-17 11:47:24 +02:00
Martijn van Groningen e649d96eb1 Merge pull request #12881 from martijnvg/allow_for_customable_query_cache
Allow a plugin to supply its own query cache implementation
2015-08-17 11:13:45 +02:00
Martijn van Groningen 5123167a99 test: added a unit test for #12261 2015-08-17 11:10:05 +02:00
Simon Willnauer e7d075f6ae Add elasticsearch version as a prefix for the staging URL
This is purely for maintainance reasons since it easier to see if we can drop
certain stageing urls if we have the version next to the hash.
I also removed the gpg passphrase from the example URL since it's better to get prompted?
2015-08-17 11:05:49 +02:00
Simon Willnauer 59f390f5d0 Endless recovery loop with `indices.recovery.file_chunk_size=0Bytes`
This is caused by sending the same file to the chunk handler with offset
`0` which in-turn opens a new outputstream and waits for bytes. But the next round
will send 0 bytes again with offset 0. This commit adds some checks / validators that those
settings are positive byte values and fixes the RecoveryStatus to throw an IAE if the same file
is opened twice.
2015-08-17 09:58:17 +02:00
Martijn van Groningen 12c40fa58a Allow plugins to register custom `QueryCache` implementations. 2015-08-17 09:55:32 +02:00
Adrien Grand 7765b0497d Merge pull request #12497 from oyiadom/master
Update BulkProcessor.java
2015-08-17 09:43:41 +02:00
Adrien Grand 1f2345db34 Merge pull request #12913 from xuzha/xu-exception
Validate class before cast.
2015-08-17 09:38:59 +02:00
Harish Kayarohanam 3976854ada Improve error handling of ClassCastException in terms aggregations.
What is the problem we are trying to solve ?
===========================================

When we are doing aggregations against a field name as shown in
https://github.com/HarishAtGitHub/elasticsearch-tester/blob/master/12135.py#L37-L46

search = {
           "aggs": {
             "NAME": {
               "terms": {
                 "field": "ip_str",
                 "size": 10
               }
             }
           }
         }
and when the field "ip_str" has values of different types in different indices
. say one is of type StringTerms type and other is of IP(LongTerms type) then
the aggregation fails as the types do not match(incompatible).
The failure throws a class cast exception as follows:
{
   "error": {
      "root_cause": [],
      "type": "reduce_search_phase_exception",
      "reason": "[reduce] ",
      "phase": "query",
      "grouped": true,
      "failed_shards": [],
      "caused_by": {
         "type": "class_cast_exception",
         "reason": "org.elasticsearch.search.aggregations.bucket.terms.LongTerms$Bucket cannot be cast to org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket"
      }
   },
   "status": 503
}

which is hard to understand . User cannot infer anything about the cause of the problem and what he should do from seeing the
class cast exception.

What can be the possible solution ?
===================================

Make the exception more readable by showing him the root cause of the problem so that he can
understand which area actually caused the problem, so that he can take necessary steps further.

Code Analysis
=============

Debugging code shows that:
the query /{indices}/_search?search_type=count involves two phases
1) search phase
***************
     searchService.sendExecuteQuery(...) [Ref: TransportSearchCountAction]

     what happens here ?
        the phase 1, which is the search phase goes without error.
        In this phase the shards for the given indexes are collected and the search is done on all asynchronously
        and finally collected in the variable "firstResults" and given to meger phase.

        [Flow: .... -> TransportSearchTypeAction -> method performFirstPhase]

2) merge phase
**************
     searchPhaseController.merge(...firstResults...) [Ref: TransportSearchCountAction]

     what happens here ?
        the "firstresults" QuerySearchResults are now to be aggregated and combined.

        [Flow: SearchPhaseController.merge(...) -> ..... -> InternalTerms.doReduce(...)]

the phase 1, which is the search phase goes without error.
The problem comes in phase 2, which is merge phase.
Now the individual term buckets are available.
As per the test case , there are two indices cast and cast2, so by default 10 shards.
cast has ip_str of type StringTerms
cast2 has ip_str of type ip which is actually LongTerms

so here two types of Buckets exist. StringTerms_Bucket and LongTerms_Bucket.
Now the aggregation is to be put inside the BucketPriorityQueue(size 2: as out of 10, 2 has hits) finally.
(docs of PriorityQueue: https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/util/PriorityQueue.html#insertWithOverflow(T))

Now first the LongTerms$Bucket is put inside.
then the StringTerms$Bucket is to be put in.
This is the area where exception is thrown. What happens is when adding the StringTerms$Bucket now it has to
goes through the code "lessThan(element, heap[1])"
which finally calls

---------------------------------------------------------------------------------------------
|      StringTerms$Bucket.compareTerms(other)  <---------------- Area of exception          |
|                                                                                           |
--------------------------------------------------------------------------------------------

where when comparing one to other a type cast is done and it fails as StringTerms$Bucket and LongTerms$Bucket are
incompatible.

Approach to solve:
==================

The best way is to make user understand that the problem is when reducing/merging/aggregating the buckets which came as a result of
querying different shards, so that this will make them infer that the problem is because the values of the fields are of different types.
The message is also user friendly and much better than the indecipherable classcastexception.

The only place to infer correctly that the aggregation has failed is in the place where aggregations take place.
so

at InternalTerms.java -> (BucketPriorityQueue)ordered.insertWithOverflow(b);

so here I can throw AggregationExecutionException saying it is because the buckets are of different
types.

But when can I infer at this point that the failure is due to mismatch of types of buckets ???
it can be possible only if at this point it is informed that the problem which occurred deep inside
is due to buckets that were incomparable.
so from just a classCastException we cannot make such a pointed exact inference, because
as class cast exception can be due to a number of scenarios and at a number of places.

so unless we inform the exact problem to InternalTerms it will not be able to infer properly.
so infer the classCastException at the compareTerms function itself that it is a IncomparableTermBucktesTypeException.
This is the best place to infer classCastException as this the place which generated the exception.
Best inference of exceptions can be done only at the source/origin of the exception.

so IncomparableTermBucktesTypeException to InternalTerms-> will make it infer and conclude on why
aggregation failed and give best information to user.

Close #12821
2015-08-17 09:35:32 +02:00