2208 Commits

Author SHA1 Message Date
Simon Willnauer
adb5c19849 [CLIENT] Remove unnecessary intermediate interfaces
Client, ClusterAdminClient and IndicesAdminClient had corresponding
intermediate `internal` interfaces that are unnecessary and cause
a lot of casting. This commit removes the intermediate interfaces
and uses the super interfaces directly.

This commit also adds Releaseable to `Node` and `Client` in order to
be used with utilities like try / with.

Closes #4355
Closes #6517
2014-06-17 12:18:37 +02:00
Simon Willnauer
e198c58a6b [TEST] Use test.bwc.version if compatibility version is not present 2014-06-17 12:16:10 +02:00
Adrien Grand
a06fd46a72 [Benchmark] Fix TermsAggregationSearchBenchmark: The ordinals execution mode doesn't exist anymore. 2014-06-17 01:46:02 +02:00
Shay Banon
0427e49b5d [TEST] verify all threads created by node and client have the node name
closes #6516
2014-06-16 21:50:12 +02:00
Martijn van Groningen
612f4618e7 [TEST] wait for ongoing recoveries to finish. Flush fails on shards otherwise. 2014-06-16 17:01:38 +02:00
Simon Willnauer
61eac483ed [TEST] Fix test cluster naming
This commit renames `TestCluster` -> `InternalTestCluster` and
`ImmutableTestCluster` to `TestCluster` for consistency. This also
makes `ExternalTestCluster` and `InternalTestCluster` consistent
with respect to their execution environment.

Closes #6510
2014-06-16 15:14:54 +02:00
Lee Hinman
0f180bd5fd [TEST] Add test for accessing _score in scripts 2014-06-16 14:12:21 +02:00
Simon Willnauer
4dfa822e1b [TEST] Add basic Backwards Compatibility Tests
This commit add a basic infrastructure as well as primitive tests
to ensure version backwards compatibility between the current
development trunk and an arbitrary previous version. The compatibility
tests are simple unit tests derived from a base class that starts
and manages nodes from a provided elasticsearch release package.

Use the following commandline executes all backwards compatiblity tests
in isolation:

```
mvn test -Dtests.bwc=true -Dtests.bwc.version=1.2.1 -Dtests.class=org.elasticsearch.bwcompat.*
```

These tests run basic checks like rolling upgrades and
routing/searching/get etc. against the specified version. The version
must be present in the `./backwards` folder as
`./backwards/elasticsearch-x.y.z`
2014-06-16 12:40:43 +02:00
Simon Willnauer
93b56eb004 [TEST] Force flush even if not needed to ensure successful shards is greater than 0 2014-06-16 11:00:34 +02:00
Simon Willnauer
76fab9d42a [TEST] consistently omit norms in test otherwise scoring will be dependent on merges etc. 2014-06-16 10:54:12 +02:00
Simon Willnauer
6d77a248fb [TEST] Stabelize test - wait for yellow to ensure all primaries are allocated 2014-06-14 21:52:15 +02:00
Adrien Grand
7bcabf9481 Fielddata: Don't expose hashes anymore.
Our field data currently exposes hashes of the bytes values. That takes roughly
4 bytes per unique value, which is definitely not negligible on high-cardinality
fields.

These hashes have been used for 3 different purposes:
 - term-based aggregations,
 - parent/child queries,
 - the percolator _id -> Query cache.

Both aggregations and parent/child queries have been moved to ordinals which
provide a greater speedup and lower memory usage. In the case of the percolator
it is used in conjunction with HashedBytesRef to not recompute the hash value
when getting resolving a query given its ID. However, removing this has no
impact on PercolatorStressBenchmark.

Close #6500
2014-06-13 23:05:02 +02:00
Adrien Grand
232394e3a8 Aggregations: Remove ordinals execution hint.
This was how terms aggregations managed to not be too slow initially by caching
reads into the terms dictionary using ordinals. However, this doesn't behave
nicely on high-cardinality fields since the reads into the terms dict are
random and this execution mode loads all unique terms into memory.

The `global_ordinals` execution mode (default since 1.2) is expected to be
better in all cases.

Close #6499
2014-06-13 23:02:20 +02:00
Adrien Grand
fbd7c9aa5d Aggregations: Fix reducing of range aggregations.
Under some rare circumstances:
 - local transport,
 - the range aggregation has both a parent and a child aggregation,
 - the range aggregation got no documents on one shard or more and several
   documents on one shard or more.
the range aggregation could return incorrect counts and sub aggregations.

The root cause is that since the reduce happens in-place and since the range
aggregation uses the same instance for all sub-aggregation in case of an
empty bucket, sometimes non-empty buckets would have been reduced into this
shared instance.

In order to avoid similar bugs in the future, aggregations have been updated
to return a new instance when reducing instead of doing it in-place.

Close #6435
2014-06-13 23:01:43 +02:00
Martijn van Groningen
52be3748ff [TEST] Fix assert 2014-06-13 18:03:25 +02:00
javanna
b9ffb2b0a5 Java API: Make sure afterBulk is always called in BulkProcessor after beforeBulk
Moved BulkProcessor tests from BulkTests to newly added BulkProcessorTests class.
Strenghtened BulkProcessorTests by adding randomizations to existing tests and new tests for concurrent requests and expcetions.
Also made sure that afterBulk is called only once per request if concurrentRequests==0.

Closes #5038
2014-06-13 17:40:06 +02:00
Boaz Leskes
44097b358d [Test] set search request size testGeohashCellFilter
The default of 10 is not good enough as previously thought.
2014-06-13 16:11:13 +02:00
Martijn van Groningen
59ff05020f [TEST] Removed incorrect assertion (it is expected that the flush doesn't execute on all shard copies, because we don't wait for green status) 2014-06-13 12:25:43 +02:00
mikemccand
9620aa315e [TEST] Add FailureMarker to test listeners so -Dtests.failfast works 2014-06-13 06:04:33 -04:00
Martijn van Groningen
77e0429089 [TEST] Verify the flush reponse 2014-06-13 11:40:05 +02:00
Boaz Leskes
7fb16c783d Added caching support to geohash_filter
Caching is turned off by default.

Closes #6478
2014-06-12 22:19:34 +02:00
Alex Ksikes
35cba50fce More Like This Query: creates only one MLT query per field for all queried items.
Previously, one MLT query per field was created for each item. One issue with
this method is that the maximum number of selected terms was equal to the
number of items times 'max_query_terms'. Instead, users should have direct control
over the maximum number of selected terms allowed, regardless of the number of
queried items.

Another issue related to the previous method is that it could lead to the
selection of rather uninteresting terms, that because they were found in a
particular queried item. Instead, this new procedure enforces the selection of
interesting terms across ALL items, not within each item. This could lead to
search results where the best matching items share commonalities amongst the
best characteristics of all the items.

Closes #6404
2014-06-12 14:19:33 +02:00
Simon Willnauer
5575ba1a12 [BUILD] Check for tabs and nocommits in the code on validate
This commit adds checks for nocommit and tabs in the source code.
The task is executed during the validate phase and can be disabled via
`-Dvalidate.skip`
2014-06-12 11:11:23 +02:00
Clinton Gormley
673ef3db3f The StemmerTokenFilter had a number of issues:
* `english` returned the slow snowball English stemmer
* `porter2` returned the snowball Porter stemmer (v1)
* `portuguese` was used twice, preventing the second version from working

Changes:

* `english` now returns the fast PorterStemmer (for indices created from v1.3.0 onwards)
* `porter2` now returns the snowball English stemmer (for indices created from v1.3.0 onwards)
* `light_english` now returns the `kstem` stemmer (`kstem` still works)
* `portuguese_rslp` returns the PortugueseStemmer
* `dutch_kp` is a synonym for `kp`

Tests and docs updated

Fixes #6345
Fixes #6213
Fixes #6330
2014-06-11 12:30:16 +02:00
Clinton Gormley
c25de57d5d Tests: Fixed CompletionSuggester test which relied on a bug 2014-06-10 21:34:03 +02:00
Clinton Gormley
bb15def36e Stats: Bugfixes and enhancements to indices stats API
Bugs:
* "groups" and "types" were being ignored
* "completion_fields" as wildcards were not being resolved to fieldnames

Enhancements:
* Made "groups" and "types" support wildcards
* Added missing tests

Closes #6390
2014-06-10 17:35:49 +02:00
Martijn van Groningen
38be1e0dde Aggregations: if maxOrd is 0 then use noop collector
Before the OrdinalsCollector was used and this leads to a ArrayIndexOutOfBoundsException

Closes #6413
2014-06-10 09:14:06 +02:00
Martijn van Groningen
5e408f3d40 Change the top_hits to be a metric aggregation instead of a bucket aggregation (which can't have an sub aggs)
Closes #6395
Closes #6434
2014-06-10 09:09:50 +02:00
javanna
ed5b49a5be [TEST] Added backwards compatibility check to control whether to enable client nodes or not within TestCluster
Our REST backwards compatibility tests need to be able to disable client nodes within the TestCluster when running older tests that assume client nodes are not around.
2014-06-07 15:39:56 +02:00
mikemccand
bb8a666b6d make test less evil 2014-06-07 04:15:52 -04:00
Boaz Leskes
a06b84d392 [Test] Enabled trace logging to testAutoGenerateIdNoDuplicates
also increased iterations some, to increase chance of identifying bad shards
2014-06-07 09:47:15 +02:00
Boaz Leskes
b454f64c57 Bulk request which try and fail to create multiple indices may never return
This is caused by an NPE in the error handling code. All is well if only 1 index creation fails (or none).

Closes #6436
2014-06-06 23:10:42 +02:00
markharwood
724129e6ce Aggregations optimisation for memory usage. Added changes to core Aggregator class to support a new mode of deferred collection.
A new "breadth_first" results collection mode allows upper branches of aggregation tree to be calculated and then pruned
to a smaller selection before advancing into executing collection on child branches.

Closes #6128
2014-06-06 15:59:51 +01:00
mikemccand
2a6468efbd make this new test a bit less stressful for nightly; catch FlushNotAllowedEngineException 2014-06-05 13:57:59 -04:00
javanna
21772e0bf9 Scripts: exposed _uid, _id and _type fields as stored fields (_fields notation)
The _uid field wasn't available in a script despite it's always stored. Made it available and made available also _id and _type fields that are deducted from it.

Closes #6406
2014-06-05 17:16:55 +02:00
mikemccand
2ad8a60532 add versioning test 2014-06-05 09:38:22 -04:00
Simon Willnauer
288eb3d803 [TEST] remove trace logging 2014-06-04 10:10:38 +02:00
Boaz Leskes
ef5d64c73b [Test] Extended IndexActionTests.testAutoGenerateIdNoDuplicates to check both with and without a specific type
The test also captures the first error but continues to run searches in order to gather more information before failing.
2014-06-03 21:55:10 +02:00
Colin Goodheart-Smithe
b9f4d44b14 Aggregations: Adds GeoBounds Aggregation
The GeoBounds Aggregation is a new single bucket aggregation which outputs the coordinates of a bounding box containing all the points from all the documents passed to the aggregation as well as the doc count. Geobound Aggregation also use a wrap_logitude parameter which specifies whether the resulting bounding box is permitted to overlap the international date line.  This option defaults to true.

This aggregation introduces the idea of MetricsAggregation which do not return double values and cannot be used for sorting.  The existing MetricsAggregation has been renamed to NumericMetricsAggregation and is a subclass of MetricsAggregation.  MetricsAggregations do not store doc counts and do not support child aggregations.

Closes #5634
2014-06-03 15:59:56 +01:00
Adrien Grand
7ab99de483 Routing: Restore shard routing.
Routing has been inadvertly changed in #5562 resulting in documents going to
different shards in 1.2. This is a terrible bug because an indexing request
would not necessarily go to the same shard anymore, potentially leading to
duplicates.

Close #6391
2014-06-03 16:37:54 +02:00
Alex Ksikes
9797e343aa More Like This Query: values of a multi-value fields are compared at the same level.
Previously, More Like This would create a new mlt query for each value of a
multi-value field. This could result in all the values of the field to be
selected, which defeats the purpose of More Like This. Instead, the correct
behavior is to generate only one mlt query for all the values of the field.
This commit provides the correct behavior for More Like This DSL. The fix for
More Like This API will be coming in another commit.

Closes #6310
2014-06-03 13:43:51 +02:00
javanna
90b1e6a461 [TEST] make sure that the -Dtests.rest.blacklist parameter works on windows too
Some reserved characters need to be replaced in the test section names, which gets parsed as a path although it isn't a filename
2014-06-03 09:23:37 +02:00
Britta Weber
125e0c16cd Object and Type parsing: Fix include_in_all in type
include_in_all can also be set on type level (root object).
This fixes a regression introduced  in #6093

closes #6304
2014-06-02 17:48:19 +02:00
Colin Goodheart-Smithe
a23e4aefaa Geo: Issue with polygons near date line
If a polygon is constructed which overlaps the date line but has a hole which lies entirely one to one side of the date line, JTS error saying that the hole is not within the bounds of the polygon because the code which splits the polygon either side of the date line does not add the hole to the correct component of the final set of polygons.  The fix ensures this selection happens correctly.

Closes #6179
2014-06-02 15:03:32 +01:00
Martijn van Groningen
f2641d29ae [TEST] Added sort duel between a single shard index and a multi shard index. 2014-06-02 14:16:55 +02:00
Martijn van Groningen
43b21719f5 [TEST] size should start from 1, top_hits aggregation doesn't support size <= 0 2014-06-02 13:21:13 +02:00
Simon Willnauer
3b31f25624 [TEST] Ensure cluster size reflected in the cluster state
We perform some management operations that require the cluster to be
consistent with respect to the number of nodes in the cluster state
/ visible to the master in order to rely on the ack mechanism. This
only applies to the test infrastructure when nodes are not explicitly
started / stopped as well as while tearing down the cluster and wiping
indices after the tests.
2014-06-02 11:57:32 +02:00
javanna
e8995ecaa7 [TEST] speed up HighlightSearchTests a bit
Randomize rewrite methods instead of trying them all when highlighting multi term queries with postings highlighter
Rely on search type randomization and remove all the explicit setSearchType calls as they are not needed anymore
Remove explicit `.from`, `.size` and `.explain`, not needed and might slow tests down (especially explain)
2014-05-31 16:29:53 +02:00
Clinton Gormley
46a67b638d Parent/Child: Added min_children/max_children to has_child query/filter
Added support for min_children and max_children parameters to
the has_child query and filter. A parent document will only
be considered if a match if the number of matching children
fall between the min/max bounds.

Closes #6019
2014-05-30 19:38:39 +02:00
Martijn van Groningen
760cee7c24 Aggregations: Take the 'from' into account when getting a fetched hit (InternalSearchHit). Hits before the 'from' are included in each shard result. 2014-05-30 16:23:28 +02:00