Commit Graph

7714 Commits

Author SHA1 Message Date
Simon Willnauer 30ca937dbb [TEST] Stabelize ConcurrentPercolatorTests after # shard randomization 2014-03-13 21:09:13 +01:00
Simon Willnauer 10a1fcb65a [TEST] Add mapping to use an actual stopword analyzer
This test was added when the default  analyzer was filtering stopwords. But since
1.0 the default analyzer doesn't filter stopwords
2014-03-13 20:40:25 +01:00
Bill Hwang 2e56253293 Added static analysis profile to pom.xml
Added pmd, findbug as well as site generation logic to top pom.xml file
Created customized pmd ruleset
2014-03-13 12:23:07 -07:00
Adrien Grand 5821fa042c Cardinality aggregation.
This aggregation computes unique term counts using the hyperloglog++ algorithm
which uses linear counting to estimate low cardinalities and hyperloglog on
higher cardinalities.

Since this algorithm works on hashes, it is useful for high-cardinality fields
to store the hash of values directly in the index, which is the purpose of
the new `murmur3` field type. This is less necessary on low-cardinality
string fields because the aggregator is smart enough to only compute the hash
once per unique value per segment thanks to ordinals, or on numeric fields
since hashing them is very fast.

Close #5426
2014-03-13 19:19:56 +01:00
Adrien Grand 4e5714b31f Remove AggregationContext.cacheRecycler(). 2014-03-13 19:02:47 +01:00
Adrien Grand 40d67c7e09 Make aggregations CacheRecycler-free.
Aggregations were still using CacheRecycler on the reduce phase. They are now
using page-based recycling for both the aggregation phase and the reduce phase.

Close #4929
2014-03-13 16:15:38 +01:00
Simon Willnauer 8a1e77c50c Allow edit distances > 2 on FuzzyLikeThisQuery
Due to a regression edit distances > 2 threw exceptions after unifying
the fuzziness factor in Elasticsearch `1.0`. This commit brings back the
expceted behavior.

Closes #5292
2014-03-13 14:21:15 +01:00
javanna 20d5481ac6 [TEST] Randomized number of replicas used for indices created during tests
Introduced two levels of randomization for the number of replicas when running tests:

1) through the existing random index template, which now sets a random number of replicas that can either be 0 or 1 that is shared across all the indices created in the same test method unless overwritten

2)  through createIndex and prepareCreate methods, between 0 and the number of data nodes available, similar to what happens using the indexSettings method, which changes for every createIndex or prepareCreate unless overwritten (overwrites index template for what concerns the number of replicas)

Added the following facilities to deal with the random number of replicas:

- made it possible to retrieve how many data nodes are available in the `TestCluster`
- added common methods similar to indexSettings, to be used in combination with createIndex and prepareCreate method and explicitly control the second level of randomization: numberOfReplicas, minimumNumberOfReplicas and maximumNumberOfReplicas

Tests that specified the number of replicas have been reviewed:
- removed manual replicas randomization where present, replaced with ordinary one that's now available
- adapted tests that didn't need a specific number of replicas to the new random behaviour
- also done some more cleanup, used common methods like assertAcked, ensureGreen, refresh, flush and refreshAndFlush where possible
2014-03-13 12:52:41 +01:00
Florian Schilling 81e537bd5e ContextSuggester
================

This commit extends the `CompletionSuggester` by context
informations. In example such a context informations can
be a simple string representing a category reducing the
suggestions in order to this category.

Three base implementations of these context informations
have been setup in this commit.

- a Category Context
- a Geo Context

All the mapping for these context informations are
specified within a context field in the completion
field that should use this kind of information.
2014-03-13 11:24:46 +01:00
Martijn van Groningen aecadfcc61 Invoke postCollection on aggregation collectors.
Also cleanup how facet and aggs collector are used inside the QueryCollector

Closes #5387
2014-03-13 17:21:07 +07:00
Martijn van Groningen ca65a2ee9e [TESTS] Added AwaitsFix 2014-03-13 15:35:08 +07:00
Martijn van Groningen 669a7ec498 For unicast zen discovery don't overwrite a ping response for a node if the previous ping response has a set master and the current response hasn't.
Per single main ping request we maintain the received ping response per node. Each node level ping response is mapped into that. If from a previous node level ping request the response has already been set for a node, it will be overwritten. We give higher value to the latest response. This change makes sure that this doesn't happen if the previous response has a set master and the current response hasn't a set master. Otherwise a node will lose the fact that another node has elected itself as master, the result of that would be that there would multiple master nodes in a single cluster.

Closes #5413
2014-03-13 15:25:05 +07:00
Kurt Hurtado ca6a2bb790 [DOCS] Various aggregation doc fixes 2014-03-13 09:05:25 +01:00
Mohsin Husen 9fcee312dc [DOCS] Added spring data elasticsearch integration 2014-03-13 08:44:17 +01:00
Martijn van Groningen 77abf027af [TESTS] Fix incorrect discovery options. 2014-03-13 14:11:01 +07:00
Lee Hinman e7ddef9974 compare with .bytes() instead of ByteSizeValue.equals() in breaker service 2014-03-12 20:47:14 -06:00
Costin Leau 76e92ffbea Disable by default plugins isolation 2014-03-13 01:11:06 +02:00
Adrien Grand e3b87926bf [Build] Remove XReferenceManager and XSearcherManager from forbidden-apis' exclude list.
These classes have been removed on the upgrade to Lucene 4.7.
2014-03-12 15:06:39 +01:00
Adrien Grand 39ebf813ee [Test] Fix missing release. 2014-03-12 15:06:39 +01:00
Simon Willnauer aa43c7a69e [TEST] stabelize SearchStatsTests 2014-03-12 12:58:37 +01:00
Derek Slife 0236a77c0b Corrected issue with throttle type setting not respected upon updates 2014-03-12 10:50:22 +01:00
Shay Banon 965620c3ff Blocking writes on a tribe node creates a "blocks" tribe
fixes #5389
2014-03-12 10:30:18 +01:00
Martijn van Groningen d05b4ef769 Keep track of the exceptions instead of just flagging that an exception has occured. 2014-03-12 16:27:57 +07:00
Martijn van Groningen c841aa296a Added more logging 2014-03-12 14:51:11 +07:00
Igor Motov 39d2377be6 Use patched version of TermsFilter to prevent using wrong cached results
See LUCENE-5502

Closes #5363
2014-03-11 20:48:22 -04:00
javanna 5378fd7901 [TEST] fixed SimpleQueryTests#testMultiMatchQuery check for shard failures
It can happen that not all shards are ready, thus we won't have a total failure, but we do need to check that we have at least a failure. Checked also the message of the failure.
2014-03-12 00:55:17 +01:00
Igor Motov 7703183cef [TEST] Make sure that a snapshot is completed before trying to modify repository 2014-03-11 17:17:36 -04:00
Simon Willnauer 7e0beead9d [TEST] Beef up SearchStatsTests 2014-03-11 11:36:22 +01:00
Simon Willnauer bb83c823b6 [TEST] Fix SearchStatsTests to have all shards allocated
If randomization brings up a single shard per index in this test
we might run our searches on only one index which causes the assertions
to fail afterwards that's why we need to wait until everything is alloated.
2014-03-11 11:36:22 +01:00
Costin Leau 9624b215fb Add docs for plugin isolation 2014-03-11 12:32:58 +02:00
Costin Leau 5182a3c3fe Add randomized plugin isolation to test infrastructure
fix #5296
2014-03-11 11:45:07 +02:00
Martijn van Groningen a465d97adb Changed debug log to warn for when IW#rollback fails with an exception other than AlreadyClosedException 2014-03-11 13:01:02 +07:00
Igor Motov a0206acbc6 Improve speed of running snapshot cancelation
The delete snapshot operation on a running snapshot should cancel the snapshot execution. However, it interrupts the snapshot only when currently running snapshot files are completely copied, which might take a long time for large files.

Closes #5242
2014-03-10 20:24:04 -04:00
Andrew Selden ba875c3b47 Merge pull request #5360 from aleph-zero/issue-111
REST Testing framework enhancement
2014-03-10 15:24:57 -07:00
Andrew Selden 673c282abd REST Testing framework enhancement
Adding operators 'lte' and 'gte' to our REST test framework. These
operators test for, respectively, less-than-or-equal and
greater-than-or-equal.
2014-03-10 15:08:43 -07:00
Boaz Leskes b7a95d11a7 Introduced VersionType.FORCE & VersionType.EXTERNAL_GTE
Also added "external_gt" as an alias name for VersionType.EXTERNAL , accessible for the rest layer.

Closes #4213 , Closes #2946
2014-03-10 21:07:17 +01:00
uboness bf8d8dc33e Fixed a bug in date_histogram aggregation parsing
- pre_zone_adjust_large_interval was not parsed properly
 - added tests for pre_zone and pre_zone_adjust_large_interval
 - changed DateHistogram#getBucketByKey(String) to support date formats (next to numeric strings)
 - added randomized testing for fetching the bucket by key in date_histogram tests
 - added missing "format" support in DateHistogramBuilder

 Closes #5375
2014-03-10 19:39:34 +01:00
javanna 48f6df3f8e [TEST] Raise shardSize parameter if number of shards is > 5 2014-03-10 18:35:29 +01:00
javanna 045e43163f [TEST] fixed SimpleQueryTests#testDateRangeInQueryString to specify the mappings upfront and wait for green 2014-03-10 17:50:22 +01:00
Simon Willnauer af4c112907 [TEST] Raise shardSize parameter if number of shards is > 5 2014-03-10 17:39:27 +01:00
Martijn van Groningen e9bc7a8cd1 [TEST] Moving mapping creation to create index call, this will make sure that in the test the mapping is always available on all nodes. 2014-03-10 19:21:54 +07:00
javanna d5aaa90f34 [TEST] Randomized number of shards used for indices created during tests
Introduced two levels of randomization for the number of shards (between 1 and 10) when running tests:

1) through the existing random index template, which now sets a random number of shards that is shared across all the indices created in the same test method unless overwritten

2) through `createIndex` and `prepareCreate` methods, similar to what happens using the `indexSettings` method, which changes for every `createIndex` or `prepareCreate` unless overwritten (overwrites index template for what concerns the number of shards)

Added the following facilities to deal with the random number of shards:
- `getNumShards` to retrieve the number of shards of a given existing index, useful when doing comparisons based on the number of shards and we can avoid specifying a static number. The method returns an object containing the number of primaries, number of replicas and the total number of shards for the existing index

- added `assertFailures` that checks that a shard failure happened during a search request, either partial failure or total (all shards failed). Checks also the error code and the error message related to the failure. This is needed as without knowing the number of shards upfront, when simulating errors we can run into either partial (search returns partial results and failures) or total failures (search returns an error)

- added common methods similar to `indexSettings`, to be used in combination with `createIndex` and `prepareCreate` method and explicitly control the second level of randomization: `numberOfShards`, `minimumNumberOfShards` and `maximumNumberOfShards`. Added also `numberOfReplicas` despite the number of replicas is not randomized (default not specified but can be overwritten by tests)

Tests that specified the number of shards have been reviewed and the results follow:
- removed number_of_shards in node settings, ignored anyway as it would be overwritten by both mechanisms above
- remove specific number of shards when not needed
- removed manual shards randomization where present, replaced with ordinary one that's now available
- adapted tests that didn't need a specific number of shards to the new random behaviour
- fixed a couple of test bugs (e.g. 3 levels parent child test could only work on a single shard as the routing key used for grand-children wasn't correct)
- also done some cleanup, shared code through shard size facets and aggs tests and used common methods like `assertAcked`, `ensureGreen`, `refresh`, `flush` and `refreshAndFlush` where possible
- made sure that `indexSettings()` is always used as a basis when using `prepareCreate` to inject specific settings
- converted indexRandom(false, ...) + refresh to indexRandom(true, ...)
2014-03-10 13:01:52 +01:00
Boaz Leskes bb63b3fa61 Improve error detection in geo_filter parsing
Relates to #5370
2014-03-10 12:22:41 +01:00
Simon Willnauer fbb8c0fafa [DOCS] Add `coming` tag to multiple rescores
Closes #5365
2014-03-10 09:27:44 +01:00
Martijn van Groningen 502f24d7e4 Also make use of the thread local memory reuse for a document being percolated with nested objects.
The memory index will only be reused for the root doc, since most of the times that will be the biggest document.
2014-03-10 13:53:02 +07:00
Martijn van Groningen 52d099dfae Don't throw UOE in PercolateContext#from and #size
Create mapping in PercolatorTests#testPercolateSorting_unsupportedField in create index call instead of lazily via index call.
2014-03-10 11:59:24 +07:00
Shay Banon 8cfff9d796 jackson: upgrade to 2.3.2 2014-03-09 23:40:43 +01:00
Shay Banon 3ea746c45e mvel: upgrade to 2.1.9 2014-03-09 22:56:09 +01:00
Clinton Gormley 8383f271d1 [DOCS] Updated the Perl docs 2014-03-09 19:45:16 +01:00
Lee Hinman 51f869cfc2 Increase RamAccountingTermsEnum flush size from 1mb to 5mb
Reduces the number of logs when TRACE logging is turned on for the
circuit breaker
2014-03-07 16:30:43 -07:00