2208 Commits

Author SHA1 Message Date
Matt Weber
4df4506875 Use URI vs URL accessing File from classpath.
URL escapes special characters such as spaces which
causes the resource to not be found when used to create
a File object.  Use URI.

Closes 
2014-04-28 18:49:55 +02:00
javanna
51ba3ca220 [TEST] made sure nodeSettings method gets called for every node type, not only data nodes in case numDataNodes is specified.
This fixes a test ZenUnicastDiscoveryTests when running in network mode
2014-04-28 18:31:47 +02:00
javanna
a414e4f2f3 [TEST] randomly introduced a client node within test cluster
The default number of clients nodes is randomized between 0 and 1, applied to all cluster scopes (global, suite and test). Can be changed through the newly added `@ClusterScope#numClientNodes`.

In our tests we currently refer to nodes in a generic way. All the tests that either stop or start nodes rely on the fact that those nodes hold data though. Made that clearer as that becomes more important when introducing other types of nodes within the test cluster. Reflected this by adapting and renaming the following methods in `TestCluster`:

- ensureAtLeastNumNodes to ensureAtLeastNumDataNodes
- ensureAtMostNumNodes to ensureAtMostNumDataNodes
- stopRandomNode to stopRandomDataNode

and the following ones in `ElasticsearchIntegrationTest`:

- allowNodes to allowDataNodes
- dataNodes to numDataNodes.
- @ClusterScope#numNodes to numDataNodes
- @ClusterScope#minNumNodes to minNumDataNodes
- @ClusterScope#maxNumNodes to maxNumDataNodes

Added facilities to be able to deal with data nodes specifically, like for instance retrieve a client to a data node, or retrieve an instance of a class through guice only from data nodes.

Adapted existing tests to successfully run although there's a node client around.

Fixed _cat/allocation REST tests to make disk.total, disk.avail and disk.percent optional as client nodes won't return that info.

Closes 
2014-04-28 16:31:36 +02:00
Martijn van Groningen
17a5575757 Disabled parent/child queries in the delete by query api.
It wasn't properly implemented and could lead to a shard being failed and not able to recover.

Closes  
2014-04-28 20:12:54 +07:00
Adrien Grand
22cbdd930c [TEST] Fix test bug in MultiOrdinalsTests. 2014-04-28 13:56:01 +02:00
Robert Muir
8e0a479316 Upgrade to Lucene 4.8
Closes 
2014-04-28 06:45:50 -04:00
Simon Willnauer
f285ffc610 Multi value handling in decay functions
Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.

Relates to 
closes 
2014-04-28 11:55:32 +02:00
Britta Weber
f993945e5c Move SortMode to org.elasticsearch.search and rename to MultiValueMode 2014-04-28 11:55:32 +02:00
Shay Banon
6b2c1d0f62 spelling 2014-04-28 11:07:54 +02:00
Shay Banon
dedddf3908 Raise node disconnected even if the transport is stopped
during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case.
Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects
closes 
2014-04-28 10:56:43 +02:00
Adrien Grand
fc32875ae9 Make ordinals start at 0.
Our ordinals currently start at 1, like FieldCache did in older Lucene versions.
However, Lucene 4.2 changed it in order to make ordinals start at 0, using -1
as the ordinal for the missing value. We should switch to the same numbering as
Lucene for consistency. This also allows to remove some abstraction on top of
Lucene doc values.

Close 
2014-04-28 10:21:50 +02:00
Lee Hinman
81e83cca74 Disable dynamic scripting by default
Closes 
2014-04-25 15:08:26 -06:00
Boaz Leskes
051beb51a3 Version types EXTERNAL & EXTERNAL_GTE test for version equality in read operation & disallow them in the Update API
Separate version check logic for reads and writes for all version types, which allows different behavior in these cases.
Change `VersionType.EXTERNAL` & `VersionType.EXTERNAL_GTE` to behave the same as `VersionType.INTERNAL` for read operations.
The previous behavior was fit for writes but is useless in reads.

This commit also makes the usage of `EXTERNAL` & `EXTERNAL_GTE` in the update api raise a validation error as it make cause data to
be lost.

Closes  , Closes , Closes 
2014-04-25 23:06:12 +02:00
Martijn van Groningen
eb9805389a Use segment ordinals as global ordinals if a segment contains all values for a field on a shard level.
Relates to 
Closes 
2014-04-25 23:05:07 +07:00
Shay Banon
65bc017271 Don't lookup version for auto generated id and create
When a create document is executed, and its an auto generated id (based on UUID), we know that the document will not exists in the index, so there is no need to try and lookup the version from the index.
For many cases, like logging, where ids are auto generated, this can improve the indexing performance, specifically for lightweight documents where analysis is not a big part of the execution.
closes 
2014-04-25 14:31:20 +02:00
Simon Willnauer
0b3605f4f2 [TEST] Logger names differ based on the classpath, inside the IDE the package name is used as a prefix 2014-04-25 12:36:32 +02:00
Simon Willnauer
b7325d005b Make Create/Update/Delete classes less mutable
Today we use a builder pattern / setters to set relevant information
to Engine#Delete|Create|Index. Yet almost all the values are required
but they are not passed via ctor arguments but via an error prone builder
pattern. If we add a required argument we should see compile errors on that
level to make sure we don't miss any place to set them.

Prerequisite for 
2014-04-25 11:42:05 +02:00
mikemccand
908c0d4165 temporarily mute this test on Java 8 until we fix getFiniteStrings 2014-04-25 05:41:18 -04:00
Simon Willnauer
d0f8742f8d [TEST] Prevent deletion of the second document by using different ids 2014-04-25 11:07:31 +02:00
Simon Willnauer
ec5dbbaf51 [TEST] Expect all shards failed in SearchWithRandomExceptionsTests 2014-04-25 11:03:22 +02:00
Britta Weber
c7bb784b08 Fix TemplateQueryParser swallows additional parameters
Request parameters such as "size" and "fields" were ignored when
placed after the template query in the reqest.

closes 
2014-04-25 08:51:08 +02:00
Adrien Grand
d8f0f7077f [TEST] Use assertAllSuccessful instead of assertNoFailures in CompletionSuggestSearchTests. 2014-04-25 00:07:10 +02:00
Adrien Grand
f1916d16dc [TEST] Fix typo in DateHistogramTests that fails the test since it expects dates to be rounded by day. 2014-04-24 23:25:23 +02:00
mikemccand
ba73877580 Send Lucene's IndexWriter infoStream messages to Logger lucene.iw, level=TRACE
Lucene's IndexWriter logs many low-level details to its infoStream
which can be helpful for diagnosing; with this change, if you enable
TRACE logging for the "lucene.iw" logger name then IndexWriter's
infoStream output will be captured and can later be scrutinized
e.g. using https://code.google.com/a/apache-extras.org/p/luceneutil/source/browse/src/python/iwLogToGraphs.py
to generate graphs like http://people.apache.org/~mikemccand/lucenebench/iw.html

Closes 
2014-04-24 16:31:23 -04:00
javanna
9a68e60142 [TEST] Allow to disable randomization of shards and replicas via system property
Needed for REST backwards compatibility tests, since we need to run older tests with the latest runner, which randomizes shards and replicas, but the tests rely on defaults (5,1).

Done in a generic way based on compatibility versions e.g. `-Dtests.compatibility=1.0.0` allows to run tests in a special manner that is compatibile with 1.0.0 version.

Also moved back randomIndexTemplate to ElasticsearchIntegrationTest (from ImmutableCluster) where all the randomized aspects should be.

Closes 
2014-04-24 22:18:31 +02:00
mikemccand
84af7d9f9a test was missing tie-break for the two suggestions 2014-04-24 15:42:08 -04:00
Britta Weber
e84d3111a3 Revert "Throw exception if decay is requested for a field with multiple values"
This reverts commit 95d781510f9ec66b6615df4e26ccc18da9c2b155.

see https://github.com/elasticsearch/elasticsearch/issues/3960#issuecomment-41279373
2014-04-24 15:46:48 +02:00
Britta Weber
95d781510f Throw exception if decay is requested for a field with multiple values
closes 
2014-04-24 15:18:39 +02:00
Adrien Grand
0631b6a042 [TEST] DisabledFieldDataFormatTests assumes a single replica. 2014-04-24 14:16:07 +02:00
Adrien Grand
cb8139a583 Remove abstraction in the percentiles aggregation.
We initially added abstraction in the percentiles aggregation in order to be
able to plug in different percentiles estimators. However, only one of the 3
options that we looked into proved useful and I don't see us adding new
estimators in the future.

Moreover, because of this, we let the parser put unknown parameters into a hash
table in case these parameters would have meaning for a specific percentiles
estimator impl. But this makes parsing error-prone: for example a user reported
that his percentiles aggregation reported extremely high (in the order of
several millions while the maximum field value was `5`), and the reason was that
he had a typo and had written `fields` instead of `field`. As a consequence,
the percentiles aggregation used the parent value source which was a timestamp,
hence the large values. Parsing would now barf in case of an unknown parameter.

Close 
2014-04-24 09:44:36 +02:00
Adrien Grand
b3e0e58094 Field data diet.
We have lots of unused, or almost unused methods in our field data impls,
especially when dealing with ordinals. Let's nuke them.

Close 
2014-04-24 09:14:09 +02:00
Shay Banon
0a84253045 [TEST] add a test that explicitly verifies no duplicates are created
we do this test in other places in ES, but no dedicated test for it. This test was born out of the auto generate id work, but we should have this test regardless if it gets in or not
2014-04-23 21:13:12 +02:00
Lee Hinman
b5adc877ca Include name of the field that caused a circuit break in the log and exception message
Fixes 
Closes 
2014-04-23 09:54:00 -06:00
javanna
6eb655380c [TEST] Randomized number of replicas between 0 and the number of data nodes - 1 (rather than just between 0 and 1)
Closes 
2014-04-23 17:46:35 +02:00
Simon Willnauer
b36ef995bb Change default recovery throttling to 50MB / sec
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware / network throughput.
A 50MB default should provide better out of the box performance.
2014-04-23 15:40:21 +02:00
Robert Muir
8568c18e6f Change default numeric precision_step
Change the default numeric precision_step to 16 for 64-bit types,
8 for 32-bit and 16-bit types. Disable precision_step for the 8-bit
byte type.

Closes 
2014-04-23 09:01:25 -04:00
Martijn van Groningen
f8d35d81d8 Re-order log statements to be correct for segment and top level warming. 2014-04-23 17:13:44 +07:00
Simon Willnauer
b4f0603169 Change default merge throttling to 50MB / sec
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware. Even on cloud infrastructure this
seems to be too lowish. A 50MB default should provide better out of the box
performance
2014-04-22 21:08:40 +02:00
Lee Hinman
029b13cf68 Parse has_child query/filter after child type has been parsed
Fixes 
Fixes 
2014-04-22 09:29:48 -06:00
Simon Willnauer
cb9f7c1da5 [TEST] Randomize translog setting per index 2014-04-22 16:41:00 +02:00
Boaz Leskes
1434f6bcbb A new ClusterStateStatus to indicate cluster state life cycles
When the ClusterService applies a new cluster state, it is first assigned as the new active one and then all listeners are called. Some of ES's features sample the current state and try to take action on it (for example index a document). If that fails, they will wait for change in the cluster state and try again (for example, wait for a shard to start and try indexing again).

If you're unlucky you sample the state after it has been assigned as the "active" state but before all listeners has done the work. In this cases the action take (i.e., indexing a doc) will still fail (as the shard is not yet started) but waiting for a new state may take a long time or fail.

This commit adds a new ClusterStateStatus that allows to better track the stages a cluster state goes through (currently `RECEIVED`, `BEING_APPLIED` & `APPLIED`). This allows detecting that a cluster state is not yet fully applied and retry without waiting for a new state to arrive.

This commit also adds a utility class , ClusterStateObserver, to make this pattern slightly simpler and avoid common pit falls.

Closes 
2014-04-22 10:14:41 +02:00
Simon Willnauer
41cc1f5bcb [TEST] Ensure that iteration order of TestSection is consistent 2014-04-22 10:06:58 +02:00
Simon Willnauer
ae911f6e75 [TEST] Remove ambigious 4th suggestion - order differs slightly on Java 8 2014-04-22 10:00:02 +02:00
javanna
918da65d35 [TEST] Added blacklist to be able to skip specific REST tests
The blacklist can be provided through -Dtests.rest.blacklist and supports a comma separated list of globs
e.g. -Dtests.rest.blacklist=get/10_basic/*,index/*/*

Also added some missing docs and made it clearer that the suite/test descriptions effectively contains their (relative) path (api/yaml_file/test section)

Closes 
2014-04-22 09:52:48 +02:00
Shay Banon
2f8fc98012 [TEST] make fetch time in millis test more resilient
beef up the fetch work, and increase teh number of iterations (since we count in nanos, but reports in rounded millis)
2014-04-22 00:00:08 +02:00
Boaz Leskes
baea1827d1 [Tests] SimpleRecoveryLocalGatewayTests.testSingleNodeNoFlush could fail if shards were not started
The test starts a single node, indexes into, restarts the node and checks that no data was lost. It only indexed into 2 shards and didn't wait for green meaning that the node could be restarted with non-started primary. In that case the node will not re-assign the primary as it was not started. This commit makes sure that we either wait for primaries to start or index into all shards which has the same net effect.

Also extending some logging in InternalIndexShard.
2014-04-21 11:44:16 +02:00
Boaz Leskes
2580099cf2 [Test] Let SuggestStatsTests.testSimpleStats do more work
The test verifies that stats are measure by checking timeInMillis>0. On fast machines the suggestions are done in < 1 millis time. The tests now index documents (to power suggestions) and does multiple suggestions per iterations to slow things down.
2014-04-19 17:46:52 +02:00
Simon Willnauer
b6515e2979 [TEST] Make InternalEngineMergeTests more stable 2014-04-18 18:20:44 +02:00
javanna
442dda2ac8 [TEST] _id is not indexed by default, sort on score,_uid in MultiMatchQueryTests 2014-04-18 15:09:00 +02:00
javanna
d6a676724a [TEST] added sort by "_id" when score is the same to MultiMatchQueryTests#testEquivalence
A merge (and refresh) might rarely happen in the background between the two queries whose output is compared. It might then happen that two docs with same scores get returned by the two queries in a different order due to different lucene document id (which has changed in the meantime). To fix this we need to order by id when the score is the same, so that we can safely compare the output of the two queries (multimatch and dismax).
2014-04-18 12:15:44 +02:00