Commit Graph

8060 Commits

Author SHA1 Message Date
Martijn van Groningen fc3efda6af Cut other aggregations over to use collectExistingBucket() if a bucket ord has been hit, that already exists.
Closes #5955
2014-04-29 11:07:12 +07:00
Martijn van Groningen f3219f7098 Added global ordinals terms aggregator impl that is optimized low cardinality fields.
Instead of resolving the global ordinal for each hit on the fly, resolve the global ordinals during post collect.
On fields with not so many unique values, that can reduce the number of global ordinals significantly.

Closes #5895
Closes #5854
2014-04-29 11:04:03 +07:00
Matt Weber 4df4506875 Use URI vs URL accessing File from classpath.
URL escapes special characters such as spaces which
causes the resource to not be found when used to create
a File object.  Use URI.

Closes #5915
2014-04-28 18:49:55 +02:00
javanna 51ba3ca220 [TEST] made sure nodeSettings method gets called for every node type, not only data nodes in case numDataNodes is specified.
This fixes a test ZenUnicastDiscoveryTests when running in network mode
2014-04-28 18:31:47 +02:00
javanna a414e4f2f3 [TEST] randomly introduced a client node within test cluster
The default number of clients nodes is randomized between 0 and 1, applied to all cluster scopes (global, suite and test). Can be changed through the newly added `@ClusterScope#numClientNodes`.

In our tests we currently refer to nodes in a generic way. All the tests that either stop or start nodes rely on the fact that those nodes hold data though. Made that clearer as that becomes more important when introducing other types of nodes within the test cluster. Reflected this by adapting and renaming the following methods in `TestCluster`:

- ensureAtLeastNumNodes to ensureAtLeastNumDataNodes
- ensureAtMostNumNodes to ensureAtMostNumDataNodes
- stopRandomNode to stopRandomDataNode

and the following ones in `ElasticsearchIntegrationTest`:

- allowNodes to allowDataNodes
- dataNodes to numDataNodes.
- @ClusterScope#numNodes to numDataNodes
- @ClusterScope#minNumNodes to minNumDataNodes
- @ClusterScope#maxNumNodes to maxNumDataNodes

Added facilities to be able to deal with data nodes specifically, like for instance retrieve a client to a data node, or retrieve an instance of a class through guice only from data nodes.

Adapted existing tests to successfully run although there's a node client around.

Fixed _cat/allocation REST tests to make disk.total, disk.avail and disk.percent optional as client nodes won't return that info.

Closes #5949
2014-04-28 16:31:36 +02:00
Martijn van Groningen 17a5575757 Disabled parent/child queries in the delete by query api.
It wasn't properly implemented and could lead to a shard being failed and not able to recover.

Closes #5828 #5916
2014-04-28 20:12:54 +07:00
Adrien Grand 22cbdd930c [TEST] Fix test bug in MultiOrdinalsTests. 2014-04-28 13:56:01 +02:00
Clinton Gormley 2dfc77a4ed Removed spec and YAML tests for indices.status
Related #4854
2014-04-28 13:00:08 +02:00
Robert Muir 8e0a479316 Upgrade to Lucene 4.8
Closes #5932
2014-04-28 06:45:50 -04:00
Chris Earle 5528370e24 Added type, max, min, queueSize & keepAlive to _cat/thread_pool
Closes #5366
2014-04-28 12:00:27 +02:00
Simon Willnauer f285ffc610 Multi value handling in decay functions
Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.

Relates to #3960
closes #5940
2014-04-28 11:55:32 +02:00
Britta Weber f993945e5c Move SortMode to org.elasticsearch.search and rename to MultiValueMode 2014-04-28 11:55:32 +02:00
javanna 5d1d5d6754 [DOCS] Removed leftover indices status link 2014-04-28 11:39:12 +02:00
javanna 1685e3611c [DOCS] Fixed get asciidoc missing section warning 2014-04-28 11:39:12 +02:00
javanna 16468f9ca3 [DOCS] Fixed scripting example 2014-04-28 11:39:12 +02:00
Shay Banon 6b2c1d0f62 spelling 2014-04-28 11:07:54 +02:00
Shay Banon 6899e642b5 Upgrade to Guava 17
closes #5953
2014-04-28 11:02:30 +02:00
Shay Banon dedddf3908 Raise node disconnected even if the transport is stopped
during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case.
Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects
closes #5918
2014-04-28 10:56:43 +02:00
Clinton Gormley 4b9f1d261d Removed indices-status docs.
Related #4854
2014-04-28 10:40:45 +02:00
Adrien Grand fc32875ae9 Make ordinals start at 0.
Our ordinals currently start at 1, like FieldCache did in older Lucene versions.
However, Lucene 4.2 changed it in order to make ordinals start at 0, using -1
as the ordinal for the missing value. We should switch to the same numbering as
Lucene for consistency. This also allows to remove some abstraction on top of
Lucene doc values.

Close #5871
2014-04-28 10:21:50 +02:00
javanna aa4dc092da _cat/allocation to return no value for `disk.total` when not available (e.g. non data nodes) instead of `-1b`
Closes #5948
2014-04-26 16:46:34 +02:00
Lee Hinman 81e83cca74 Disable dynamic scripting by default
Closes #5853
2014-04-25 15:08:26 -06:00
Boaz Leskes 051beb51a3 Version types `EXTERNAL` & `EXTERNAL_GTE` test for version equality in read operation & disallow them in the Update API
Separate version check logic for reads and writes for all version types, which allows different behavior in these cases.
Change `VersionType.EXTERNAL` & `VersionType.EXTERNAL_GTE` to behave the same as `VersionType.INTERNAL` for read operations.
The previous behavior was fit for writes but is useless in reads.

This commit also makes the usage of `EXTERNAL` & `EXTERNAL_GTE` in the update api raise a validation error as it make cause data to
be lost.

Closes #5663 , Closes #5661, Closes #5929
2014-04-25 23:06:12 +02:00
Uwe Dauernheim 080c4ade25 Fix typo 2014-04-25 14:59:10 -06:00
Benoss ed33b022d3 Update setup repositories documentation
Update doc so
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html
example is going to 1.1 instead of 0.90
2014-04-25 14:57:23 -06:00
Martijn van Groningen a2aa167e6e Don't create docCounts equal to maxOrd for the GlobalOrdinalsStringTermsAggregator.WithHash impl.
Relates #5873
2014-04-26 00:01:52 +07:00
Martijn van Groningen eb9805389a Use segment ordinals as global ordinals if a segment contains all values for a field on a shard level.
Relates to #5854
Closes #5873
2014-04-25 23:05:07 +07:00
Shay Banon 65bc017271 Don't lookup version for auto generated id and create
When a create document is executed, and its an auto generated id (based on UUID), we know that the document will not exists in the index, so there is no need to try and lookup the version from the index.
For many cases, like logging, where ids are auto generated, this can improve the indexing performance, specifically for lightweight documents where analysis is not a big part of the execution.
closes #5917
2014-04-25 14:31:20 +02:00
Simon Willnauer 0b3605f4f2 [TEST] Logger names differ based on the classpath, inside the IDE the package name is used as a prefix 2014-04-25 12:36:32 +02:00
Simon Willnauer b7325d005b Make Create/Update/Delete classes less mutable
Today we use a builder pattern / setters to set relevant information
to Engine#Delete|Create|Index. Yet almost all the values are required
but they are not passed via ctor arguments but via an error prone builder
pattern. If we add a required argument we should see compile errors on that
level to make sure we don't miss any place to set them.

Prerequisite for #5917
2014-04-25 11:42:05 +02:00
mikemccand 908c0d4165 temporarily mute this test on Java 8 until we fix getFiniteStrings 2014-04-25 05:41:18 -04:00
Simon Willnauer d0f8742f8d [TEST] Prevent deletion of the second document by using different ids 2014-04-25 11:07:31 +02:00
Simon Willnauer ec5dbbaf51 [TEST] Expect all shards failed in SearchWithRandomExceptionsTests 2014-04-25 11:03:22 +02:00
Britta Weber 8076a31ac1 Throw exception if an additional field was placed inside the "query" body
Currently the parser accepts queries like

```
"query" : {
     "any_query": {
         ...
     },
     "any_field_name":...
}
```

The "any_field_name" is silently ignored. However, this also causes the parser
not to move to the next closing bracket which in turn can lead to additional query
paremters being ignored such as "fields", "highlight",...
This was the case in issue #4895

closes issue #4895
2014-04-25 08:57:06 +02:00
Britta Weber c7bb784b08 Fix TemplateQueryParser swallows additional parameters
Request parameters such as "size" and "fields" were ignored when
placed after the template query in the reqest.

closes #5933
2014-04-25 08:51:08 +02:00
Adrien Grand d8f0f7077f [TEST] Use assertAllSuccessful instead of assertNoFailures in CompletionSuggestSearchTests. 2014-04-25 00:07:10 +02:00
Adrien Grand f1916d16dc [TEST] Fix typo in DateHistogramTests that fails the test since it expects dates to be rounded by day. 2014-04-24 23:25:23 +02:00
Adrien Grand f109802960 Remove java6ism in FSTBytesAtomicFieldData. 2014-04-24 22:41:12 +02:00
mikemccand ba73877580 Send Lucene's IndexWriter infoStream messages to Logger lucene.iw, level=TRACE
Lucene's IndexWriter logs many low-level details to its infoStream
which can be helpful for diagnosing; with this change, if you enable
TRACE logging for the "lucene.iw" logger name then IndexWriter's
infoStream output will be captured and can later be scrutinized
e.g. using https://code.google.com/a/apache-extras.org/p/luceneutil/source/browse/src/python/iwLogToGraphs.py
to generate graphs like http://people.apache.org/~mikemccand/lucenebench/iw.html

Closes #5891
2014-04-24 16:31:23 -04:00
javanna 9a68e60142 [TEST] Allow to disable randomization of shards and replicas via system property
Needed for REST backwards compatibility tests, since we need to run older tests with the latest runner, which randomizes shards and replicas, but the tests rely on defaults (5,1).

Done in a generic way based on compatibility versions e.g. `-Dtests.compatibility=1.0.0` allows to run tests in a special manner that is compatibile with 1.0.0 version.

Also moved back randomIndexTemplate to ElasticsearchIntegrationTest (from ImmutableCluster) where all the randomized aspects should be.

Closes #5897
2014-04-24 22:18:31 +02:00
Isabel Drost-Fromm dcfc7cead0 Add some more documentation to TemplateQueryParser
Relates to #4879
2014-04-24 22:11:17 +02:00
mikemccand 84af7d9f9a test was missing tie-break for the two suggestions 2014-04-24 15:42:08 -04:00
Clinton Gormley c1e03bf860 Update keyword-repeat-tokenfilter.asciidoc 2014-04-24 16:44:02 +02:00
Britta Weber e84d3111a3 Revert "Throw exception if decay is requested for a field with multiple values"
This reverts commit 95d781510f.

see https://github.com/elasticsearch/elasticsearch/issues/3960#issuecomment-41279373
2014-04-24 15:46:48 +02:00
Britta Weber 95d781510f Throw exception if decay is requested for a field with multiple values
closes #3960
2014-04-24 15:18:39 +02:00
Adrien Grand 0631b6a042 [TEST] DisabledFieldDataFormatTests assumes a single replica. 2014-04-24 14:16:07 +02:00
Adrien Grand d792d14926 Instantiate facets/aggregations during the QUERY phase.
In case of a DFS_QUERY_THEN_FETCH request, facets and aggregations are currently
instantiated during the DFS phase while they only become useful during the QUERY
phase. By instantiating during the QUERY phase instead, we can make better use
of recycling since objects will have a shorter life out of the recyclers.

Close #5821
2014-04-24 11:48:36 +02:00
Adrien Grand d8880f2906 Fail a DFS_QUERY_THEN_FETCH request if all shards failed the QUERY phase.
Today, if some shards pass the DFS phase but all of them fail the QUERY phase,
the response will only consist of failed shards. We should throw an exception
instead in order to be consistent with the QUERY_THEN_FETCH type.
2014-04-24 11:48:24 +02:00
Adrien Grand cb8139a583 Remove abstraction in the percentiles aggregation.
We initially added abstraction in the percentiles aggregation in order to be
able to plug in different percentiles estimators. However, only one of the 3
options that we looked into proved useful and I don't see us adding new
estimators in the future.

Moreover, because of this, we let the parser put unknown parameters into a hash
table in case these parameters would have meaning for a specific percentiles
estimator impl. But this makes parsing error-prone: for example a user reported
that his percentiles aggregation reported extremely high (in the order of
several millions while the maximum field value was `5`), and the reason was that
he had a typo and had written `fields` instead of `field`. As a consequence,
the percentiles aggregation used the parent value source which was a timestamp,
hence the large values. Parsing would now barf in case of an unknown parameter.

Close #5859
2014-04-24 09:44:36 +02:00
Adrien Grand b3e0e58094 Field data diet.
We have lots of unused, or almost unused methods in our field data impls,
especially when dealing with ordinals. Let's nuke them.

Close #5874
2014-04-24 09:14:09 +02:00