Commit Graph

4263 Commits

Author SHA1 Message Date
Nik Everett 0ff0985e01 Limit guava caches to 31.9GB
Guava's caches have overflow issues around 32GB with our default segment
count of 16 and weight of 1 unit per byte.  We give them 100MB of headroom
so 31.9GB.

This limits the sizes of both the field data and filter caches, the two
large guava caches.

Closes #6268
2014-05-23 00:20:12 +02:00
Adrien Grand a836496e57 [TESTS] Randomly disable the filter cache.
Close #6280
2014-05-22 23:13:29 +02:00
Adrien Grand 6e49256fa8 Nested: Make sure queries/filters/aggs get a FixedBitSet when they expect one.
Close #6279
2014-05-22 23:13:13 +02:00
Adrien Grand b3274bd770 Aggregations: Fix ReverseNestedAggregator to compute the parent document correctly.
Close #6278
2014-05-22 23:13:13 +02:00
Martijn van Groningen cbdd11777f [TEST] Just start two nodes 2014-05-22 21:13:52 +02:00
Martijn van Groningen 41bcb3e0d3 [TEST] Don't stop master node. 2014-05-22 19:17:54 +02:00
Nik Everett 3573822b7e Highlight fields in request order
Because json objects are unordered this also adds an explicit order syntax
that looks like
    "highlight": {
        "fields": [
            {"title":{ /*params*/ }},
            {"text":{ /*params*/ }}
        ]
    }

This is not useful for any of the builtin highlighters but will be useful
in plugins.

Closes #4649
2014-05-22 16:44:14 +02:00
Alex Ksikes 2546c06131 More Like This Query: allow for both 'like_text' and 'docs/ids' to be specified.
Closes #6246
2014-05-22 13:50:17 +02:00
Martijn van Groningen a717af505a [TEST] Use _uid sort field as tie, so that hits with the same score are sorted in the same way in both search responses. 2014-05-22 12:10:03 +02:00
Colin Goodheart-Smithe cabd2340dd Aggregations: Fixed conversion of date field values when using multiple date formats
When multiple date formats are specified using the || syntax in the field mappings the date_histogram aggregation breaks.  This is because we are getting a parser rather than a printer from the date formatter for the object we use to convert the DateTime values back into Strings.  Simple fix to get the printer from the date format and test to back it up

Closes #6239
2014-05-22 10:21:50 +01:00
Martijn van Groningen e8e684c6c4 Add number of shards statistic to PercolateContext instead of throwing exception.
Certain features like significant_terms aggregation rely on this statistic for sizing heuristics.

Closes #6037
Closes #6123
2014-05-22 10:44:50 +02:00
Martijn van Groningen 16e5cdf8d0 Cut over to Lucene's TopDocs#merge for shard topdocs sorting.
Closes #6197
2014-05-22 10:40:56 +02:00
Martijn van Groningen 157d511061 [TEST] Use SuiteScopeTest annotation instead of ClusterScope(scope = ElasticsearchIntegrationTest.Scope.SUITE, numDataNodes = 1) 2014-05-21 22:08:59 +02:00
Alex Ksikes a29b4a800d More Like This Query: replaced 'exclude' with 'include' to avoid double negation when set.
Closes #6248
2014-05-21 18:45:03 +02:00
Britta Weber 8cca9b28df Percolator: Fix assertion in percolation with nested docs
Assertion was triggered for percolating documents with nested object
in mapping if the document did not actually contain a nested object.
Reason:
MultiDocumentPercolatorIndex checks if the number of documents is
actualu >1. Instead we can just use the SingleDocumentPercolatorIndex
in this case.

closes #6263
2014-05-21 18:17:36 +02:00
Simon Willnauer 17d34d5c97 Fix FieldDataWeighter generics to accept RamUsage instead of AtomicFieldData
The `FieldDataWeighter` allowed to use a concrete subclass of the caches
generic type to be used that causes ClassCastException and also trips the
CirciutBreaker to not be decremented appropriately.

This was tripped by settings randomization also part of this commit.

Closes #6260
2014-05-21 17:50:45 +02:00
Lee Hinman 03402c7ed8 [TEST] prevent dummy documents from being indexed in testSimpleQueryString() since scores are compared 2014-05-21 17:37:54 +02:00
Martijn van Groningen a6b0b80f3d [TEST] Added test for #6256 2014-05-21 16:17:03 +02:00
Adrien Grand 34f7bd1ca4 Fail queries that have two aggregations with the same name.
Close #6255
2014-05-21 15:11:23 +02:00
Simon Willnauer f29744cc2f XFilteredQuery default strategy prefers query first in the deleted docs case
Today we check if the DocIdSet we filter by is `fast` but the check fails
if the DocIdSet if wrapped in an `ApplyAcceptedDocsFilter` which is always
the case if the index has deleted documents. This commit unwraps
the original DocIdSet in the case of deleted documents.

Closes #6247
2014-05-21 13:04:41 +02:00
Adrien Grand fa3bd738ab Remove `DocIdSets.isFastIterator(DocIdSetIterator iterator)`.
This method was unused and its implementation wasn't correct since FixedBitSet
has its own iterator since Lucene 4.7.
2014-05-21 11:25:35 +02:00
Simon Willnauer a60dabdf0c [TEST] skip benchmark tests for now 2014-05-20 22:21:37 +02:00
Martijn van Groningen 9494bbd9b7 Verify that the current node is still master before the reroute is executed and if that isn't the case skip reroute
Invoke listener when reroute fails.

Closes #6244
2014-05-20 18:29:06 +02:00
Simon Willnauer 0e445d3aaf [TEST] Wait for all benchmarks to be started if more than one is used 2014-05-20 17:17:25 +02:00
Simon Willnauer 75efa47d5a [TEST] Allow to disable plugin loading from classpath 2014-05-20 16:31:32 +02:00
mikemccand 9c45fe8f9b Don't use AllTokenStream when no fields were boosted
AllTokenStream, used to index the _all field, adds some overhead, but
it's not necessary when no fields were boosted or when positions are
not indexed the _all field.

Closes #6187 Closes #6219
2014-05-20 10:28:31 -04:00
Andrew Selden 476e28f4ce Benchmark abort accepts wildcard patterns
This adds support for sending a list of benchmark names and/or wildcard
patterns when submitting an abort request. Standard wildcards such as:
"*", "*xxx", and "xxx*" are supported. Multiple benchmark names and
wildcard patterns can be submitted together as a comma-separated list
from the REST API or as a string array from the Java API.

Closes #6185
2014-05-20 16:00:11 +02:00
Simon Willnauer e47ee6f683 [TEST] Disable dummy documents for QueryRescorerTests#testEquivalence 2014-05-20 15:52:00 +02:00
Igor Motov 91c7892305 Add ability to snapshot replicating primary shards
This change adds a new cluster state that waits for the replication of a shard to finish before starting snapshotting process. Because this change adds a new snapshot state, an pre-1.2.0 nodes will not be able to join the 1.2.0 cluster that is currently running snapshot/restore operation.

Closes #5531
2014-05-20 08:57:21 -04:00
Boaz Leskes 05d131c39d Before deleting a local unused shard copy, verify we're connected to the node it's supposed to be on
This is yet another safety guard to make sure we don't delete data if the local copy is the only one (even if it's not part of the cluster state any more)

Closes #6191
2014-05-20 11:16:13 +02:00
Boaz Leskes 541acc7e9b Honor time delay when retrying recoveries
In some places we want to delay the start of a shard recovery because the source node is not ready to receive. At the moment the retry logic ignores the time delay parameter (`retryAfter`) causing a busy waiting like scenario. This is fixed in this commit.

Closes #6226
2014-05-20 11:03:14 +02:00
Simon Willnauer 223550bf3c [TEST] Opt out of dummy documents where scores are relevant. 2014-05-20 10:26:50 +02:00
Simon Willnauer ac28557228 [TEST] Provide overloaded indexRandom to opt out of dummy documents 2014-05-20 10:20:31 +02:00
Clinton Gormley 0741ce3684 CharArraySet doesn't know how to lookup the original string
in an ImmutableList.

Closes #6237
2014-05-19 21:27:04 +02:00
Simon Willnauer 7d76548a1a Added Version [1.3.0] 2014-05-19 20:55:23 +02:00
Simon Willnauer 85a0b76dbb Upgrade to Lucene 4.8.1
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
2014-05-19 20:47:55 +02:00
Andrew Selden 3731362ca8 Do not throw execption on no available nodes when listing benchmarks.
Changed behavior to not throw an exception on a status request
when there are no available benchmark nodes.

Closes #6146
2014-05-19 11:42:15 -07:00
Tiago Alves Macambira 4dd2ba6d50 Register uppercase as an exposed ES token filter.
Just follow "lowercase" token filter example and register "uppercase" token filter as an exposed token filter. This will not, by itself, test whether ES is correctly handling "uppercase" TF; this is more of a "code as documentation" fix.
2014-05-19 13:49:33 -03:00
Simon Willnauer fc28fbfada Add dummy docs injection to indexRandom
This commit add `dummy docs` to `ElasticsearchIntegrationTest#indexRandom`.
It indexes document with an empty body into the indices specified by the docs
and deletes them after all docs have been indexed. This produces gaps in
the segments and enforces usage of accept docs on lower levels to ensure
the features work with delete documents as well.
2014-05-19 17:23:14 +02:00
Simon Willnauer 579a79d1ac Check accepts docs before MatchDocIdSet#matchDoc(int)
We currently ask `MatchDocIdSet#matchDoc(int)` before consulting
the accept docs. This can also have a negative performance impact
since `matchDoc(int)` calls might be way more expensive than
acceptDocs calls.

Closes #6234
2014-05-19 17:17:55 +02:00
Simon Willnauer 3e4c896944 [TEST] Drop obsolet test - the option is obsolet and won't be fixed 2014-05-19 15:06:04 +02:00
Simon Willnauer 72da764261 Don't report terms as live if all it's docs are filtered out
FilterableTermsEnum allows to filter stats by supplying per segment
bits. Today if all docs are filtered out the term is still reported as
live but shouldn't.

Relates to #6211
2014-05-19 13:48:56 +02:00
Simon Willnauer c593234b7c [TEST] Ensure multi_match & match query equivalence in the single field case 2014-05-19 13:32:24 +02:00
Martijn van Groningen 39018c5d0b [TEST] Added await for yellow status,
because the shard the get request for 'test' index, 'type1' type and id 1 is getting executed on may not be in a started state
and also added more logging.
2014-05-19 11:56:26 +02:00
Simon Willnauer d9441747e8 [TEST] Beef up MoreLikeThisActionTests#testCompareMoreLikeThisDSLWithAPI 2014-05-18 23:02:08 +02:00
Simon Willnauer 91b74931a3 [TEST] Stabelize MoreLikeThisActionTests
The `testCompareMoreLikeThisDSLWithAPI` test compares results from query
and API which might query different shards. Those shares might use
different doc IDs internally to disambiguate. This commit resorts the
results and compares them after stable disambiguation.
2014-05-18 22:57:46 +02:00
mikemccand 4f7792e64b Tie-break suggestions from phrase suggester by term
If the score for two suggestions is the same, we now tie break by term; earlier terms (aaa) sort before later terms (zzz).

Closes #5978
2014-05-18 16:45:37 -04:00
Simon Willnauer dab4596b13 Use default forceAnalyzeQueryString if no query builder is present
In the single field case no query builder is selected which causes NPE
when the query has only a numeric field.

Closes #6215
2014-05-18 10:20:31 +02:00
Boaz Leskes 1e5138889e Translog: remove unneeded Versions.readVersion & Versions.writeVersion
These calls were introduced in pr #6149 as a backward compatibility layer for the previous value of `Versions.MATCH_ANY`. This is not needed as the translog never contains these values. On top of that, the calls are not effective as the stream the translog used is effectively not versioned (versioining is done on an item by item basis)
2014-05-18 09:45:00 +02:00
Boaz Leskes 682acfcacd DeleteRequest.version was not initialized to `Versions.MATCH_ANY` 2014-05-18 09:45:00 +02:00
Simon Willnauer c7db8843b3 [TEST] Stabelize BenchmarkIntegrationTest#testAbortBenchmark 2014-05-17 23:33:49 +02:00
Alex Ksikes db991dc3a4 More Like This Query: Added searching for multiple items.
The syntax to specify one or more items is the same as for the Multi GET API.
If only one document is specified, the results returned are the same as when
using the More Like This API.

Relates #4075 Closes #5857
2014-05-17 19:14:56 +02:00
Igor Motov a3581959d7 [TESTS] Ignore SnapshotMissingException in snapshotWithStuckNodeTest
The retry mechanism in the transport layer might cause the delete snapshot request to be executed twice if the cluster master is closed while the request is executed. First time delete snapshot request is getting successfully executed on the old master and then it is retried on the newly elected master. When the new master tries to delete the snapshot - the snapshot no longer exists (since it was successfully deleted by the old master) and SnapshotMissingException is returned.
2014-05-17 11:18:11 -04:00
Igor Motov c20713530d Switch to shared thread pool for all snapshot repositories
Closes #6181
2014-05-16 19:03:15 -04:00
Igor Motov 7f5befd95e Add Partial snapshot state
Currently even if some shards of the snapshot are not snapshotted successfully, the snapshot is still marked as "SUCCESS". Users may miss the fact the there are shard failures present in the snapshot and think that snapshot was completed. This change adds a new snapshot state "PARTIAL" that provides a quick indication that the snapshot was only partially successful.

Closes #5792
2014-05-16 18:26:56 -04:00
Boaz Leskes 9f10547f4b Allow 0 as a valid external version
Until now all version types have officially required the version to be a positive long number. Despite of this has being documented, ES versions <=1.0 did not enforce it when using the `external` version type. As a result people have succesfully indexed documents with 0 as a version. In 1.1. we introduced validation checks on incoming version values and causing indexing request to fail if the version was set to 0. While this is strictly speaking OK, we effectively have a situation where data already indexed does not match the version invariant.

To be lenient and adhere to spirit of our data backward compatibility policy, we have decided to allow 0 as a valid external version type. This is somewhat complicated as 0 is also the internal value of `MATCH_ANY`, which indicates requests should succeed regardles off the current doc version. To keep things simple, this commit changes the internal value of `MATCH_ANY` to `-3` for all version types.

Since we're doing this in a minor release (and because versions are stored in the transaction log), the default `internal` version type still accepts 0 as a `MATCH_ANY` value. This is not a problem for other version types as `MATCH_ANY` doesn't make sense in that context.

Closes #5662
2014-05-16 22:10:16 +02:00
Simon Willnauer bf22df7fd0 Remove SoftReferences from StreamInput/StreamOutput
We try to reuse character arrays and UTF8 writers with softreferences.
SoftReferences have negative impact on GC and should be avoided in
general. Yet in this case it can simply replaced with a per-stream
Bytes/CharsRef that is thread local and has the same lifetime as the
stream.
2014-05-16 20:58:42 +02:00
Simon Willnauer 11a3201a09 Use EnumSet rather than static mutable arrays
ClusterBlockLevel uses arrays but should use EnumSets instead
2014-05-16 20:54:01 +02:00
Simon Willnauer d65e9e9bea Add some finals where appropriate 2014-05-16 20:54:01 +02:00
Simon Willnauer c561900512 Use UTF-8 as string encoding 2014-05-16 20:54:01 +02:00
David Pilato 0dbc83e7b0 [TEST] Do not filter gz files 2014-05-16 15:23:09 +02:00
Simon Willnauer d806b567e4 Remove dead code 2014-05-16 15:08:56 +02:00
Simon Willnauer eef505ed51 RecoveryID should not be a per JVM but per Node
Today the RecovyerID is taken from a static atomic long which
is essentially a per JVM ID. We run the tests within the same
JVM and that means we don't really simulate what happens in
production environments. Instead we should use a per node generated
ID.
2014-05-16 14:59:32 +02:00
Simon Willnauer 9a9cc0b8e4 Add simple example to XContentParser how to obtain an instance of it 2014-05-16 14:55:22 +02:00
David Pilato bd871f96c2 Check that a plugin is Lucene compatible with the current running node using `lucene` property in `es-plugin.properties` file.
* If plugin does not provide `lucene` property, we consider that the plugin is compatible.
* If plugin provides `lucene` property, we try to load related Enum org.apache.lucene.util.Version. If this fails, it means that the node is too "old" comparing to the Lucene version the plugin was built for.
* We compare then two first digits of current node lucene version against two first digits of plugin Lucene version. If not equal, it means that the plugin is too "old" for the current node.

Plugin developers who wants to launch plugin check only have to add a `lucene` property in `es-plugin.properties` file. If you are using maven to build your plugin, you can do it like this:

In `pom.xml`:

```xml
    <properties>
        <lucene.version>4.6.0</lucene.version>
    </properties>

    <build>
        <resources>
            <resource>
                <directory>src/main/resources</directory>
                <filtering>true</filtering>
            </resource>
        </resources>
    </build>
```

In `es-plugin.properties`, add:

```properties
lucene=${lucene.version}
```

BTW, if you don't already have it, you can add the plugin version as well:

```properties
version=${project.version}
```

You can disable that check using `plugins.check_lucene: false`.
2014-05-16 13:41:20 +02:00
Simon Willnauer 094908ac7f Randomize CMS settings in index template
This commit adds randomization for:
 * `index.merge.scheduler.max_thread_count`
 * `index.merge.scheduler.max_merge_count`

This commit also moves to use
EsExecutors#boundedNumberOfProcessors(Settings) to default
configure the default `max_thread_count` for better reproducibility

Closes #6194
2014-05-15 23:16:45 +02:00
javanna 7548b2edb7 Unified MetaData#concreteIndices methods into a single method that accepts indices (or aliases) and indices options
Added new internal flag to IndicesOptions that tells whether aliases can be resolved to multiple indices or not.

Cut over to new metaData#concreteIndices(IndicesOptions, String...) for all the api previously using MetaData#concreteIndices(String[], IndicesOptions) and removed old method, deprecation is not needed as it doesn't break client code.

Introduced constants for flags in IndicesOptions for more readability

Renamed MetaData#concreteIndex to concreteSingleIndex, left method as a shortcut although it calls the common concreteIndices that accepts IndicesOptions and multipleIndices
2014-05-15 20:53:05 +02:00
Boaz Leskes 1f28cd0ba8 When sending shard start/failed message due to a cluster state change, use the master indicated in the new state rather than current
This commit also adds extra protection in other cases against a master node being de-elected and thus being null.

Closes #6189
2014-05-15 18:42:26 +02:00
Boaz Leskes 84593f0d7c Added meta data and routing version to cluster state's pretty print 2014-05-15 15:55:11 +02:00
Boaz Leskes dc07ece790 Added some debug logs to the recovery process 2014-05-15 15:37:30 +02:00
Simon Willnauer e47de1f809 [TEST] Randomize number of available processors
We configure the threadpools according to the number of processors which is
different on every machine. Yet, we had some test failures related to this
and #6174 that only happened reproducibly on a node with 1 available processor.
This commit does:
  * sometimes randomize the number of available processors
  * if we don't randomize we should set the actual number of available processors
    in the settings on the test node
  * always print out the num of processors when a test fails to make sure we can
    reproduce the thread pool settings with the reproduce info line

Closes #6176
2014-05-15 12:24:53 +02:00
Simon Willnauer 53bfe44e19 Fix debug logging message for put template action 2014-05-15 11:13:30 +02:00
Andrew Selden fc0bed5236 Fix bug for BENCH thread pool size == 1
On small hardware, the BENCH thread pool can be set to size 1. This is
problematic as it means that while a benchmark is active, there are no
threads available to service administrative tasks such as listing and
aborting. This change fixes that by executing list and abort operations
on the GENERIC thread pool.

Closes #6174
2014-05-14 10:40:39 -07:00
Simon Willnauer 2c1c5c163f [TEST] Ensure all benchmarks are aborted on failure and latches are counted down 2014-05-14 16:40:34 +02:00
Simon Willnauer fc2ab0909e [TEST] Remove busy waiting from BenchmarkIntegrationTest
I think Chuck Norris is required to fix this at this point until we have an API
that can for instance pause a Benchmark. We basically wait for a query to be executed
and that query syncs on a latch with the test in a script :)

This commit also adds some more testing for benchmarks that run into errors.
2014-05-14 14:40:27 +02:00
David Pilato e0a95d9c19 Allow sorting on nested sub generated field
When you have a nested document and want to sort on its fields, it's perfectly doable on regular fields but not on "generated" sub fields.

Here is a SENSE recreation:

```
DELETE /tmp

PUT /tmp

PUT /tmp/doc/_mapping
{
  "properties": {
    "flat": {
      "type": "string",
      "index": "analyzed",
      "fields": {
        "sub": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    },
    "nested": {
      "type": "nested",
      "properties": {
        "foo": {
          "type": "string",
          "index": "analyzed",
          "fields": {
            "sub": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      }
    }
  }
}

PUT /tmp/doc/1
{
  "flat":"bar",
  "nested":{
    "foo":"bar"
  }
}
```

When sorting on `flat.sub` sub field, everything is fine:

```
GET /tmp/doc/_search
{
  "sort": [
    {
      "flat.sub": {
        "order": "desc"
      }
    }
  ]
}

```

When sorting on `nested` field, everything is fine:

```
GET /tmp/doc/_search
{
  "sort": [
    {
      "nested.foo": {
        "order": "desc"
      }
    }
  ]
}

```

But when sorting on `nested.sub` field, sorting is incorrect:

```
GET /tmp/doc/_search
{
  "sort": [
    {
      "nested.foo.sub": {
        "order": "desc"
      }
    }
  ]
}

Closes #6150.
2014-05-14 14:13:44 +02:00
Britta Weber 08e57890f8 use shard_min_doc_count also in TermsAggregation
This was discussed in issue #6041 and #5998 .

closes #6143
2014-05-14 14:10:04 +02:00
Britta Weber d4a0eb818e refactor: make requiredSize, shardSize, minDocCount and shardMinDocCount a single parameter
Every class using these parameters has their own member where these four
are stored. This clutters the code. Because they mostly needed together
it might make sense to group them.
2014-05-14 14:10:02 +02:00
Britta Weber 8e3bcb5e2f refactor: unify terms and significant_terms parsing
Both need the requiredSize, shardSize, minDocCount and shardMinDocCount.
Parsing should not be duplicated.
2014-05-14 14:09:59 +02:00
Adrien Grand bfcebbb957 [TESTS] Fix test bug in PagedBytesReferenceTest. 2014-05-14 10:09:11 +02:00
Adrien Grand 265b386fa7 [TESTS] Fix test bugs for parent/child queries.
If you got a bad seed and tests.nightly=true, these tests would either call
Random#nextInt on `0` or trigger infinite loops.
2014-05-14 09:35:45 +02:00
Boaz Leskes 9daa72941a [Test] increase ping timeout to 400ms in MinimumMasterNodesTests.dynamicUpdateMinimumMasterNodes 2014-05-14 09:28:44 +02:00
Boaz Leskes fb501b22e1 [Test] SimpleNodesInfoTests.testNodesInfos didn't wait for cluster to form properly 2014-05-14 08:59:48 +02:00
Lee Hinman 588ae1ba9e Track the number of times the CircuitBreaker has been tripped
Fixes #6130
2014-05-13 21:08:48 +02:00
javanna ffe97f004e [TEST] improved MetaDataTests coverage for different index options
Relates to #6068
2014-05-13 20:17:46 +02:00
David Pilato 2971a102f6 [Javadoc] Add full link to TDigest class
(cherry picked from commit ed72484)
2014-05-13 20:04:45 +02:00
Andrew Selden 8713a090c2 Fix recovery percentage > 100%
The recovery API was sometimes misreporting the recovered byte
percentages of index files. This was caused by summing up total file
lengths on each file chunk transfer. It should have been summing the
lengths of each transfer request.

Closes #6113
2014-05-13 09:38:02 -07:00
Simon Willnauer 0457b0b765 [TEST] Raise request timeout windows is sometimes extraordinary slow 2014-05-13 18:05:34 +02:00
Martijn van Groningen c6c9bbdd72 Removed useless and illegal json object in the response.
Relates to #5865
2014-05-13 14:32:03 +02:00
Adrien Grand 3ad321fcb2 Fix NPE when initializing an accepted socket in NettyTransport.
NettyTransport's ChannelPipelineFactory uses the instance variable
serverOpenChannels in order to create sockets. However, this instance variable
is set to null when stoping the netty transport, so if the transport tries to
stop and to initialize a socket at the same time you might hit the following
NullPointerException:

[2014-05-13 07:33:47,616][WARN ][netty.channel.socket.nio.AbstractNioSelector] Failed to initialize an accepted socket.
java.lang.NullPointerException: handler
	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.<init>(DefaultChannelPipeline.java:725)
	at org.jboss.netty.channel.DefaultChannelPipeline.init(DefaultChannelPipeline.java:667)
	at org.jboss.netty.channel.DefaultChannelPipeline.addLast(DefaultChannelPipeline.java:96)
	at org.elasticsearch.transport.netty.NettyTransport$2.getPipeline(NettyTransport.java:327)
	at org.jboss.netty.channel.socket.nio.NioServerBoss.registerAcceptedChannel(NioServerBoss.java:134)
	at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:104)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
	at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

This fix ensures that the ChannelPipelineFactory always uses the channels that
have been used upon start, even if a stop request is issued concurrently.

Close #6144
2014-05-13 13:39:56 +02:00
Simon Willnauer d8c02c2599 Read full message on free context
Since #5730 we write a boolean in the FreeContextResponse which should be deserialized

Closes #6147
2014-05-13 12:36:13 +02:00
Simon Willnauer 1feddac315 Log cache recycler clear call as debug 2014-05-13 12:26:31 +02:00
Simon Willnauer 53db387698 Report all errors if benchmark fails and mark as failed 2014-05-13 12:26:31 +02:00
Simon Willnauer 65d27bff9d [TEST] Ensure no bechmarks are running after test 2014-05-13 12:26:30 +02:00
Benjamin Devèze 240a2a8abf [JAVADOCS] Fix wrong javadoc in IdentityHashSet.
Close #6121
2014-05-13 11:57:11 +02:00
Adrien Grand cc530b9037 Use t-digest as a dependency.
Our improvements to t-digest have been pushed upstream and t-digest also got
some additional nice improvements around memory usage and speedups of quantile
estimation. So it makes sense to use it as a dependency now.

This also allows to remove the test dependency on Apache Mahout.

Close #6142
2014-05-13 10:38:08 +02:00
markharwood 1e560b0d92 Significant_terms agg: added option for a background_filter to define background context for analysis of term frequencies
Closes #5944
2014-05-13 09:10:30 +01:00
Lee Hinman 3484ca3737 Log script change/add and removal at INFO level
Closes #6104
2014-05-13 09:33:22 +02:00
Igor Motov dfdc183ba6 Fix for hanging aborted snapshot during node shutdown
If a node is shutdown while a snapshot that runs on this node is aborted, it might cause the snapshot process to hang.

Closes #5958
2014-05-12 18:02:07 -04:00
Andrew Selden fdbefa0cd1 Fix for benchmark test timeout
Lower number of random requests generated for each test so as not to
timeout on heavy tests.

Addresses #6094
2014-05-12 14:45:43 -07:00
javanna 154688bba1 improved IndicesOptions javadocs 2014-05-12 23:26:29 +02:00
javanna c69c66bb7a fixed MetaData#concreteIndices to throw exception with a single index argument in case allowNoIndices == false and ignoreUnavailable == true
Closes #6137
2014-05-12 23:26:29 +02:00
mikemccand 00fcf4d560 #6081: set IO throttling back to 20 MB/sec now that #6018 is fixed 2014-05-12 14:42:26 -04:00
mikemccand 254ebc2f88 #6120 Remove SerialMergeScheduler (master only)
It's dangerous to expose SerialMergeScheduler as an option: since it only allows one merge at a time, it can easily cause merging to fall behind.

Closes #6120
2014-05-12 14:06:20 -04:00
mikemccand eae304aa39 5882: put back Elasticsearch's 1.1 defaults for ConcurrentMergeScheduler 2014-05-12 13:22:33 -04:00
Britta Weber 4b2e4becc7 Check if root mapping is actually valid
When a mapping is declared and the type is known from the uri
then the type can be skipped in the body (see #4483). However,
there was no check if the given keys actually make a valid mapping.

closes #5864
closes #6093
2014-05-12 18:36:14 +02:00
Adrien Grand caacce9429 [TESTS] Improve BenchmarkIntegrationTest's check that percentiles are increasing.
Percentiles are supposed to be monotonically increasing but floating-point
rounding issues can come into play and make the test fail if checks are too
strict.
2014-05-12 17:21:11 +02:00
Shay Banon 78e39882ee Allow to change concurrent merge scheduling setting dynamically
Allow to change the concurrent merge scheduler settings dynamically using the update settings API
closes #6098
2014-05-12 07:33:31 -07:00
Adrien Grand 6d9da390ed [TESTS] Fix MinDocCountTests.
The new include/exclude support for global ordinals didn't exclude terms in
`buildAggregation` (which is required if minDocCount is 0).
2014-05-12 15:45:29 +02:00
javanna 9361305177 [TEST] made catch request more accurate in REST tests runner
Excluded 404, 403 and 409 status codes from the catch request as they have their own specific catch codes
2014-05-12 14:29:56 +02:00
Martijn van Groningen 64c43c6dc0 Made the include and exclude support for terms and significant terms aggregations based on global ordinals.
Closes #6000
2014-05-12 13:14:13 +02:00
javanna 7980911d96 restored @Test annotation in SimpleValidateQueryTests 2014-05-12 12:52:48 +02:00
Alex Marandon d1ddbd2c51 Detect unsupported fields after query in validate query api
The validate API was failing to reject JSON input that had unsupported
fields placed after a supported field. This was causing invalid requests
to be reported as valid.

Fixes #5685
2014-05-12 12:49:25 +02:00
javanna 3d63bac51d Fixed validate query parsing issues
Made sure that a match_all query is used when no query is specified and ensure no NPE is thrown either.
Also used the same code path as the search api to ensure that alias filters are taken into account, same for type filters.

Closes #6111 Closes #6112 Closes #6116
2014-05-12 12:49:25 +02:00
Alex Ksikes 513f25ae97 More Like This: Fix correct use of size and from parameters
More Like This API would not take into account 'size' and 'from' in request body parameters.
Instead these values would always be overriden by the default values of REST parameters
'search_size' and 'search_from'.

Closes #5981
2014-05-12 12:30:04 +02:00
Martijn van Groningen d4d6c3459e [TEST] Make sure all shards are allocated before the delete type is being executed. 2014-05-12 11:59:09 +02:00
Adrien Grand ebfab19400 [TESTS] Disable BenchmarkIntegrationTest#testSubmitBenchmark until it is fixed. 2014-05-12 11:30:01 +02:00
Martijn van Groningen 145efbf6ea Return missing (404) is a scroll_id is cleared that no longer exists.
Closes #5730
2014-05-12 09:43:56 +02:00
Adrien Grand 51de01bae5 [TESTS] Tentative fix of BigArrays byte-accounting checks. 2014-05-12 09:25:49 +02:00
cccabot 58ebcf1252 Fixed typos in FieldSortBuilder 2014-05-10 02:57:51 +02:00
Andrew Selden 48879752a2 [TEST] Fix for benchmark tests
- Fix bug where repeatedly calling computeSummaryStatistics() could
  accumulate some values incorrectly
- Fix check for number of responsive nodes on list is <= number of
  candidate benchmark nodes
- Add public getters for summary statistics
- Add javadoc for new getters
- Add javadoc comments about API use
- Improve abort and status tests by calling awaitBusy() to wait for jobs
  to be completely submitted before testing them
2014-05-09 16:01:57 -07:00
mikemccand 5e40a4b95a don't call isFinite from XAnalyzingSuggester; re-enable test on Java 8 2014-05-09 18:24:13 -04:00
javanna 6678da8c28 [TEST] randomly added node.bench=true to client node in test cluster and re-enabled REST benchmark tests based on number of bench nodes available
In our REST tests we already have support for features and skip sections that allow to skip tests if a feature is not supported.
We can then add a skip section based on the benchmark feature to the benchmark tests and execute them only when they are supported, knowing that they need at least a node with node.bench settings within the cluster. We can check that this requirement is met by calling the nodes info api.

This way we can dynamically decide whether to execute those tests or not and we don't need to have a node.bench around all the time. In fact, given that the REST tests use the GLOBAL cluster, we want to be able to randomize settings as much as possible and run tests against default settings as well. Also, this mechanism can be easily supported by the external cluster implementation that is used during the release process.

Introduced ability to disable benchmark nodes which is needed by BenchmarkNegativeTest.
2014-05-09 23:36:00 +02:00
Alex Ksikes d8bb7c157a [TEST] Removed the restriction on the number of bool clauses that must match.
The test failed because 'percent_terms_to_match' defaults to 0.3, which results
in requiring that some terms only found in the queried document must match, when
all the documents are on the same shard.
2014-05-09 19:14:32 +02:00
Lee Hinman e7e4ef859a Add /_cat/fielddata to display fielddata usage
Closes #4593
2014-05-09 13:18:02 +02:00
Alex Ksikes dae48d9fe8 Added the ability to include the queried document for More Like This API.
By default More Like This API excludes the queried document from the response.
However, when debugging or when comparing scores across different queries, it
could be useful to have the best possible matched hit. So this option lets users
explicitly specify the desired behavior.

Closes #6067
2014-05-09 12:59:39 +02:00
mikemccand aa31c71776 mute this test until we fix isFinite 2014-05-09 05:24:22 -04:00
Martijn van Groningen 67fe88c63c [TEST] Enforce that only one shard per node is allocated. The prevents during node shutdown, that a second shard is assigned the another node. 2014-05-09 10:43:08 +02:00
Martijn van Groningen d7c05e5924 Temporarily disabling benchmark tests.
Relates #6094
2014-05-08 13:18:12 +02:00
Martijn van Groningen d5b95e3e8a A number of changes to fix reduce failures if shard failures have occurred:
* The shardTopDocs array should get created with the size equal to the total number of shard level requests and not the total number of requests that have a shard level result.
* Make sure no null TopDocs entires are passed down to TopDocs#merge
* Added dedicated scroll tests that tests scrolling on an index that has missing shards due to node failure.
* Made sure that the sort fields in SimpleNestedTests exists by adding the fields in the mapping during index creation.

Closes #6022
2014-05-08 10:17:00 +02:00
Martijn van Groningen 0efeeff49a The percolator needs to deleted percolator documents into account when running in near realtime mode.
This bug only occurs in non-realtime mode when query, filter, facet or aggs is used.

Closes #5843
Closes #5840
2014-05-08 09:52:27 +02:00
Andrew Selden c00120b818 Fix for benchmark test
- Fix bug where repeatedly calling computeSummaryStatistics() could
  accumulate some values incorrectly.
- Fix check for number of responsive nodes on list is <= number of
  candidate benchmark nodes.
- Add public getters for summary statistics
- Add javadoc for new getters
- Add javadoc comments about API use
2014-05-07 18:42:39 -07:00
mikemccand 82aad78ff2 it's safe to use OneMerge.getTotalBytesSize (fixed in LUCENE-4775) 2014-05-07 17:25:06 -04:00
Andrew Selden f23274523a Integration tests for benchmark API.
- Randomized integration tests for the benchmark API.
- Negative tests for cases where the cluster cannot run benchmarks.
- Return 404 on missing benchmark name.
- Allow to specify 'types' as an array in the JSON syntax when describing a benchmark competition.
- Don't record slowest for single-request competitions.

Closes #6003, #5906, #5903, #5904
2014-05-07 14:14:54 -07:00
uboness fc52db1209 Changed the respnose structure of the percentiles aggregation where now all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false`
Closes #5870
2014-05-07 18:35:24 +02:00
Shay Banon 743dc19acb Node version sometimes empty in _cat/nodes
closes #5480
2014-05-07 18:08:11 +02:00
Britta Weber 7944369fd1 Add `shard_min_doc_count` parameter for significant terms similar to `shard_size`
Significant terms internally maintain a priority queue per shard with a size potentially
lower than the number of terms. This queue uses the score as criterion to determine if
a bucket is kept or not. If many terms with low subsetDF score very high
but the `min_doc_count` is set high, this might result in no terms being
returned because the pq is filled with low frequent terms which are all sorted
out in the end.

This can be avoided by increasing the `shard_size` parameter to a higher value.
However, it is not immediately clear to which value this parameter must be set
because we can not know how many terms with low frequency are scored higher that
the high frequent terms that we are actually interested in.

On the other hand, if there is no routing of docs to shards involved, we can maybe
assume that the documents of classes and also the terms therein are distributed evenly
across shards. In that case it might be easier to not add documents to the pq that have
subsetDF <= `shard_min_doc_count` which can be set to something like
`min_doc_count`/number of shards  because we would assume that even when summing up
the subsetDF across shards `min_doc_count` will not be reached.

closes #5998
closes #6041
2014-05-07 18:02:56 +02:00
javanna f554178fc7 Renamed IndicesOptions#strict and IndicesOptions#lenient to make it clearer what they actually return, reused methods and introduced new one
Relates to #6059, where two new constants were introduced in IndicesOptions. There were already two constants there though, one of which we could have reused. This commit tries to unify them.
2014-05-07 17:40:57 +02:00
Alexander Reelsen 0c0f717aba Removed Index Status API
The functionality of the index status API has been replaced by the recovery API.

Relates #4854
2014-05-07 16:57:19 +02:00
Adrien Grand c49276cda7 Add a dedicated field data type for the _index field mapper.
This makes aggregations work on the _index field, and also allows to remove the
special facet aggregator for the _index field.

Close #5848
2014-05-07 14:06:13 +02:00
Adrien Grand c4f127fb6f Limit the number of bytes that can be allocated to process requests.
This should prevent costly requests from killing the whole cluster.

Close #6050
2014-05-07 12:55:48 +02:00
Adrien Grand 8cd7811955 Lower initial sizing of sub aggregations.
We currently compute initial sizings based on the cardinality of our fields.
This can be highly exagerated for sub aggregations, for example if there is a
parent terms aggregation that is executed over a field that has a very long
tail: most buckets will only collect a couple of documents.

Close #5994
2014-05-06 17:23:34 +02:00
Adrien Grand c306d8c5f5 Don't assume fixed earth diameter in the geo-distance bounding box optimization.
We switched to Lucene's SloppyMath way of computing an approximate value of
the eath diameter given a latitude in order to compute distances, yet the
bounding box optimization of the geo distance filter still assumed a constant
earth diameter, equal to the average.

Close #6008
2014-05-06 16:20:31 +02:00
Shay Banon 44fd962a9f Improve 404 on missing scroll id
This relates to #6040, the fix is twofold, first, not handling missing context specifically in the search code, but behave the same as we do in non scroll search, where if all the shards failed, raise an exception. The second is to apply this logic in both scroll cases.
2014-05-06 15:55:42 +02:00
Shay Banon 66296de38d Remove unused dump infra
Way back when, when ES started, there was an idea for a dump infrastructure, but it ended up supporting its serviceability aspects through APIs, remove the unused code
2014-05-06 14:02:24 +02:00
javanna a8b6f81525 Made it mandatory to specify IndicesOptions when calling MetaData#concreteIndices
Removed MetaData#concreteIndices variations that didn't require an IndicesOptions argument. Every caller should specify how indices should be resolved to concrete indices based on the indices options argument.

Closes #6059
2014-05-06 12:45:16 +02:00
Adrien Grand 90b547cf2c Remove RootMapper.validate and validate the routing key up-front.
RootMapper.validate was only used by the routing field mapper, which makes
buggy assumptions about how fields are indexed. For example, it assumes that
the index representation of a field is the same as its external representation.

Close #5844
2014-05-06 11:55:31 +02:00
Adrien Grand 589360c8b1 [TESTS] Don't randomize mappings in SimpleValidateQueryTests.
This test relies on the fact that the _id field is not indexed.
2014-05-06 11:46:31 +02:00
Adrien Grand 17a32fca03 [TEST] Random dynamic templates.
This change randomly indexes the _id field and randomizes field data formats
and loading.

Close #5834
2014-05-06 11:07:43 +02:00
Alexander Reelsen d356881664 [REST] Missing scroll id now returns 404
A bad/non-existing scroll ID used to return a 200, however a 404 might be more useful.
Also, this PR returns the right Exception (SearchContextMissingException) in the Java API.

Additionally: Added StatusToXContent interface and RestStatusToXContentListener listener, so
the appropriate RestStatus can be returned

Closes #5729
2014-05-05 17:37:26 +02:00
Shay Banon fad5e2d0e1 Remove operation threading from broadcast actions
Similar to search removal, the operation threading options are not really ued, and the default should always be used. This also considerably simplifies the code.
A side affect is that we can now remove the ShardIterator#firstOrNull method, which can cause for sneaky bugs to occur.
closes #6044
2014-05-05 17:09:36 +02:00
Alexander Reelsen 799bb2491c Analyze API: Default analyzer accidentally removed stopwords
The analyze API used the standard analyzer from lucene and therefore removed
stopwords instead of using the elasticsearch default analyzer.

Closes #5974
2014-05-05 15:55:33 +02:00
Alexander Reelsen d4fcf23057 Cluster State API: Remove index template filtering
The possibility of filtering for index templates in the cluster state API
had been introduced before there was a dedicated index templates API. This
commit removes this support from the cluster state API, as it was not really
clean, requiring you to specify the metadata and the index templates.

Closes #4954
2014-05-05 14:54:14 +02:00
Shay Banon 7ce8306bc5 Remove search operation threading option
Search operation threading is an option that is not really used, and current non default implementations are flawed. Handling it also creates quite the complexity in the search handling codebase...
This is a breaking change, but one that is actually a good one, since I haven't seen/heard anybody use it, and if its used, its problematic...
closes #6042
2014-05-05 11:39:16 +02:00
Benjamin Devèze cea2d21c50 Fix bug in PropertyPlaceholder and add unit tests.
Close #6034
2014-05-05 10:21:18 +02:00
Adrien Grand 727e6172e3 Restore read/write visibility is PlainShardsIterator.
Change #5561 introduced a potential bug in that iterations that are performed
on a thread are might not be visible to other threads due to the removal of the
`volatile` keyword.

Close #6039
2014-05-05 10:05:44 +02:00
Shay Banon 342a32fb16 Search might not return on thread pool rejection
When a thread pool rejects the execution on the local node, the search might not return.
This happens due to the fact that we move to the next shard only *within* the execution on the thread pool in the start method. If it fails to submit the task to the thread pool, it will go through the fail shard logic, but without "counting" the current shard itself. When this happens, the relevant shard will then execute more times than intended, causing the total opes counter to skew, and for example, if on another shard the search is successful, the total ops will be incremented *beyond* the expectedTotalOps, causing the check on == as the exit condition to never happen.
The fix here makes sure that the shard iterator properly progresses even in the case of rejections, and also includes improvement to when cleaning a context is sent in case of failures (which were exposed by the test).
Though the change fixes the problem, we should work on simplifying the code path considerably, the first suggestion as a followup is to remove the support for operation threading (also in broadcast), and move the local optimization execution to SearchService, this will simplify the code in different search action considerably, and will allow to remove the problematic #firstOrNull method on the shard iterator.
The second suggestion is to move the optimization of local execution to the TransportService, so all actions will not have to explicitly do the mentioned optimization.
fixes #4887
2014-05-05 09:24:53 +02:00
javanna e96e634d10 [TEST] fixed _cat/thread_pool REST tests with local transport, in case the transport port is not available and gets returned as '-'
Re-enabled REST tests suite

Closes #6033
2014-05-04 22:10:03 +02:00
mikemccand 6bc3a744a1 Fix StackOverflowException for long suggestion strings
Changed getFiniteStrings to use an iterative implementation instead of
recursive, so we don't use a Java stack-frame per character for each
suggestion at build & query time.
2014-05-04 13:35:05 -04:00
Shay Banon c9f1792c81 Change default filter cache to 10% and circuit breaker to 60%
The defaults we have today in our data intensive memory structures don't properly add up to properly protected from potential OOM.
The circuit breaker, today at 80%, aims at protecting from extensive field data loading. The default threshold today is too permissive and can still cause OOMs.
 The filter cache today is at 20%, and its too high when adding it to other limits we have, reduce it to 10%, which is still a big enough portion of the heap, yet provides improved safety measure.
 closes #5990
2014-05-04 15:38:16 +02:00
Adrien Grand 01eb01cb70 [TEST] Disable REST tests until #6033 is fixed. 2014-05-04 11:58:30 +02:00
Boaz Leskes 694bf287d6 Do not start a recovery process if the primary shard is currently allocated on a node which is not part of the cluster state
If a source node disconnect during recover, the target node will respond by canceling the recovery. Typically the master will respond by removing the disconnected node from the cluster state, promoting another shard to become primary. This is sent it to all nodes and the target node will start recovering from the new primary. However, if the drop of a node caused the node count to go bellow min_master_node, the master will step down and will not promote shard immediately. When a new master is elected we may publish a new cluster state (who's point is to notify of a new master) which is not yet updated. This caused the node to start a recovery to a non existent node. Before we aborted the recovery without cleaning up the shard, causing subsequent correct cluster states to be ignored. We should not start the recovery process but wait for another cluster state to come in.

Closes #6024
2014-05-02 23:30:24 +02:00
Alex Ksikes b55d8ed2e3 Fix behavior on default boost factor for More Like This.
A boost terms factor of 1.0 is not the same as no boosting of terms.
The desired behavior is to deactivate boosting by default. If the user
specifies any value other than 0, then boosting is activated.

Closes #6021
2014-05-02 16:59:09 +02:00
Holger Hoffstätte f5c9bf6f0f Update JNA to latest version
Updating to this version allows to configure a special JNA directory,
in case the /tmp directory is mounted with the noexec option, as JNA
extracts some data and tries to execute parts of it.

Also updated documentation to clarify mlockall and memory settings as well
as pointing to the new jna.tmpdir system property.

Closes #5493
2014-05-02 11:52:57 +02:00
Britta Weber 2e44040388 function_score parser throws exception if both functions:[] and single function given
In addition, add a special warning if the misplaced function is a "boost_factor"
function to avoid confusion of "boost" and "boost_function".

closes #5995
2014-05-02 10:53:33 +02:00
Shay Banon a557ee8daf Support empty properties array in mappings
closes #5887
2014-05-01 12:18:39 -04:00
Boaz Leskes 42a112f50b debug log of receiving a cluster state from another master could be erroneously logged
Added trace logging to MinimumMasterNodesTests.multipleNodesShutdownNonMasterNodes
2014-05-01 13:15:08 +02:00
Martijn van Groningen 9493824a0e [TEST] (RecoveryPercolatorTests) Don't stop the master node and always use the client of the master node 2014-05-01 14:06:34 +07:00
Martijn van Groningen 61093f1bd1 [TEST] Replace execute().actionGet() with get() 2014-05-01 14:06:34 +07:00
Shay Banon 23f200bc0e Use non analyzed token stream optimization everywhere
In the string type, we have an optimization to reuse the StringTokenStream on a thread local when a non analyzed field is used (instead of creating it each time). We should use this across the board on all places where we create a field with a String.
Also, move to a specific XStringField, that we can reuse StringTokenStream instead of copying it.
closes #6001
2014-04-30 17:18:15 -04:00
Martijn van Groningen 12f43fbbc0 Fixed license headers. 2014-05-01 00:33:17 +07:00
Martijn van Groningen 013b319415 Added `reverse_nested` aggregation.
The `reverse_nested` aggregation allows to aggregate on properties outside of the nested scope of a `nested` aggregation.

Closes #5507
2014-05-01 00:23:05 +07:00
Martijn van Groningen 5a0070071a Use collectExistingBucket in GlobalOrdinalsSignificantTermsAggregator.WithHash.
Relates to #5955.
2014-04-30 23:24:33 +07:00
Matt Weber 2663d04a96 Run tests through forbidden-apis. 2014-04-30 17:48:33 +02:00
Adrien Grand 34fb5e48e2 Use collectExistingBucket in GlobalOrdinalsStringTermsAggregator.WithHash.
Relates to #5955.
2014-04-30 15:34:01 +02:00
Boaz Leskes 870bd90f54 ThreadPool.EstimatedTimeThread should be set on initialization
Some tests run before the thread is started and thus use 0 as a the current time, which later on leads to big time jumps and thus failures.
Ex. InternalEngineTests.testVersioningReplicaConflict2
2014-04-30 11:47:47 +02:00
Adrien Grand b2db7c8222 Improve the way sub-aggregations are collected.
Sub-aggregations are currently collected directly, by just forwarding the
doc ID and bucket ordinal to them. This change adds the new BucketCollector
abstract class that Aggregator extends, so that we have more flexibility to
add implicit filters or buffering between an aggregator and its sub
aggregators.

Close #5975
2014-04-30 08:47:25 +02:00
Adrien Grand 2eeaa56d95 Fix setting of readerGen in BytesRefOrdValComparator on nested documents.
Sorting was broken on nested documents because the `missing(slot)` method
didn't correctly set the segment ordinal (readerGen), causing term ordinals to
be compared across segments.

Close #5986
2014-04-30 08:21:26 +02:00
Shay Banon 2076194d8f Upgrade to Jackson 2.3.3
fixes the long value bug as well...
2014-04-29 20:13:43 -04:00
Shay Banon 34302a7cc5 disable using CBOR in randomized test infra
due to a bug in CBOR handling long values (test case to verify it is included), disalbe using CBOR in our tests till it gets fixed
2014-04-29 19:11:12 -04:00
Martijn van Groningen dce127bcdf Added global ordinals based implementations for significant terms aggregator.
Closes #5970
2014-04-30 01:36:02 +07:00
Shay Banon a4ef418e6e Range/Term query/filter on dates fail to handle numbers properly
When providing a number (milliseconds since epoch, UTC), range and term query/filter don't handle it correctly and convert it to a string, that is then first tried to parse as a date
closes #5969
2014-04-29 14:25:05 -04:00
mikemccand fb53784e3b add thread name to logger message from IndexWriter's infoStream 2014-04-29 10:50:36 -04:00
Adrien Grand 6ec01c13e5 Fix computation of the missing ord (leftover of the ordinals change). 2014-04-29 16:29:01 +02:00
Britta Weber 9d214d14fe Provide meaningful error message if field has no fielddata type
closes #5930
2014-04-29 15:19:01 +02:00
mikemccand a8d4c04fc2 include thread name when logging IndexWriter's infoStream messages 2014-04-29 05:50:13 -04:00
Adrien Grand d07c5a5c32 Aggregations parsing is too lenient.
Close #5827
2014-04-29 11:07:06 +02:00
Martijn van Groningen 8817281a70 Added AwaitsFix 2014-04-29 13:58:39 +07:00
Martijn van Groningen 0f23485a3c Cut p/c queries (has_child and has_parent queries) over to use global ordinals instead of being bytes values based.
Closes #5846
2014-04-29 12:41:04 +07:00
Martijn van Groningen fc3efda6af Cut other aggregations over to use collectExistingBucket() if a bucket ord has been hit, that already exists.
Closes #5955
2014-04-29 11:07:12 +07:00
Martijn van Groningen f3219f7098 Added global ordinals terms aggregator impl that is optimized low cardinality fields.
Instead of resolving the global ordinal for each hit on the fly, resolve the global ordinals during post collect.
On fields with not so many unique values, that can reduce the number of global ordinals significantly.

Closes #5895
Closes #5854
2014-04-29 11:04:03 +07:00
Matt Weber 4df4506875 Use URI vs URL accessing File from classpath.
URL escapes special characters such as spaces which
causes the resource to not be found when used to create
a File object.  Use URI.

Closes #5915
2014-04-28 18:49:55 +02:00
javanna 51ba3ca220 [TEST] made sure nodeSettings method gets called for every node type, not only data nodes in case numDataNodes is specified.
This fixes a test ZenUnicastDiscoveryTests when running in network mode
2014-04-28 18:31:47 +02:00
javanna a414e4f2f3 [TEST] randomly introduced a client node within test cluster
The default number of clients nodes is randomized between 0 and 1, applied to all cluster scopes (global, suite and test). Can be changed through the newly added `@ClusterScope#numClientNodes`.

In our tests we currently refer to nodes in a generic way. All the tests that either stop or start nodes rely on the fact that those nodes hold data though. Made that clearer as that becomes more important when introducing other types of nodes within the test cluster. Reflected this by adapting and renaming the following methods in `TestCluster`:

- ensureAtLeastNumNodes to ensureAtLeastNumDataNodes
- ensureAtMostNumNodes to ensureAtMostNumDataNodes
- stopRandomNode to stopRandomDataNode

and the following ones in `ElasticsearchIntegrationTest`:

- allowNodes to allowDataNodes
- dataNodes to numDataNodes.
- @ClusterScope#numNodes to numDataNodes
- @ClusterScope#minNumNodes to minNumDataNodes
- @ClusterScope#maxNumNodes to maxNumDataNodes

Added facilities to be able to deal with data nodes specifically, like for instance retrieve a client to a data node, or retrieve an instance of a class through guice only from data nodes.

Adapted existing tests to successfully run although there's a node client around.

Fixed _cat/allocation REST tests to make disk.total, disk.avail and disk.percent optional as client nodes won't return that info.

Closes #5949
2014-04-28 16:31:36 +02:00
Martijn van Groningen 17a5575757 Disabled parent/child queries in the delete by query api.
It wasn't properly implemented and could lead to a shard being failed and not able to recover.

Closes #5828 #5916
2014-04-28 20:12:54 +07:00
Adrien Grand 22cbdd930c [TEST] Fix test bug in MultiOrdinalsTests. 2014-04-28 13:56:01 +02:00
Robert Muir 8e0a479316 Upgrade to Lucene 4.8
Closes #5932
2014-04-28 06:45:50 -04:00
Chris Earle 5528370e24 Added type, max, min, queueSize & keepAlive to _cat/thread_pool
Closes #5366
2014-04-28 12:00:27 +02:00
Simon Willnauer f285ffc610 Multi value handling in decay functions
Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.

Relates to #3960
closes #5940
2014-04-28 11:55:32 +02:00
Britta Weber f993945e5c Move SortMode to org.elasticsearch.search and rename to MultiValueMode 2014-04-28 11:55:32 +02:00