Commit Graph

5507 Commits

Author SHA1 Message Date
Simon Willnauer f8cf5ae7e9 Added Random test for ShardAllocator.
This test will make random allocation decision on a growing and
shrinking cluster leading to a random distribution of the shards.
After a certain amount of iterations the test allows allocation unless
the same shard is already allocated on a node and balances the cluster
to gain optimal balance.
2013-08-30 10:11:45 +02:00
Martijn van Groningen a7b2b7847a Use atomic collections to make sure all of the memory contents are visible from writing to reading thread. 2013-08-30 00:06:45 +02:00
Simon Willnauer 7113731022 Execute listeners on current thread if threadpool is shutting down 2013-08-29 16:40:38 +02:00
Simon Willnauer f408673bd4 Force default codec in tests where we rely on the latest lucene defaults 2013-08-29 16:40:38 +02:00
Nik Everett 10e55bd3ef Recheck cutoffScore during phrase_suggest merge.
The goal is to throw out suggestions that only meet the cutoff in some
shards.  This will happen if your input phrase is only contained in a
few shards.  If your shards are unbanced this rechecking can throw out
good suggestions.

Closes #3547.
2013-08-29 16:40:38 +02:00
Martijn van Groningen 76939b82d3 Removed unnecessary catch clause.
Improved logging
2013-08-29 16:00:59 +02:00
Boaz Leskes ffd019c07e Logging shard level multi percolate errors under debug rather than trace 2013-08-29 14:15:38 +02:00
Luca Cavanna 649108656d Added logging and asserts to make it easier to understand potential failures 2013-08-29 14:10:49 +02:00
Simon Willnauer 8b617fb48f Convert relative paths to URI first in tests to prevent URL encoded paths
If a path to the test classes contains spaces Hunspell tests failed due
to URL encoded paths. This happens on CI builds if you give the build
a name containing a space. This is fixed by first converting to a URI
and created a File object from the URI directly.
2013-08-29 12:56:33 +02:00
Martijn van Groningen 25c1b93d57 Ignore AlreadyExpiredException exception during when indexing a document on a replica.
The document should just be indexed to stay consistent with the primary shard.
2013-08-29 11:38:13 +02:00
Martijn van Groningen a1f2f44eef Make sure 2 writes happened, so we can check if 2 deletes happen that have been triggered by ttl 2013-08-29 11:27:01 +02:00
Adrien Grand 5b6be0c456 Use a separate build directory for Eclipse.
The fact that Maven and Eclipse share the same build directories can trigger
race conditions when both are trying to build at the same time, eg. if you run
`mvn clean test` while Eclipse is up and running: Eclipse will notice that some
class files are missing and start compiling in parallel with Maven.
2013-08-29 10:29:26 +02:00
Simon Willnauer 11ff90420a Fixed typo 2013-08-29 10:27:20 +02:00
Boaz Leskes e807c99f27 Fixed a typo in the config of light finnish stemmer (old last_finish is still supported for backward compatibility)
Closes #3594
2013-08-29 10:15:40 +02:00
Clinton Gormley 822043347e Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00
Simon Willnauer b9558edeff Prevent allocation algorithm from prematurely exiting balance operations
In a special case if allocations from the "heaviest" to the "lighter" nodes is not possible
due to some allocation decider restrictions like zone awareness. if one zone has for instance
less nodes than another zone so one zone is horribly overloaded from a balanced perspective but we
can't move to the "lighter" shards since otherwise the zone would go over capacity.

Closes #3580
2013-08-28 23:31:35 +02:00
Simon Willnauer 82397c0554 Treat empty prefrence as a `not set` in Plain Operation Routing
An empty preference was causing a AIOOB exception in
PlainOperationRouting. We now check for `null` or `empty` instead of
just `null`

Closes #3591
2013-08-28 20:57:39 +02:00
Boaz Leskes 86cb76a0ce Added a status method CountResponse that resolve shard failures into rest status code. That method is now used in RestCountAction to return proper status.
Closes #3585
2013-08-28 20:21:22 +02:00
Shay Banon db11c30dd5 batch failed shards into a single cluster state event
make sure we process as much as possible failed shard events within a single cluster state event callback (similar to what we do with started shards)
2013-08-28 15:29:16 +02:00
Adrien Grand b63af53313 Make BytesValues documentation clearer about BytesRef ownership. 2013-08-28 15:24:51 +02:00
Martijn van Groningen e6f014bd62 no need for flush/optimize 2013-08-28 14:56:26 +02:00
Shay Banon 25d42e5caf Don't recover a shard if it just failed on this node and wasn't reassigned to this node by master yet.
additional places where we should track failed shards, and call clean as part of the top level calls
2013-08-28 14:12:07 +02:00
Igor Motov ed2740a50a Don't recover a shard if it just failed on this node and wasn't reassigned to this node by master yet.
When recovery of a shard fails on a node, the node sends notification to the master with information about the failure. During the period between the shard failure and the time when notification about the failure reaches the master, any changes in shard allocations can trigger the node with the failed shard to allocate this shard again. This allocation (especially if successful) creates a ripple effect of the shard going through failure/started states in order to match the delayed states processed by master.  Under certain condition, a node involved in this process might generate warning messages: "marked shard as started, but shard has not been created, mark shard as failed".

This fix makes sure that nodes keep track of failed shard allocations and will not try to allocate such shards repeatedly while waiting for the failure notification to be processed by master.
2013-08-28 14:12:07 +02:00
Britta Weber 513c761aee ElasticsearchAssertions.assertHitCount also checks shard failures 2013-08-28 10:47:39 +02:00
Martijn van Groningen 60ac34ff3a Added `_name` support to queries.
This extends the named filter support from only filters to also queries.

Closes #3581
2013-08-28 10:42:53 +02:00
Boaz Leskes 45d4864021 Refreshing after green so also recovering replicas will be refreshed. 2013-08-28 08:44:10 +02:00
Simon Willnauer e6dcd137a6 AwarenessAllocationTests must extend ESTestCase in order to respect AwaitsFix annotations 2013-08-27 19:28:20 +02:00
Lee Hinman 9d5868904b Recover small files (< 1mb) using a separate threadpool than large files.
Fixes #3576
2013-08-27 11:01:54 -06:00
Britta Weber 1ab037d4d0 Fix searching while shard is being relocated
Shard relocation caused queries to fail in the fetch phase either with a
`AlreadyClosedException` or a `SearchContextMissingException`.
This was caused by the `SearchService` releasing the `SearchContext`s via the
superfluous call of `releaseContextsForShard()` when the
2013-08-27 18:30:00 +02:00
Simon Willnauer e7ff8ea509 Added unit tests or zone awareness 2013-08-27 18:05:42 +02:00
Luca Cavanna d5b2c8e82f Fixed extraction of site plugins downloaded from github, so that we skip the top-level folder and we place the files directly under the _site folder
Closes #3551
2013-08-27 17:03:55 +02:00
Boaz Leskes fa067e058a MultiTermVectorsAction was wrongly registered under mget 2013-08-27 13:46:33 +02:00
Simon Willnauer 39040b5f17 Add more tests for Zone Awareness 2013-08-27 12:25:35 +02:00
Simon Willnauer 6b8dd0b08f Added log info on test finished 2013-08-27 12:03:49 +02:00
Luca Cavanna 6e19ca8080 Fixed order of parameters when calling byteSizeField and timeValueField methods (introduced with #3432 - support for human readable flag) 2013-08-27 11:48:20 +02:00
Martijn van Groningen 45699bae5a Make sure preference isn't null 2013-08-27 11:05:12 +02:00
Adrien Grand db46946d16 Configurable sort order for missing string values.
This commit allows for configuring the sort order of missing values in BytesRef
comparators (used for strings) with the following options:
 - _first: missing values will appear in the first positions,
 - _last: missing values will appear in the last positions (default),
 - <any value>: documents with missing sort value will use the given value when
   sorting.

Since the default is _last, sorting by string value will have a different
behavior than in previous versions of elasticsearch which used to insert missing
value in the first positions when sorting in ascending order.

Implementation notes:
 - Nested sorting is supported through the implementation of
   NestedWrappableComparator,
 - BytesRefValComparator was mostly broken since no field data implementation
   was using it, it is now tested through NoOrdinalsStringFieldDataTests,
 - Specialized BytesRefOrdValComparators have been removed now that the ordinals
   rely on packed arrays instead of raw arrays,
 - Field data tests hierarchy has been changed so that the numeric tests don't
   inherit from the string tests anymore,
 - When _first or _last is used, internally the comparators are told to use
   null or BytesRefFieldComparatorSource.MAX_TERM to replace missing values
   (depending on the sort order),
 - BytesRefValComparator just replaces missing values with the provided value
   and uses them for comparisons,
 - BytesRefOrdValComparator multiplies ordinals by 4 so that it can find
   ordinals for the missing value and the bottom value which are directly
   comparable with the segment ordinals. For example, if the segment values and
   ordinals are (a,1) and (b,2), they will be stored internally as (a,4) and
   (b,8) and if the missing value is 'ab', it will be assigned 6 as an ordinal,
   since it is between 'a' and 'b'. Then if the bottom value is 'abc', it will
   be assigned 7 as an ordinal since if it between 'ab' and 'b'.

Closes #896
2013-08-27 10:46:21 +02:00
Martijn van Groningen 2c939847b4 Simplified percolate reduce logic and the percolator recovery test 2013-08-27 10:45:09 +02:00
Martijn van Groningen 28e1744e79 No test is simple 2013-08-26 18:01:12 +02:00
Martijn van Groningen 29626dd201 Rename multi percolate item actions:
count_percolate -> count
percolate_existing_doc -> percolate
count_percolate_existing_doc -> count

If header contains `id` field, then it will automatically be percolation  an existing document.
2013-08-26 18:00:36 +02:00
Martijn van Groningen 3ca0239668 Added highlighter to percolate api.
The highlighter in the percolate api highlights snippets in the document being percolated. If highlighting is enabled then foreach matching query, highlight snippets will be generated.
All highlight options that are supported via the search api are also supported in the percolate api, since the percolate api embeds the same highlighter infrastructure as the search api.
The `size` option is a required option if highlighting is specified in the percolate api, other than that the `highlight`request part can just be placed in the percolate api request body.

Closes #3574
2013-08-26 16:37:07 +02:00
Martijn van Groningen df3922a22a Small cleanup 2013-08-26 15:42:19 +02:00
Martijn van Groningen 35dcdb0b5a Only use the client variable and don't use client from client() in percolator test 2013-08-26 15:37:18 +02:00
Shay Banon 8b295b53d0 Improve refresh logic when replica move to started
closes #3573
2013-08-26 15:15:01 +02:00
Shay Banon b329943632 improve search while create test
- improve the test to be more re-creatable
- have tests for various number of replica counts, to check if failures are caused by searching on replicas that might not have been refreshed yet
- improve test to test explicit index creation, and index creation caused by index operation
- have an initial search go to _primary, to check if failure fails when searching on replica because it missed a refresh
2013-08-26 14:29:40 +02:00
Martijn van Groningen 9f2e615ed9 Added extra logging 2013-08-26 09:48:04 +02:00
Boaz Leskes a96ecea653 Multi term vector request
--------------------------

This feature allows to retrieve [term vectors](https://github.com/elasticsearch/elasticsearch/issues/3114) for a list of documents. The json request has exactly the same [format](https://github.com/elasticsearch/elasticsearch/issues/3484) as the ```_termvectors``` endpoint

It use it, call

```
curl -XGET 'http://localhost:9200/index/type/_mtermvectors' -d '{
    "fields": [
        "field1",
        "field2",
        ...
    ],
    "ids": [
        "docId1",
        "docId2",
        ...
    ],
    "offsets": false|true,
    "payloads": false|true,
    "positions": false|true,
    "term_statistics": false|true,
    "field_statistics": false|true
}'

```

The return format is an array, each entry of which conatins the term vector response for one document:

```
{
   "docs": [
      {
         "_index": "index",
         "_type": "type",
         "_id": "docId1",
         "_version": 1,
         "exists": true,
         "term_vectors": {
         	...
         }
      },
      {
         "_index": "index",
         "_type": "type",
         "_id": "docId2",
         "_version": 1,
         "exists": true,
         "term_vectors": {
         ...
         }
      }
   ]
}
```

Note that, like term vectors, the mult term vectors request will silenty skip over documents that have no term vectors stored in the index and will simply return an empty response in this case.

Closes #3536
2013-08-26 09:25:21 +02:00
Boaz Leskes 18c71b16b5 Refactor term vector api
This is necessary to allow adding a mult term vector request
2013-08-26 09:25:21 +02:00
Martijn van Groningen a09f217b45 Don't reduce twice if failure occurs. 2013-08-25 21:50:04 +02:00
Simon Willnauer 020e68f2a0 Make sure all shards are refreshed during recovery test 2013-08-24 09:08:39 +02:00