Commit Graph

5860 Commits

Author SHA1 Message Date
Simon Willnauer e416ed2426 Don't pass indexing buffer side to the translog 2015-01-09 23:45:27 +01:00
Ryan Ernst 4cda543637 Tests: Add logic to handle static index upgrade case where index is
already on latest version.

See #9207
2015-01-09 12:40:44 -08:00
Robert Muir d226a973f7 core: upgrade to lucene 5 r1650327.
refactor _version docvalues migration to be more efficient.

closes #9206
2015-01-09 12:12:31 -05:00
Colin Goodheart-Smithe 91e00c6c8e Aggregations: Numeric metric aggregations are now formattable
You can now specify `format` in the request definition for most numeric metric aggregations. The exceptions are Percentile_Ranks, Cardinality and Value_Count as the response type of these can be different from the field type so the formatter won't work.

Closes #6812
2015-01-09 16:10:58 +00:00
Simon Willnauer d2277d70ff [ENGINE] Simplify Engine construction and ref counting
Today the internal engine closes itself it the engine hits an exception
it can not recover from. This complicates a lot of refcounting issues
if such an exception happens during engine creation. This commit
only markes the engine as failed and let the user close it once the exception
bubbles up. Additionally it rolls back the indexwriter to prevent any changes after
the engine is failed.
2015-01-09 02:04:16 +01:00
Martijn van Groningen 592f517583 Serialize the rest status code, not the rest status enum. 2015-01-08 23:58:46 +01:00
David Pilato 6d58db8868 Mapping With a `null` Default Timestamp Causes NullPointerException on Merge
I have a field with a `null` [default `_timestamp` value](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html#mapping-timestamp-field-default) and when I try to update the mapping I get a server error caused by a `NullPointerException`

```
[2015-01-08 17:28:56,040][DEBUG][action.admin.indices.mapping.put] [...] failed to put mappings on indices [[feed_170_v1, feed_204_v1, feed_229_v1, feed_232_v1, feed_239_v1, feed_248_v1, feed_268_v1, feed_256_v1, feed_272_v1, feed_159_v1, feed_255_v1, feed_164_v1, feed_259_v1, feed_266_v1, feed_188_v1, feed_240_v1, feed_233_v1, feed_13_v1, feed_184_v1, feed_261_v1, feed_267_v1, feed_271_v1, feed_257_v1, feed_172_v1, feed_238_v1, feed_254_v1, feed_223_v1, feed_274_v1, feed_203_v1, feed_269_v1, feed_262_v1, feed_205_v1, feed_168_v1, feed_219_v1, feed_253_v1, feed_251_v1, feed_173_v1, feed_252_v1, feed_210_v1, feed_216_v1, feed_218_v1, feed_118_v1, feed_273_v1, feed_227_v1, feed_166_v1, feed_213_v1, feed_226_v1]], type [history]
java.lang.NullPointerException
        at org.elasticsearch.index.mapper.internal.TimestampFieldMapper.merge(TimestampFieldMapper.java:287)
        at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:936)
        at org.elasticsearch.index.mapper.DocumentMapper.merge(DocumentMapper.java:693)
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:508)
        at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329)
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
```

https://github.com/elasticsearch/elasticsearch/blob/v1.4.2/src/main/java/org/elasticsearch/index/mapper/internal/TimestampFieldMapper.java#L286

Looks like the existence of default timestamp is not checked before use. The next line also has the same issue -- uses of default timestamp without checked to see if it's not null.

To reproduce:

```
$ curl -XPUT localhost:9200/twitter2

$ curl -XPUT localhost:9200/twitter2/tweet/_mapping -d '{
     "tweet" : {
         "_timestamp" : {
             "enabled" : true,
             "default" : null
         }
     }
}'

$ curl -XPUT localhost:9200/twitter2/tweet/_mapping -d '{
     "tweet" : {
         "_timestamp" : {
             "enabled" : true,
             "default" : null
         },
         "properties": {
             "user": {"type": "string"}
         }
     }
}'
```

Closes #9204.

(cherry picked from commit 62c6d63)
2015-01-08 21:33:17 +01:00
Ryan Ernst 7f9ffea97c Tests: Add upgrade step to static bwc tests 2015-01-08 11:53:48 -08:00
Ryan Ernst 060f963a8e Mappings: Remove allow_type_wrapper setting
Before Elasticsearch 1.0, the type was allowed to be passed as the root
element when uploading a document.  However, this was ambiguous if the
mappings also contained a field with the same name as the type.  The
behavior was changed in 1.0 to not allow this, but a setting was added
for backwards compatibility.  This change removes the setting for 2.0.
2015-01-08 09:13:40 -08:00
Martijn van Groningen ca4f27f40e Core: Added `_shards` header to all write responses.
The header indicates to how many shard copies (primary and replicas shards) a write was supposed to go to, to how many
shard copies to write succeeded and potentially captures shard failures if writing into a replica shard fails.

For async writes it also includes the number of shards a write is still pending.

Closes #7994
2015-01-08 18:10:08 +01:00
Ryan Ernst 1ad64a97ec Mappings: Remove includeExisting flag from adding ObjectMapper and
FieldMapper listeners

This flag is unused by the 2 places that add these listeners.
2015-01-08 09:08:54 -08:00
Simon Willnauer 959e3ca9da [CORE] Fold engine into IndexShard
This commit removes most of the Engine abstractions and removes
Engine exposure via dependency injection. It also removes the Holder
abstraction and makes the engine itself start at constrcution time.
It removes the start method from the engine entire which means no engine
instances exists until it's started. There is also no way to stop the
engine to restart, it needs to be an entire new Engine
2015-01-08 17:48:27 +01:00
Martijn van Groningen dedaf9387e Core: Also check if indices resolved via aliases resolution aren't closed and deal with this according to IndicesOptions.
Closes #9057
2015-01-08 16:45:34 +01:00
Martijn van Groningen b0b61ee0c3 Renamed allowNoIndices to failNoIndices and changed parameter order. 2015-01-08 16:43:56 +01:00
Simon Willnauer 78fc7c3f01 [TEST] Ensure shard lock is acquired before we try the timeout version 2015-01-08 15:37:31 +01:00
Colin Goodheart-Smithe ecfe72ebcc Indices API: Fix to make GET Index API consistent with docs
This fix ensures that calls to the GET alias/mappings/settings/warmers APIs return the aliases/mappings/settings/warmers object even if there is no content within them.. This make them consistent with the GET Index API docs and the breaking changes in 1.4 docs

Closes #9148
2015-01-08 08:48:44 +00:00
Boaz Leskes ad66d25fa2 Test: trace logging for testDeleteSafe 2015-01-07 23:16:19 +01:00
Simon Willnauer acf6132a99 Fix 1.5.0 Lucene version constant 2015-01-07 22:13:12 +01:00
Simon Willnauer 77493762e2 [TEST] Add back MockIndexEngine
This test class was lost accidentially in a8fa650
2015-01-07 16:38:21 +01:00
Adrien Grand fda727e20c Internal: assert that we do not call blocking code from transport threads.
This currently only adds checks to BaseFuture, but this should already cover
lots of client code. We could add more in the future, like interactions with
the filesystem and so on.

Close #9164
2015-01-07 14:08:40 +01:00
Martijn van Groningen 20f7be378b Removed parent parameter from update request, because it is just sets the routing.
The routing option should be used instead. The parent a child document points to can't be updated.

Closes #4538
2015-01-07 10:26:20 +01:00
Martijn van Groningen 687be70736 Made sure that named filters and queries defined in a wrapped query and filter are not lost.
Closes #6871
2015-01-07 09:10:16 +01:00
Martijn van Groningen c94d056454 Fixed a bug that was caused by specifying routing on a multi percolate request causing an ArrayIndexOutOfBoundsException.
The multi percolate shard responses are collected in an atomic array which uses the shard id is used as index, but the number of shards the multi percolate request was meant to go to was used as size of this array instead the total number of shards an index has. This caused the exception when routing was used.

Closes #6214
2015-01-07 08:49:25 +01:00
Simon Willnauer 7ec8973fbc [CORE] Delete shard content under lock
Once we delete the the index on a node we are closing all resources
and subsequently need to delete all shards contents from disk. Yet
this happens today under a lock (the shard lock) that needs to be
acquried in order to execute any operation on the shards data
path. We try to delete all the index meta-data once we acquired
all the shard lock but this operation can run into a timeout which causes
the index to remain on disk. Further, all shard data will be left on
disk if the timeout is reached.

This commit removes all the shards data just before the shard lock
is release as the last operation on a shard that belongs to a deleted
index.
2015-01-06 21:53:28 +01:00
Ryan Ernst f7f99b8dbf Stats: Added verbose option to segments api, with full ram tree as first
additional element per segment.

This commit adds a verbose flag to the _segments api.  Currently the
only additional information returned when set to true is the full
ram tree from lucene for each segment.
2015-01-06 10:04:52 -08:00
Adrien Grand bc86796592 Core: Remove terms filter cache.
This is our only cache which is not 'exact' and might allow for stalled results.
Additionally, a similar cache that we have and needs to perform lookups in other
indices in order to run queries is the script index, and for this index we rely
on the filesystem cache, so we should probably do the same with terms filters
lookups.

Close #9056
2015-01-06 17:21:20 +01:00
Simon Willnauer 236e2491b4 [ALLOCATION] Remove primary balance factor
The `cluster.routing.allocation.balance.primary` setting has caused
a lot of confusion in the past while it has very little benefit form a
shard allocatioon point of view. Users tend to modify this value to
evently distribute primaries across the nodes which is dangerous since
a prmiary flag on it's own can trigger relocations. The primary flag for a shard
is should not have any impact on cluster performance unless the high level feature
suffereing from primary hotspots is buggy. Yet, this setting was intended to be a
tie-breaker which is not necessary anymore since the algorithm is deterministic.

This commit removes this setting entriely.
2015-01-06 16:43:39 +01:00
Robert Muir 8948b489d6 core: Populate metadata.writtenBy for pre 1.3 index files.
Today this not populated (null) in these cases. But it would be useful to have
    this available, even just for improved error messages.

    The trickiest part today is the handling of 1.2.x files written with
    lucene 4.8+ which have both ES checksums and lucene ones. This is now simplified:
    when the lucene checksum is there, we always use it.

Closes #9152
2015-01-06 09:54:39 -05:00
Simon Willnauer 4900f52619 [ALLOCATION] Weight deltas must be absolute deltas
In some situations the shard balanceing weight delta becomes negative. Yet,
a negative delta is always treated as `well balanced` which is wrong. I wasn't
able to reproduce the issue in any way other than useing the real world data
from issue #9023. This commit adds a fix for absolute deltas as well as a base
test class that allows to build tests or simulations from the cat API output.

Closes #9023
2015-01-06 15:48:44 +01:00
Martijn van Groningen ca68136628 Made the `nested`, `reverse_nested` and `children` aggs ignore unmapped nested fields or unmapped child / parent types.
Closes #8760
2015-01-06 12:14:15 +01:00
Adrien Grand 4cb23a0520 Search: Fix paging on strings sorted in ascending order.
For the comparator to work correctly, we need to give it the same value in
`setTopValue` as the value that it gave back in `value`.

Close #9136
2015-01-06 11:48:05 +01:00
Adrien Grand 999bec1243 Make DistributorDirectory not call fsync on sub directories and missing files.
Related to #9145
2015-01-06 11:15:42 +01:00
Boaz Leskes 9090e0381f Internal: AdapterActionFuture should not set currentThread().interrupt()
If someone blocks on it and it is interrupted, we throw an ElasticsearchIllegalStateException. We should not set Thread.currentThread().interrupt(); in this case because we already communicate the interrupt through an exception.

Similar to #9001

Closes #9141
2015-01-06 11:03:27 +01:00
Adrien Grand cd04851206 Recovery: RecoveryTarget does not fsync the right file name.
Close #9144
2015-01-06 09:24:18 +01:00
Adrien Grand 90f98579a2 Upgrade to Lucene 5.0.0-snapshot-1649544.
A couple of changes that triggerred a refactoring in Elasticsearch:

 - LUCENE-6148: Accountable.getChildResources returns a collection instead of
   a list.

 - LUCENE-6121: CachingTokenFilter now propagates reset(), as a result
   SimpleQueryParser.newPossiblyAnalyzedQuery has been fixed to not reset both
   the underlying stream and the wrapper (otherwise lucene would barf because of
   a doubl reset).

 - LUCENE-6119: The auto-throttle issue changed a couple of method
   names/parameters. It also made
   `UpdateSettingsTests.testUpdateMergeMaxThreadCount` dead slow so I muted this
   test until we clea up merge throttling to use LUCENE-6119.

Close #9145
2015-01-06 09:24:18 +01:00
Alexander Reelsen 8626c18ad9 Settings: Ensure fields are overriden and not merged when using arrays
In the case you try to merge two settings, one being an array and one being
a field, together, the settings were merged instead of being overridden.

First config
my.value: 1

Second config
my.value: [ 2, 3 ]

If you execute

settingsBuilder().put(settings1).put(settings2).build()

now only values 2,3 will be in the final settings

Closes #8381
2015-01-06 09:13:10 +01:00
tlrx a5127d2ffd Plugins: Installation failed when bin/ and plugins/ directories are on different filesystems
Plugin installation failed when bin/, conf/ and plugins/ directories are on different file systems. The method File.move() can't be used to move a non-empty directory between different filesystems.

I didn't find a simple way to unittest that, even with in-memory filesystems like jimfs or the Lucene test framework.

Closes #8999
2015-01-06 08:45:59 +01:00
Robert Muir 9f6a6a832f fix TODO for master, we don't need to support this version here 2015-01-05 15:31:49 -05:00
Robert Muir 8f2f2c5663 Tests: add 0.20 index and fix test bugs in assertNewReplicasWork() 2015-01-05 15:30:18 -05:00
Clinton Gormley 0d9fad79e0 Fixed typo in geoshape exception
Invaild -> Invalid
2015-01-05 13:27:54 +01:00
Britta Weber 9454593d6a [TEST] mute ExceptionRetryTests 2015-01-03 18:40:19 +01:00
Britta Weber f45e6ae3f9 [index] Prevent duplication of documents when retry indexing after fail
If bulk index request fails due to a disconnect, unavailable shard etc, the request is
retried once before actually failing. However, even in case of failure the documents
might already be indexed. For autogenerated ids the request must not add the
documents again and therfore canHaveDuplicates must be set to true.

closes #8788
2015-01-02 15:44:47 +01:00
Nicholas Knize b21024b5f9 [GEO] Throw helpful exception for Polygons with holes outside the shell
A recent situation occured where a MultiPolygon coordinate array was accidentally defined as a single polygon with multiple holes. Since the intent was a MultiPolygon, the holes of the unintended Polygon fell outside the outer shell.  This exposed a bug in the contains logic inside BasePolygonBuilder. An ArrayIndexOutOfBoundsException was being thrown instead of a more useful ElasticsearchParseException( "hole is not within polygon" ).  This pull request fixes the bug and adds additional unit tests for verifying proper MultiPolygon type parsing.

closes #9071
2015-01-02 08:14:07 -06:00
Simon Willnauer 93dddcdfd9 [TEST] Wait for green if index is closed and reopened
if we reopen an index and the majority of the replicas where
not created the reopen will fail sicne on master this runs with
local gatway all the time.
2015-01-02 14:58:18 +01:00
Simon Willnauer 3e37c89932 [INTERNAL] Remove OperationRouting abstraction
This commit removes the unneeded OperationRouting interface and flattens
the package structure inside cluster.routing
2015-01-02 12:17:35 +01:00
Simon Willnauer b936c2f845 [RECOVERY] Allow cancel waiting for mapping changes
This commit interrupts the wait for mapping change if the index
shard gateway is waiting for the master on a mapping update.
2015-01-02 11:02:45 +01:00
Simon Willnauer 54ce210c8e [RECOVERY] Release store lock before blocking on mapping updates
This can lead to sporadic shard creating timeouts if the same shard is
created, closed and created again on the same node. The reason for this is
that we holding on to the store reference while blocking on the mapping update
that will prevent the shard lock from being released. Holding the lock is unnecessary
in this case and can simply be removed.
2015-01-02 11:02:19 +01:00
Adrien Grand 56974bf867 [TEST] Fix GroovyScriptTests failures. 2014-12-31 10:02:45 +01:00
Ryan Ernst 6304f68715 Scripting: Make _score in groovy scripts comparable
closes #8828
closes #9094
2014-12-30 16:38:44 -08:00
Nicholas Knize 0e24f34b0c [GEO] GIS envelope validation
ShapeBuilder expected coordinates for Envelope types in strict Top-Left, Bottom-Right order. Given that GeoJSON does not enforce coordinate order (as seen in #8672) clients could specify envelope bounds in any order and be compliant with the GeoJSON spec but not the ES ShapeBuilder logic. This change loosens the ShapeBuilder requirements on envelope coordinate order, reordering where necessary.

closes #2544
closes #9067
closes #9079
closes #9080
2014-12-30 11:54:07 -06:00
Lee Hinman 31652a8b3d Fix TransportNodesListShardStoreMetaData for custom data paths
Cleans up the testReusePeerRecovery test as well

The actual fix is in TransportNodesListShardStoreMetaData.java, which
needs to use `nodeEnv.shardDataPaths` instead of `nodeEnv.shardPaths`.

Due to the difficulty in tracking this down, I've added a lot of
additional logging. This also fixes a logging issue in GatewayAllocator
2014-12-30 17:50:38 +01:00
Simon Willnauer bc65afba8a [TEST] Wait for threads to finish / start before asserting 2014-12-30 15:45:28 +01:00
Adrien Grand 3af3def30b Remove some dead code. 2014-12-30 14:30:40 +01:00
Lee Hinman a4e2230ebd Add index.data_path setting
This allows specifying the path an index will be at.

`index.data_path` is specified in the settings when creating an index,
and can not be dynamically changed.

An example request would look like:

POST /myindex
{
  "settings": {
    "number_of_shards": 2,
    "data_path": "/tmp/myindex"
  }
}

And would put data in /tmp/myindex/0/index/0 and /tmp/myindex/0/index/1

Since this can be used to write data to arbitrary locations on disk, it
requires enabling the `node.enable_custom_paths` setting in
elasticsearch.yml on all nodes.

Relates to #8976
2014-12-29 14:40:50 +01:00
tlrx 2ccfde76f1 Native: Kernel32Library throws NoClassDefFound if JNA is missing
Introduced by #8993
2014-12-28 21:05:02 +01:00
Martijn van Groningen d8054ec299 inner_hits: Added another more compact syntax for inner hits.
Closes #8770
2014-12-24 17:41:35 +01:00
Adrien Grand cc71f7730a [TESTS] Make sure to wait for all shards to be allocated before running the test. 2014-12-24 11:18:40 +01:00
Martijn van Groningen a345e98575 Core: `ignore_unavailable` shouldn't ignore closed indices if a single index is specified in a search or broadcast request.
Closes #9047
Closes #7153
2014-12-24 10:46:03 +01:00
Adrien Grand 7678ab5264 Parent/child: Fix concurrency issues of the _parent field data.
`_parent` field data mistakenly shared some stateful data-structures across
threads.

Close #8396
2014-12-24 09:34:40 +01:00
Adrien Grand 67eba23b2d Core: Terms filter lookup caching should cache values, not filters.
The terms filter lookup mechanism today caches filters. Because of this, the
cache values depend on two things: the values that can be found in the lookup
index AND the mapping of the local index, since changing the mapping can change
the way that the filter is parsed. We should make the cache depend solely on
the content of the lookup index.

For instance the issue I was seeing was due to the following scenario:
 - create index1 with _id indexed
 - run terms filter with lookup, the parsed filter looks like `_id: 1 OR _id: 2`
 - remove index1
 - create index1 with _id not indexed
 - run terms filter without lookup, the parsed filter is `_uid: type#1 OR _uid: type#2` (the _id field mapper knows how to use the _uid field when _id is not indexed)
 - run terms filter with lookup, the filter is fetched from the cache: `_id: 1 OR _id: 2` but does not match anything since `_id` is not indexed.

Close #9027
2014-12-24 09:33:21 +01:00
Adrien Grand 24591b3c70 Search: parse terms filters on a single term as a term filter.
Running a terms filter on a single term is equivalent to loading a postings
list into a bit set and then returning the bit set instead of reading the
postings list on the fly.

Close #9014
2014-12-24 09:33:21 +01:00
Nicholas Knize 6d872843bd [GEO] Removing unnecessary orientation enumerators
PR #8978 included 4 unnecessary enumeration values ('cw', 'clockwise', 'ccw', 'counterclockwise'). Since the ShapeBuilder.parse method handles these as strings and maps them to LEFT and RIGHT enumerators, respectively, their enumeration counterpart is unnecessary. This minor change adds 4 static convenience variables (COUNTER_CLOCKWISE, CLOCKWISE, CCW, CW) for purposes of the API and removes the unnecessary values from the Orientation Enum.

closes #9035
2014-12-22 22:00:40 -06:00
Nicholas Knize 77a7ef28b3 [GEO] Add optional left/right parameter to GeoJSON
This feature adds an optional orientation parameter to the GeoJSON document and geo_shape mapping enabling users to explicitly define how they want Elasticsearch to interpret vertex ordering.  The default uses the right-hand rule (counterclockwise for outer ring, clockwise for inner ring) complying with OGC Simple Feature Access standards. The parameter can be explicitly specified for an entire index using the geo_shape mapping by adding "orientation":{"left"|"right"|"cw"|"ccw"|"clockwise"|"counterclockwise"} and/or overridden on each insert by adding the same parameter to the GeoJSON document.

closes #8764
2014-12-22 12:09:45 -06:00
Colin Goodheart-Smithe 391b5f3f5e Aggregations: Adds methods to get to/from as Strings for Range Aggs
Adds getToAsString and getFromAsString to Range interface and implements them for all range aggregations

Closes #9003
2014-12-22 09:56:25 +00:00
Tomas Varaneckas f8897a40af Mappings: Include currentFieldName into ObjectMapper errors
Without currentFieldName error is very generic and non informative

Close #9020
2014-12-22 10:11:25 +01:00
Nik Everett a95d75e074 Mappings: Reencode transformed result with same xcontent
When I originally wrote the transform feature I didn't think that the
XContentType of the reencoded source mattered.  It actually matters because
payloads for the completion suggester are stored and returned exactly
as encoded by this XContentType.

This revision changes the transform feature from always reencoding with smile
to always reencoding with the provided XContentType to support the completion
suggester.

Closes #8959
2014-12-22 10:11:25 +01:00
tlrx a4133ec4a3 Shutdown: Add support for Ctrl-Close event on Windows platforms to gracefully shutdown node
This commit adds the support for the Ctrl-Close event on Windows using native system calls. This way, it is possible to catch the Ctrl-Close event sent by a 'taskill /pid' command (or when the user closes the console window where elasticsearch.bat was started) and gracefully close the node. Before this commit, the node was simply killed on taskkill/window closing.
2014-12-22 09:36:29 +01:00
David Pilato 90f2f1da84 Plugins: NPE when plugins dir is inaccessible
Steps to reproduce:

1. Download fresh es.
2. `sudo mkdir plugins && sudo chmod 0700 plugins`
3. Start elasticsearch

```
elasticsearch-1.4.1 λ ./bin/elasticsearch
[2014-12-09 12:18:59,025][INFO ][node                     ] [Piotr Rasputin] version[1.4.1], pid[16338], build[89d3241/2014-11-26T15:49:29Z]
[2014-12-09 12:18:59,025][INFO ][node                     ] [Piotr Rasputin] initializing ...
{1.4.1}: Initialization Failed ...
- NullPointerException[null]
```

Closes #8837.
2014-12-21 11:59:54 +01:00
Boaz Leskes defecb3f80 Test: added some logging to NodeEnvironmentTests.testDeleteSafe 2014-12-20 00:27:37 +01:00
Boaz Leskes 4d699bd76c Internal: remove IndexCloseListener & Store.OnCloseListener
Closes #9009
2014-12-19 21:11:46 +01:00
Boaz Leskes c077683248 Test: ZenFaultDetectionTests.testNodesFaultDetectionConnectOnDisconnect should account for initial ping
There was a race condition in the test in the case where the nodes fault detection would manage to send and initial ping, followed by 2 attempts before the target service was disconnected.
2014-12-19 13:12:39 +01:00
Boaz Leskes cb0d462aa0 Test: fix racing condition in IndicesRequestTests
a request could be captured after action array was cleared.
2014-12-19 11:25:12 +01:00
Boaz Leskes 635ae29bf1 Recovery: cleaner interrupt handling during cancellation
RecoveryTarget initiates the recovery by sending a start recovery request to the source node and then waits for the recovery to complete. During recovery cancellation, we interrupt the thread so it will wake up and clean the recovery. Depending on timing, this can leave an unneeded interrupted thread status causing future IO commands to fail unneeded.

RecoverySource already had a handy utility called CancellableThreads. This extracts it to a top level class, and uses it in RecoveryTarget as well.

Closes #9000
2014-12-19 10:39:21 +01:00
Guillaume Hiron 8738583de6 FunctionScore: Fix 'avg' score mode to correctly implement weighted mean.
closes #8992
closes #9004
2014-12-18 16:36:39 -08:00
Boaz Leskes e6a190ec58 Test: AutoFilterCachingPolicy.HISTORY_SIZE should be large enough to accommodate other param 2014-12-18 21:00:47 +01:00
Adrien Grand 55d8bfd691 [TEST] Fix IndexStatsTests failures. 2014-12-18 19:33:05 +01:00
Adrien Grand ce11e0ee6d Filter cache: add a `_cache: auto` option and make it the default.
Up to now, all filters could be cached using the `_cache` flag that could be
set to `true` or `false` and the default was set depending on the type of the
`filter`. For instance, `script` filters are not cached by default while
`terms` are. For some filters, the default is more complicated and eg. date
range filters are cached unless they use `now` in a non-rounded fashion.

This commit adds a 3rd option called `auto`, which becomes the default for
all filters. So for all filters a cache wrapper will be returned, and the
decision will be made at caching time, per-segment. Here is the default logic:
 - if there is already a cache entry for this filter in the current segment,
   then return the cache entry.
 - else if the doc id set cannot iterate (eg. script filter) then do not cache.
 - else if the doc id set is already cacheable and it has been used twice or
   more in the last 1000 filters then cache it.
 - else if the filter is costly (eg. multi-term) and has been used twice or more
   in the last 1000 filters then cache it.
 - else if the doc id set is not cacheable and it has been used 5 times or more
   in the last 1000 filters, then load it into a cacheable set and cache it.
 - else return the uncached set.

So for instance geo-distance filters and script filters are going to use this
new default and are not going to be cached because of their iterators.

Similarly, date range filters are going to use this default all the time, but
it is very unlikely that those that use `now` in a not rounded fashion will get
reused so in practice they won't be cached.

`terms`, `range`, ... filters produce cacheable doc id sets with good iterators
so they will be cached as soon as they have been used twice.

Filters that don't produce cacheable doc id sets such as the `term` filter will
need to be used 5 times before being cached. This ensures that we don't spend
CPU iterating over all documents matching such filters unless we have good
evidence of reuse.

One last interesting point about this change is that it also applies to compound
filters. So if you keep on repeating the same `bool` filter with the same
underlying clauses, it will be cached on its own while up to now it used to
never be cached by default.

`_cache: true` has been changed to only cache on large segments, in order to not
pollute the cache since small segments should not be the bottleneck anyway.
However `_cache: false` still has the same semantics.

Close #8449
2014-12-18 15:51:36 +01:00
Boaz Leskes b9db5b178c Internal: PlainTransportFuture should not set currentThread().interrupt()
We use PlainTransportFuture as a future for our transport calls. If someone blocks on it and it is interrupted, we throw an ElasticsearchIllegalStateException. We should not set  Thread.currentThread().interrupt(); in this case because we already communicate the interrupt through an exception.

Closes #9001
2014-12-18 11:57:12 +01:00
Adrien Grand 6d253aba08 Upgrade to lucene-5.0.0-snapshot-1646179. 2014-12-18 09:51:20 +01:00
Boaz Leskes ee7ed387d4 Test: use less shards in SimpleQueryTests 2014-12-18 09:02:51 +01:00
Michael McCandless 242e631e95 Core: ignore known idle threads by default in /_nodes/hot_threads
Add a new ignore_idle_threads boolean option (default true) to
/_nodes/hot_threads, to filter out threads in known idle places like
waiting on a socket select or on pulling the next task from an empty
queue.

Closes #8985

Closes #8908
2014-12-17 11:59:31 -05:00
Adrien Grand f1da788211 Aggregations: reduce histogram buckets on the fly using a priority queue.
This commit makes histogram reduction a bit cleaner by expecting buckets
returned from shards to be sorted by key and merging them on-the-fly on the
coordinating node using a priority queue.

Close #8797
2014-12-17 16:46:16 +01:00
Alex Ksikes 86e1655e4b Term Vectors: support for version and version_type
This commit adds support for version and version_type to the Term Vectors API.
This could be useful in the following case whereby the user gets a document
and later wants to generate its TVs. With version, this would ensure that only
the TVs of that particular document are generated, and error out if the
document has been updated in between.

Closes #7480
2014-12-17 15:43:15 +01:00
Adrien Grand c2695d3d77 Revert "Aggregations: reduce histogram buckets on the fly using a priority queue."
This reverts commit 5694626f79.
2014-12-17 15:41:23 +01:00
Adrien Grand 5694626f79 Aggregations: reduce histogram buckets on the fly using a priority queue.
This commit makes histogram reduction a bit cleaner by expecting buckets
returned from shards to be sorted by key and merging them on-the-fly on the
coordinating node using a priority queue.

Close #8797
2014-12-17 14:21:00 +01:00
Lee Hinman ddf83a90dd [TEST] Inject IndexSettings, not node Settings objects
Guice was injecting the wrong Settings object
2014-12-17 10:55:13 +01:00
Lee Hinman 853879a121 Revert "Add index.data_path setting"
This reverts commit b2ec19ab36.
2014-12-17 09:39:19 +01:00
Boaz Leskes 8f146f9ab0 Discovery: only retry join when other node is not (yet) a master
When a node tries to join a master, the master may not yet be ready to accept the join request. In such cases we retry sending the join request up to 3 times before going back to ping. To detect this the current logic uses ExceptionsHelper.unwrapCause(t) to unwrap the incoming RemoteTransportException and inspect it's source, looking for ElasticsearchIllegalStateException. However, local ElasticsearchIllegalStateException can also be thrown when the join process should be cancelled (i.e., node shut down). In this case we shouldn't retry.

This commit adds an explicit NotMasterException to indicate the remote node is not a master. A similarly named exception (but meaning something else) in the master fault detection code was given a better name. Also clean up some other exceptions while at it.

Closes #8972
2014-12-16 23:12:46 +01:00
Lee Hinman 154e9d90cd [TEST] Mute IndicesCustomDataPathTests 2014-12-16 23:02:36 +01:00
Adrien Grand a50e3930c9 Terms aggs: Validate the aggregation order on unmapped terms too.
Close #8946
2014-12-16 18:50:37 +01:00
Lee Hinman b2ec19ab36 Add index.data_path setting
This allows specifying the path an index will be at.

`index.data_path` is specified in the settings when creating an index,
and can not be dynamically changed.

An example request would look like:

POST /myindex
{
  "settings": {
    "number_of_shards": 2,
    "data_path": "/tmp/myindex"
  }
}

And would put data in /tmp/myindex/0/index/0 and /tmp/myindex/0/index/1

Since this can be used to write data to arbitrary locations on disk, it
requires enabling the `node.enable_custom_paths` setting in
elasticsearch.yml on all nodes.
2014-12-16 18:25:21 +01:00
Nicholas Knize 18d56f154c Adding unit tests for clockwise non-OGC ordering
Adding unit tests to validate cw defined polys not-crossing and crossing the dateline, respectively
2014-12-16 10:54:51 -06:00
Nicholas Knize ac0e37449e Adding unit test for self intersecting polygons. Relevant to #7751 even/odd discussion
Updating documentation to describe polygon ambiguity and vertex ordering.
2014-12-16 10:54:39 -06:00
Nicholas Knize 437afd6f45 Adding dateline test with valid lat/lon pairs
Cleanup: Removing unnecessary logic checks
2014-12-16 10:54:28 -06:00
Nicholas Knize 85502ac40a Updating translation gate check to disregard order of hole vertices for non dateline crossing polys.
Updating comments and code readability

Correcting code formatting
2014-12-16 10:54:13 -06:00
Nicholas Knize e9e13d5cfc Computational geometry logic changes to support OGC standards
This commit adds the logic necessary for supporting polygon vertex ordering per OGC standards. Exterior rings will be treated in ccw (right-handed rule) and interior rings will be treated in cw (left-handed rule).  This feature change supports polygons that cross the dateline, and those that span the globe/map.  The unit tests have been updated and corrected to test various situations.  Greater test coverage will be provided in future commits.

Addresses #8672
2014-12-16 10:54:02 -06:00
Nicholas Knize 9466e16e24 Updating connect method to prevent duplicate edges 2014-12-16 10:53:46 -06:00
Nicholas Knize f8f92f816a [GEO] OGC compliant polygons fail with ambiguity
This feature branch implements OGC compliance for Polygon/Multi-polygon.  That is, vertex order for the exterior ring follows the right-hand rule (ccw) and all holes follow the left-hand rule (cw).  While GeoJSON imposes no restrictions, a user that wants to specify a complex poly across the dateline must do so in compliance with the OGC spec, otherwise a polygon that spans the globe will be assumed.

Reference issue #8672

Fix orientation of outer and inner ring for polygon with holes.  Updated unit tests.  Bug exists in boundary condition on negative side of dateline.
2014-12-16 10:53:34 -06:00
Michael McCandless 5910b17ece Add 1.4.3 2014-12-16 09:54:56 -05:00
mikemccand 8017f788e6 Add 1.3.8 version 2014-12-16 09:40:54 -05:00