10428 Commits

Author SHA1 Message Date
Martijn van Groningen
a345e98575 Core: ignore_unavailable shouldn't ignore closed indices if a single index is specified in a search or broadcast request.
Closes #9047
Closes #7153
2014-12-24 10:46:03 +01:00
Adrien Grand
7678ab5264 Parent/child: Fix concurrency issues of the _parent field data.
`_parent` field data mistakenly shared some stateful data-structures across
threads.

Close #8396
2014-12-24 09:34:40 +01:00
Adrien Grand
67eba23b2d Core: Terms filter lookup caching should cache values, not filters.
The terms filter lookup mechanism today caches filters. Because of this, the
cache values depend on two things: the values that can be found in the lookup
index AND the mapping of the local index, since changing the mapping can change
the way that the filter is parsed. We should make the cache depend solely on
the content of the lookup index.

For instance the issue I was seeing was due to the following scenario:
 - create index1 with _id indexed
 - run terms filter with lookup, the parsed filter looks like `_id: 1 OR _id: 2`
 - remove index1
 - create index1 with _id not indexed
 - run terms filter without lookup, the parsed filter is `_uid: type#1 OR _uid: type#2` (the _id field mapper knows how to use the _uid field when _id is not indexed)
 - run terms filter with lookup, the filter is fetched from the cache: `_id: 1 OR _id: 2` but does not match anything since `_id` is not indexed.

Close #9027
2014-12-24 09:33:21 +01:00
Adrien Grand
24591b3c70 Search: parse terms filters on a single term as a term filter.
Running a terms filter on a single term is equivalent to loading a postings
list into a bit set and then returning the bit set instead of reading the
postings list on the fly.

Close #9014
2014-12-24 09:33:21 +01:00
Janmejay Singh
01bb02a0a4 ignore intellij project/workspace files
closes #9044
2014-12-23 12:00:11 -08:00
Ryan Ernst
39b3613420 Fix date histogram docs grammar. 2014-12-23 10:19:55 -08:00
Nicholas Knize
6d872843bd [GEO] Removing unnecessary orientation enumerators
PR #8978 included 4 unnecessary enumeration values ('cw', 'clockwise', 'ccw', 'counterclockwise'). Since the ShapeBuilder.parse method handles these as strings and maps them to LEFT and RIGHT enumerators, respectively, their enumeration counterpart is unnecessary. This minor change adds 4 static convenience variables (COUNTER_CLOCKWISE, CLOCKWISE, CCW, CW) for purposes of the API and removes the unnecessary values from the Orientation Enum.

closes #9035
2014-12-22 22:00:40 -06:00
Nicholas Knize
77a7ef28b3 [GEO] Add optional left/right parameter to GeoJSON
This feature adds an optional orientation parameter to the GeoJSON document and geo_shape mapping enabling users to explicitly define how they want Elasticsearch to interpret vertex ordering.  The default uses the right-hand rule (counterclockwise for outer ring, clockwise for inner ring) complying with OGC Simple Feature Access standards. The parameter can be explicitly specified for an entire index using the geo_shape mapping by adding "orientation":{"left"|"right"|"cw"|"ccw"|"clockwise"|"counterclockwise"} and/or overridden on each insert by adding the same parameter to the GeoJSON document.

closes #8764
2014-12-22 12:09:45 -06:00
Adrien Grand
fb6c3b7c29 [Docs] Improve documentation of the new caching policy for filters. 2014-12-22 17:14:47 +01:00
Colin Goodheart-Smithe
391b5f3f5e Aggregations: Adds methods to get to/from as Strings for Range Aggs
Adds getToAsString and getFromAsString to Range interface and implements them for all range aggregations

Closes #9003
2014-12-22 09:56:25 +00:00
Tomas Varaneckas
f8897a40af Mappings: Include currentFieldName into ObjectMapper errors
Without currentFieldName error is very generic and non informative

Close #9020
2014-12-22 10:11:25 +01:00
Nik Everett
a95d75e074 Mappings: Reencode transformed result with same xcontent
When I originally wrote the transform feature I didn't think that the
XContentType of the reencoded source mattered.  It actually matters because
payloads for the completion suggester are stored and returned exactly
as encoded by this XContentType.

This revision changes the transform feature from always reencoding with smile
to always reencoding with the provided XContentType to support the completion
suggester.

Closes #8959
2014-12-22 10:11:25 +01:00
tlrx
a4133ec4a3 Shutdown: Add support for Ctrl-Close event on Windows platforms to gracefully shutdown node
This commit adds the support for the Ctrl-Close event on Windows using native system calls. This way, it is possible to catch the Ctrl-Close event sent by a 'taskill /pid' command (or when the user closes the console window where elasticsearch.bat was started) and gracefully close the node. Before this commit, the node was simply killed on taskkill/window closing.
2014-12-22 09:36:29 +01:00
David Pilato
90f2f1da84 Plugins: NPE when plugins dir is inaccessible
Steps to reproduce:

1. Download fresh es.
2. `sudo mkdir plugins && sudo chmod 0700 plugins`
3. Start elasticsearch

```
elasticsearch-1.4.1 λ ./bin/elasticsearch
[2014-12-09 12:18:59,025][INFO ][node                     ] [Piotr Rasputin] version[1.4.1], pid[16338], build[89d3241/2014-11-26T15:49:29Z]
[2014-12-09 12:18:59,025][INFO ][node                     ] [Piotr Rasputin] initializing ...
{1.4.1}: Initialization Failed ...
- NullPointerException[null]
```

Closes #8837.
2014-12-21 11:59:54 +01:00
Boaz Leskes
defecb3f80 Test: added some logging to NodeEnvironmentTests.testDeleteSafe 2014-12-20 00:27:37 +01:00
Boaz Leskes
4d699bd76c Internal: remove IndexCloseListener & Store.OnCloseListener
Closes #9009
2014-12-19 21:11:46 +01:00
Boaz Leskes
c077683248 Test: ZenFaultDetectionTests.testNodesFaultDetectionConnectOnDisconnect should account for initial ping
There was a race condition in the test in the case where the nodes fault detection would manage to send and initial ping, followed by 2 attempts before the target service was disconnected.
2014-12-19 13:12:39 +01:00
Boaz Leskes
cb0d462aa0 Test: fix racing condition in IndicesRequestTests
a request could be captured after action array was cleared.
2014-12-19 11:25:12 +01:00
Boaz Leskes
635ae29bf1 Recovery: cleaner interrupt handling during cancellation
RecoveryTarget initiates the recovery by sending a start recovery request to the source node and then waits for the recovery to complete. During recovery cancellation, we interrupt the thread so it will wake up and clean the recovery. Depending on timing, this can leave an unneeded interrupted thread status causing future IO commands to fail unneeded.

RecoverySource already had a handy utility called CancellableThreads. This extracts it to a top level class, and uses it in RecoveryTarget as well.

Closes #9000
2014-12-19 10:39:21 +01:00
Guillaume Hiron
8738583de6 FunctionScore: Fix 'avg' score mode to correctly implement weighted mean.
closes #8992
closes #9004
2014-12-18 16:36:39 -08:00
Boaz Leskes
e6a190ec58 Test: AutoFilterCachingPolicy.HISTORY_SIZE should be large enough to accommodate other param 2014-12-18 21:00:47 +01:00
Adrien Grand
55d8bfd691 [TEST] Fix IndexStatsTests failures. 2014-12-18 19:33:05 +01:00
Adrien Grand
ce11e0ee6d Filter cache: add a _cache: auto option and make it the default.
Up to now, all filters could be cached using the `_cache` flag that could be
set to `true` or `false` and the default was set depending on the type of the
`filter`. For instance, `script` filters are not cached by default while
`terms` are. For some filters, the default is more complicated and eg. date
range filters are cached unless they use `now` in a non-rounded fashion.

This commit adds a 3rd option called `auto`, which becomes the default for
all filters. So for all filters a cache wrapper will be returned, and the
decision will be made at caching time, per-segment. Here is the default logic:
 - if there is already a cache entry for this filter in the current segment,
   then return the cache entry.
 - else if the doc id set cannot iterate (eg. script filter) then do not cache.
 - else if the doc id set is already cacheable and it has been used twice or
   more in the last 1000 filters then cache it.
 - else if the filter is costly (eg. multi-term) and has been used twice or more
   in the last 1000 filters then cache it.
 - else if the doc id set is not cacheable and it has been used 5 times or more
   in the last 1000 filters, then load it into a cacheable set and cache it.
 - else return the uncached set.

So for instance geo-distance filters and script filters are going to use this
new default and are not going to be cached because of their iterators.

Similarly, date range filters are going to use this default all the time, but
it is very unlikely that those that use `now` in a not rounded fashion will get
reused so in practice they won't be cached.

`terms`, `range`, ... filters produce cacheable doc id sets with good iterators
so they will be cached as soon as they have been used twice.

Filters that don't produce cacheable doc id sets such as the `term` filter will
need to be used 5 times before being cached. This ensures that we don't spend
CPU iterating over all documents matching such filters unless we have good
evidence of reuse.

One last interesting point about this change is that it also applies to compound
filters. So if you keep on repeating the same `bool` filter with the same
underlying clauses, it will be cached on its own while up to now it used to
never be cached by default.

`_cache: true` has been changed to only cache on large segments, in order to not
pollute the cache since small segments should not be the bottleneck anyway.
However `_cache: false` still has the same semantics.

Close #8449
2014-12-18 15:51:36 +01:00
Boaz Leskes
b9db5b178c Internal: PlainTransportFuture should not set currentThread().interrupt()
We use PlainTransportFuture as a future for our transport calls. If someone blocks on it and it is interrupted, we throw an ElasticsearchIllegalStateException. We should not set  Thread.currentThread().interrupt(); in this case because we already communicate the interrupt through an exception.

Closes #9001
2014-12-18 11:57:12 +01:00
javanna
d17db85794 [TEST] upgrade randomized runner to 2.1.11
2.1.11 contains the fix for this issue: https://github.com/carrotsearch/randomizedtesting/issues/179

Closes #8930
2014-12-18 10:40:05 +01:00
Adrien Grand
6d253aba08 Upgrade to lucene-5.0.0-snapshot-1646179. 2014-12-18 09:51:20 +01:00
Boaz Leskes
ee7ed387d4 Test: use less shards in SimpleQueryTests 2014-12-18 09:02:51 +01:00
Michael McCandless
242e631e95 Core: ignore known idle threads by default in /_nodes/hot_threads
Add a new ignore_idle_threads boolean option (default true) to
/_nodes/hot_threads, to filter out threads in known idle places like
waiting on a socket select or on pulling the next task from an empty
queue.

Closes #8985

Closes #8908
2014-12-17 11:59:31 -05:00
Adrien Grand
f1da788211 Aggregations: reduce histogram buckets on the fly using a priority queue.
This commit makes histogram reduction a bit cleaner by expecting buckets
returned from shards to be sorted by key and merging them on-the-fly on the
coordinating node using a priority queue.

Close #8797
2014-12-17 16:46:16 +01:00
Alex Ksikes
86e1655e4b Term Vectors: support for version and version_type
This commit adds support for version and version_type to the Term Vectors API.
This could be useful in the following case whereby the user gets a document
and later wants to generate its TVs. With version, this would ensure that only
the TVs of that particular document are generated, and error out if the
document has been updated in between.

Closes #7480
2014-12-17 15:43:15 +01:00
Adrien Grand
c2695d3d77 Revert "Aggregations: reduce histogram buckets on the fly using a priority queue."
This reverts commit 5694626f79555af65b1109125afef49657186f0a.
2014-12-17 15:41:23 +01:00
Martijn Laarman
bc76032fdd Documented the new terminate_after querystring option on search as implemented in #6885 2014-12-17 14:49:05 +01:00
Adrien Grand
5694626f79 Aggregations: reduce histogram buckets on the fly using a priority queue.
This commit makes histogram reduction a bit cleaner by expecting buckets
returned from shards to be sorted by key and merging them on-the-fly on the
coordinating node using a priority queue.

Close #8797
2014-12-17 14:21:00 +01:00
Yasir Bamarni
5059d6fe1c Update percolate.asciidoc
wrong type used in the -GET request

Closes #8942
2014-12-17 14:05:27 +01:00
Pablo Díaz-López
adb1a5b43b Update getting-started.asciidoc
Missing -X flag at the curl template

Closes #8977
2014-12-17 14:03:38 +01:00
Peter Johnson a.k.a. insertcoffee
4b5e6b2de0 [docs] pedantry
Closes #8982
2014-12-17 13:46:39 +01:00
Joao Duarte
d73f7c90aa doc: transport sniff only adds data nodes 2014-12-17 11:29:01 +00:00
Lee Hinman
ddf83a90dd [TEST] Inject IndexSettings, not node Settings objects
Guice was injecting the wrong Settings object
2014-12-17 10:55:13 +01:00
Lee Hinman
853879a121 Revert "Add index.data_path setting"
This reverts commit b2ec19ab360cc5f23d3cde391c8fc6e700dcb41f.
2014-12-17 09:39:19 +01:00
Boaz Leskes
8f146f9ab0 Discovery: only retry join when other node is not (yet) a master
When a node tries to join a master, the master may not yet be ready to accept the join request. In such cases we retry sending the join request up to 3 times before going back to ping. To detect this the current logic uses ExceptionsHelper.unwrapCause(t) to unwrap the incoming RemoteTransportException and inspect it's source, looking for ElasticsearchIllegalStateException. However, local ElasticsearchIllegalStateException can also be thrown when the join process should be cancelled (i.e., node shut down). In this case we shouldn't retry.

This commit adds an explicit NotMasterException to indicate the remote node is not a master. A similarly named exception (but meaning something else) in the master fault detection code was given a better name. Also clean up some other exceptions while at it.

Closes #8972
2014-12-16 23:12:46 +01:00
Lee Hinman
154e9d90cd [TEST] Mute IndicesCustomDataPathTests 2014-12-16 23:02:36 +01:00
Adrien Grand
a50e3930c9 Terms aggs: Validate the aggregation order on unmapped terms too.
Close #8946
2014-12-16 18:50:37 +01:00
Lee Hinman
b2ec19ab36 Add index.data_path setting
This allows specifying the path an index will be at.

`index.data_path` is specified in the settings when creating an index,
and can not be dynamically changed.

An example request would look like:

POST /myindex
{
  "settings": {
    "number_of_shards": 2,
    "data_path": "/tmp/myindex"
  }
}

And would put data in /tmp/myindex/0/index/0 and /tmp/myindex/0/index/1

Since this can be used to write data to arbitrary locations on disk, it
requires enabling the `node.enable_custom_paths` setting in
elasticsearch.yml on all nodes.
2014-12-16 18:25:21 +01:00
Nicholas Knize
18d56f154c Adding unit tests for clockwise non-OGC ordering
Adding unit tests to validate cw defined polys not-crossing and crossing the dateline, respectively
2014-12-16 10:54:51 -06:00
Nicholas Knize
ac0e37449e Adding unit test for self intersecting polygons. Relevant to #7751 even/odd discussion
Updating documentation to describe polygon ambiguity and vertex ordering.
2014-12-16 10:54:39 -06:00
Nicholas Knize
437afd6f45 Adding dateline test with valid lat/lon pairs
Cleanup: Removing unnecessary logic checks
2014-12-16 10:54:28 -06:00
Nicholas Knize
85502ac40a Updating translation gate check to disregard order of hole vertices for non dateline crossing polys.
Updating comments and code readability

Correcting code formatting
2014-12-16 10:54:13 -06:00
Nicholas Knize
e9e13d5cfc Computational geometry logic changes to support OGC standards
This commit adds the logic necessary for supporting polygon vertex ordering per OGC standards. Exterior rings will be treated in ccw (right-handed rule) and interior rings will be treated in cw (left-handed rule).  This feature change supports polygons that cross the dateline, and those that span the globe/map.  The unit tests have been updated and corrected to test various situations.  Greater test coverage will be provided in future commits.

Addresses #8672
2014-12-16 10:54:02 -06:00
Nicholas Knize
9466e16e24 Updating connect method to prevent duplicate edges 2014-12-16 10:53:46 -06:00
Nicholas Knize
f8f92f816a [GEO] OGC compliant polygons fail with ambiguity
This feature branch implements OGC compliance for Polygon/Multi-polygon.  That is, vertex order for the exterior ring follows the right-hand rule (ccw) and all holes follow the left-hand rule (cw).  While GeoJSON imposes no restrictions, a user that wants to specify a complex poly across the dateline must do so in compliance with the OGC spec, otherwise a polygon that spans the globe will be assumed.

Reference issue #8672

Fix orientation of outer and inner ring for polygon with holes.  Updated unit tests.  Bug exists in boundary condition on negative side of dateline.
2014-12-16 10:53:34 -06:00