Commit Graph

5230 Commits

Author SHA1 Message Date
Florian Schilling 0c2d12bda3 Geo-Refactoring
===============
The code handling geo-shapes is not centralized and creating points takes
place at different places. Also the collection of supported geo_shapes is
not complete regarding to the GEOJSon specification. This commit
centralizes the code related to GEO calculations and extends the old API by
a set of new shapes.

Null-Shapes
===========
The latest implementation of geo-shapes allows to index null-shapes. This
means a field that is defined to hold a geo-shape can be set to null. In
example:
    {

        "shape": null
    }

New Shapes
==========
The geo-shapes multipoint and multilinestring have been added to the
geo_shape types. Also geo_circle is introduced by this commit.

Dateline wrapping
=================
A major issue of geo-shapes is the spherical geometry. Since ElasticSearch
works on the Geo-Coordinates by wrapping the Earths surface to a plane,
some shapes are hard to define if it’s crossing the +180°, -180 longitude.
To solve this issue ElasticSearch offers the possibility to define geo
shapes crossing this borders and decompose these shapes and automatically
re-compose them in a spherical manner. This feature may change the indexed
shape-type. If for example a polygon is defined, that crosses the dateline,
it will be re-assembled to a set of polygons. This causes indexing a
multipolygon. Also linestrings crossing the dateline might be re-assembled
to multilinestrings.

Builders
========
The API has been refactored to use builders instead of using shapes. So
parsing geo-shapes will result in builder objects. These builders can be
parsed and serialized without generating any shapes. this causes shape
generation only on the nodes executing the actual operation. Also the
baseclass ShapeBuilder implements the ToXContent interface which allows to
set fields of XContent directly.

TODO’s
======
 - The geo-circle will not work, if it’s crossing the dateline
 - The envelope also needs to wrapped

Closes #1997 #2708
2013-07-03 10:53:41 +02:00
Shay Banon 0643569e70 remove table tests for now 2013-07-03 10:31:32 +02:00
Andrew Raines 0d57c4eafd Prefer getHostAddress(). 2013-07-02 23:40:22 -05:00
Andrew Raines 750f20d9d4 Add relocation info to _cat/shards. 2013-07-02 23:30:59 -05:00
Andrew Raines 8a90f4b5ff Don't trim() table since we need the newline most of the time. 2013-07-02 23:17:02 -05:00
Andrew Raines 4ab2cd13f2 Add /_cat/master. 2013-07-02 22:43:43 -05:00
Andrew Raines 7aa9d4bc9f Start cat api with shards endpoint. 2013-07-02 18:00:05 -05:00
Shay Banon 8919e7e602 simplify builder API with simpler get()
also simplify some common API calls, for example, a simplified format in Java API for providing mapping
2013-07-02 12:10:35 +02:00
Boaz Leskes 98bd5a0e66 _source could be loaded twice from disk
if only partial_fields was specified or fields needed to be extracted from _source the source it self isn't needed to be returned.
2013-07-02 11:59:14 +02:00
Alexander Reelsen 2dcc664310 Support for parent in multi get request
When specifying the docs to be returned in a multi get request, a parent
field could not be specified, so that some docs seemingly did not exist,
even though they did.

This fix behaves like the normal GetRequest and simply overwrites the
routing value if it has not yet been set.

Also a test for routing with mget has been added.

Closes #3274
2013-07-02 08:51:55 +02:00
Shay Banon 3a0ce0bde8 (Java) Using primitive arrays instead of Object with map/builder
also simplify and consolidate the builder generic value write handling
fixes #3279
2013-07-01 21:47:46 +02:00
Luca Cavanna 2314b8665b Parent was ignored in exists request
The routing was always set to null right after the parent was set.

Closes #3276
2013-07-01 17:15:17 +02:00
Benjamin Devèze 9ce0156d39 Supports mget fields parameter given as string.
Closes 3270
2013-07-01 13:48:27 +02:00
Alexander Reelsen 7790d2bf65 Stop aborting of multiget requests in case of missing index
The MultiGet API stops with a IndexMissingException, if only one of all
requests tries to access a non existing index. This patch creates a
failure for this item without failing the whole request.

Closes #3267
2013-07-01 09:43:43 +02:00
Martijn van Groningen 751d4ab68e Clean up update tests 2013-06-30 18:50:22 +02:00
Martijn van Groningen fdec15f204 Changes validation error message 2013-06-30 18:50:22 +02:00
Benjamin Devèze e815f257cc Fixes issue that in some cases the doc_as_upsert option is ignored.
Closes #3265
2013-06-30 18:49:37 +02:00
Adrien Grand 40cd549c37 Ignore live docs when loading field data, the ID cache and filter caches.
Relying on deleted documents when loading field data is dangerous because a
field data instance might be loaded for a given generation of a segment and
then loaded from the cache by an older generation of the same segment which
has fewer deleted documents. This could, for example, lead to under-estimated
facet counts. The same issue applies to the ID cache and filter caches.

Close #3224
2013-06-29 11:41:09 +02:00
Drew Raines 477489ac82 Fix misspelling. 2013-06-28 10:02:22 -05:00
Martijn van Groningen f8780751c4 Fixes the issue that the `parent` option was ignored for delete requests.
The `parent` option was ignored in the delete api (rest only) and for delete actions in the bulk api.
This bug occurred in the case that the _parent field is enabled, and only the parent option was used. This resulted in a situation that documents are deleted even if the specified parent value is incorrect.

Closes #3257
2013-06-28 14:31:17 +02:00
Alexander Reelsen 2d5b832f17 Updated elasticsearch.yml file for recovery throttling 2013-06-28 13:24:32 +02:00
Alexander Reelsen 455bc32460 Moving forbidden-api checks to compile phase instead of test phase (fail fast) 2013-06-28 13:12:52 +02:00
Alexander Reelsen 0a50ed0a27 Dont execute suggest before parsing the full request
The current implementation of parsing suggestions executed inside of the
the pull parser - which resulted in being reliable of the order of the
elements in the request. This fix changes the behaviour to parse the
relevant parts of the request first and then execute all the suggestions
afterwards, so we can be sure that every information has been extracted
from the request before execution.

Closes #3247
2013-06-28 12:07:34 +02:00
Alexander Reelsen 71d5148b1c Make index.warmer.enabled setting dynamic
Even though proposed in the documentation, the realtime enabling/disabling of
index warmers was not supported. This commit adds support for
index.warmer.enabled as a dynamic setting.

Closes #3246
2013-06-28 10:28:08 +02:00
Shay Banon 0114fb0f58 add running the tests in headless mode in maven 2013-06-27 23:23:13 +01:00
David Pilato f5e5eb5fb9 NPE in PluginManager when asking for list on non existing dir
Asking for list of installed plugins with no existing plugin dir:

```sh
$ bin/plugin --list
```

It causes a NPE in PluginManager.
Closes #3253.
2013-06-27 16:12:27 +02:00
Adrien Grand 2fb5d3ff51 Merge integer field data implementations.
This commit merges field data implementations for byte, short, int and long
data into PackedArrayAtomicFieldData which uses Lucene's PackedInts API to
store data.

Close #3220
2013-06-26 22:22:23 +02:00
David Pilato 5a20ba5ff2 PluginManager fails with unknown command when passing url or verbose parameters
Closes #3245.
2013-06-26 18:49:50 +02:00
Adrien Grand cb34cccc1e Fix field number attribution to _version.
IndexUpgraderMergePolicy assumed that field numbers were dense and that
fieldInfos.size() was a free field number. This can however be wrong for a
segment which doesn't have one or more fields that some older segments have.

Close #3237
2013-06-26 16:57:13 +02:00
Adrien Grand 1954f770a1 Put Eclipse settings in the root directory.
This enforces that settings are taken into account whichever mean is used to
import the project into Eclipse (manual import, m2e, mvn eclipse:eclipse, ...).
2013-06-26 16:51:47 +02:00
Alexander Reelsen 7e55354f4a Added support for PatternReplaceCharFilter
PatternReplaceCharFilter allows the use of a regex to manipulate the characters in a string before analysis

Closes #3197
2013-06-26 15:25:18 +02:00
Florian Schilling 42b3f06a32 fixed ShapeFetchService. closes #3242 2013-06-26 12:55:56 +02:00
Shay Banon c3ef49f5b0 add 0.90.3 2013-06-26 09:02:54 +01:00
Shay Banon 1b870774b6 Terms Filter Lookup: Allow to disable caching of lookup terms
closes #3241
2013-06-26 08:45:57 +01:00
Shay Banon 991b5abdf4 Terms Filter Lookup: When on cache key defined, use terms values as key to filter cache
closes #3240
2013-06-26 08:34:25 +01:00
Martijn van Groningen 64d42782a9 No need to fetch the freq for term filter 2013-06-25 22:40:59 +02:00
Boaz Leskes 99cb26fa02 A small doc change to reflect StreamOutput.writeVInt() does support negative numbers but not efficiently. StreamOutput.writeVLong & StreamInput.readVLong really support it.
This is to better describe the current situation. We probably want to normalize these methods and potentially add optimization/support for -1 values.
2013-06-25 14:13:44 +02:00
Martijn van Groningen 4c0b10aec7 Made the minimum score only active when executing the main query and not during the context rewrite phase.
This fixes parent/child queries when using minimum_score.

Closes #3203
2013-06-25 13:38:10 +02:00
Florian Schilling 84fa9ead4d The `geohash_cell` filter now adapts the format of other geo-filters. The oject fieldnames match the fieldnames document names automatically. This invalidates the `field` field in previeous versions. The value these fields value is a `geo_point` value (all formats supported) which is internally translated to a geohash. Since those points alway have a maximum precision (level 12) a `precision` definition has been included. This precision can either be defined as *length* of the geohash-string or as *distance*. It's assumed the a distance without any unit is a geohash-length.
```
GET 'http://127.0.0.1:9200/locations/_search?pretty=true' -d '{
    "query": {
        "match_all":{}
    },
    "filter": {
        "geohash_cell": {
			"pin": {
				"lat": 13.4080,
				"lon": 52.5186
			},
            "precision": 3,
            "neighbors": true
        }
    }
}'
```
Closes #3229
2013-06-25 12:16:08 +02:00
Shay Banon d094042b08 Lookup Terms Filter ignores the routing parameter
fixes #3233
2013-06-25 11:54:09 +02:00
Shay Banon cbe18608ef Deleting or closing an index doesn't clean the memory properly
fixes #3232
2013-06-25 00:44:00 +02:00
Shay Banon b91cb8b779 properly set the set flag 2013-06-25 00:06:55 +02:00
Alexander Reelsen c561b1bbcf Added Arabic/PersianNormalizationFilters from Lucene 2013-06-24 22:09:53 +02:00
Shay Banon f3c068f637 only call terms lookup once and not per segment 2013-06-24 18:01:35 +02:00
Florian Schilling e0846448e9 Reduced geobulk data 2013-06-24 16:20:38 +02:00
Shay Banon 160cb36b9d better handling of null filters when caching them 2013-06-24 15:34:44 +02:00
Shay Banon 80ede081c3 Lookup Terms Filter _cache parameter not being taken into account
fixes #3219
2013-06-24 15:23:16 +02:00
Adrien Grand 432628086f Fix NumericTokenizer.
NumericTokenizer is a simple wrapper aroung a NumericTokenStream. However, its
implementations had a few issues: its reset() method was not idempotent,
causing exceptions if reset() was called twice (causing #3211) and it had no
attributes, meaning that the only thing it allowed to do is counting the number
of generated tokens. The reason why indexing numeric data worked is that
the mapper's parseCreateField directly generates a NumericTokenStream and
by-passes the analyzer.

This commit makes NumericTokenizer.reset idempotent and makes consuming a
NumericTokenizer behave the same way as consuming the underlying
NumericTokenStream.
2013-06-24 14:13:27 +02:00
Shay Banon 58e68db148 improve geohash_filter to use terms filter
and various other cleanups
2013-06-24 11:34:59 +02:00
Shay Banon 6fd74fa39e Terms Filter Lookup: Failure when no mappings for the terms field exists (no data indexed)
closes #3216
2013-06-22 19:41:02 +02:00