4932 Commits

Author SHA1 Message Date
Shay Banon
58e68db148 improve geohash_filter to use terms filter
and various other cleanups
2013-06-24 11:34:59 +02:00
Shay Banon
6fd74fa39e Terms Filter Lookup: Failure when no mappings for the terms field exists (no data indexed)
closes 
2013-06-22 19:41:02 +02:00
Simon Willnauer
7206c60019 stabelize more tests 2013-06-20 17:04:30 +02:00
Boaz Leskes
178629382c Added version support to update requests
Moved version handling from RobinEngine into VersionType. This avoids code re-use and makes it cleaner and easier to read.

Closes 
2013-06-20 13:27:00 +02:00
Cédric HOURCADE
71849668e9 Add Lucene CommonGrams/CommonGramsQuery token fiter
Both filters merged in a single "common_grams" tokenfilter.

Closes 
2013-06-19 17:39:04 +02:00
Florian Schilling
5aa0a8438f GeoHash Filter
##############

Previous versions of the GeoPointFieldMapper just stored the actual geohash
of a point. This commit changes the behavior of storing geohashes by storing
the geohash and all its prefixes in decreasing order in the same field. To
enable this functionality the option geohash_prefix must be set in the mapping.

This behavior allows to filter GeoPoints by their geohashes. Basically a
geohash prefix is defined by the filter and all geohashes that match this
prefix will be returned. The neighbors flag allows to filter geohashes
that surround the given geohash cell. In general the neighborhood of a
geohash is defined by its eight adjacent cells.

To enable this, the type of filtered fields must be geo_point with geohashes
and geohash_prefix enabled.

For example:
    curl -XPUT 'http://127.0.0.1:9200/locations/?pretty=true' -d '{
        "mappings" : {
            "location": {
                "properties": {
                    "pin": {
                        "type": "geo_point",
                        "geohash": true,
                        "geohash_prefix": true
                    }
                }
            }
        }
    }'

This example defines a mapping for a type location in an index locations
with a field pin. The option geohash arranges storing the geohash of
the pin field.

To filter the results by the geohash a geohash_cell needs to be defined.
For example
    curl -XGET 'http://127.0.0.1:9200/locations/_search?pretty=true' -d '{
        "query": {
            "match_all":{}
        },
        "filter": {
            "geohash_cell": {
                "field": "pin",
                "geohash": "u30",
                "neighbors": true
            }
        }
    }'

This filter will match all geohashes that start with one of the following
prefixes: u30, u1r, u32, u33, u1p, u31, u0z, u2b and u2c.

Internally the GeoHashFilter is either a simple TermFilter, in case no
neighbors should be filtered or a BooleanFilter combining the TermFilters
of the geohash and all its neighbors.

Closes 
2013-06-19 14:35:02 +02:00
Clinton Gormley
bc90e73932 Expose fielddata "fields" param in standard in indicesStatsRequest
Closes 
2013-06-19 13:18:55 +02:00
Clinton Gormley
b27ad99b8d The "fielddata" qs param to index stats was setting idCache, not fieldData
Closes 
2013-06-19 12:33:01 +02:00
Boaz Leskes
02c6222320 Trimming MVEL scripts before compiling them.
This bypasses an issue with MVEL error handling why can go into an infinite loop in some edge cases. More info here: http://jira.codehaus.org/browse/MVEL-292

Closes 
2013-06-19 12:14:10 +02:00
Adrien Grand
fccbe9c185 Import the new n-gram tokenizers and filters from Lucene.
Lucene 4.4 will feature new n-gram tokenizers and filters that should not
generate broken offsets (that cause highlighting bugs) anymore. They also
correctly handle supplementary characters and the tokenizers can work in a
streaming fashion (they are not limited to the first 1024 chars of the
stream anymore).
2013-06-19 09:45:17 +02:00
Simon Willnauer
a388588b1f Upgrade to Lucene 4.3.1 2013-06-18 22:15:31 +02:00
Simon Willnauer
c9c68fced7 Add ShardId and Index to SuggestionContext
Suggesters might need access to the shard they run on as well as the
index they operate on. This patch adds indexname and shard ID to the
SuggestionContext

Closes 
2013-06-18 15:00:42 +02:00
Cédric HOURCADE
d41c37fdfa Add support for "high_freq" and "low_freq" parameters for Common Query
"minimum_should_match" parameter. High freq parameters is used when the
query has only high frequent terms.

Closes 
2013-06-17 20:31:38 +02:00
Simon Willnauer
8363fcf281 create 'shape' index explicitly to ensure tests don't hang 2013-06-17 17:48:35 +02:00
Simon Willnauer
deda7a37fc Ensure tests wait for relocations 2013-06-17 13:55:18 +02:00
Martijn van Groningen
e7d13971f3 Simplified validate check 2013-06-17 10:36:38 +02:00
Marcus Granström
b7cb479a72 Added doc_as_upsert option to update api.
This option can reduce to amount of data being send to Elasticsearch.
Closes 
2013-06-17 10:23:37 +02:00
Clinton Gormley
2f616e3c2a Merge pull request from clintongormley/nodes_info_timeout
Expose timeout for nodes_info requests in the REST interface
2013-06-15 10:28:42 -07:00
Clinton Gormley
27a8083b7d Expose timeout for nodes_info requests in the REST interface
Closes 
2013-06-15 19:01:09 +02:00
Adrien Grand
a30d58aae2 Compress PagedBytesAtomicFieldData's termOrdToBytesOffset.
Using MonotonicAppendingLongBuffer instead of a GrowableWriter should help
save several bits per value, especially when the bytes to store have similar
lengths.

Closes 
2013-06-15 09:31:23 +02:00
Simon Willnauer
25f19f8b87 Wait for reloctations in utility methods 2013-06-14 21:59:43 +02:00
Simon Willnauer
a4fc11b3d1 Wait for Yellow state after indexing 2013-06-14 12:14:43 +02:00
Clinton Gormley
f537b8ccee Change default operator to "or" for "low_freq_operator" and "high_freq_operator" parameters for "common" queries
Closes 
2013-06-14 11:08:56 +02:00
Martijn van Groningen
8d59ed3ab0 Use SinglePackedOrdinals over SingleArrayOrdinals to reduce the memory ordinals take for single valued fields in field data.
Closes 
2013-06-14 10:16:49 +02:00
Simon Willnauer
b995abfa80 Call DISI#cost() ahead of time to prevent NPE
NotDocIdSet resets the internal DocIdSetIterator to null causing NPE
if cost is called.

Closes 
2013-06-14 09:49:30 +02:00
Clinton Gormley
c3332db7d0 Fixed an error message on the terms filter 2013-06-13 19:40:47 +02:00
Simon Willnauer
4e4529f3dc Check if Alias Creation was acknoledge in tests.
if there is a failure during alias creation the tests don't fail with the
correct exception. This commit simplifies the debugging asserting on the ack
flag.
2013-06-13 15:52:33 +02:00
Simon Willnauer
a654c3d103 Set a hard limit on the number of tokens we run suggestion on
PhraseSuggester can be very slow and CPU intensive if a lot of terms
are suggested. Yet, to prevent cluster instabilty and long running requests
this commit adds a hard limit of by default 10 tokens where we just return
no correction for anymore if the query is parsed into more tokens.

Closes 
2013-06-13 15:12:38 +02:00
Alexander Reelsen
9d3e34b9f9 Allow date format to supported group of built-in patterns
Until now 'named dates' like dateOptionalTime could not be used as a group
of dates. This patch allows it to group it arbitrarily like this:

* yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||dateOptionalTime
* dateOptionalTime||yyyy/MM/dd HH:mm:ss||yyyy/MM/dd
* yyyy/MM/dd HH:mm:ss||dateOptionalTime||yyyy/MM/dd
* date_time||date_time_no_millis

Closes 
2013-06-13 15:03:55 +02:00
Martijn van Groningen
015d820e53 Made not found logic easier.
Relates 
2013-06-13 13:21:36 +02:00
Simon Willnauer
7e2d8f1358 add more verbose assertions to tests 2013-06-13 11:58:28 +02:00
Adrien Grand
c20d44a1ff Forbid usage of Character.codePoint(At|Before) and Collections.sort.
Character.codePointAt and codePointBefore have two versions: one which only
accepts an offset, and one which accepts an offset and a limit. The former can
be dangerous when working with buffers of characters because if the offset
is the last char of the buffer, a char outside the buffer might be used to
compute the code point, so one should always use the version which accepts a
limit.

Collections.sort is wasteful on random-access lists: it dumps data into an
array, sorts the list and then adds elements back to the list. However, the
sorting can easily be performed in-place by using Lucene's
CollectionUtil.(merge|quick|tim)Sort.
2013-06-13 10:14:35 +02:00
Martijn van Groningen
6d8a85c6af Made get mapping rest response consistent.
Closes 
2013-06-13 10:11:06 +02:00
Martijn van Groningen
96af4ee44f Use XConstantScoreQuery instead of ConstantScoreQuery.
Relates to 
2013-06-13 10:00:54 +02:00
Boaz Leskes
aa851225e5 Added created flag to index related request classes.
The flag is set to true when a document is new, false when replacing an existing object.

Other minor changes:
Fixed an issue with dynamic gc deletes settings update
Added an assertThrows to ElasticsearchAssertion

Closes  , Closes 
2013-06-13 09:10:32 +02:00
Martijn van Groningen
a2de34eead Added filter support to custom_score query.
Closes 
2013-06-12 22:41:49 +02:00
Martijn van Groningen
dc0d81b8aa Improves the way the get mapping and get warmer get their data from the master's cluster state copy.
Both apis now also support a `local` parameter, that fetches the mapping / warmer from the cluster state of the node that received the request. The `type` option in the get mapping api now also support wildcards. The warmer api now also support the `type` option.

Closes 
2013-06-12 21:03:47 +02:00
Simon Willnauer
8e33e0e69d Use CFS in any case if index.compound_format is set to true
Lucenes MergePolicies support a noCFSRatio. This commit introduces
support for this ratio via `index.compound_format`. This setting
can parse a boolean value or a value in the interval [0..1] that
is equivalent to the noCFSRatio. The setting `1`, `1.0` and `true`
are equivalent as well as `0`, `0.0` and `false`.

Closes 
2013-06-12 20:45:18 +02:00
Simon Willnauer
cb0cf3167c stabelize more tests 2013-06-12 13:25:26 +02:00
Shay Banon
c449fbdd68 missing/exists filters should also work for objects
closes 
2013-06-12 04:42:23 +02:00
Shay Banon
f155525cad upgrade jackson to 2.2.2, netty to 3.6.6 2013-06-11 20:36:08 +02:00
Simon Willnauer
66cd74d2df Always ceate index with mapping in test to ensure shards are available 2013-06-11 19:08:33 +02:00
Shay Banon
dac2c559d4 remove the index level class support
fix the test that relies on it, just index the data for each test case
2013-06-11 16:35:13 +02:00
Shay Banon
78fb12bcaa fix the type of the mapping 2013-06-11 14:49:34 +02:00
Shay Banon
3a0f9c6ea3 fix shared cluster to delete templates as well per test run 2013-06-11 14:43:18 +02:00
Shay Banon
1d63ff64c7 simplify parsing code 2013-06-11 13:19:54 +02:00
Shay Banon
41e4ee22e6 Thread pool: rename capacity to queue_size
fixes 
2013-06-11 13:07:07 +02:00
Simon Willnauer
7afffbe13b Cleanup String to UTF-8 conversion
Currently we have many different places that convert String to UTF-8
bytes and back. We shouldn't maintain more code than necessary to
do this conversion and rather use Lucene's support for it.
2013-06-10 21:56:24 +02:00
Alexander Reelsen
9323e677bd Cleaning up some tests by using assertHitCount assertion 2013-06-10 16:57:09 +02:00
Simon Willnauer
21945e5060 Ensure all shards return compareable scores for rescore tests 2013-06-10 16:50:10 +02:00