Commit Graph

9020 Commits

Author SHA1 Message Date
Simon Willnauer d403e68f43 add missing import 2014-07-28 14:33:51 +02:00
Simon Willnauer bf7f97d22f [CORE] Support alpha/beta releases in version parsing too
Pull Request #7055 fixed Version parsing for bugfix releases
causing problems with minor version in segments files. Even though
we never release anything with lucene in alpha / beta status this
commit fixes lenient parsing for these cases.

Relates to #7055
2014-07-28 14:04:39 +02:00
Simon Willnauer d2493ea48a [CORE] Support parsing lucene minor version strings
We parse the version that is shipped with the Lucene segments in order
to find the version of lucene that wrote a particular segment. Yet, some lucene
version ie:
 * 4.3.1 (Elasticsearch 0.90.2)
 * 4.5.1 (Elasticsearch 0.90.7)
 * 3.6.1 (pre Elasticsearch 0.90.0)

wrote illegal strings containing the minor version which causes IAE exceptions
being thrown from lucenes parsing method.

Closes #7055
2014-07-28 13:02:00 +02:00
Lee Hinman 07c9b5b08d Change logging level for circuit breaking to warn 2014-07-28 12:10:13 +02:00
Lee Hinman 6abe4c951d Add HierarchyCircuitBreakerService
Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

`indices.breaker.total.limit`: starting limit for all memory-use breaker,
defaults to 70%

`indices.breaker.fielddata.limit`: starting limit for fielddata breaker,
defaults to 60%
`indices.breaker.fielddata.overhead`: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting `indices.fielddata.breaker.limit` and
`indices.fielddata.breaker.overhead`)

`indices.breaker.request.limit`: starting limit for request breaker,
defaults to 40%
`indices.breaker.request.overhead`: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes #6129

Conflicts:
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldDataService.java
	src/main/java/org/elasticsearch/index/fielddata/RamAccountingTermsEnum.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/InternalGlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/plain/AbstractIndexOrdinalsFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/DisabledIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/IndexIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/NonEstimatingEstimator.java
	src/main/java/org/elasticsearch/index/fielddata/plain/PackedArrayIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/ParentChildIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/SortedSetDVOrdinalsIndexFieldData.java
	src/main/java/org/elasticsearch/node/internal/InternalNode.java
	src/test/java/org/elasticsearch/index/aliases/IndexAliasesServiceTests.java
	src/test/java/org/elasticsearch/index/codec/CodecTests.java
	src/test/java/org/elasticsearch/index/fielddata/AbstractFieldDataTests.java
	src/test/java/org/elasticsearch/index/fielddata/IndexFieldDataServiceTests.java
	src/test/java/org/elasticsearch/index/mapper/MapperTestUtils.java
	src/test/java/org/elasticsearch/index/query/IndexQueryParserFilterCachingTests.java
	src/test/java/org/elasticsearch/index/query/SimpleIndexQueryParserTests.java
	src/test/java/org/elasticsearch/index/query/guice/IndexQueryParserModuleTests.java
	src/test/java/org/elasticsearch/index/search/FieldDataTermsFilterTests.java
	src/test/java/org/elasticsearch/index/search/child/ChildrenConstantScoreQueryTests.java
	src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java
2014-07-28 11:27:33 +02:00
Clinton Gormley be86556946 Update request-body.asciidoc
Added link from `timeout` to time-units

Closes #6361
2014-07-28 11:08:59 +02:00
Martijn van Groningen 5631bbb02b [TEST] All shards should be allocated before snapshotting. 2014-07-28 10:48:35 +02:00
Brian Altenhofel dbd5cbee7f Docs: Add Drupal Search API Elasticsearch module
The module at drupal.org/project/elasticsearch has been abandoned. The Search API Elasticsearch module allows Drupal to use Elasticsearch as a backend for Search API.

Closes #7001
2014-07-28 10:46:56 +02:00
Martijn van Groningen 86c0d693c3 [TEST] Ignore Lucene40 codec 2014-07-28 10:40:25 +02:00
Colin Goodheart-Smithe f7b7f67522 Aggregations: fixed value count so it can be used in terms order
Closes #7050
2014-07-28 09:19:01 +01:00
Martijn van Groningen 2e9ee5c937 The `nested` aggregator should also resolve and use the parentFilter of the closest `reverse_nested` aggregator.
Closes #6994
Closes #7048
2014-07-28 10:07:57 +02:00
mikemccand 96ecec34d1 Docs: fix documentation for bloom filter defaults 2014-07-27 18:39:29 -04:00
Clinton Gormley c367ae09e3 Update nested-query.asciidoc
Changed score_mode `total` to `sum` to be consistent with parent-child etc
2014-07-26 22:32:28 +02:00
Clinton Gormley 10b4177def Docs: Fixed path to search-shards 2014-07-26 15:05:53 +02:00
Clinton Gormley 88c8754a3c Docs: Removed search-shards from request-body 2014-07-26 14:52:50 +02:00
Clinton Gormley 93d9628975 Docs: Reorganised the search-shards API docs 2014-07-26 14:51:44 +02:00
mikemccand e42b73c6d4 Test: more verbosity for this test on failure 2014-07-26 04:42:26 -04:00
Adrien Grand f682461b2f Mappings: Enforce non-null settings.
No that we are using the index created version to make index-time decisions,
assuming that the version is the current version when settings are null is
very error-prone. Instead we should ensure that settings are always non-null
and contain the version when the index was created.

Close #7032
2014-07-25 21:01:44 +02:00
David Pilato 11eced01da Add multi_field support for Mapper externalValue (plugins)
In context of mapper attachment and other mapper plugins, when dealing with multi fields, sub fields never get the `externalValue` although it was set.

Here is a full script which reproduce the issue when used with mapper attachment plugin:

```
DELETE /test

PUT /test
{
    "mappings": {
        "test": {
            "properties": {
                "f": {
                    "type": "attachment",
                    "fields": {
                        "f": {
                            "analyzer": "english",
                            "fields": {
                                "no_stemming": {
                                    "type": "string",
                                    "store": "yes",
                                    "analyzer": "standard"
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

PUT /test/test/1
{
    "f": "VGhlIHF1aWNrIGJyb3duIGZveGVz"
}

GET /test/_search
{
    "query": {
        "match": {
           "f": "quick"
        }
    }
}

GET /test/_search
{
    "query": {
        "match": {
           "f.no_stemming": "quick"
        }
    }
}

GET /test/test/1?fields=f.no_stemming
```

Related to https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/57

Closes #5402.
2014-07-25 16:59:42 +02:00
Colin Goodheart-Smithe 655157c83a Aggregations: Added an option to show the upper bound of the error for the terms aggregation.
This is only applicable when the order is set to _count.  The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term.  The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.

Closes #6696
2014-07-25 14:24:24 +01:00
Justin Honold 593fffc7a1 Docs: Changing ES_MAX_MEM default from '1gb' to '1g'
If you set ES_HEAP_SIZE to '1gb' as suggested, Java will yield an "Invalid initial heap size".

Closes #6824
2014-07-25 12:50:59 +02:00
rendel 50634e6a3d Docs: Added new entry for the SIREn plugin.
Closes #6961
2014-07-25 12:49:50 +02:00
Alexander Reelsen a1e335b1e9 CORS: Support regular expressions for origin to match against
This commit adds regular expression support for the allow-origin
header depending on the value of the request `Origin` header.

The existing HttpRequestBuilder is also extended to support the
OPTIONS HTTP method.

Relates #5601
Closes #6891
2014-07-25 10:51:22 +02:00
Lee Hinman 1fb9f404df [DOCS] correct documentation about groovy/mvel defaults and deprecations 2014-07-25 10:39:33 +02:00
Alexander Reelsen 35e562343f Tests: Remove HttpClient to only use one Http client
The HTTP client implementation used by the Elasticsearch REST tests is
backed by apache http client instead of a self written helper class,
that uses HttpUrlConnection. This commit removes the old simple HttpClient
class and uses the more powerful and reliable one for all tests.

It also fixes a minor bug, that when sending a 301 redirect, a Location
header needs to be added as well, which was uncovered by the switching
to the new client.

Closes #7003
2014-07-25 10:26:52 +02:00
Adrien Grand 51fd2f513c [TESTS] Fix NPE in FreqTermsEnumTests. 2014-07-25 09:12:01 +02:00
Martijn van Groningen a0e5684d7b [TEST] more logging 2014-07-25 01:16:32 +02:00
Adrien Grand a3d8022dc5 Fielddata: Fix thread safety issue with field data on the `_index` field. 2014-07-24 19:04:22 +02:00
Lee Hinman 89e03910f4 Add a periodic cleanup thread for IndexFieldCache caches
Fixes #7010
2014-07-24 17:23:52 +02:00
Martijn van Groningen 297a97cd23 Core: Use the provided cluster state instead of fetching a new cluster state from cluster service.
Close #7013
2014-07-24 16:23:42 +02:00
Colin Goodheart-Smithe 5483c62de6 Geo: Fixes parse error with complex shapes
The bug reproduces when the point under test for the placement of the hole of the polygon has an x coordinate which only intersects with the ends of edges in the main polygon. The previous code threw out these cases as not relevant but an intersect at 1.0 of the distance from the start to the end of an edge is just as valid as an intersect at any other point along the edge.  The fix corrects this and adds a test.

Closes #5773
2014-07-24 15:17:55 +01:00
Simon Willnauer bd51d7a07f Add `wait_if_ongoing` option to _flush requests
This commit adds the ability to force blocking on the flush operaition
to make sure all files have been written and synced to disk. Without
this option a flush might be executing at the same time causing the
current flush to fail and return before all files being synced.

Closes #6996
2014-07-24 15:34:53 +02:00
Colin Goodheart-Smithe 127649d174 Aggregations: Added pre and post offset to histogram aggregation
Added preOffset and postOffset parameters to the API for the histogram aggregation which work in the same way as in the date histogram

Closes #6605
2014-07-24 14:32:33 +01:00
Adrien Grand f5d1e0a37d [TESTS] Ensure yellow in SimpleFacetsTests.testFilterFacetWithFacetFilterPostMode. 2014-07-24 15:21:20 +02:00
Shay Banon eb37a5992b remove use of recycled set in filters eviction
closes #7012
2014-07-24 15:00:30 +02:00
javanna d9ff42f88a Internal: expose the indices names every action relates to if applicable
Added two new interfaces:
1) IndicesRequest that allows to retrieve the indices the request relates to in a generic manner, together with the indices options that tell how they are going to get resolved and expanded
2) CompositeIndicesRequest for compound requests that hold multiple indices request like MultiSearchRequest, MultiGetRequest, MultiTermVectorsRequest, BulkRequest, BenchmarkRequest, PercolateRequest, MultiPercolateRequest and MoreLikeThisRequest

Taken the chance to streamline the indices options and add them to every request where it makes sense (although they can't be changed from the outside), rather than leaving them implicit in the related TransportAction when indices get expanded (tipycally MetaData#concreteIndices or MetaData#concreteSingleIndex). Added IndicesOptions parameter to MetaData#concreteSingleIndex to make sure it is taken from the request, where the information belongs, instead of hardcoded within MetaData. The concreteSingleIndex method remains but it's just a utility method that returns a single index instead of an array and complains otherwise.

Also made sure NPE is never thrown when setting indices(null) to IndicesAliasesRequest, similar to what SearchRequest does.

Closes #6933
2014-07-24 14:42:40 +02:00
Adrien Grand 6f31b1135a [Benchmark] Make TermsAggregationSearchBenchmark fairer to uninverted field data.
The benchmark indexes 200 unique full-width longs. For uninverted field data
we try to use the most memory-efficient storage, and in that case it would use
two arrays: one for the doc->ordinals mapping and one for the ordinal->value
mapping. Which is slower than what doc values do by storing directly the
mapping from docs to values.
2014-07-24 14:35:47 +02:00
Colin Goodheart-Smithe fdf2bb9371 Aggregations: Better JSON output scoping
Before this change each aggregation had to output an object field with its name and write its JSON inside that object.  This allowed for badly behaved aggregations which could write JSON content in the root of the 'aggs' object.  this change move the writing of the aggregation name to a level above the aggregation itself, ensuring that aggregations can only write within there own scope in the JSON output.

Closes #7004
2014-07-24 12:02:40 +01:00
Robert Muir d8cd755445 Speed up string sort with custom missing value
Today if the user supplies a custom missing value for a string sort,
we do it in an extremely slow way, not using ordinals but dereferencing
bytes for every document. Ordinals are only used if the missing value
is _first or _last.

Instead, use ordinals with custom missing values too.

Closes #7005
2014-07-24 06:27:59 -04:00
Simon Willnauer f130d60b72 [TEST] Don't randomize preference PRIMARY it might not try replicas depending on the clusterstate 2014-07-24 11:36:31 +02:00
Martijn van Groningen 73f7f426de Made `_source` parsing in `top_hits` aggregation consistent with regular `_source` parsing in search api.
Closes #6997
2014-07-24 11:23:59 +02:00
Adrien Grand 8cb4471cca [TESTS] Add more assertions to SimpleFacetsTests. 2014-07-24 11:13:53 +02:00
Brian Murphy ce864d4016 [REFACTOR] TransportActions
Get rid of boilerplate code for handling transport actions.
Make these transport actions extend HandledTransportAction where this code
now lives.
2014-07-24 11:05:29 +01:00
javanna 3e30fa2089 Internal: streamline use of IndexClosedException when executing operation on closed indices
Single index operations to use the newly added IndexClosedException introduced with #6475. This way we can also fail faster when we are trying to execute operations on closed indices and their use is not allowed (depending on indices options). Indices blocks are still checked but we can already throw error while resolving indices (MetaData#concreteIndices).

Effectively this change also affects what we return when using one of the following apis: analyze, bulk, index, update, delete, explain, get, multi_get, mlt, term vector, multi_term vector. We now return `{"error":"IndexClosedException[[test] closed]","status":403}` instead of `{"error":"ClusterBlockException[blocked by: [FORBIDDEN/4/index closed];]","status":403}`.

Closes #6988
2014-07-24 10:33:58 +02:00
Colin Goodheart-Smithe dc9e9cb4cc Aggregations: change to default shard_size in terms aggregation
The default shard size in the terms aggregation now uses BucketUtils.suggestShardSideQueueSize() to set the shard size if the user does not specify it as a parameter.

Closes #6857
2014-07-24 07:55:09 +01:00
Areek Zillur 5487c56c70 Search & Count: Add option to early terminate doc collection
Allow users to control document collection termination, if a specified terminate_after number is
set. Upon setting the newly added parameter, the response will include a boolean terminated_early
flag, indicating if the document collection for any shard terminated early.

closes #6876
2014-07-23 15:10:15 -04:00
Robert Muir 66825ac851 Change numeric data types to use SORTED_NUMERIC docvalues type
instead of a custom encoding in BINARY.

In low level benchmarks this is 2x to 5x faster: its also optimized
for the common case where fields actually only contain at most one
value for each document.

Additionally SORTED_NUMERIC doesn't lose values if they appear more
than once, so mathematical computations such as averages are correct.

Closes #6967
2014-07-23 14:55:03 -04:00
Adrien Grand ff2903d2c6 [TEST] Don't recycle in facets.
The recycling happening in facets is done manually and arrays are sometimes not
released. Aggregations do it in a less error-prone way by registering on to the
SearchContext.
2014-07-23 20:20:16 +02:00
Adrien Grand 629f91ae57 Fielddata: goodbye comparators.
This commit removes custom comparators in favor of the ones that are in Lucene.

The major change is for nested documents: instead of having a comparator wrapper
that deals with nested documents, this is done at the fielddata level by having
a selector that returns the value to use for comparison.

Sorting with custom missing string values might be slower since it is using
TermValComparator since Lucene's TermOrdValComparator only supports sorting
missing values first or last. But other than this particular case, this change
will allow us to benefit from improvements on comparators from the Lucene side.

Close #5980
2014-07-23 20:08:36 +02:00
Lee Hinman a1a03a184c [DOCS] Fix nested root object indexing documentation
Types can no longer be specified when indexing, see:
https://github.com/elasticsearch/elasticsearch/pull/4552
2014-07-23 18:34:27 +02:00