Commit Graph

1766 Commits

Author SHA1 Message Date
Shay Banon e580507fbe wait for yellow after the index is created
also, remove starting one node, it not useful for the test, and slows down the execution
2013-05-17 18:21:51 +02:00
Simon Willnauer 17681d7104 Grow array buffer in ScriptDocValues if needed
The buffer in ScriptDocValues for Strings was never called causeing
NPE in scripts if a document has > 10 distinct values in a field.

Closes #3051
2013-05-17 15:16:59 +02:00
Alexander Reelsen bd857d6d2e Ensuring test isolation when jvm plugins are loaded
Instead of specifying 'path.plugins' configuration option, 'plugin.types'
is used to load plugins in integration tests. This makes sure the JVM
plugins are not loaded in all following tests from then.

Also removed the now unneeded es-plugin.properties files from JVM test
plugins.
2013-05-17 13:21:26 +02:00
Alexander Reelsen 2485c4890c Packaging improvements & bugfixes
* RPM: Use the ES_USER variable to set the user (same name as in the debian package
  now), while retaining backwards compatibility to existing /etc/sysconfig/elasticsearch
* RPM: Bugfix: Remove the user when uninstalling the package
* RPM: Set an existing homedir when adding the user (allows one to run cronjobs as this user)
* DEB & RPM: Unify Required-Start/Required-Stop fields in initscripts
2013-05-17 11:14:44 +02:00
Alexander Reelsen 2e07af63ba Allowing pluggable highlighter implementations.
Currently elasticsearch ships with the plain and the fast-vector highlighter.
In order to support arbitrary highlighters via plugins, you only need to
implement a Highlighter interface and register your implementation in your
plugin at the HighlightModule.

In addition you can also add arbitrary options via the 'options' field in
the highlight request, which can be parsed in the highlighter implementation.

In order to find out how to write add your own analyzer, check out the tests
classes (CustomHighlighterSearchTests and CustomHighlighter).

Closes #2828
2013-05-17 09:07:13 +02:00
Martijn van Groningen db421742f7 Added support for nested sorting for script sorting and geo sorting.
Closes #3044
2013-05-16 18:45:00 +02:00
Martijn van Groningen 42d5bdd337 If matching root doc's inner objects don't match the `nested_filter` then the `missing` value should be used to sort the root doc.
Closes #3020
2013-05-16 10:12:02 +02:00
Shay Banon 2779967279 fix package name... 2013-05-15 17:06:42 +02:00
Martijn van Groningen bc0c7f8f28 Added simple id loading test.
Relates to #3028
2013-05-15 16:10:22 +02:00
Simon Willnauer 8235b89e9c Don't apply min frequency smoothing if suggest type is 'always'
Using an automatically detected 'min_doc_freq' if suggest type is set to
'always' is counter intuitive. If we suggest always ignore the frequency and
set threshold frequency to 0 to allow all possible candidates to be drawn if
they are within the given bounds.

Closes #3037
2013-05-15 15:17:49 +02:00
Martijn van Groningen 48cb06c9cf Keep backwards compatible with 0.90.0 on the transport layer.
Relates to #3039
2013-05-15 13:28:55 +02:00
Martijn van Groningen 585cbf6886 Routing value not serialized on transport layer.
Closes #3039
2013-05-15 13:09:13 +02:00
Clinton Gormley db805cf5a9 Corrected English in a shard error message 2013-05-15 12:41:49 +02:00
Clinton Gormley 4d09e7562a Corrected a typo and improved the English in a master-discovery error 2013-05-15 12:39:31 +02:00
Shay Banon f92eed8591 clean thread locals without needing a wrapper
clean thread locals smartly by identifying "our" classes, and removing them, so there is no need to wrap it in our our clenable values
2013-05-15 12:13:13 +02:00
Shay Banon 4d357660ca reuse version key in an actual operation
no need to compute the hash several times
2013-05-15 00:27:48 +02:00
Shay Banon 1fb78c53b8 remove unused class 2013-05-14 20:21:37 +02:00
Shay Banon 1c7d2442c8 use bytes instead of String as key in versionMap
no need to create a String every time we put or get a value from the version map
2013-05-14 20:18:54 +02:00
Martijn van Groningen 15fcb17a81 During parent uid loading seek to next parent type when child type is encountered.
Relates to #3028
2013-05-14 16:22:05 +02:00
Simon Willnauer 6d5805c901 Use Recovery Throtteling by default.
To prevent to extensive resource use during recovery we use
recovery throtteling by default to prevent unexpected peak load
on clusters. The default is set to 20 MB/sec.

Closes #3035
2013-05-14 15:10:03 +02:00
Simon Willnauer 6624949501 Use Merge Throtteling by default on node level.
Merge Throtteling is one of the most recommended settings and crucial in the
RealTime indexing case. We should set the default to a reasonable setting
that allows folks to index in a production index and don't see large merge
peaks by default. The default is set to 20 MB/sec on the node level.

Closes #3033
2013-05-14 15:10:03 +02:00
Simon Willnauer 09fb2264d0 Raise search threadpool default size.
The default size used to be 2x availableProcessors which seemed to
be a to lowish value in practice. 3x appeared to be a sweetspot for
most application. The default is now 3 x availableProcessors

Closes #3023
2013-05-14 15:10:03 +02:00
uboness d06a15ec3e Support for term facets on unmapped fields
Added support for unmapped & partially mapped fields (partially mapped fields may occur when searching across multiple indices where the faceted field is mapped on some and unmapped on others). If a shard doesn't have mappings for a field, the matching documents count on that shard will be added to the missing count for that facet.
2013-05-14 13:53:41 +02:00
Martijn van Groningen 906f278896 Make sure only relevant documents are evaluated in the second round lookup phase.
Both has_parent and has_child filters are internally executed in two rounds. In the second round all documents are evaluated whilst only specific documents need to be checked. In the has_child case only documents belonging to a specific parent type need to be checked and in the has_parent case only child documents need to be checked.

Closes #3034
2013-05-14 11:02:03 +02:00
Shay Banon ae6c1b345f Allow to disable allocation on the index level
Similar to the global cluster wide disable allocation flags, allow to set those on a specific index by updating its settings. The keys are the same as the cluster one, except they start with an index, for example: index.routing.allocation.disable_allocation set to true.
closes #3031
2013-05-14 10:25:23 +02:00
Simon Willnauer 7b437e801a Added test for LimitTokenCountFilterFactory 2013-05-14 09:58:43 +02:00
Brusic 183ac1e04c Expose LimitTokenCountFilter as a TokenFilter
Closes #3013
2013-05-14 09:58:42 +02:00
Martijn van Groningen 669cf90d0c Not load the ids of child documents into memory.
Closes #3028
2013-05-14 09:46:43 +02:00
Alexander Reelsen 31b4b7ea58 Renaming span_multi_term query to span_multi
... due to discussing this on #2610 in order to have a more concise name
2013-05-13 12:32:57 +02:00
Simon Willnauer cffe333fe3 Ensure tests pass if store dir is a soft-link 2013-05-13 12:08:41 +02:00
Simon Willnauer a3a2ca0ad3 Reduce branches in TopChildrenQuery
The branches used in the score method can be moved into the
scorer call and be essentially a constant operation rather than
a linear operation depending on the number of parent docs.
2013-05-13 12:08:41 +02:00
Alexander Reelsen 52654179e7 Fix for RPM postinstall on old OpenSUSE distributions
Older OpenSUSE distributions do not ship with systemd and therefore are
using chkconfig, but do not have their scripts placed at /etc/init.d/
This patch is more defensive and adds additional checks in the postinstall
script to prevent aborted post install scripts, which makes the RPM
uninstallable.
2013-05-13 11:48:04 +02:00
Martijn van Groningen 3c58176d29 Also support `sum` as `score_mode` option for the nested query.
Relates to #3026
2013-05-13 10:38:20 +02:00
Martijn van Groningen 6eaad25621 Made all the queries support `score_mode` parameter name in addition to the existing parameter name for score mode.
Closes #3026
2013-05-13 10:30:01 +02:00
Martijn van Groningen bacf969dd3 Improved the stability of hl tests by adding waiting for at least yellow status.
In some test cases this was missing.
2013-05-13 10:18:36 +02:00
Shay Banon 21d749a6aa resolved empty setting values should be removed
when resolving empty settings values, their value should be removed, for example, when using ${env.ENV_VAR}, and ENV_VAR is not set, then the setting should be removed
2013-05-12 05:18:03 +02:00
Shay Banon 2ab72da7d6 update to joda 2.2 2013-05-11 23:37:56 +02:00
Shay Banon ee636c2330 use throwable in transport layer
catch throwable when processing messages in the transport layer, to report back failures even under errors
2013-05-11 21:30:16 +02:00
Shay Banon 342e9cf18e test no longer needed... 2013-05-11 01:22:23 +02:00
Shay Banon 6e26efcd87 not active shards should translate to 503 not 500 2013-05-10 18:09:42 +02:00
Alexander Reelsen 21fcc482eb Allow to set headers in HTTP response
This commit allows to set custom headers in HTTP responses (like
setting the WWW-Authenticate header for basic auth) by adding
RestRequest.addHeader() method.

Closes #2936
Closes #2540

To get the history right: This is based on PR #2723
2013-05-10 17:58:46 +02:00
Shay Banon da5dff9ee4 remove concrete bytes for field data
no really need for it, specifically with the fact that we don't need to deepCopy on makeSafe for the (default) paged bytes
2013-05-10 17:42:34 +02:00
Shay Banon 455b5da52f No need for deepCopy on makeSafe for pages field data
Since its a reference to a buffer in the PagedBytes, we don't need to deep copy it on makeSafe, just shallow copy it
2013-05-10 17:25:39 +02:00
Martijn van Groningen 2be23d2427 Added test that checks if a validation error is thrown when both doc and script provided in a update request.
Closes #2967
2013-05-10 16:43:20 +02:00
Alexander Kahn 47971ac808 Reject update request that has both script and doc 2013-05-10 16:31:54 +02:00
Martijn van Groningen 9ddd675a02 Added support for the update operation in the bulk api.
Update requests can now be put in the bulk api. All update request options are supported.

Example usage:
```
curl -XPOST 'localhost:9200/_bulk' --date-binary @bulk.json
```

Contents of bulk.json that contains two update request items:
```
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "script" : "counter += param1", "lang" : "js", "params" : {"param1" : 1}, "upsert" : {"counter" : 1}}
```
The `doc`, `upsert` and all script related options are part of the payload. The `retry_on_conflict` option is part of the header.

Closes #2982
2013-05-10 16:03:24 +02:00
Shay Banon c5e177dc56 lazy compute the hash and actually use it... 2013-05-10 11:51:55 +02:00
Igor Motov 4d66575abe Make GetField behavior more consitent for multivalued fields.
Before this change, the GetField#getValue() method was returning a list of values of a multivalued fields if the field values were obtained from source or if the field was stored and real-time get was used. If the field was stored but non-realtime get was used, GetField#getValue() was returning only the first element and the GetField#getValues() was returning a list of elements. This change makes behavior consistent. GetField#getValue() now always returns only the first value of the field and GetField#getValues() returns the entire list.
2013-05-09 12:45:49 -04:00
Igor Motov d69dd321fc Improve test stability 2013-05-09 10:21:21 -04:00
Shay Banon 8a2e5bbe68 Reroute Allocate to force primary allocation when enabled
Typically, the main reason a reroute allocation command with allow_primary is enabled, is to force create an empty new shard because a shard (and its replicas) were lost. This can't be done today because the shard expects to have a valid index where its allocated, we need to clear its post allocation flag to make sure it is allowed to create a fresh index.
2013-05-09 00:47:57 +02:00
Igor Motov 15c8510e65 Fix DfsSearchResult method names in AggregatedDfs 2013-05-08 18:21:47 -04:00
Simon Willnauer 436e23b8d4 Use simplified asserts and better naming
addOne / subOne are likely easier to understand without reading the docs.
If I would read my emails this would have made it in the last commit.
2013-05-08 22:06:01 +02:00
Simon Willnauer 1ef8761b70 Handle optional term and field statistics gracefully
Lucene provides a set of statistics that depend on the codec / postingsformat
as well as on the index options used when the field is created / indexed.
If a certain stats value is not available lucene return `-1` instead of the
correct value. We need to ensure that those values are encoded correctly if
we try to write vLongs as well as when we aggregate those values.

Closes #3012
2013-05-08 21:34:48 +02:00
Igor Motov dbaf39c792 Add more informative toString method to StoreDirectory 2013-05-08 12:05:40 -04:00
Simon Willnauer a89230945f Add NGramTokenizer and NGramTokenFilter to broken chains
NgramTokenizer and NGramTokenFilter are broken with a version < 4.2
We should still support these filters but should prevent the StringIOOB
exceptions. Adding these fitlers for the FragmentBuilderHelper will
allow seamless highlighting on fields indexed with those tokenizers or
tokenfilters
2013-05-08 17:57:20 +02:00
chilling b7cd8a64cd Merge pull request #3009 from chilling/issue2986_scores
Fixed parsing of track_scores in RestSearchAction
2013-05-08 03:54:16 -07:00
Florian Schilling 19fab7cd0e Fixed parsing of `track_scores` in `RestSearchAction`
Closes #2986
2013-05-08 12:45:32 +02:00
Simon Willnauer c1e8d4787a Don't use smart query wrapping for span term query
Lucenes span queries are a different family than 'ordinary' queries
in lucene. Spans only work with other spans such that smart query
wrapping doesn't work with span queries at all ie. we can't wrap
in filtered query.

Closes #2994
2013-05-07 22:52:57 +02:00
Simon Willnauer 992a40cbd8 Add `field_masking_span` to IndexQueryModule
The query parser for `field_masking_span` has never been added / bound to
the IndexQueryModule.

Closes #3007
2013-05-07 21:50:47 +02:00
Igor Motov 32abf6b890 Fix error getting array fields
Fixes #3000
2013-05-07 13:30:29 -04:00
Simon Willnauer 130f0f6afd Remove Java 7 only API
We still run on Java 6 as minimum requirement. Integer.compare(int,int)
was added in Java 7. This caused compile errors on CI.
2013-05-07 18:40:56 +02:00
uboness 74317fec8b Fixed custom hunspell dictionary directory
Properly loading dictionaries from based on the "indices.analysis.hunspell.dictionary.location" setting if one exists
2013-05-07 17:21:37 +02:00
Simon Willnauer e1b66b34ea Don't fail hard if broken analysis is used.
Today an analysis chain with broken tokenfilters or tokenizers like
WordDelimiterFilter might produce somewhat broken term vectors that cause
`StringIndexOutOfBoundsExceptions` if FastVectorHighlighter is used
since the positions / offsets contract is violated and offsets of highlight
tokens are not increasing but decreasing even if their positions are increasing.

Yet, if we detect such a situation we can resort the tokens which might cause
somewhat odd highlights but doesn't fail hard with a StringIndexOOBException.

Closes #3006
2013-05-07 16:29:00 +02:00
uboness 14ae2fb765 Changed the priority of delete-index action to URGENT
All index meta data API's have urgent priority when it comes to cluster state updates. We'd like to remove indices asap to avoid things like unnecessary shards relocations
2013-05-07 05:43:08 +02:00
Simon Willnauer 758a4fcdc0 Enable Geo-Shape Relations Within and Disjoint 2013-05-06 18:03:16 +02:00
Simon Willnauer 2219925485 Upgrade to Lucene 4.3.0
This Lucene Release introduced a new API on DocIdSetIterator that requires each
implementation to return a `cost` upperbound as a function of the iterated documents.
This API allows for several optimizations during query execution especially in
Conjunction and Disjunction Queries with min_should_match set.

Closes #2990
2013-05-06 18:03:16 +02:00
Shay Banon f566527513 Rest Get Source
Allow to get the source directly using a specific REST endpoint without any additional content around it, the endpoint is `{index}/{type}/{id}/_source`.
Note, HEAD now also support the _source endpoint.
closes #2993, closes #2995
2013-05-06 14:33:23 +02:00
Derek McNeil fbd732cde2 Added support for Collections in TermsQuery/InQuery. 2013-05-06 10:30:30 +02:00
Simon Willnauer 3c995d5dcc Expose Lucene Main Version via Main Action. A call to `/` will
return the version of the used Lucene library next to the Elasticsearch
version.

Closes #2988
2013-05-06 09:48:08 +02:00
Simon Willnauer 29da615afd Use full ord range in binary search. The upperbound of the binary search in
BytesRefOrdValComparator starts at 1 and ends at maxOrd - 1. Yet, numOrd is defined
as maxOrd - 1 excluding the 0 ord.

This causes wrong sort ords when the bottom of the queue is compared to the next
segment and the greatest term in the new segment is in-fact less than the current
queue bottom. If that is true we treat the values as equal and never include the right
value into the queue.

Closes #2991
2013-05-05 00:48:10 +02:00
Igor Motov f92c53efdb Accept loopback interfaces in the network.host setting
Closes #2924. Adds support for loopback interfaces such as _lo0_ in network.host and other network settings.
2013-05-03 14:36:49 -04:00
Martijn van Groningen f22510cab5 A neater approach of for processing should clauses before must or must_not clauses. 2013-05-03 18:25:32 +02:00
Martijn van Groningen 52edc4c652 Fixed issue where 'fast' should filter can make documents that didn't match the must or must_not clause a match again. Relates to #2979 2013-05-03 17:37:41 +02:00
Alexander Reelsen 70355f693f Refactoring SpanMultiTermQuery support
* Added license headers where needed
* Refactored SpanMultiTermQueryParser
* Refactored tests to adhere to other tests
2013-05-03 16:00:51 +02:00
Anton Hägerstrand e30aa6b221 Support SpanMultiTerm, closes #2610, #2400
This adds support for lucene span multi term queries. This lucene query
allows users to form complicated queries such as wildcards or prefix
queries embedded within span queries.
2013-05-03 16:00:31 +02:00
Igor Motov ed289dc6c7 Improve stability of SimpleDataNodesTests
Make sure that we are waiting for the new state to be propagated to the node where we are executing the followup query that depends on this state.
2013-05-03 09:14:09 -04:00
Simon Willnauer c9c10273a6 Introduced a Opertaion enum that is passed to each call of
WeightFunction#weight to allow dedicated weight calculations per operation. In certain
circumstance it is more efficient / required to ignore certain factors in the weight
calculation to prevent for instance relocations if they are solely triggered by tie-breakers.
In particular the primary balance property should not be taken into account if the delta for
early termination is calculated since otherwise a relocation could be triggered solely by the
fact that two nodes have different amount of primaries allocated to them.

Closes #2984
2013-05-03 14:37:47 +02:00
Alexander Reelsen ad92d82680 Added a first small set of hamcrest matchers
A first implementation of adding matchers and helper methods to elasticsearch.
The following ones are supported

assertHitCount(searchResponse, 2);

// helper methods to easily access the first hits
assertFirstHit(searchResponse, hasId("foo")):
assertSecondHit(searchResponse, hasType("foo")):
assertThirdHit(searchResponse, hasIndex("foo")):

// methods to access all other hits
assertSearchHit(searchResponse, 5, hasId("10"));
// same as above, but maybe more readable
assertSearchHit(searchResponse.getHits().getAt(5), hasIndex("foo"));

I changed GeoFilterTests to show how it works.

Furthermore I inlined assertHighlight() from HighlighterSearchTests.
The ElasticsearchAssertions class can be used now as a centralized assertion class
in order have a centralized class for every developer to look at.
2013-05-03 09:29:56 +02:00
Simon Willnauer 345b63e2d0 Use less agressive threshold to prevent primary relocation in recovery test 2013-05-02 17:51:21 +02:00
Simon Willnauer 72982d955a Use current settings as default in BalancedShardsAllocator instead of defaults.
Custom settings are not always present in the `Settings` that are passed
to `NodeSettingsService.Listener#onRefreshSettings` such that using the defaults
will necessarily override the custom settings if set before.

Closes #2973
2013-05-02 16:23:48 +02:00
uboness 58bc21a216 Added tests for hunspell token filter factory 2013-05-02 14:10:30 +02:00
Martijn van Groningen 59a741cee5 Properly cache parent/child queries in the case they are wrapped in a compound filter.
Closes #2971
2013-05-02 12:08:54 +02:00
uboness f430953ca1 Changed hunspell token filter factory to use "dedup = true" by default 2013-05-01 23:57:14 +02:00
Martijn van Groningen 0d3b7871df Added support for sort_mode `avg` for sorting by geo_distance.
Closes #2962
2013-05-01 12:53:31 +02:00
Martijn van Groningen c21ab1a9cf Return proper response code for delete by query api in the case of failures.
Closes #2963
2013-05-01 11:53:40 +02:00
Igor Motov 6437c51501 Improve stability of SimpleRecoveryLocalGatewayTests
Fixed testX and testSingleNodeNoFlush by specifying mapping on index creation instead of using dynamic mapping. Dynamic mapping is updated on the cluster level asynchronously and if mapping changes are not applied to the cluster state before node is closed, these changes are not be available after node restart. While data added in the test is preserved, due to absence of mapping, the test still fails. This is a known issue that we are not planning to fix at the moment.
2013-04-30 12:11:30 -04:00
Alexander Reelsen a694e97ab9 Support source include/exclude for realtime GET
Currently realtime GET does not take source includes/excludes into account.
This patch adds support for the source field mapper includes/excludes
when getting an entry from the transaction log. Even though it introduces
a slight performance penalty, it now adheres to the defined configuration
instead of returning all source data when a realtime get is done.
2013-04-30 17:48:03 +02:00
Alexander Reelsen d5f4c8230d XContentMapValues.filter now works with nested arrays
The filter method of XContentMapValues actually filtered out nested
arrays/lists completely due to a bug in the filter method, which threw
away all data inside of such an array.

Closes #2944
This bug was a follow up problem, because of the filtering of nested arrays
in case source exclusion was configured.
2013-04-30 17:33:09 +02:00
Simon Willnauer 773ea0306b Fail will IAE if a numeric field is used for the anaysis endpoint.
Analysing a numeric field will return UTF-16 representations of
of Lucenes numeric prefix terms. Those terms are meaningless in general
unless used for lookups in the lucene index. Passing a numeric field
to the analysis action is most likely a bug.

Closes #2953 #2952
2013-04-30 16:07:11 +02:00
Simon Willnauer 8c6ba59b83 Upgrade Lucene Version to 4.2. The latest Elasticsearch version must
use the latest Lucene version as specified in o.e.common.lucene.Lucene
and must be upgraded with each lucene release.

This commit adds an assert that fails once the actual lucene version
that is used is higher than the current releases version.
2013-04-30 14:06:57 +02:00
Simon Willnauer 42b9674d0c added simple test for numeric match query 2013-04-30 13:53:49 +02:00
Shay Banon 6c3bb4dcdd move to 1.0.0.Beta1 snap 2013-04-29 13:51:09 +02:00
Shay Banon cb75ce0caa release 0.90.0 GA 2013-04-29 13:41:43 +02:00
Shay Banon 9ded2405a0 Use Lucene Version that was used to create the index in Analysis
Lucene ships with a version constant that is mainly used to provide consistent behaviour across lucene release versions. Lucene's Analysis capabilities are commonly applied at index and search time such that the search-time behaviour should be identical to the index-time behaviour in most of the cases. Currently ElasticSearch always uses the latest version from Lucene which can break backwards compatibility with the index for users that rely on behaviour that changed in new Lucene version.

Users should always use the version the index was created with unless it's explicitly configured.

closes #2945
2013-04-29 13:18:51 +02:00
Simon Willnauer bd7ff6946e Added X Versions of NGramTokenFilter and NGramTokenizer to ElasticSearch. These versions
don't produce broken positions anymore and prevent certain highlighter bugs that fail with
StringArrayOutOfBoundsExceptions as in #2931

This commit breaks backwards compatibility in terms of highlighting when NGramTokenFilter is used.
The highlighter will highlight the entire terms as produced by the tokenizer instead of the individual
sub-gram. To do sub-gram highlighting, the ngram tokenizer should be used. This behavior was based on
broken NGramTokenFilter behavior which will be fixed in Lucene 4.4 but was ported in this commit
to elasticsearch 0.90. The broken behavior can still be used if a version < LUCENE_42 is used
in the token filter mapping.

Closes #2931
2013-04-27 16:48:25 +02:00
Shay Banon f09ad507a4 open context stats
- rename to open_contexts from open, we might have other open stats in the future related to search (lucene index searchers?)
- add a test to verify it works
2013-04-27 15:09:47 +02:00
Simon Willnauer 8a7f81104f Remove XSimpleFragmentsBuilder and XScoreOrderFragmentsBuilder since the only difference
to the lucene version is that `discreteMultiValueHighlighting` does default to `true`. Yet
we set this anyway in the HighlightingPhase such that the classes are obsolet.
2013-04-26 20:04:38 +02:00
Simon Willnauer 355f80adc9 Added temporary fix for LUCENE-4899 where FastVectorHighlihgter failed with StringIndexOutOfBoundsException
if a single highlight phrase or term was greater than the fragCharSize producing negative string offsets

The fixed BaseFragListBuilder was added as XSimpleFragListBuilder which triggers an assert once Elasticsearch
upgrades to Lucene 4.3
2013-04-26 19:48:48 +02:00
Simon Willnauer 2ed2fab904 Add assert that fails one Elasticsearch upgrades to Lucene 4.3 in order to remove the duplicated class 2013-04-26 19:16:21 +02:00
Alexander Reelsen 90353ceb79 Fixing possible NoClassDefFoundError when trying to load nonexisting classes
In order to handle exceptions correctly, when classes are not found, one
needs to handle ClassNotFoundException as well as NoClassDefFoundError
in order to be sure to have caught every possible case. We did not cater
for the latter in ImmutableSettings yet.

This fix is just executing the same logic for both exceptions instead of
simply bubbling up NoClassDefFoundError.
2013-04-26 10:34:10 +02:00
Alexander Reelsen 22e25cc165 Added stolen time to OsStats output 2013-04-25 10:46:24 +02:00
Shay Banon c4968d7d65 no longer support snappy... 2013-04-25 09:38:58 +02:00
Igor Motov 982b570037 Fix serialization of sync/async replication type 2013-04-25 08:25:31 +02:00
Martijn van Groningen dd12e0b86c If searchContext not set, abort parsing and throw ISE 2013-04-24 10:24:15 +02:00
Simon Willnauer c884304753 Fall back to local statistics if global statistics are not availalbe for a field or term
Closes #2926
2013-04-23 13:32:35 +02:00
Simon Willnauer f372f7c109 Cut over StringScriptDataComparator to use BytesRef instead of Strings
Closes #2920
2013-04-23 13:29:19 +02:00
Simon Willnauer 7a36bed031 Remove per-doc ord collector callback infavor of an iterator 2013-04-23 10:35:40 +02:00
Martijn van Groningen c390f9b1a9 Added more test assertions 2013-04-19 22:16:42 +02:00
Simon Willnauer 7ea6cd6888 use Double/Float.compare for stable and correct float sort order 2013-04-19 21:40:01 +02:00
Clinton Gormley 1483a3a0e5 Added tests for multi_match with minimum_should_match 2013-04-19 21:40:01 +02:00
Clinton Gormley e508b27203 Apply minimum_should_match to inner clauses of multi_match query
When specifying minimum_should_match in a multi_match query it was being applied
to the outer bool query instead of to each of the inner field-specific bool queries.

Closes #2918
2013-04-19 21:39:54 +02:00
Simon Willnauer 3ab56e16b7 Support empty string in FSTBytesAtomicFieldData 2013-04-19 12:49:06 +02:00
Simon Willnauer a1c62759c9 remove size bound from cache recycler for performance reasons 2013-04-19 12:36:12 +02:00
Simon Willnauer 2d13aa29f8 s/ES.RECYCLE/es.cache.recycle 2013-04-19 11:48:28 +02:00
Simon Willnauer 05b6c46bec allow CacheRecycler to be cleared via the REST API 2013-04-19 11:45:33 +02:00
Simon Willnauer 79db1bfbf0 make object caching optional 2013-04-18 19:14:19 +02:00
Florian Schilling 54cb4b9615 # Response for Cluster Settings Update API
If cluster settings are update the REST API returns the accepted values. For
example, updating the `cluster.routing.allocation.disable_allocation` via
cluster settings:

```curl -XPUT http://localhost:9200/_cluster/settings -d '{
    "transient":{
        "cluster.routing.allocation.disable_allocation":"true"
    }
}'```

will respond:

```{
    "persistent":{},
    "transient":{
        "cluster.routing.allocation.disable_allocation":"true"
    }
}```

Closes #2907
2013-04-18 11:34:58 +02:00
Lucas Ward 99c101c37e If a value/field is a Calendar, it will be converted to a Date using getTime()
Closes #2911
2013-04-18 10:57:08 +02:00
Shay Banon 0eb298fe64 use more aggressive concurrency levels for CHM
- long running ones with high update rates
- also expose a *system* property of es.useConcurrentHashMapV8 to use the new non blocking Java8 CHM impl
2013-04-17 14:28:38 -07:00
Shay Banon 271305d5eb Search Stats: Add current open searches
closes #2906
2013-04-16 18:08:57 -07:00
Simon Willnauer efc9e8fe7b only return primary if it is active in PlainOperationRounting
Closes #2896
2013-04-16 17:20:22 +02:00
Martijn van Groningen bcc16654d2 Better error messaging when postings_format can be resolved or when a custom postings_format type can't be instantiated.
Relates to #2893
2013-04-16 16:29:54 +02:00
Martijn van Groningen 9a1c03408b Added support for the `_cache` and` _cache_key` options to the `has_child` and `has_parent` filters.
Closes #2900
2013-04-16 14:42:45 +02:00
Florian Schilling ef5b7412e6 Allow PolygonBuilder to create polygons with hole
Closes #2899
2013-04-16 11:22:48 +02:00
Simon Willnauer 30f9f278c3 Added UNICODE_CHARACTER_CLASS support to Regex flags. This flag is only supported in Java7 and is ignored if set on a java 6 JVM
Closes #2895
2013-04-16 10:06:53 +02:00
uboness eb21526552 Added missing support for lat, lats, lon, lons for doc notation in scripts 2013-04-13 13:58:30 -07:00
uboness 20e6df9f34 Optimization in fielddata cache where ordinals are used instead of flat arrays when number of unique values is low 2013-04-13 12:42:53 -07:00
Igor Motov e7b49d8936 Add more dynamic settings validation 2013-04-12 20:55:45 -04:00
Shay Banon d385e1b356 Clear Cache API: Streamline option names
closes #2890
2013-04-12 15:58:24 -07:00
Shay Banon a2d72697eb Expose field level field data statistics
closes #2889
2013-04-12 15:51:08 -07:00
David Pilato 3b7a195f6f Add toString() for FilterBuilders
Closes #2887.
2013-04-12 22:27:51 +02:00
Martijn van Groningen bf21466291 CacheTests test fix. 2013-04-12 19:14:38 +02:00
Martijn van Groningen 80dbca0809 Field data: Try to load short values as byte values and load int values as short or byte values to reduce the size they take in memory. 2013-04-12 19:11:18 +02:00
Shay Banon 5fbd4a12a0 fix memory computation for int field data 2013-04-12 08:38:52 -07:00
Martijn van Groningen 5c90e5f940 If no options are specified with the clear cache api then all caches should be cleared.
Closes #2886
2013-04-12 15:24:50 +02:00
Igor Motov 00c035f88c Make sure that settings are propagated to all nodes 2013-04-11 10:59:14 -04:00
Martijn van Groningen 2dfcc3c740 Test that size is actually computed.
Relates to #2882
2013-04-11 10:22:48 +02:00
Simon Willnauer 9a2d27a035 rename prefix_length to prefix_len for consistency
Closes #2883
2013-04-10 17:39:32 +02:00
Martijn van Groningen 4fd8c2c6d2 Ordinals were omitted from fielddata cache size calculation if field has more than one term.
Closes #2882
2013-04-10 14:50:07 +02:00
Martijn van Groningen 637eeacb20 Better error description if field(s) (statistical facet) and value_field (term_stats facet) are not a numeric field 2013-04-10 11:11:52 +02:00
Martijn van Groningen 6a3c53ef44 Should prevent OOM 2013-04-10 10:00:51 +02:00
Martijn van Groningen b8b28041e5 Fix for extended facets test. 2013-04-10 00:47:00 +02:00
Igor Motov b0e44a2b40 Fix term counters in script field terms facet
Fixes #2878
2013-04-09 12:42:35 -04:00
Simon Willnauer ae74a8dbb7 Configure FieldData using a hash not a string
Closes #2876
2013-04-09 15:53:05 +02:00
Simon Willnauer 374bbbfa7b # FieldData Filter
FieldData is an in-memory representation of the term dictionary in an uninverted form. Under certain circumstances this FieldData representation can grow very large on high-cardinality fields like tokenized full-text. Depending on the use-case filtering the terms that are hold in the FieldData representation can heavily improve execution performance and application stability.
FieldData Filters can be applied on a per-segment basis. During FieldData loading the terms enumeration is passed through a filter predicate that  either accepts or rejects a term.

## Frequency Filter

The Frequency Filter acts as a high / low pass filter based on the document frequencies of a certain term within the segment that is loaded into field data. It allows to reject terms that are very high or low frequent based on absolute frequencies or percentages relative to the number of documents in the segment or more precise the number of document that have at least one value in the field that is loaded in the current segment.

Here is an example mapping

Here is an example mapping:

```json
{
    "tweet" : {
        "properties" : {
            "locale" : {
                "type" : "string",
                "fielddata" : "format=paged_bytes;filter.frequency.min=0.001;filter.frequency.max=0.1",
                "index" : "analyzed",
            }
        }
    }
}
```
### Paramters

 * `filter.frequency.min` - the minimum document frequency (inclusive) in order to be loaded in to memory. Either a percentage if < `1.0` or an absolute value. `0` if omitted.
 * `filter.frequency.max` - the maximum document frequency (inclusive) in order to be loaded in to memory. Either a percentage if < `1.0` or an absolute value. `0` if omitted.
 * `filter.frequency.min_segment_size` - the minimum number of documents in a segment in order for the filter to be applied. Small segments might be omitted with this setting.

## Regular Expression Filter

The regular expression filter applies a regular expression to each term  during loading and only loads terms into memory that match the given regular expression.

Here is an example mapping:

```json
{
    "tweet" : {
        "properties" : {
            "locale" : {
                "type" : "string",
                "fielddata" : "format=paged_bytes;filter.regex=^en_.*",
                "index" : "analyzed",
            }
        }
    }
}
```

Closes #2874
2013-04-09 11:34:48 +02:00
Igor Motov acc0950957 Get template should return warmers
Fixes #2868
2013-04-08 19:12:20 -04:00
Simon Willnauer a10c80e20f ensure that modificatons to the enum order trigger test failures since we rely on the ordinal 2013-04-08 23:29:56 +02:00
Simon Willnauer 7e77ddb88f use enum to represent flags and fail if flags are not respected 2013-04-08 22:56:11 +02:00
Igor Motov 2a588dc1f1 Fix IndexMissingException in get template request
Fixes #2873
2013-04-08 16:25:09 -04:00
Shay Banon 3120457bfe move to 0.90.0.RC3 snap 2013-04-08 05:48:29 -07:00