Commit Graph

5064 Commits

Author SHA1 Message Date
Martijn van Groningen 18141b8da0 Fix SimplePercolatorTests#testPercolateStatistics 2013-07-23 00:10:27 +02:00
Shay Banon 4930b93c26 move master actions to accept a lister, improve cluster service state execution
- master actions many times end up being executed on the cluster service, so there is no need to block them on the management thread pool to wait for a response, this remove the load on the management thread pool, and also simplifies the code implementing it
- cluster service state update exception handling was improved to include a callback when a failure happens during state update execution, this makes sure that we catch all relevant exceptions and invoke the callback, as well as simplify the code of cluster state update tasks, as they can throw failures from within the execute method and then handle them properly
2013-07-23 00:08:45 +02:00
Adrien Grand e943cc81a5 Add FastVectorHighlighter support for more complex queries.
FastVectorHighlighter fails at highlighting some complex queries such as
multi phrase queries which have two terms at the same position. This can be
easily triggered by running a `match_phrase` query with an analyzer which
outputs synonyms such as SynonymFilter or WordDelimiterFilter.

Close #3357
2013-07-22 19:43:22 +02:00
Shay Banon 6b21414520 should be 500... 2013-07-22 17:53:44 +02:00
Shay Banon c766f6bd97 Cluster State Update APIs (master node) to respect master_timeout better
also respect the timeout when trying to obtain the md lock
relates #3365
2013-07-22 17:53:18 +02:00
Shay Banon 235a68c3bd Cluster State Update APIs (master node) to respect master_timeout better
Currently, the master node might be processing too many cluster state events, and then be blocked on waiting for its respective even to be processed. We can use the new cluster state update timeout support to use the master_timeout value and respect it.

closes #3365
2013-07-22 16:58:00 +02:00
Luca Cavanna 0b33394476 Added alias action validation
Moved alias action validation to IndicesAliasesRequest, so that Java api and RestIndexPutAliasAction can benefit from it too.
Added check in MetaDataIndexAliasesService too.

Closes #3363
2013-07-22 15:27:20 +02:00
Martijn van Groningen 2f7d1189b1 Also serialize the routing and preference 2013-07-22 15:16:29 +02:00
Martijn van Groningen d310b94904 Fix ConcurrentPercolatorTests replaces CountDownLatches array by a Semaphore.
The writes to the CountDownLatches array wasn't visible by all threads when the countdown latch array slots were re-initialized.
2013-07-22 11:32:55 +02:00
Shay Banon 4da7086df8 catch failures and notify the listener in aliases action handling 2013-07-22 11:19:23 +02:00
Shay Banon 7a9350c9a1 Transport: Add a dedicated ping channel
Today, we have a low/med/high channel groups in our transport layer. High is used to publish cluster state and send ping requests. Sometimes, the overhead of publishing large cluster states can interfere with ping requests.

Introduce a new, dedicated ping channel (with size 1) to have a channel that only handles ping requests.
closes #3362
2013-07-22 10:29:57 +02:00
Shay Banon e2961c0c7a add ability to associate a timeout with a priority executor
enhancement that can be used later to timeout tasks that are pending, also added the ability to get the pending task list from the executor
2013-07-22 09:16:56 +02:00
Martijn van Groningen e8ff7de6b8 Test to some extent the time spent on percolating. 2013-07-22 09:02:27 +02:00
Shay Banon bd52d61d5d reroute post shard started should have HIGH prio as well 2013-07-21 15:21:22 +02:00
Shay Banon f2614b22de Zen Discovery Cluster Events to have Priority.URGENT
Master node cluster state events resulting in zen discovery (node gets added, removed, for example) should be processed with priority URGENT as its always better to process them as fast as possible, and not let other events get in the way.
closes #3361
2013-07-21 14:57:26 +02:00
Shay Banon 6a25395c97 fix percolate stats tests failures 2013-07-21 14:25:26 +02:00
Adrien Grand 09362f47e9 Be consistent with the default value of `acceptable_overhead_ratio`. 2013-07-19 19:37:36 +02:00
Martijn van Groningen 32a96aea71 Added percolate statistics to indices and node stats.
Relates to #3173
2013-07-19 19:14:49 +02:00
Shay Banon eb75a815db pattern replace with empty "" setting fails
we should default the replacement to "", since in the settings, if its set to "", we remove the setting
2013-07-19 19:13:15 +02:00
skymeyer 08e35e4dbe Use systemd configuration file for applying limits correctly
In case systemd is used, ulimit is not called (as it would be in the initscript)
and has to be configured in the systemd configuration.

For more information about the parameters LimitNOFILE and LimitMEMLOCK
see http://www.freedesktop.org/software/systemd/man/systemd.exec.html
2013-07-19 18:10:00 +02:00
Martijn van Groningen c346ed3d2d Fix ConcurrentPercolatorTests that failed occasionally when running with other tests, by not using the shared test cluster. 2013-07-19 16:54:14 +02:00
Shay Banon b12acbcf9e introduce read/writeSharedString while streaming
currently, we treat all strings as shared (either by full equality or identity equality), while almost all times we know if they should be serialized as shared or not. Add an explicitly write/readSharedString, and use it where applicable, and all other write/readString will not treat them as shared
relates to #3322
2013-07-19 16:17:22 +02:00
Shay Banon 74a7c46b0e top level filter not resulting in an actual filter is ignored
when parsing a filter, we use null to indicate that this filter should not match anything, the top level filter doesn't take it into account
fixes #3356
2013-07-19 13:12:23 +02:00
Adrien Grand fe4c2a9d02 Work around the fact that AssertionError(String message, Throwable cause) is a Java 1.7-only API. 2013-07-19 09:47:55 +02:00
Adrien Grand 12d9268db2 Make field data able to support more than 2B ordinals per segment.
Although segments are limited to 2B documents, there is not limit on the number
of unique values that a segment may store. This commit replaces 'int' with
'long' every time a number is used to represent an ordinal and modifies the
data-structures used to store ordinals so that they can actually support more
than 2B ordinals per segment.

This commit also improves memory usage of the multi-ordinals data-structures
and the transient memory usage which is required to build them (OrdinalsBuilder)
by using Lucene's PackedInts data-structures. In the end, loading the ordinals
mapping from disk may be a little slower, field-data-based features such as
faceting may be slightly slower or faster depending on whether being nicer to
the CPU caches balances the overhead of the additional abstraction or not, and
memory usage should be better in all cases, especially when the size of the
ordinals mapping is not negligible compared to the size of the values (numeric
data for example).

Close #3189
2013-07-19 09:10:08 +02:00
Martijn van Groningen 4d05c9cfd5 Optimize `has_child` query & filter execution with two short circuit mechanisms:
* If all parent ids have been emitted as hit, abort the query / filter execution.
* If the a relative small number of parent ids have been collected in the first phase then limit the number of second phase parent id lookups by putting a short circuit filter before parent document evaluation or omit the it in the case of the filter. This is contrable via the `short_circuit_cutoff` option which is exposed in the `has_child` query & filter.

All parent / child queries and filters (expect `top_children` query) abort execution if no parent ids have been collected in the first phase.

Closes #3190
2013-07-18 17:41:23 +02:00
Martijn van Groningen c222ce28fc Redesigned the percolator engine to execute in a distribute manner.
With this design the percolate queries will be stored in a special `_percolator` type with its own mapping in the same index where the actual data is or in a different index (dedicated percolation index, which might require different sharding behavior compared to the index that holds actual data and being search on). This approach allows percolate requests to scale to the number of primary shards an index has been configured with and effectively distributes the percolate execution.

This commit doesn't add new percolate features other than scaling. The response remains similar, with exception that a header similar to the search api has been added to the percolate response.

Closes #3173
2013-07-18 16:52:42 +02:00
Adrien Grand f38103a232 Eclipse: organize imports on save.
It can happen that Eclipse fails at correctly adding a new import entry to an
existing list of imports since we don't use its default rules. This commit
forces Eclipse to organize imports on save.
2013-07-18 14:49:35 +02:00
Florian Schilling 1e5e8d83b1 Changed GeoPoint parsing in serveral parsers using Geopoint.parse() Closes #3351 2013-07-18 12:49:12 +02:00
Robin Hughes 45a756c203 Analysis: update ThaiAnalyzerProvider to use custom stopwords setting 2013-07-18 11:18:37 +02:00
Luca Cavanna c28452ee67 added missing link to the elasticsearch.org website 2013-07-17 19:07:22 +02:00
Adrien Grand ffcc710e4e Add the ability to ignore or fail on numeric fields when executing more-like-this or fuzzy-like-this queries.
More-like-this and fuzzy-like-this queries expect analyzers which are able to
generate character terms (CharTermAttribute), so unfortunately this doesn't
work with analyzers which generate binary-only terms (BinaryTermAttribute,
the default CharTermAttribute impl being a special BinaryTermAttribute) such as
our analyzers for numeric fields (byte, short, integer, long, float, double but
also date and ip).

To work around this issue, this commits adds a fail_on_unsupported_field
parameter to the more-like-this and fuzzy-like-this parsers. When this parameter
is false, numeric fields will just be ignored and when it is true, an error will
be returned, saying that these queries don't support numeric fields. By default,
this setting is true but the mlt API sets it to true in order not to fail on
documents which contain numeric fields.

Close #3252
2013-07-16 18:37:34 +02:00
Clinton Gormley 1bc8f82d0a Merge pull request #3341 from clintongormley/pattern_capture
Added the "pattern_capture" token filter from Lucene 4.4
2013-07-16 09:20:13 -07:00
Clinton Gormley 16e137ebbc Added the "pattern_capture" token filter from Lucene 4.4
The XPatternCaptureGroupTokenFilter.java file can be removed once we
upgrade to Lucene 4.4.

This change required the addition of the commaDelimited flag to getAsArray()
to disable parsing strings as comma-delimited values.

Closes #3340
2013-07-16 18:08:12 +02:00
Luca Cavanna 933fd50466 Added support for multiple indices in open/close index apis
Open/Close index api supports now multiple indices the same way as the delete index api works. The only exception is when dealing with all indices: it's required to explicitly use _all or a pattern that identifies all the indices, not just an empty array of indices. Supports the ignore_missing param too.
Added also a new flag action.disable_close_all_indices (default false) to disable closing all indices

Closes #3217
2013-07-16 15:10:13 +02:00
Florian Schilling 6e9ad03b27 Fixed nullshape indexing.
Closes #3310
2013-07-16 10:49:05 +02:00
Alexander Reelsen 3087fd8b2a Removed useless TODO 2013-07-16 10:29:48 +02:00
Shay Banon 21677964a5 rename variable and add comment about TopDocs#merge 2013-07-16 10:28:10 +02:00
Brett Dargan 94fd152eb1 Added statistical facet to term facet in SimpleNestedTests
The test now uses a statistical facet plus a filter facet on nested documents.
2013-07-16 09:59:02 +02:00
Boaz Leskes 88eb3552d8 AtomicArray.toArray will now throw an exception if target array if of the wrong size. 2013-07-16 09:22:12 +02:00
Andrew Raines 092fd6fc7a Add info to _cat/nodes, add _cat/indices. 2013-07-15 16:03:21 -05:00
Boaz Leskes c3038889f9 Using AtomicArray to collect responses in mget and bulk indexing (instead of synchronised) 2013-07-15 22:44:57 +02:00
Alexander Reelsen 28b9e25053 Fix xcontent serialization of timestamp/routing index field
The index field was serialized as a boolean instead of showing the
'analyed', 'not_analzyed', 'no' options. Fixed by calling
indexTokenizeOptionToString() in the builder.

Closes #3174
2013-07-15 18:02:39 +02:00
Luca Cavanna baea7fd1c2 fixed existing test and linked it to its issue 2013-07-15 17:31:49 +02:00
Alexander Reelsen c59b0b22e2 Debian/Redhat package improvments
This decision helps people who want to rollout the oracle java without having an openjdk java installed.

* Removed any hard dependency on Java in the debian package
* The debian init script does not check for an existing JAVA_HOME anymore
* Debian and RedHat initscripts now exit if they do not find a java binary (instead of starting elasticsearch in the background and swallowing the error as there is no way to log it in that case)
* Changed the debian init script to rely on the pid file instead of the argument name of process
* Added a useful error message in case no java binary is available (in elasticsearch shell script)

Closes #3304
Closes #3311
2013-07-15 16:03:24 +02:00
Simon Willnauer 37edfe060b Set spare becore comparing comparator bottom value
The actual documents value was never calculated if setSpare wasn't called
before compareBottom was called on a certain document.

Closes #3309
2013-07-15 15:40:58 +02:00
Benjamin Devèze b116097ea5 Add found field for bulk deletes. Closes #3320 2013-07-15 15:08:43 +02:00
Martijn van Groningen 470b685fa9 Renamed IndicesGetAliases* classes to begin with GetAliases* 2013-07-15 14:55:46 +02:00
Shay Banon f28ff2becc move to use ScoreDoc/FieldDoc instead of our wrappers
now that we have the concept of a shardIndex as part of our search execution, we can simply move to use ScoreDoc and FieldDoc instead of having our own wrappers that held the info
Also, rename shardRequestId where needed to be called shardIndex to conform with the variable name in Lucene
2013-07-15 14:54:28 +02:00
Britta Weber 7098073a66 fix term vector api retrieved wrong doc
The previous loading of term vectors from the top level reader did not use the
correct docId. The docId in Versions.DocIdAndVersion  is relative to the segment
reader in Versions.DocIdAndVersion and not to the top level reader.
Consequently the term vectors for the wrong document were returned if the
document was not on the first segment of the shard.
2013-07-15 14:50:48 +02:00