4263 Commits

Author SHA1 Message Date
stephlag
b5c9d8c98b Add Javadoc 2014-06-04 17:18:25 +02:00
mikemccand
50e42265ef Indexing: clear versionMap on refresh (not flush) to reduce heap usage
The versionMap holds all versions (keyed by _uid) for recently indexed
documents.  Previously we only cleared it during flush, which can be
infrequent if the translog flush thresholds are high, and can cause
excessive heap usage especially for small documents.

Now we clear it during refresh which is usually more frequent
(e.g. once per second by default).

Closes #6379
2014-06-04 05:37:51 -04:00
Colin Goodheart-Smithe
f78480a0bc Aggregations: Fixed failures when geo points are all either positive or negative 2014-06-04 09:16:29 +01:00
Simon Willnauer
288eb3d803 [TEST] remove trace logging 2014-06-04 10:10:38 +02:00
Boaz Leskes
ef5d64c73b [Test] Extended IndexActionTests.testAutoGenerateIdNoDuplicates to check both with and without a specific type
The test also captures the first error but continues to run searches in order to gather more information before failing.
2014-06-03 21:55:10 +02:00
Simon Willnauer
963f627dca Add [1.2.1] Release 2014-06-03 17:25:57 +02:00
Colin Goodheart-Smithe
b9f4d44b14 Aggregations: Adds GeoBounds Aggregation
The GeoBounds Aggregation is a new single bucket aggregation which outputs the coordinates of a bounding box containing all the points from all the documents passed to the aggregation as well as the doc count. Geobound Aggregation also use a wrap_logitude parameter which specifies whether the resulting bounding box is permitted to overlap the international date line.  This option defaults to true.

This aggregation introduces the idea of MetricsAggregation which do not return double values and cannot be used for sorting.  The existing MetricsAggregation has been renamed to NumericMetricsAggregation and is a subclass of MetricsAggregation.  MetricsAggregations do not store doc counts and do not support child aggregations.

Closes #5634
2014-06-03 15:59:56 +01:00
Simon Willnauer
4b28bc396d Translog: Revert unlimited flush_treshold_ops for translog
This commit reverts the commit for issue #5900 introduced
in `1.2.0`. The unlimited translog size can cause memory pressure
on ES instances with low memory and high indexing load.

Closes #6377
2014-06-03 16:54:22 +02:00
Adrien Grand
7ab99de483 Routing: Restore shard routing.
Routing has been inadvertly changed in #5562 resulting in documents going to
different shards in 1.2. This is a terrible bug because an indexing request
would not necessarily go to the same shard anymore, potentially leading to
duplicates.

Close #6391
2014-06-03 16:37:54 +02:00
Kevin Wang
6a399d4c9a Remove support for field names in node_stats url
Field names ended up making the urls too long, fields are still supported as query string parameters though (same as indices stats)
2014-06-03 13:57:07 +02:00
stephlag
10cb136eb0 [DOCS] Fixed typo in IndexRequestBuilder Javadocs 2014-06-03 13:48:41 +02:00
Alex Ksikes
9797e343aa More Like This Query: values of a multi-value fields are compared at the same level.
Previously, More Like This would create a new mlt query for each value of a
multi-value field. This could result in all the values of the field to be
selected, which defeats the purpose of More Like This. Instead, the correct
behavior is to generate only one mlt query for all the values of the field.
This commit provides the correct behavior for More Like This DSL. The fix for
More Like This API will be coming in another commit.

Closes #6310
2014-06-03 13:43:51 +02:00
Adrien Grand
df67b17646 BigArrays: Disable breaking.
The BigArrays limit is currently shared by the translog, netty, http and some
queries/aggregations. If any of these consumers starts taking a lot of memory,
then other ones might fail to allocate memory, which could have bad
consequences, eg. if ping requests can't be sent. The plan is to come up with
a better solution in 1.3.

Close #6332
2014-06-03 11:34:25 +02:00
javanna
90b1e6a461 [TEST] make sure that the -Dtests.rest.blacklist parameter works on windows too
Some reserved characters need to be replaced in the test section names, which gets parsed as a path although it isn't a filename
2014-06-03 09:23:37 +02:00
Britta Weber
125e0c16cd Object and Type parsing: Fix include_in_all in type
include_in_all can also be set on type level (root object).
This fixes a regression introduced  in #6093

closes #6304
2014-06-02 17:48:19 +02:00
Colin Goodheart-Smithe
a23e4aefaa Geo: Issue with polygons near date line
If a polygon is constructed which overlaps the date line but has a hole which lies entirely one to one side of the date line, JTS error saying that the hole is not within the bounds of the polygon because the code which splits the polygon either side of the date line does not add the hole to the correct component of the final set of polygons.  The fix ensures this selection happens correctly.

Closes #6179
2014-06-02 15:03:32 +01:00
Martijn van Groningen
f2641d29ae [TEST] Added sort duel between a single shard index and a multi shard index. 2014-06-02 14:16:55 +02:00
Martijn van Groningen
43b21719f5 [TEST] size should start from 1, top_hits aggregation doesn't support size <= 0 2014-06-02 13:21:13 +02:00
Simon Willnauer
3b31f25624 [TEST] Ensure cluster size reflected in the cluster state
We perform some management operations that require the cluster to be
consistent with respect to the number of nodes in the cluster state
/ visible to the master in order to rely on the ack mechanism. This
only applies to the test infrastructure when nodes are not explicitly
started / stopped as well as while tearing down the cluster and wiping
indices after the tests.
2014-06-02 11:57:32 +02:00
mikemccand
7552b69b1f Core: reuse Lucene's TermsEnum for faster _uid/version lookup during
Reusing Lucene's TermsEnum for _uid/version lookups gives a small
indexing (updates) speedup and brings us a closer to not having
to spend RAM on bloom filters.

Closes #6212
2014-05-31 17:38:48 -04:00
Martijn van Groningen
f51a09d8f7 Core: Protects against: 'from + size > scoreDocs.length' in case only single shard response 2014-05-31 20:30:11 +02:00
javanna
e8995ecaa7 [TEST] speed up HighlightSearchTests a bit
Randomize rewrite methods instead of trying them all when highlighting multi term queries with postings highlighter
Rely on search type randomization and remove all the explicit setSearchType calls as they are not needed anymore
Remove explicit `.from`, `.size` and `.explain`, not needed and might slow tests down (especially explain)
2014-05-31 16:29:53 +02:00
Martijn van Groningen
01ca8491cf Core: apply 'from' if there is one shard result. 2014-05-31 13:35:11 +02:00
Martijn van Groningen
b8366a3213 Aggregations: apply 'from' if there is one shard result. 2014-05-31 13:34:49 +02:00
Clinton Gormley
46a67b638d Parent/Child: Added min_children/max_children to has_child query/filter
Added support for min_children and max_children parameters to
the has_child query and filter. A parent document will only
be considered if a match if the number of matching children
fall between the min/max bounds.

Closes #6019
2014-05-30 19:38:39 +02:00
mikemccand
48ccb06160 remove stale nocommit 2014-05-30 13:22:48 -04:00
Martijn van Groningen
760cee7c24 Aggregations: Take the 'from' into account when getting a fetched hit (InternalSearchHit). Hits before the 'from' are included in each shard result. 2014-05-30 16:23:28 +02:00
Shay Banon
9c98bb3554 Have a dedicated join timeout that is higher than ping.timeout for node join
Using ping.timeout, which defaults to 3s, to use as a timeout value on the join request a node makes to the master once its discovered can be too small, specifically when there is a large cluster state involved (and by definition, all the buffers and such on the nio layer will be "cold"). Introduce a dedicated join.timeout setting, that by default is 10x the ping.timeout (so 30s by default).
closes #6342
2014-05-30 12:42:08 +02:00
Martijn van Groningen
0e2d33b4a4 [BUILD] Fix compile error 2014-05-30 12:24:11 +02:00
Martijn van Groningen
aab38fb2e6 Aggregations: added pagination support to top_hits aggregation by adding from option.
Closes #6299
2014-05-30 11:45:31 +02:00
Martijn van Groningen
35755cd8a4 Aggregations: Fixed bug in top_hits aggregation to not fail with NPE when shard results are empty.
The top_hits aggregation returned an empty InternalTopHits instance with no fields set when there were no result, causing reduce and serialization errors down the road. This is fixed by setting all required fields when a there are no results.

Closes #6346
2014-05-30 11:40:45 +02:00
Igor Motov
8c903f4787 [TESTS] Add get snapshot status test for partial snapshots 2014-05-29 19:07:04 -04:00
Boaz Leskes
93e0ce0c5b [Test] added search trace logging to IndexActionTests.testAutoGenerateIdNoDuplicates 2014-05-28 22:12:23 +02:00
Boaz Leskes
dc34ccebfe [Tests] assert indexRandom's deletion of injection dummy docs find them 2014-05-28 22:06:38 +02:00
Adrien Grand
4ff511000e [TESTS] There might be several live BigArrays instances at the same time. 2014-05-28 16:55:26 +02:00
Adrien Grand
cc9a7bd454 Recycling: change the default type of the page recycler to CONCURRENT instead of SOFT_CONCURRENT.
This default type has been inherited from its ancestor, the (non-paged) recycler whose memory
usage was unbounded and required soft references to make sure it could release memory eventually.
On the contrary, the page cache recycler memory usage is bounded so we could remove soft
references in order to remove load on the garbage collector.

Note: the cache type is already randomized in integration tests.

Close #6320
2014-05-28 15:23:18 +02:00
Simon Willnauer
a5866e226e Mustache: Ensure internal scope extrators are always operating on a Map
Mustache extracts the key/value pairs for parameter substitution from
objects and maps but it's decided on the first execution. We need to
make sure if the params are null we pass an empty map to ensure we
bind the map based extractor

Closes #6318
2014-05-28 13:29:21 +02:00
Mathias Fussenegger
82e9a4e80a Serialization: Add support for Byte to the XContentBuilder.
Close #6127
2014-05-28 12:19:44 +02:00
Adrien Grand
be29138962 [BUILD] Remember to use AtomicReader.addCoreClosedListener when upgrading to Lucene 4.9. 2014-05-28 09:35:00 +02:00
mateusz_kaczynski
e97a381db2 Highlighting: Plain highlighter to use analyzer defined on a document level when available.
At the moment plain highligher only uses an analyzer defined for on the type
level. However, during the indexing stage it is possible to define analyzer on
per document level, for example mapping '_analyzer' to another field, containing
required name. This commit attempts to make sure that highlighting works
correctly in this scenario.

Closes #5497
2014-05-28 08:27:14 +02:00
Shay Banon
13f49237df [Test] make sure to close the file at the end of the test 2014-05-27 11:08:29 +02:00
Shay Banon
cd94af2c9e [Test] make sure we test writeTo(Channel) in BytesReference
also introduce proper randomization of content in the bytes
2014-05-26 13:32:52 +02:00
Alex Brasetvik
15ff3df243 Fix MatchQueryParser not parsing fuzzy_transpositions 2014-05-23 22:02:21 +02:00
Martijn van Groningen
3f2f1f088d Set the sortValues on SearchHit post aggregation instead of during the reduce. 2014-05-23 19:05:30 +02:00
Lee Hinman
65ce5acfb4 Explicitly clean up fielddata cache when clearing entire cache 2014-05-23 16:29:26 +02:00
Robert Muir
2cbe9371d2 Improve error when mlockall fails (closes #6288) 2014-05-23 10:16:26 -04:00
Martijn van Groningen
5fafd2451a Added top_hits aggregation that keeps track of the most relevant document being aggregated per bucket.
Closes #6124
2014-05-23 16:01:18 +02:00
Adrien Grand
2d417cf5b6 [TESTS] Left-over from 14420d7c4e15df9b565b50ef5beab797f756c3ac. 2014-05-23 10:10:00 +02:00
Adrien Grand
14420d7c4e [TESTS] Fix test to use index-level doc IDs instead of segment-level doc IDs. 2014-05-23 01:20:41 +02:00
Adrien Grand
0d3410a837 [TESTS] Fix test bug in SimpleValidateQueryTests. 2014-05-23 00:52:56 +02:00