Commit Graph

5015 Commits

Author SHA1 Message Date
Britta Weber 7098073a66 fix term vector api retrieved wrong doc
The previous loading of term vectors from the top level reader did not use the
correct docId. The docId in Versions.DocIdAndVersion  is relative to the segment
reader in Versions.DocIdAndVersion and not to the top level reader.
Consequently the term vectors for the wrong document were returned if the
document was not on the first segment of the shard.
2013-07-15 14:50:48 +02:00
Shay Banon 3004a2a696 move the fields doc queue to a better package location 2013-07-15 14:33:44 +02:00
Martijn van Groningen 127c62924b Rename IndicesAdminClient#existsAliases to IndicesAdminClient#aliasesExist.
Closes #3330
2013-07-15 14:28:09 +02:00
Adrien Grand 1310f02e6c Rename DocIdAndVersion.reader to DocIdAndVersion.context to avoid confusion. 2013-07-15 14:21:35 +02:00
Boaz Leskes 9e8c42f0c6 multiget requests which referred to missing indexes blocked and never returned. 2013-07-15 09:58:10 +02:00
Shay Banon 8e0d23b147 search reducer to use atomic reference arrays
move away from maps to correlate between responses from different shards to unique incremental integer representing a shardRequestId (unique for the specific search request)

 this allows to no longer require using maps (or CHM), and simply use atomic reference arrays, which rely on volatiles. it also removes the need to use a cache for heavy data structures since we don't really have them around anymore...
2013-07-14 00:51:54 +02:00
Shay Banon 2762fed04f remove unused class 2013-07-13 17:23:14 +02:00
Shay Banon 9f6117612c cache recycler now node/client level component 2013-07-13 01:00:45 +02:00
Shay Banon 17936fabb0 latest jsr166 upgrade only compiled with 1.7 2013-07-11 22:48:49 +02:00
Shay Banon fe6fb7135b update to latest jsr166 2013-07-10 13:09:04 -07:00
Boaz Leskes abf2268574 Added an error message for when child mapping is not properly configured (incorrect type) 2013-07-09 14:06:48 +02:00
Adrien Grand c37de66fb6 Don't reset TokenStreams twice when highlighting.
When using PlainHighlighter, TokenStreams are resetted both before highlighting
and at the beginning of highlighting, causing issues with analyzers that read
in reset() such as PatternAnalyzer. This commit removes the call to reset which
was performed before passing the TokenStream to the highlighter.

Close #3200
2013-07-08 14:31:06 +02:00
Shay Banon 759a13f1de optimize reroute
- optimize initialization of building the all the assigned shards state
- optimize iteration in throttling allocation decider
2013-07-06 14:12:52 -07:00
Shay Banon cc1173b58f automatically set translog buffer size based on number of shards
similar to how we set the indexing buffer size, automatically set the translog buffer size based on the number of shards allocated on a node
2013-07-06 10:56:36 -07:00
Shay Banon 4574489c27 make utf8 bytes response not reuse thread local buffer
no need, optimized conversion to bytes anyhow, and when sending, it will just get wrapped by a buffer
2013-07-06 10:42:34 -07:00
Shay Banon f4d1895399 guice optimization
only under debug logging use the source provider to find the line number through stack trace elements, otherwise, its very expensive
2013-07-05 22:47:27 -07:00
Shay Banon b9a2fbd874 properly reuse indices analyzers
don't wrap in AnalysisService the indices analyzers we have with a NamedAnalyzer, since its effectively creates a new instance of an analyzer (with per field reuse strategy) and we don't benefit as much from reusing analyzers on the indices / node level

Now, the indices level analyzers return a NamedAnalyzer, also NamedAnalyzer will use the non per field reuse strategy since thats really the common case for it (no need for per field reuse there).

Also, try and reuse numeric analyzers globally instead of creating them per numeric mapper. Although those analyzers are not used during indexing (we have a custom numeric field for it), they can be used sometimes when searching in a query string for example without specific query implemenation in the mappers
2013-07-05 19:07:08 -07:00
Shay Banon 8d9c84f84e optimize guice injector once created
in guice, we always use eager loaded singletons for all modules we create, thus, we can actually optimize the memory used by injectors by reduced the construction information they store per binding resulting in extensive reduction in memory usage for many indices/shards case on a node

 also because all are eager singletons (and effectively, read only), we can not go through trying to create just in time bindings in the parent injector before trying to craete it in the current injector, resulting in improvement of object creations time and the time it takes to create an index or a shard on a node
2013-07-05 17:30:09 -07:00
Shay Banon 09a6907cca optimize applyDeletes event
- reuse set
- don't copy over again the shard ids immutable set
2013-07-05 17:30:09 -07:00
Boaz Leskes 5b078ebfed fixed casting that caused compilation errors with JDK7 2013-07-05 16:26:50 +02:00
Boaz Leskes 491d2b721c added support for a prefix wild card (*.field) in includes 2013-07-05 12:51:07 +02:00
Andrew Raines 1645e4230f Add /_cat/nodes. 2013-07-04 18:58:59 -05:00
Andrew Raines bcacbff096 Don't consider header width if not printed. 2013-07-04 18:58:57 -05:00
Andrew Raines 2bb681466c Use standard CSS separator. 2013-07-04 18:58:53 -05:00
Andrew Raines dd6267aaf2 Add /_cat/master.
% curl localhost:9200/_cat/master
id                     ip             node
Zdumn8bkTuOGRLfr9JL8jQ 192.168.20.109 Petros, Dominic
2013-07-04 18:26:30 -05:00
Shay Banon 9d0ce1b1d3 when thread pool queue size is negative use unbounded queue 2013-07-04 22:59:36 +02:00
Martijn van Groningen 5f9581d4f0 Simplified the rewrite logic for the parent/child queries. 2013-07-04 22:27:44 +02:00
Martijn van Groningen 08c3359e6f Fixed embedded percolator benchmark 2013-07-04 19:55:00 +02:00
Alexander Reelsen 3162f5b725 Updated maven shade plugin to version 2.1
The currently used maven shade plugin still keeps references to the
original classes in their constant pools around. This is never a problem
at runtime, but for dependency tools which try to use the constant pool
for determining dependencies will get confused (OSGI for example). This
patch simply bumps the version and will implicetely fix
fix http://jira.codehaus.org/browse/MSHADE-105

Closes #3254
Closes #3255
2013-07-04 15:57:13 +02:00
Martijn van Groningen f9efa02a85 Fix test, because of term filter change. 2013-07-04 15:51:26 +02:00
Shay Banon 0c5a87608d only ask for the relevant stats 2013-07-04 14:40:38 +02:00
Martijn van Groningen 953dda2aee Changed the TermFilter to return a native DocIdSetIterator instead of the FixedBitSet's implementation.
This has two advantages in the case term filter is *not* cached:
 * We iterate only once over the matching docs. Before this fix we iterated once to create the FBS and another time the consume the matching docs from the FBS.
 * The DocIdSetIterator#cost method of a DocIdSetIterator from the DocsEnum is accurate, because it based on the document frequency whereas the cost method of the FBS' iterator impl is based on the total number of bits (which is based on maxDoc). This will make this filter execute faster when it is included in a filtered query, because the filtered query can base its decision on what strategy to pick on an accurate heuristic.

 This change doesn't have any negative implications in the case a filter is cached (which is the default). The FBS is now created lazily in the DocIdSets#toCacheable method, which is always invoked when the term filter needs to be cached.
2013-07-04 13:02:45 +02:00
Boaz Leskes 018ca58cdb Filtering maps had false hit is a field was a prefix (but not a match) of an include. Also, exact matching a key whose value is an object resulted in an empty value.
Closes #3288
2013-07-04 10:22:45 +02:00
Martijn van Groningen 4c8f3de34b Fixes class cast exception when`top_children`, `has_child` and `has_parent` queries are cached via `fquery` filter.
The error only occurs for `has_child` and `has_parent` if `score_mode` is used.
Closes #3290
2013-07-03 21:37:16 +02:00
Shay Banon ceb7d55857 more cat support
- add attributes to each cell
- change how it gets rendered, allow for other formats
- various other changes
2013-07-03 21:05:20 +02:00
Florian Schilling 0c2d12bda3 Geo-Refactoring
===============
The code handling geo-shapes is not centralized and creating points takes
place at different places. Also the collection of supported geo_shapes is
not complete regarding to the GEOJSon specification. This commit
centralizes the code related to GEO calculations and extends the old API by
a set of new shapes.

Null-Shapes
===========
The latest implementation of geo-shapes allows to index null-shapes. This
means a field that is defined to hold a geo-shape can be set to null. In
example:
    {

        "shape": null
    }

New Shapes
==========
The geo-shapes multipoint and multilinestring have been added to the
geo_shape types. Also geo_circle is introduced by this commit.

Dateline wrapping
=================
A major issue of geo-shapes is the spherical geometry. Since ElasticSearch
works on the Geo-Coordinates by wrapping the Earths surface to a plane,
some shapes are hard to define if it’s crossing the +180°, -180 longitude.
To solve this issue ElasticSearch offers the possibility to define geo
shapes crossing this borders and decompose these shapes and automatically
re-compose them in a spherical manner. This feature may change the indexed
shape-type. If for example a polygon is defined, that crosses the dateline,
it will be re-assembled to a set of polygons. This causes indexing a
multipolygon. Also linestrings crossing the dateline might be re-assembled
to multilinestrings.

Builders
========
The API has been refactored to use builders instead of using shapes. So
parsing geo-shapes will result in builder objects. These builders can be
parsed and serialized without generating any shapes. this causes shape
generation only on the nodes executing the actual operation. Also the
baseclass ShapeBuilder implements the ToXContent interface which allows to
set fields of XContent directly.

TODO’s
======
 - The geo-circle will not work, if it’s crossing the dateline
 - The envelope also needs to wrapped

Closes #1997 #2708
2013-07-03 10:53:41 +02:00
Shay Banon 0643569e70 remove table tests for now 2013-07-03 10:31:32 +02:00
Andrew Raines 0d57c4eafd Prefer getHostAddress(). 2013-07-02 23:40:22 -05:00
Andrew Raines 750f20d9d4 Add relocation info to _cat/shards. 2013-07-02 23:30:59 -05:00
Andrew Raines 8a90f4b5ff Don't trim() table since we need the newline most of the time. 2013-07-02 23:17:02 -05:00
Andrew Raines 4ab2cd13f2 Add /_cat/master. 2013-07-02 22:43:43 -05:00
Andrew Raines 7aa9d4bc9f Start cat api with shards endpoint. 2013-07-02 18:00:05 -05:00
Shay Banon 8919e7e602 simplify builder API with simpler get()
also simplify some common API calls, for example, a simplified format in Java API for providing mapping
2013-07-02 12:10:35 +02:00
Boaz Leskes 98bd5a0e66 _source could be loaded twice from disk
if only partial_fields was specified or fields needed to be extracted from _source the source it self isn't needed to be returned.
2013-07-02 11:59:14 +02:00
Alexander Reelsen 2dcc664310 Support for parent in multi get request
When specifying the docs to be returned in a multi get request, a parent
field could not be specified, so that some docs seemingly did not exist,
even though they did.

This fix behaves like the normal GetRequest and simply overwrites the
routing value if it has not yet been set.

Also a test for routing with mget has been added.

Closes #3274
2013-07-02 08:51:55 +02:00
Shay Banon 3a0ce0bde8 (Java) Using primitive arrays instead of Object with map/builder
also simplify and consolidate the builder generic value write handling
fixes #3279
2013-07-01 21:47:46 +02:00
Luca Cavanna 2314b8665b Parent was ignored in exists request
The routing was always set to null right after the parent was set.

Closes #3276
2013-07-01 17:15:17 +02:00
Benjamin Devèze 9ce0156d39 Supports mget fields parameter given as string.
Closes 3270
2013-07-01 13:48:27 +02:00
Alexander Reelsen 7790d2bf65 Stop aborting of multiget requests in case of missing index
The MultiGet API stops with a IndexMissingException, if only one of all
requests tries to access a non existing index. This patch creates a
failure for this item without failing the whole request.

Closes #3267
2013-07-01 09:43:43 +02:00
Martijn van Groningen 751d4ab68e Clean up update tests 2013-06-30 18:50:22 +02:00