OpenSearch

Commit Graph

Author	SHA1	Message	Date
Shay Banon	b12acbcf9e	introduce read/writeSharedString while streaming currently, we treat all strings as shared (either by full equality or identity equality), while almost all times we know if they should be serialized as shared or not. Add an explicitly write/readSharedString, and use it where applicable, and all other write/readString will not treat them as shared relates to #3322	2013-07-19 16:17:22 +02:00
Shay Banon	74a7c46b0e	top level filter not resulting in an actual filter is ignored when parsing a filter, we use null to indicate that this filter should not match anything, the top level filter doesn't take it into account fixes #3356	2013-07-19 13:12:23 +02:00
Adrien Grand	fe4c2a9d02	Work around the fact that AssertionError(String message, Throwable cause) is a Java 1.7-only API.	2013-07-19 09:47:55 +02:00
Adrien Grand	12d9268db2	Make field data able to support more than 2B ordinals per segment. Although segments are limited to 2B documents, there is not limit on the number of unique values that a segment may store. This commit replaces 'int' with 'long' every time a number is used to represent an ordinal and modifies the data-structures used to store ordinals so that they can actually support more than 2B ordinals per segment. This commit also improves memory usage of the multi-ordinals data-structures and the transient memory usage which is required to build them (OrdinalsBuilder) by using Lucene's PackedInts data-structures. In the end, loading the ordinals mapping from disk may be a little slower, field-data-based features such as faceting may be slightly slower or faster depending on whether being nicer to the CPU caches balances the overhead of the additional abstraction or not, and memory usage should be better in all cases, especially when the size of the ordinals mapping is not negligible compared to the size of the values (numeric data for example). Close #3189	2013-07-19 09:10:08 +02:00
Martijn van Groningen	4d05c9cfd5	Optimize `has_child` query & filter execution with two short circuit mechanisms: * If all parent ids have been emitted as hit, abort the query / filter execution. * If the a relative small number of parent ids have been collected in the first phase then limit the number of second phase parent id lookups by putting a short circuit filter before parent document evaluation or omit the it in the case of the filter. This is contrable via the `short_circuit_cutoff` option which is exposed in the `has_child` query & filter. All parent / child queries and filters (expect `top_children` query) abort execution if no parent ids have been collected in the first phase. Closes #3190	2013-07-18 17:41:23 +02:00
Martijn van Groningen	c222ce28fc	Redesigned the percolator engine to execute in a distribute manner. With this design the percolate queries will be stored in a special `_percolator` type with its own mapping in the same index where the actual data is or in a different index (dedicated percolation index, which might require different sharding behavior compared to the index that holds actual data and being search on). This approach allows percolate requests to scale to the number of primary shards an index has been configured with and effectively distributes the percolate execution. This commit doesn't add new percolate features other than scaling. The response remains similar, with exception that a header similar to the search api has been added to the percolate response. Closes #3173	2013-07-18 16:52:42 +02:00
Adrien Grand	f38103a232	Eclipse: organize imports on save. It can happen that Eclipse fails at correctly adding a new import entry to an existing list of imports since we don't use its default rules. This commit forces Eclipse to organize imports on save.	2013-07-18 14:49:35 +02:00
Florian Schilling	1e5e8d83b1	Changed GeoPoint parsing in serveral parsers using Geopoint.parse() Closes #3351	2013-07-18 12:49:12 +02:00
Robin Hughes	45a756c203	Analysis: update ThaiAnalyzerProvider to use custom stopwords setting	2013-07-18 11:18:37 +02:00
Luca Cavanna	c28452ee67	added missing link to the elasticsearch.org website	2013-07-17 19:07:22 +02:00
Adrien Grand	ffcc710e4e	Add the ability to ignore or fail on numeric fields when executing more-like-this or fuzzy-like-this queries. More-like-this and fuzzy-like-this queries expect analyzers which are able to generate character terms (CharTermAttribute), so unfortunately this doesn't work with analyzers which generate binary-only terms (BinaryTermAttribute, the default CharTermAttribute impl being a special BinaryTermAttribute) such as our analyzers for numeric fields (byte, short, integer, long, float, double but also date and ip). To work around this issue, this commits adds a fail_on_unsupported_field parameter to the more-like-this and fuzzy-like-this parsers. When this parameter is false, numeric fields will just be ignored and when it is true, an error will be returned, saying that these queries don't support numeric fields. By default, this setting is true but the mlt API sets it to true in order not to fail on documents which contain numeric fields. Close #3252	2013-07-16 18:37:34 +02:00
Clinton Gormley	1bc8f82d0a	Merge pull request #3341 from clintongormley/pattern_capture Added the "pattern_capture" token filter from Lucene 4.4	2013-07-16 09:20:13 -07:00
Clinton Gormley	16e137ebbc	Added the "pattern_capture" token filter from Lucene 4.4 The XPatternCaptureGroupTokenFilter.java file can be removed once we upgrade to Lucene 4.4. This change required the addition of the commaDelimited flag to getAsArray() to disable parsing strings as comma-delimited values. Closes #3340	2013-07-16 18:08:12 +02:00
Luca Cavanna	933fd50466	Added support for multiple indices in open/close index apis Open/Close index api supports now multiple indices the same way as the delete index api works. The only exception is when dealing with all indices: it's required to explicitly use _all or a pattern that identifies all the indices, not just an empty array of indices. Supports the ignore_missing param too. Added also a new flag action.disable_close_all_indices (default false) to disable closing all indices Closes #3217	2013-07-16 15:10:13 +02:00
Florian Schilling	6e9ad03b27	Fixed nullshape indexing. Closes #3310	2013-07-16 10:49:05 +02:00
Alexander Reelsen	3087fd8b2a	Removed useless TODO	2013-07-16 10:29:48 +02:00
Shay Banon	21677964a5	rename variable and add comment about TopDocs#merge	2013-07-16 10:28:10 +02:00
Brett Dargan	94fd152eb1	Added statistical facet to term facet in SimpleNestedTests The test now uses a statistical facet plus a filter facet on nested documents.	2013-07-16 09:59:02 +02:00
Boaz Leskes	88eb3552d8	AtomicArray.toArray will now throw an exception if target array if of the wrong size.	2013-07-16 09:22:12 +02:00
Andrew Raines	092fd6fc7a	Add info to _cat/nodes, add _cat/indices.	2013-07-15 16:03:21 -05:00
Boaz Leskes	c3038889f9	Using AtomicArray to collect responses in mget and bulk indexing (instead of synchronised)	2013-07-15 22:44:57 +02:00
Alexander Reelsen	28b9e25053	Fix xcontent serialization of timestamp/routing index field The index field was serialized as a boolean instead of showing the 'analyed', 'not_analzyed', 'no' options. Fixed by calling indexTokenizeOptionToString() in the builder. Closes #3174	2013-07-15 18:02:39 +02:00
Luca Cavanna	baea7fd1c2	fixed existing test and linked it to its issue	2013-07-15 17:31:49 +02:00
Alexander Reelsen	c59b0b22e2	Debian/Redhat package improvments This decision helps people who want to rollout the oracle java without having an openjdk java installed. * Removed any hard dependency on Java in the debian package * The debian init script does not check for an existing JAVA_HOME anymore * Debian and RedHat initscripts now exit if they do not find a java binary (instead of starting elasticsearch in the background and swallowing the error as there is no way to log it in that case) * Changed the debian init script to rely on the pid file instead of the argument name of process * Added a useful error message in case no java binary is available (in elasticsearch shell script) Closes #3304 Closes #3311	2013-07-15 16:03:24 +02:00
Simon Willnauer	37edfe060b	Set spare becore comparing comparator bottom value The actual documents value was never calculated if setSpare wasn't called before compareBottom was called on a certain document. Closes #3309	2013-07-15 15:40:58 +02:00
Benjamin Devèze	b116097ea5	Add found field for bulk deletes. Closes #3320	2013-07-15 15:08:43 +02:00
Martijn van Groningen	470b685fa9	Renamed IndicesGetAliases* classes to begin with GetAliases*	2013-07-15 14:55:46 +02:00
Shay Banon	f28ff2becc	move to use ScoreDoc/FieldDoc instead of our wrappers now that we have the concept of a shardIndex as part of our search execution, we can simply move to use ScoreDoc and FieldDoc instead of having our own wrappers that held the info Also, rename shardRequestId where needed to be called shardIndex to conform with the variable name in Lucene	2013-07-15 14:54:28 +02:00
Britta Weber	7098073a66	fix term vector api retrieved wrong doc The previous loading of term vectors from the top level reader did not use the correct docId. The docId in Versions.DocIdAndVersion is relative to the segment reader in Versions.DocIdAndVersion and not to the top level reader. Consequently the term vectors for the wrong document were returned if the document was not on the first segment of the shard.	2013-07-15 14:50:48 +02:00
Shay Banon	3004a2a696	move the fields doc queue to a better package location	2013-07-15 14:33:44 +02:00
Martijn van Groningen	127c62924b	Rename IndicesAdminClient#existsAliases to IndicesAdminClient#aliasesExist. Closes #3330	2013-07-15 14:28:09 +02:00
Adrien Grand	1310f02e6c	Rename DocIdAndVersion.reader to DocIdAndVersion.context to avoid confusion.	2013-07-15 14:21:35 +02:00
Boaz Leskes	9e8c42f0c6	multiget requests which referred to missing indexes blocked and never returned.	2013-07-15 09:58:10 +02:00
Shay Banon	8e0d23b147	search reducer to use atomic reference arrays move away from maps to correlate between responses from different shards to unique incremental integer representing a shardRequestId (unique for the specific search request) this allows to no longer require using maps (or CHM), and simply use atomic reference arrays, which rely on volatiles. it also removes the need to use a cache for heavy data structures since we don't really have them around anymore...	2013-07-14 00:51:54 +02:00
Shay Banon	2762fed04f	remove unused class	2013-07-13 17:23:14 +02:00
Shay Banon	9f6117612c	cache recycler now node/client level component	2013-07-13 01:00:45 +02:00
Shay Banon	17936fabb0	latest jsr166 upgrade only compiled with 1.7	2013-07-11 22:48:49 +02:00
Shay Banon	fe6fb7135b	update to latest jsr166	2013-07-10 13:09:04 -07:00
Boaz Leskes	abf2268574	Added an error message for when child mapping is not properly configured (incorrect type)	2013-07-09 14:06:48 +02:00
Adrien Grand	c37de66fb6	Don't reset TokenStreams twice when highlighting. When using PlainHighlighter, TokenStreams are resetted both before highlighting and at the beginning of highlighting, causing issues with analyzers that read in reset() such as PatternAnalyzer. This commit removes the call to reset which was performed before passing the TokenStream to the highlighter. Close #3200	2013-07-08 14:31:06 +02:00
Shay Banon	759a13f1de	optimize reroute - optimize initialization of building the all the assigned shards state - optimize iteration in throttling allocation decider	2013-07-06 14:12:52 -07:00
Shay Banon	cc1173b58f	automatically set translog buffer size based on number of shards similar to how we set the indexing buffer size, automatically set the translog buffer size based on the number of shards allocated on a node	2013-07-06 10:56:36 -07:00
Shay Banon	4574489c27	make utf8 bytes response not reuse thread local buffer no need, optimized conversion to bytes anyhow, and when sending, it will just get wrapped by a buffer	2013-07-06 10:42:34 -07:00
Shay Banon	f4d1895399	guice optimization only under debug logging use the source provider to find the line number through stack trace elements, otherwise, its very expensive	2013-07-05 22:47:27 -07:00
Shay Banon	b9a2fbd874	properly reuse indices analyzers don't wrap in AnalysisService the indices analyzers we have with a NamedAnalyzer, since its effectively creates a new instance of an analyzer (with per field reuse strategy) and we don't benefit as much from reusing analyzers on the indices / node level Now, the indices level analyzers return a NamedAnalyzer, also NamedAnalyzer will use the non per field reuse strategy since thats really the common case for it (no need for per field reuse there). Also, try and reuse numeric analyzers globally instead of creating them per numeric mapper. Although those analyzers are not used during indexing (we have a custom numeric field for it), they can be used sometimes when searching in a query string for example without specific query implemenation in the mappers	2013-07-05 19:07:08 -07:00
Shay Banon	8d9c84f84e	optimize guice injector once created in guice, we always use eager loaded singletons for all modules we create, thus, we can actually optimize the memory used by injectors by reduced the construction information they store per binding resulting in extensive reduction in memory usage for many indices/shards case on a node also because all are eager singletons (and effectively, read only), we can not go through trying to create just in time bindings in the parent injector before trying to craete it in the current injector, resulting in improvement of object creations time and the time it takes to create an index or a shard on a node	2013-07-05 17:30:09 -07:00
Shay Banon	09a6907cca	optimize applyDeletes event - reuse set - don't copy over again the shard ids immutable set	2013-07-05 17:30:09 -07:00
Boaz Leskes	5b078ebfed	fixed casting that caused compilation errors with JDK7	2013-07-05 16:26:50 +02:00
Boaz Leskes	491d2b721c	added support for a prefix wild card (*.field) in includes	2013-07-05 12:51:07 +02:00
Andrew Raines	1645e4230f	Add /_cat/nodes.	2013-07-04 18:58:59 -05:00

... 4 5 6 7 8 ...

5293 Commits All Branches Search

5293 Commits

All Branches