OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-23 13:26:02 +00:00

Author	SHA1	Message	Date
Luca Cavanna	48ac9747a8	Added third highlighter type based on lucene postings highlighter Requires field index_options set to "offsets" in order to store positions and offsets in the postings list. Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be. Requires less disk space than term_vectors, needed for the fast_vector_highlighter. Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance. Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm. Uses forked version of lucene postings highlighter to support: - per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value - manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same The lucene postings highlighter api is quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO. The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents. Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding. Closes #3704	2013-10-24 23:38:00 +02:00
Simon Willnauer	08b4ca66c3	Use at least one query to prevent division by zero in PercolatorFacetsTests	2013-10-24 21:45:12 +02:00
Britta Weber	ebd328e340	remove trove import from mvel script all subclasses of trove maps have been removed in 088e05b368	2013-10-24 19:44:45 +02:00
Simon Willnauer	3a34aa735e	Upgrade to Lucene 4.5.1	2013-10-24 18:37:44 +02:00
Luca Cavanna	e981e411d7	[DOCS] rephrased docs for highlight no_match_size parameter (removed 0.90.6 coming tag as it's needed only in 0.90 branch)	2013-10-24 14:38:32 +02:00
Luca Cavanna	8e6c4ce8e8	Minor changes to no_match_size highlight parameter and highlight tests (#1171 ) - Randomly store the field to highlight - Updated test to use common assertions - Restored previously commented out testCommonTermsQuery	2013-10-24 14:38:32 +02:00
Nik Everett	14a709f563	Highlighting can return excerpt with no highlights You can configure the highlighting api to return an excerpt of a field even if there wasn't a match on the field. The FVH makes excerpts from the beginning of the string to the first boundary character after the requested length or the boundary_max_scan, whichever comes first. The Plain highlighter makes excerpts from the beginning of the string to the end of the last token before the requested length. Closes #1171	2013-10-24 14:38:32 +02:00
Costin Leau	919720ab4f	add detection of JRE server (JRE on Windows Server) fixes #3928 (cherry picked from commit a176ffda6fd97b6efbce6cc4f02a824bf5a10a17)	2013-10-24 15:28:17 +03:00
Costin Leau	64e4883e2e	add stop timeout and start mode for windows service.bat fixes #3938 fixes #3962 (cherry picked from commit f1d6ccc5845ca8f93177c0de3da59a2f37f67818)	2013-10-24 15:27:32 +03:00
Simon Willnauer	fb9cd5a562	Use abstract classes as super typs for Acknowledge operations Currently we have a marker interface for Acknowledged[Request\|Response], this makes not much sense since we duplicate the code in each subclass or class that implements the interface. We can simply use abstract classes and have it implemented only once.	2013-10-24 14:01:43 +02:00
Simon Willnauer	7867de4f5b	Refactor FieldData iterations This commit primarily folds [Double\|Bytes\|Long\|GeoPoint]Values.Iter into [Double\|Bytes\|Long\|GeoPoint]Values. Iterations now don't require a auxillary class (Iter) but instead driven by native for loops. All [Double\|Bytes\|Long\|GeoPoint]Values are stateful and provide `setDocId` and `nextValue` methods to iterate over all values in a document. This has several advantage: * The amout of specialized classes is reduced * Iteration is clearly stateful ie. Iters can't be confused to be local. * All iterations are size bounded which prevents runtime checks and allows JIT optimizations / loop un-rolling and most iterations are branch free. * Due to the bounded iteration the need for a `hasNext` method call is removed. * Value iterations feels more native. This commit also adds consistent documentation and unifies the calcualtion if SortMode is involved. This commit also changes the runtime behavior of BytesValues#getValue() such that it will never return `null` anymore. If a document has no value in a field this method still returns a `BytesRef` with a `length` of 0. To identify documents with no values #hasValue() or #setDocument(int) should be used. The latter should be preferred if the value will be consumed in the case the document has a value.	2013-10-24 10:33:56 +02:00
Adrien Grand	7bd1a55f6e	Revert "Fix for has_child can cause an infinite loop (100% CPU) when used in bool query." Temporarily revert the commit while waiting for the CLA to be signed.	2013-10-24 09:43:58 +02:00
Josh Canfield	adadc72da3	Fix for has_child can cause an infinite loop (100% CPU) when used in bool query. Closes #3955	2013-10-24 09:24:14 +02:00
Shay Banon	35b573ff24	Transport: Have a separate channel for recovery Have a separate channel for recovery, so it won't overflow the "low" channel which is also used for bulk indexing. Also, rename the channel names to be more descriptive. Change low to bulk (for bulk based operations, currently just bulk indexing), med to reg (for "regular" operations), and high to state (for state based communication). The new channel for recovery will be named recovery, and the ping channel will remain the same. closes #3954	2013-10-23 15:55:27 -07:00
Adrien Grand	d18192b39f	Add doc values to TermsFacetSearchBenchmark.	2013-10-23 10:58:35 +02:00
Shay Banon	a3122a88e4	Java API: Setting track scores does not affect scan search type When setting track scores, the scan search type will return the scores for each document. The Java API builder does not properly set this value (it only sets it if a sort in in place, which is not relevant for scan search type). closes #3949	2013-10-22 18:04:22 -07:00
Boaz Leskes	0e6e6f97dc	Merge pull request #3940 from rboulton/patch-1 [Docs] Clean up wording in cluster health api doc	2013-10-22 04:09:13 -07:00
Shay Banon	5bc3825c70	externalize writing "raw" fields to helper method	2013-10-21 12:19:14 -07:00
Markus Fischer	782d315da3	Fix markup	2013-10-21 16:11:09 +02:00
Martijn van Groningen	8d49aa398f	Added facet support to the percolate api. Closes #3851	2013-10-21 19:13:28 +07:00
Richard Boulton	b62cc7c716	Clean up wording to reduce confusion The description of the timeout parameter was worded misleadingly; it implied that the API would wait until the cluster reached the desired level and then stayed at that level for the timeout. I've tweaked the sentence to remove the risk of confusion.	2013-10-21 12:37:50 +01:00
Clinton Gormley	b2d82d7e75	[DOCS] Reorganised the highlight_query docs and added a version flag	2013-10-18 18:03:31 +02:00
Luca Cavanna	b7d8c275eb	Fix small typo in terms lookup tests mapping (count api tests)	2013-10-18 17:55:21 +02:00
Matt Weber	e6fc416adc	Fix small typo in terms lookup tests mapping.	2013-10-18 17:55:21 +02:00
Matt Weber	1e0a834c68	Document strict dynamic type mapping.	2013-10-18 08:29:31 -07:00
Simon Willnauer	f749db26e8	Allow awareness attributes to be reset via the API Currently we don't allow resetting the awareness attribute via the API since it requires at least one non-empty string to update the setting. This commit allows resetting this using an empty string. Closes #3931	2013-10-18 16:25:00 +02:00
Nik Everett	60550e4cc2	phrase_len is not called phrase_length	2013-10-18 09:29:53 -04:00
Blake Smith	03a89297ba	Fix teh typos in javadocs	2013-10-18 12:19:37 +02:00
Shay Banon	c9b0e1de6c	Settings queue_size on index/bulk TP can cause rejection failures when executed over network The #3526 fix was not complete, it handled cases of on node execution, but didn't properly handle cases where it was executed over the network, and forcing the execution of the replica operation when done over the wire. This relates to #3854 closes #3929	2013-10-17 17:06:44 +03:00
Clinton Gormley	adf0c8424b	[DOCS] How to check max_file_descriptors	2013-10-17 11:54:36 +02:00
David Pilato	4efd94e7cf	Java API Documentation (0.90+) needs update for accessors in Facets docs Closes #3921. (cherry picked from commit a753c48)	2013-10-17 09:50:15 +02:00
Boaz Leskes	2593b6e644	Terms facet will now throw a parsing exception if no field, fields or a script is supplied. Previously you'd get an NPE. Also added extra exception when called with unknown parameters.	2013-10-17 09:26:43 +02:00
Honza Kral	dd43d932f1	Added a link to official Python client to the client list, fixed perl link	2013-10-16 17:51:50 +02:00
Honza Kral	4f3ad73854	Added brief overview of the python client to the guide	2013-10-16 17:45:05 +02:00
Martijn van Groningen	b7c4adeea3	[Docs] update reference to remove documentation about percolating during an index, bulk or update request.	2013-10-16 16:31:36 +02:00
Martijn van Groningen	1d0841e2b8	Added initial documentation for the redesigned percolator.	2013-10-16 14:12:19 +02:00
Boaz Leskes	18e12ef66c	[Docs] updated refrences to dynamic_date_formats	2013-10-16 12:04:31 +02:00
Boaz Leskes	57b2d45142	[Docs] added document for the lenient option in match queries	2013-10-16 10:53:25 +02:00
Martijn van Groningen	cc9ab111a0	Prohibit indexing a document with parent for a type that doesn't have a `_parent` field configured and prohibit adding a _parent field to an existing mapping. Closes #3848 #3849	2013-10-15 18:29:44 +02:00
Simon Willnauer	89de3ab627	Added simple count down class that allows to be fast forwarded Closes #3910	2013-10-15 17:47:51 +02:00
Luca Cavanna	fcf13e0fa7	Delete warmer api to support acknowledgements Added support for acknowledgements in delete warmer api using the generic mechanism introduced in#3786 Closes #3833	2013-10-15 17:47:50 +02:00
Luca Cavanna	31142ae471	Put warmer api to support acknowledgements Added support for acknowledgements in put warmer api using the generic mechanism introduced in #3786 Closes #3831	2013-10-15 17:47:50 +02:00
Luca Cavanna	55f1eab09a	Added generic cluster state update ack mechanism Added new AckedClusterStateUpdateTask interface that can be used to submit cluster state update tasks and allows actions to be notified back when a set of (configurable) nodes have acknowledged the cluster state update. Supports a configurable timeout, so that we wait for acknowledgement for a limited amount of time (will be provided in the request as it curently happens, default 10s). Internally, a low level AckListener is created (InternalClusterService) and passed to the publish method, so that it can be notified whenever each node responds to the publish request. Once all the expected nodes have responded or the timeoeout has expired, the AckListener notifies the action which will return adding the proper acknowledged flag to the response. Ideally, this new mechanism will gradually replace the existing ones based on custom endpoints and notifications (per api). Closes #3786	2013-10-15 17:47:50 +02:00
Clinton Gormley	f5e2cf9785	[Docs] Typo	2013-10-15 17:27:05 +02:00
Clinton Gormley	4798425da6	[Docs] Added a page for the Perl client	2013-10-15 17:22:34 +02:00
Alexander Reelsen	4d19239ec4	Add support for Lucene SuggestStopFilter The suggest stop filter is an improved version of the stop filter, which takes stopwords only into account if the last char of a query is a whitespace. This allows you to keep stopwords, but to allow suggesting for "a". Example: Index document content "a word". You are now able to suggest for "a" and get back results in the completion suggester, if the suggest stop filter is used on the query side, but will not get back any results for "a " as this is identified as a stopword. The implementation allows to set the `remove_trailing` parameter for a custom stop filter and thus use the suggest stop filter instead of the standard stop filter.	2013-10-15 16:12:02 +02:00
Clinton Gormley	870346070e	[DOCS] Added compound_on_flush docs and updated compound_format docs to include note about accepting a float	2013-10-15 13:30:56 +02:00
Clinton Gormley	d67331b554	[DOCS] Added script.disable_dynamic to the scripting page	2013-10-15 12:25:07 +02:00
Martijn van Groningen	c1ec32aa1e	Added `num_queries` and `memory_size` stats to percolate stats. Closes #3883	2013-10-15 10:30:49 +02:00
steve mayzak	48656fd1ed	removed a duplicate paragraphin config docs	2013-10-14 15:33:56 -07:00

1 2 3 4 5 ...

5854 Commits