652 Commits

Author SHA1 Message Date
Martijn van Groningen
ebf6519965 Added aggs option to percolate api documentation. 2013-12-10 14:09:37 +01:00
Martijn van Groningen
8c1de501e7 Update percolator highlighting docs. 2013-12-07 16:40:49 -05:00
Adrien Grand
32eb5ffa92 [Docs] Document which encoding should be used in order to make sense of the offsets returned by the term vectors API.
Close #4363
2013-12-06 22:39:08 +01:00
Markus Fischer
2da0611dfb [DOCS] Completion suggest: Clarify de-duplication, optimize/merge
This contribution is based on the feedback given in issue #4254 and
issue #4255, and should clear things up, when suggestions are being
removed and not displayed anymore after deletion of data.
2013-12-05 11:10:56 +01:00
Nik Everett
8e34057bc0 Add support for combining fields to the FVH
The Fast Vector Highlighter can combine matches on multiple fields to
highlight a single field using `matched_fields`.  This is most
intuitive for multifields that analyze the same string in different
ways.  Example:
{
    "query": {
        "query_string": {
            "query": "content.plain:running scissors",
            "fields": ["content"]
        }
    },
    "highlight": {
        "order": "score",
        "fields": {
            "content": {
                "matched_fields": ["content", "content.plain"],
                "type" : "fvh"
            }
        }
    }
}

Closes #3750
2013-12-03 11:10:01 +01:00
Clinton Gormley
d9a480c97a [DOCS] Typos in aggregations 2013-12-02 15:14:25 +01:00
Conrad Pankoff
87246af256 [DOCS] Fixed typos and corrected grammar 2013-12-02 10:08:26 +01:00
uboness
cdc7dfbb2c Changed the "script_lang" parameter to "lang" in all value source based aggs - to be consistent with all other script based APIs. 2013-12-02 02:01:03 +01:00
Clinton Gormley
bc393b6d79 Changed the minScore comparator from > to >=
Closes #4303
2013-11-29 20:29:20 +01:00
uboness
0d6a35b9a7 - Added support for term filtering based on include/exclude regex on the terms agg
- Added javadoc to the TermsBuilder

Closes #4267
2013-11-29 13:46:48 +01:00
uboness
afb0d119e4 - Added docs for the value_count aggregation
- Fixed typos in the terms facets docs
- Fixed aggregation docs layout
- Added docs for shard_size in term aggregation
2013-11-29 12:35:42 +01:00
Clinton Gormley
cdc1935b6e [DOCS] Documented rest.action.multi.allow_explicit_index 2013-11-27 17:33:09 +01:00
Boaz Leskes
c63d8c4fb5 [Docs] Added _source filtering to documentation
Relates to #3301
2013-11-26 19:16:24 +01:00
Britta Weber
dbef64009f [DOC] add doc for multi term vector api
closes #3998
2013-11-26 17:03:14 +01:00
Alexander Reelsen
bf74f49fdd Updated Analyzing/Fuzzysuggester from lucene trunk
* Minor alignments (like setter to ctor)
* FuzzySuggester has a unicode aware flag, which is not exposed in the fuzzy completion request parameters
* Made XAnalyzingSuggester flags (PAYLOAD_SEP, END_BYTE, SEP_LABEL) to be written into the postings format, so we can retain backwards compatibility
* The above change also implies, that these flags can be set per instantiated XAnalyzingSuggester
* CompletionPostingsFormatTest now uses a randomProvider for writing data to check for bwc
2013-11-26 12:52:06 +01:00
uboness
c7f6c5266d initial commit of the aggregations module
Closes #3300
2013-11-24 03:13:08 -08:00
Clinton Gormley
3465e69e83 [DOCS] Changed all store:yes/no to store:true/false
which is how this setting is stored internally
2013-11-07 16:57:18 +01:00
Simon Willnauer
77bc5d5ecf release [1.0.0.Beta1] 2013-11-06 15:32:43 +01:00
Shay Banon
7c32269f4f Dist. Percolation: Use .percolator instead of _percolator for type name
Use .percolator as the internal (hidden) type name for percolators within the index. Seems nicer name to represent "hidden" types within an index.
closes #4090
2013-11-05 20:02:59 +01:00
Martijn van Groningen
30ab6f841d [DOCS] Fixed percolate docs errors 2013-11-01 11:44:07 +01:00
Clinton Gormley
8b2efd4849 [DOCS] Added a version flag to percolation 2013-10-30 13:59:03 +01:00
Luca Cavanna
48ac9747a8 Added third highlighter type based on lucene postings highlighter
Requires field index_options set to "offsets" in order to store positions and offsets in the postings list.
Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be.
Requires less disk space than term_vectors, needed for the fast_vector_highlighter.
Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance.
Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm.

Uses forked version of lucene postings highlighter to support:
- per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value
- manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same

The lucene postings highlighter api is  quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO.
The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents.

Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding.

Closes #3704
2013-10-24 23:38:00 +02:00
Luca Cavanna
e981e411d7 [DOCS] rephrased docs for highlight no_match_size parameter
(removed 0.90.6 coming tag as it's needed only in 0.90 branch)
2013-10-24 14:38:32 +02:00
Nik Everett
14a709f563 Highlighting can return excerpt with no highlights
You can configure the highlighting api to return an excerpt of a field
even if there wasn't a match on the field.

The FVH makes excerpts from the beginning of the string to the first
boundary character after the requested length or the boundary_max_scan,
whichever comes first.  The Plain highlighter makes excerpts from the
beginning of the string to the end of the last token before the requested
length.

Closes #1171
2013-10-24 14:38:32 +02:00
Markus Fischer
782d315da3 Fix markup 2013-10-21 16:11:09 +02:00
Clinton Gormley
b2d82d7e75 [DOCS] Reorganised the highlight_query docs and added a version flag 2013-10-18 18:03:31 +02:00
Nik Everett
60550e4cc2 phrase_len is not called phrase_length 2013-10-18 09:29:53 -04:00
Martijn van Groningen
1d0841e2b8 Added initial documentation for the redesigned percolator. 2013-10-16 14:12:19 +02:00
Clinton Gormley
9a062e465c [DOCS] Reorganised common API conventions 2013-10-13 16:46:56 +02:00
Clinton Gormley
ba1b4886e3 [DOCS] Moved "named filters/queries" up one level 2013-10-10 11:23:08 +02:00
Clinton Gormley
264a00a40f [DOCS] Added pages explaining lucene query parser syntax and regular expression syntax 2013-10-07 14:42:49 +02:00
Clinton Gormley
7a53d41446 [DOCS] Changed capitalization of operator in rescore query 2013-10-05 17:18:15 +02:00
uboness
f3c6108b71 introduced support for "shard_size" for terms & terms_stats facets. The "shard_size" is the number of term entries each shard will send back to the coordinating node. "shard_size" > "size" will increase the accuracy (both in terms of the counts associated with each term and the terms that will actually be returned the user) - of course, the higher "shard_size" is, the more expensive the processing becomes as bigger queues are maintained on a shard level and larger lists are streamed back from the shards.
closes #3821
2013-10-02 22:02:00 +02:00
Nik Everett
6b000d8c6d Support specifing score query on highlight.
This is useful if you want to highlight terms not in the search query or
you want sort highlighted snippets based on another query.

Closes #3630
2013-10-02 15:46:24 -04:00
Lee Hinman
ba40aa374e Uniquify anchor links to fix asciidoc/docbook generation 2013-09-30 15:32:00 -06:00
Lee Hinman
0442b737be Add more anchor links to documentation
Related to #3679
2013-09-30 13:13:16 -06:00
Adrien Grand
90524d7ad2 Fix formatting of the documentation.
Remaining '@'s have been replaced with '`'s.
2013-09-18 12:35:44 +02:00
David Pilato
1e3ffa0df7 Add distance supported units 2013-09-17 14:21:45 +02:00
Clinton Gormley
85bba668f7 [DOCS] Tidied up various doc formatting errors 2013-09-16 16:13:01 +02:00
Martijn van Groningen
f6f4b5014f Added docs for named queries.
Relates to #3581
2013-09-16 11:17:01 +02:00
David Pilato
169cd007b5 Fix typo
Thanks to @ybonnel for finding it ;-)
2013-09-12 11:00:59 +02:00
Martijn van Groningen
8ddb809f98 If all scroll ids should be removed then the _all value should be used instead of not specifying any scroll ids. 2013-09-12 10:41:38 +02:00
Martijn van Groningen
0efa78710b Added clear scroll api.
The clear scroll api allows clear all resources associated with a `scroll_id` by deleting the `scroll_id` and its associated SearchContext.

Closes #3657
2013-09-10 21:17:34 +02:00
Clinton Gormley
6d667e5d41 [DOCS] Missing sort values now works for all field types 2013-09-04 23:20:55 +02:00
Clinton Gormley
e1c6f45ff0 [DOCS] Added clarification about global scope in facets 2013-09-04 23:20:55 +02:00
Clinton Gormley
08f8e77b8f [DOCS] Added fuzzy options to completion suggester 2013-09-04 23:20:55 +02:00
Clinton Gormley
9f5d0b6e89 [DOCS] Added a few clarifications to the docs from the issues list 2013-09-04 23:20:55 +02:00
Clinton Gormley
5b60506b2e [DOCS] Added highlighting to the phrase suggester 2013-09-04 23:20:54 +02:00
Clinton Gormley
53ad7330fc [DOCS] Added docs for term vectors 2013-09-04 23:20:54 +02:00
Clinton Gormley
393c28bee4 [DOCS] Removed outdated new/deprecated version notices 2013-09-03 21:28:31 +02:00