OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	8bd9e34e39	Stop FVH from throwing away some query boosts The FVH was throwing away some boosts on queries stopping a number of ways to boost phrase matches to the top of the list of fragments from working. The plain highlighter also doesn't work for this but that is because it doesn't support the concept of the same term having a different score at different positions. Also update documentation claiming that FHV is nicer for weighing terms found by query combinations. Closes #4351	2014-01-08 11:51:48 +01:00
Nik Everett	522d620eb6	Use FHV's phraseLimit This prevents poisoning the FVH with documents that contain TONS of matches which take tons of memory and time to highlight. Closes #4645	2014-01-08 11:27:58 +01:00
Simon Willnauer	fa16969360	Cleanup comments and class names s/ElasticSearch/Elasticsearch * Clean up s/ElasticSearch/Elasticsearch on docs/* * Clean up s/ElasticSearch/Elasticsearch on src/* bin/* & pom.xml * Clean up s/ElasticSearch/Elasticsearch on NOTICE.txt and README.textile Closes #4634	2014-01-07 11:21:51 +01:00
Martijn van Groningen	f1bf585089	The `fields` option should always return an array for json document fields and single valued field for metadata fields. Also the `fields` option can only be used to fetch leaf fields, trying to do fetch object fields will return in a client error. Closes #4542	2014-01-03 17:29:12 +01:00
Clinton Gormley	34b9b16233	[DOCS] Fixed some bad link refs	2013-12-16 18:07:33 +01:00
Martijn van Groningen	23d2b1ea7b	Renamed top level `filter` to `post_filter`. Closes #4119	2013-12-16 17:10:14 +01:00
Martijn van Groningen	10e2528cce	Added the `force_source` option to highlighting that enforces to use of the _source even if there are stored fields. The percolator uses this option to deal with the fact that the MemoryIndex doesn't support stored fields, this is possible b/c the _source of the document being percolated is always present. Closes #4348	2013-12-13 13:39:53 +01:00
Nik Everett	8e34057bc0	Add support for combining fields to the FVH The Fast Vector Highlighter can combine matches on multiple fields to highlight a single field using `matched_fields`. This is most intuitive for multifields that analyze the same string in different ways. Example: { "query": { "query_string": { "query": "content.plain:running scissors", "fields": ["content"] } }, "highlight": { "order": "score", "fields": { "content": { "matched_fields": ["content", "content.plain"], "type" : "fvh" } } } } Closes #3750	2013-12-03 11:10:01 +01:00
Conrad Pankoff	87246af256	[DOCS] Fixed typos and corrected grammar	2013-12-02 10:08:26 +01:00
Clinton Gormley	bc393b6d79	Changed the minScore comparator from > to >= Closes #4303	2013-11-29 20:29:20 +01:00
Boaz Leskes	c63d8c4fb5	[Docs] Added _source filtering to documentation Relates to #3301	2013-11-26 19:16:24 +01:00
Clinton Gormley	3465e69e83	[DOCS] Changed all store:yes/no to store:true/false which is how this setting is stored internally	2013-11-07 16:57:18 +01:00
Luca Cavanna	48ac9747a8	Added third highlighter type based on lucene postings highlighter Requires field index_options set to "offsets" in order to store positions and offsets in the postings list. Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be. Requires less disk space than term_vectors, needed for the fast_vector_highlighter. Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance. Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm. Uses forked version of lucene postings highlighter to support: - per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value - manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same The lucene postings highlighter api is quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO. The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents. Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding. Closes #3704	2013-10-24 23:38:00 +02:00
Luca Cavanna	e981e411d7	[DOCS] rephrased docs for highlight no_match_size parameter (removed 0.90.6 coming tag as it's needed only in 0.90 branch)	2013-10-24 14:38:32 +02:00
Nik Everett	14a709f563	Highlighting can return excerpt with no highlights You can configure the highlighting api to return an excerpt of a field even if there wasn't a match on the field. The FVH makes excerpts from the beginning of the string to the first boundary character after the requested length or the boundary_max_scan, whichever comes first. The Plain highlighter makes excerpts from the beginning of the string to the end of the last token before the requested length. Closes #1171	2013-10-24 14:38:32 +02:00
Clinton Gormley	b2d82d7e75	[DOCS] Reorganised the highlight_query docs and added a version flag	2013-10-18 18:03:31 +02:00
Clinton Gormley	ba1b4886e3	[DOCS] Moved "named filters/queries" up one level	2013-10-10 11:23:08 +02:00
Clinton Gormley	7a53d41446	[DOCS] Changed capitalization of operator in rescore query	2013-10-05 17:18:15 +02:00
Nik Everett	6b000d8c6d	Support specifing score query on highlight. This is useful if you want to highlight terms not in the search query or you want sort highlighted snippets based on another query. Closes #3630	2013-10-02 15:46:24 -04:00
Lee Hinman	ba40aa374e	Uniquify anchor links to fix asciidoc/docbook generation	2013-09-30 15:32:00 -06:00
Lee Hinman	0442b737be	Add more anchor links to documentation Related to #3679	2013-09-30 13:13:16 -06:00
Clinton Gormley	85bba668f7	[DOCS] Tidied up various doc formatting errors	2013-09-16 16:13:01 +02:00
Martijn van Groningen	f6f4b5014f	Added docs for named queries. Relates to #3581	2013-09-16 11:17:01 +02:00
Martijn van Groningen	8ddb809f98	If all scroll ids should be removed then the `_all` value should be used instead of not specifying any scroll ids.	2013-09-12 10:41:38 +02:00
Martijn van Groningen	0efa78710b	Added clear scroll api. The clear scroll api allows clear all resources associated with a `scroll_id` by deleting the `scroll_id` and its associated SearchContext. Closes #3657	2013-09-10 21:17:34 +02:00
Clinton Gormley	6d667e5d41	[DOCS] Missing sort values now works for all field types	2013-09-04 23:20:55 +02:00
Clinton Gormley	393c28bee4	[DOCS] Removed outdated new/deprecated version notices	2013-09-03 21:28:31 +02:00
Clinton Gormley	822043347e	Migrated documentation into the main repo	2013-08-29 01:24:34 +02:00

28 Commits