65 Commits

Author SHA1 Message Date
Michael McCandless
85065f9c8e Core: cutover to Lucene's query rescorer
This is functionally equivalent to before, so there should be no
user-visible impact, except I added a NOTE in the docs warning about
the interaction of pagination and rescoring.

Closes #6232

Closes #7707
2014-10-18 05:25:50 -04:00
Clinton Gormley
6a180d1803 Docs: Update highlighting.asciidoc
Added note about how to highlight on the `_all` field

Closes #7991
2014-10-15 13:45:56 +02:00
Clinton Gormley
cb00d4a542 Docs: Removed all the added/deprecated tags from 1.x 2014-09-26 21:04:42 +02:00
Adrien Grand
4bfad644b3 Aggregations: Forbid usage of aggregations in conjunction with search_type=SCAN.
Aggregations are collection-wide statistics, which is incompatible with the
collection mode of search_type=SCAN since it doesn't collect all matches on
calls to the search API.

Close #7429
2014-09-03 09:03:01 +02:00
Adrien Grand
203e80e650 Aggregations: Only return aggregations on the first page when scrolling.
Aggregations are collection-wide statistics so they would always be the same.
In order to save CPU/bandwidth, we can just return them on the first page.

Same as #1642 but for aggregations.
2014-09-03 09:03:01 +02:00
Adrien Grand
8e1d3d56b3 Docs: Replace added[1.4.0] with coming[1.4.0] since 1.4 is not released yet. 2014-08-29 11:57:22 +02:00
Adrien Grand
ea96359d82 Facets: Removal from master.
Close #7337
2014-08-21 10:34:39 +02:00
Britta Weber
639692943f Docs: Document distance type and sort mode for many to many geo_points
closes #7280
2014-08-18 16:15:55 +02:00
Britta Weber
d49ed93488 Docs: md -> asciidoc 2014-08-08 11:25:14 +02:00
Adrien Grand
d9d5b35be9 Sort: Make ignore_unmapped work for cross-index queries.
Close #2255
2014-08-01 15:30:17 +02:00
Britta Weber
d6a18ab2ba Docs: add 1.4.0 label to many to many geo distance sort 2014-08-01 12:30:08 +02:00
Kurt Hurtado
66560acebb Update fielddata-fields.asciidoc 2014-08-01 09:20:19 +02:00
Britta Weber
fe86c8bc88 _geo_distance sort: allow many to many geo point distance
Add computation of disyance to many geo points. Example request:

```
{
  "sort": [
    {
      "_geo_distance": {
        "location": [
          {
            "lat":1.2,
            "lon":3
          },
          {
             "lat":1.2,
            "lon":3
          }
        ],
        "order": "desc",
        "unit": "km",
        "sort_mode": "max"
      }
    }
  ]
}
```

closes #3926
2014-07-31 17:33:45 +02:00
Clinton Gormley
36e1c7928c Rewrote post-filter.asciidoc
Closes #5166
2014-07-31 12:56:11 +02:00
Clinton Gormley
10b4177def Docs: Fixed path to search-shards 2014-07-26 15:05:53 +02:00
Clinton Gormley
0f943850a0 Update named-queries-and-filters.asciidoc 2014-07-23 17:28:49 +02:00
Clinton Gormley
6a7a77eada Docs: Add links to client helper classes for bulk/scroll/reindexing 2014-07-18 13:55:47 +02:00
Clinton Gormley
b6baa4be4a Update preference.asciidoc
Clarify that `preference` is a query string parameter only
and provide an example.
2014-07-09 11:13:17 +02:00
Clinton Gormley
feb81e228b Docs: Rewrote the scroll/scan docs
Closes #6774
2014-07-08 11:54:53 +02:00
Clinton Gormley
64a4acc49b Docs: Added IDs to the highlighters for linking 2014-06-22 16:46:42 +02:00
Luke Fender
f9da5259bc [DOCS] Fixed typo in post-filter.asciidoc
Remove 'be' where it is not needed
2014-06-12 12:09:19 +02:00
Nik Everett
3573822b7e Highlight fields in request order
Because json objects are unordered this also adds an explicit order syntax
that looks like
    "highlight": {
        "fields": [
            {"title":{ /*params*/ }},
            {"text":{ /*params*/ }}
        ]
    }

This is not useful for any of the builtin highlighters but will be useful
in plugins.

Closes #4649
2014-05-22 16:44:14 +02:00
Clinton Gormley
ff12585fea Improved wording in search-type.asciidoc
Closes #5951
2014-05-14 12:15:48 +02:00
Lee Hinman
57bee03193 [DOCS] Add /_search_shards documentation 2014-04-22 08:54:32 -06:00
Andrew O'Brien
48031b6236 Fixes typo in "Scan" search type documention 2014-04-07 16:01:37 -06:00
Hannes Korte
c11293ad78 Fix some typos in documentation. 2014-03-31 13:48:17 +02:00
bleskes
5d832374dd Update Documentation Feature Flags [1.1.0] 2014-03-25 17:51:30 +01:00
Randy Stauner
1486188a3b [DOCS] Reword clear-scroll sentence 2014-03-17 12:08:49 +01:00
Simon Willnauer
fbb8c0fafa [DOCS] Add coming tag to multiple rescores
Closes #5365
2014-03-10 09:27:44 +01:00
Luca Cavanna
4e6610a798 Fixed multi term queries support in postings highlighter for non top-level queries
In #4052 we added support for highlighting multi term queries using the postings highlighter. That worked only for top-level queries though, and not for multi term queries that are nested for instance within a bool query, or filtered query, or a constant score query.

The way we make this work is by walking the query structure and temporarily overriding the query rewrite method with a method that allows for multi terms extraction.

Closes #5102
2014-02-21 21:43:40 +01:00
Simon Willnauer
990ce658a4 [Docs] Remove custom_score from documentation and add a migration
section.
2014-02-11 14:59:15 +01:00
Clinton Gormley
93930d6dc7 Removed 0.90.* deprecation and addition notifications
Closes #5052
2014-02-07 20:52:49 +01:00
Lars Francke
1bd9dc129b Fix confusing sentence
The original sentence didn't make much sense. I hope this is a bit better. Taken heavy inspiration from c63d8c4fb5
2014-02-03 17:20:40 +01:00
Nik Everett
93a8e80aff Support multiple rescores
Detects if rescores arrive as an array instead of a plain object.  If so
then parse each element of the array as a separate rescore to be executed
one after another.  It looks like this:
   "rescore" : [ {
      "window_size" : 100,
      "query" : {
         "rescore_query" : {
            "match" : {
               "field1" : {
                  "query" : "the quick brown",
                  "type" : "phrase",
                  "slop" : 2
               }
            }
         },
         "query_weight" : 0.7,
         "rescore_query_weight" : 1.2
      }
   }, {
      "window_size" : 10,
      "query" : {
         "score_mode": "multiply",
         "rescore_query" : {
            "function_score" : {
               "script_score": {
                  "script": "log10(doc['numeric'].value + 2)"
               }
            }
         }
      }
   } ]

Rescores as a single object are still supported.

Closes #4748
2014-01-23 16:29:07 +01:00
Nik Everett
37f80c8d80 Documentation for score_mode
Closes #4742
2014-01-23 16:24:48 +01:00
Lee Hinman
2c289fb538 Add the ability to retrieve fields from field data
Adds a new FetchSubPhase, FieldDataFieldsFetchSubPhase, which loads the
field data cache for a field and returns an array of values for the
field.

Also removes `doc['<field>']` and `_source.<field>` workaround no longer
needed in field name resolving.

Closes #4492
2014-01-21 09:13:32 -07:00
Clinton Gormley
2e4b70d40f [DOCS] Fixed duplicate ID in highlighting 2014-01-09 00:37:18 +01:00
Nik Everett
8bd9e34e39 Stop FVH from throwing away some query boosts
The FVH was throwing away some boosts on queries stopping a number of
ways to boost phrase matches to the top of the list of fragments from
working.

The plain highlighter also doesn't work for this but that is because it
doesn't support the concept of the same term having a different score at
different positions.

Also update documentation claiming that FHV is nicer for weighing terms
found by query combinations.

Closes #4351
2014-01-08 11:51:48 +01:00
Nik Everett
522d620eb6 Use FHV's phraseLimit
This prevents poisoning the FVH with documents that contain TONS of matches
which take tons of memory and time to highlight.

Closes #4645
2014-01-08 11:27:58 +01:00
Simon Willnauer
fa16969360 Cleanup comments and class names s/ElasticSearch/Elasticsearch
* Clean up s/ElasticSearch/Elasticsearch on docs/*
 * Clean up s/ElasticSearch/Elasticsearch on src/* bin/* & pom.xml
 * Clean up s/ElasticSearch/Elasticsearch on NOTICE.txt and README.textile

Closes #4634
2014-01-07 11:21:51 +01:00
Martijn van Groningen
f1bf585089 The fields option should always return an array for json document fields and single valued field for metadata fields.
Also the `fields` option can only be used to fetch leaf fields, trying to do fetch object fields will return in a client error.

Closes #4542
2014-01-03 17:29:12 +01:00
Clinton Gormley
34b9b16233 [DOCS] Fixed some bad link refs 2013-12-16 18:07:33 +01:00
Martijn van Groningen
23d2b1ea7b Renamed top level filter to post_filter.
Closes #4119
2013-12-16 17:10:14 +01:00
Martijn van Groningen
10e2528cce Added the force_source option to highlighting that enforces to use of the _source even if there are stored fields.
The percolator uses this option to deal with the fact that the MemoryIndex doesn't support stored fields,
this is possible b/c the _source of the document being percolated is always present.

Closes #4348
2013-12-13 13:39:53 +01:00
Nik Everett
8e34057bc0 Add support for combining fields to the FVH
The Fast Vector Highlighter can combine matches on multiple fields to
highlight a single field using `matched_fields`.  This is most
intuitive for multifields that analyze the same string in different
ways.  Example:
{
    "query": {
        "query_string": {
            "query": "content.plain:running scissors",
            "fields": ["content"]
        }
    },
    "highlight": {
        "order": "score",
        "fields": {
            "content": {
                "matched_fields": ["content", "content.plain"],
                "type" : "fvh"
            }
        }
    }
}

Closes #3750
2013-12-03 11:10:01 +01:00
Conrad Pankoff
87246af256 [DOCS] Fixed typos and corrected grammar 2013-12-02 10:08:26 +01:00
Clinton Gormley
bc393b6d79 Changed the minScore comparator from > to >=
Closes #4303
2013-11-29 20:29:20 +01:00
Boaz Leskes
c63d8c4fb5 [Docs] Added _source filtering to documentation
Relates to #3301
2013-11-26 19:16:24 +01:00
Clinton Gormley
3465e69e83 [DOCS] Changed all store:yes/no to store:true/false
which is how this setting is stored internally
2013-11-07 16:57:18 +01:00
Luca Cavanna
48ac9747a8 Added third highlighter type based on lucene postings highlighter
Requires field index_options set to "offsets" in order to store positions and offsets in the postings list.
Considerably faster than the plain highlighter since it doesn't require to reanalyze the text to be highlighted: the larger the documents the better the performance gain should be.
Requires less disk space than term_vectors, needed for the fast_vector_highlighter.
Breaks the text into sentences and highlights them. Uses a BreakIterator to find sentences in the text. Plays really well with natural text, not quite the same if the text contains html markup for instance.
Treats the document as the whole corpus, and scores individual sentences as if they were documents in this corpus, using the BM25 algorithm.

Uses forked version of lucene postings highlighter to support:
- per value discrete highlighting for fields that have multiple values, needed when number_of_fragments=0 since we want to return a snippet per value
- manually passing in query terms to avoid calling extract terms multiple times, since we use a different highlighter instance per doc/field, but the query is always the same

The lucene postings highlighter api is  quite different compared to the existing highlighters api, the main difference being that it allows to highlight multiple fields in multiple docs with a single call, ensuring sequential IO.
The way it is introduced in elasticsearch in this first round is a compromise trying not to change the current highlight api, which works per document, per field. The main disadvantage is that we lose the sequential IO, but we can always refactor the highlight api to work with multiple documents.

Supports pre_tag, post_tag, number_of_fragments (0 highlights the whole field), require_field_match, no_match_size, order by score and html encoding.

Closes #3704
2013-10-24 23:38:00 +02:00