773 Commits

Author SHA1 Message Date
Simon Willnauer
040b86a76b Set shard count limit to unlimited (#24012)
Now that we have incremental reduce functions for topN and aggregations
we can set the default for `action.search.shard_count.limit` to unlimited.
This still allows users to restrict these settings while by default we executed
across all shards matching the search requests index pattern.
2017-04-10 17:09:21 +02:00
Jim Ferenczi
9b3c85dd88 Deprecate _field_stats endpoint (#23914)
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field.
In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can
retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures).
Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field.
This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently.
For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why
 this PR is made directly to 6.0.
 The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
2017-04-10 10:10:16 +02:00
Nik Everett
7fad7c675d Rewrite the scripting security docs (#23930)
They needed to be updated now that Painless is the default and
the non-sandboxed scripting languages are going away or gone.

I dropped the entire section about customizing the classloader
whitelists. In master this barely does anything (exposes more
things to expressions).
2017-04-07 11:46:41 -04:00
Nik Everett
048191ceb6 CONSOLEify highlighting a function_score docs
Converts many of the partial examples into full search requests.

Relates #18160
2017-04-06 08:13:56 -04:00
Christoph Büscher
024ed1b6ca Merge branch 'master' into feature/rank-eval 2017-04-04 18:23:41 +02:00
Jim Ferenczi
a8250b26e7 Add FieldCapabilities (_field_caps) API (#23007)
This change introduces a new API called `_field_caps` that allows to retrieve the capabilities of specific fields.

Example:

````
GET t,s,v,w/_field_caps?fields=field1,field2
````
... returns:
````
{
   "fields": {
      "field1": {
         "string": {
            "searchable": true,
            "aggregatable": true
         }
      },
      "field2": {
         "keyword": {
            "searchable": false,
            "aggregatable": true,
            "non_searchable_indices": ["t"]
            "indices": ["t", "s"]
         },
         "long": {
            "searchable": true,
            "aggregatable": false,
            "non_aggregatable_indices": ["v"]
            "indices": ["v", "w"]
         }
      }
   }
}
````

In this example `field1` have the same type `text` across the requested indices `t`, `s`, `v`, `w`.
Conversely `field2` is defined with two conflicting types `keyword` and `long`.
Note that `_field_caps` does not treat this case as an error but rather return the list of unique types seen for this field.
2017-03-31 15:34:46 +02:00
Glen Smith
c62d4b7b0f Clarify preference docs
This commit clarifies the preference docs regarding the explanation of
how operations are routed by default. In particular, the previous use of
"shard replicas" was confusing as it could imply an operation would only
be routed to replicas by default.

Relates #23794
2017-03-29 12:55:47 -04:00
Christoph Büscher
96fc3aaf6f Merge branch 'master' into feature/rank-eval 2017-03-23 19:55:47 +01:00
Igor Motov
f927a2708d Make it possible to validate a query on all shards instead of a single random shard (#23697)
This is especially useful when we rewrite the query because the result of the rewrite can be very different on different shards. See #18254 for example.
2017-03-22 17:39:21 -04:00
Jim Ferenczi
b8c352fc3f Add support for fragment_length in the unified highlighter (#23431)
* Add support for fragment_length in the unified highlighter

This commit introduce a new break iterator (a BoundedBreakIterator) designed for the unified highlighter
 that is able to limit the size of fragments produced by generic break iterator like `sentence`.
The `unified` highlighter now supports `boundary_scanner` which can `words` or `sentence`.
The `sentence` mode will use the bounded break iterator in order to limit the size of the sentence to `fragment_length`.
When sentences bigger than `fragment_length` are produced, this mode will break the sentence at the next word boundary **after**
 `fragment_length` is reached.
2017-03-17 18:10:13 +01:00
Jack Conradson
8e04561c0d Change params._source to params['_source'] in example. 2017-03-15 17:29:31 -07:00
Jack Conradson
4c11ebc8b9 Fix example in documentation for Painless using _source. (#21322) 2017-03-15 17:18:34 -07:00
Christoph Büscher
cf35545e2d Merge branch 'master' into feature/rank-eval 2017-03-13 17:36:13 -07:00
NFM
f8fa5c96aa Fix indentation in sort docs
This commit fixes the indentation in an example query in the sort docs.

Relates #23561
2017-03-12 17:08:06 -07:00
Christoph Büscher
1f4c4d99b9 Merge branch 'master' into feature/rank-eval 2017-02-27 11:25:17 +01:00
Shai Erera
eeac6d27f2 Add BreakIteratorBoundaryScanner support for FVH (#23248)
This commit adds a boundary_scanner property to the search highlight
request so the user can specify different boundary scanners:

* `chars` (default,  current behavior)
* `word` Use a WordBreakIterator
* `sentence` Use a SentenceBreakIterator

This commit also adds "boundary_scanner_locale" to define which locale
should be used when scanning the text.
2017-02-23 23:32:22 +01:00
Christoph Büscher
cfa52f8b9a Merge branch 'master' into feature/rank-eval 2017-02-16 10:39:07 +01:00
Adrien Grand
8d6a41f671 Nested queries should avoid adding unnecessary filters when possible. (#23079)
When nested objects are present in the mappings, many queries get deoptimized
due to the need to exclude documents that are not in the right space. For
instance, a filter is applied to all queries that prevents them from matching
non-root documents (`+*:* -_type:__*`). Moreover, a filter is applied to all
child queries of `nested` queries in order to make sure that the child query
only matches child documents (`_type:__nested_path`), which is required by
`ToParentBlockJoinQuery` (the Lucene query behing Elasticsearch's `nested`
queries).

These additional filters slow down `nested` queries. In 1.7-, the cost was
somehow amortized by the fact that we cached filters very aggressively. However,
this has proven to be a significant source of slow downs since 2.0 for users
of `nested` mappings and queries, see #20797.

This change makes the filtering a bit smarter. For instance if the query is a
`match_all` query, then we need to exclude nested docs. However, if the query
is `foo: bar` then it may only match root documents since `foo` is a top-level
field, so no additional filtering is required.

Another improvement is to use a `FILTER` clause on all types rather than a
`MUST_NOT` clause on all nested paths when possible since `FILTER` clauses
are more efficient.

Here are some examples of queries and how they get rewritten:

```
"match_all": {}
```

This query gets rewritten to `ConstantScore(+*:* -_type:__*)` on master and
`ConstantScore(_type:AutomatonQuery {\norg.apache.lucene.util.automaton.Automaton@4371da44})`
with this change. The automaton is the complement of `_type:__*` so it matches
the same documents, but is faster since it is now a positive clause. Simplistic
performance testing on a 10M index where each root document has 5 nested
documents on average gave a latency of 420ms on master and 90ms with this change
applied.

```
"term": {
  "foo": {
    "value": "0"
  }
}
```

This query is rewritten to `+foo:0 #(ConstantScore(+*:* -_type:__*))^0.0` on
master and `foo:0` with this change: we do not need to filter nested docs out
since the query cannot match nested docs. While doing performance testing in
the same conditions as above, response times went from 250ms to 50ms.

```
"nested": {
  "path": "nested",
  "query": {
    "term": {
      "nested.foo": {
        "value": "0"
      }
    }
  }
}
```

This query is rewritten to
`+ToParentBlockJoinQuery (+nested.foo:0 #_type:__nested) #(ConstantScore(+*:* -_type:__*))^0.0`
on master and `ToParentBlockJoinQuery (nested.foo:0)` with this change. The
top-level filter (`-_type:__*`) could be removed since `nested` queries only
match documents of the parent space, as well as the child filter
(`#_type:__nested`) since the child query may only match nested docs since the
`nested` object has both `include_in_parent` and `include_in_root` set to
`false`. While doing performance testing in the same conditions as above,
response times went from 850ms to 270ms.
2017-02-14 16:05:19 +01:00
Tanguy Leroux
e2e5937455 Use typed_keys parameter to prefix suggester names by type in search responses (#23080)
This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.
2017-02-10 10:53:38 +01:00
Tanguy Leroux
63ea6f7168 [Docs] Remove unnecessary // TEST[continued] in search-template doc
It has been explained in e39b96f257
2017-02-10 10:08:24 +01:00
Jim Ferenczi
94087b3274 Removes ExpandCollapseSearchResponseListener, search response listeners and blocking calls
This changes removes the SearchResponseListener that was used by the ExpandCollapseSearchResponseListener to expand collapsed hits.
The removal of SearchResponseListener is not a breaking change because it was never released.
This change also replace the blocking call in ExpandCollapseSearchResponseListener by a single asynchronous multi search request. The parallelism of the expand request can be set via CollapseBuilder#max_concurrent_group_searches

Closes #23048
2017-02-09 18:06:10 +01:00
Tanguy Leroux
832952cb29 [Docs] Fix consoleify search-template.asciidoc
It does not reproduce well, hopefully this will fix the failure on DELETE _search/template/<templatename>.
2017-02-08 21:23:38 +01:00
Jay Modi
7f3769c745 Remove ldjson support and document ndjson for bulk/msearch (#23049)
This commit removes support for the `application/x-ldjson` Content-Type header as this was only used in the first draft
of the spec and had very little uptake. Additionally, the docs for bulk and msearch have been updated to specifically
call out ndjson and mention that the newline character may be preceded by a carriage return.

Finally, the bulk request handling of the carriage return has been improved to remove this character from the source.

Closes #23025
2017-02-08 11:55:50 -05:00
Tanguy Leroux
477d1aa8bf [Docs] Consoleify multi-search and search-template docs (#23047)
Relates #23001
2017-02-08 17:05:22 +01:00
Nik Everett
0e98c9107a Docs: CONSOLEify some more docs
These need to be CONSOLEified *now* because we're starting to
require Content-Type headers and they didn't have any.

* cluster/reroute: Marked as CONSOLE but skipped because the docs
build runs with a single node.
* docs/bulk: Marked as NOTCONSOLE because the snippets describe
either examples or `curl` commands. Fixed the `curl` command to
include the `Content-Type` header.
* query-dsl/terms-query: Marked as CONSOLE.
* search/request/rescore: Marked as CONSOLE. Fixed deprecated
syntax.

Relates #23001
Relates #18160
2017-02-07 16:49:01 -05:00
Jim Ferenczi
bbf62e3472 CONSOLify indices/analyze.asciidoc and search/field-stats.asciidoc
Relates #23001
2017-02-07 20:15:09 +01:00
Clinton Gormley
e181a020a9 Replaced absolute URLs in docs with attributes 2017-02-04 12:05:03 +01:00
Christoph Büscher
4cb8d9d08c Merge branch 'master' into feature/rank-eval
Conflicts:
	core/src/main/java/org/elasticsearch/script/Script.java
        docs/reference/search.asciidoc
2017-02-03 17:27:20 +01:00
Clinton Gormley
8ace37e214 Fix asciidoc in stored fields 2017-02-03 10:18:01 +01:00
Nicholas Knize
b41d5747f0 Reduce GeoDistance insanity
GeoDistance query, sort, and scripts make use of a crazy GeoDistance enum for handling 4 different ways of computing geo distance: SLOPPY_ARC, ARC, FACTOR, and PLANE. Only two of these are necessary: ARC, PLANE. This commit removes SLOPPY_ARC, and FACTOR and cleans up the way Geo distance is computed.
2017-02-02 12:39:42 -06:00
Jim Ferenczi
f6d38d480a Integrate UnifiedHighlighter (#21621)
* Integrate UnifiedHighlighter

This change integrates the Lucene highlighter called "unified" in the list of supported highlighters for ES.
This highlighter can extract offsets from either postings, term vectors, or via re-analyzing text.
The best strategy is picked automatically at query time and depends on the field and the query to highlight.
2017-01-31 19:06:03 +01:00
Jim Ferenczi
e7e871acdd Fix link to keyword and numeric type 2017-01-30 13:57:28 +01:00
Clinton Gormley
938f5194ef Include field-collapsing docs in request-body search 2017-01-30 11:47:12 +01:00
Jim Ferenczi
e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Christoph Büscher
9ed867ea83 [DOCS] Fix inconsistent formatting for fieldnames in profile.asciidoc 2017-01-18 10:41:22 +01:00
Clinton Gormley
401438819e Docs: Fix the first highlighting example to work
Closes #22642
2017-01-17 12:20:03 +01:00
Christoph Büscher
2791c69960 Update profile.asciidoc
Making the "Human readable output" section a note instead of an own section.
2017-01-16 16:19:07 +01:00
Christoph Büscher
49a49da3f5 [Docs] Fix section title in profile.asciidoc 2017-01-16 14:53:06 +01:00
Christoph Büscher
59a48ffc41 ProfileResult and CollectorResult should print machine readable timing information (#22561)
Currently both ProfileResult and CollectorResult print the time field in a human readable string format
 (e.g. "time": "55.20315000ms"). When trying to parse this back to a long value, for example to use in 
the planned high level java rest client, we can lose precision because of conversion and rounding issues. 
This change adds a new additional field (`time_in_nanos`) to the profile response to be able to get the 
original time value in nanoseconds back. 

The old `time` field is only printed when the `?`human=true` flag in the url is set. This follow the behaviour for 
all other stats-related apis. Also the format of the `time` field is slightly changed. Instead of always formatting 
the output as a 10-digit ms value, by using the `XContentBuilder#timeValueField()` method we now print 
the largest time unit present is used (e.g. "s", "ms", "micros").
2017-01-16 14:27:55 +01:00
maciejkula
b4c8c21553 State default sort order on missing values
Closes #19099
2017-01-13 17:05:13 +01:00
Lee Hinman
2db01b6127 Merge remote-tracking branch 'dakrone/disable-all-by-default' 2017-01-12 10:17:51 -07:00
Tanguy Leroux
df703dce0a [DOC] Document {{url}} mustache function (#22549)
This function introduced in #20838 wasn't documented at all.

Related to #22459
2017-01-12 14:57:03 +01:00
Lee Hinman
7a18bb50fc Disable _all by default
This change disables the _all meta field by default.

Now that we have the "all-fields" method of query execution, we can save both
indexing time and disk space by disabling it.

_all can no longer be configured for indices created after 6.0.

Relates to #20925 and #21341
Resolves #19784
2017-01-11 16:47:13 -07:00
Martijn van Groningen
cb2333dacd percolator: remove deprecated percolate and mpercolate apis 2017-01-10 11:18:27 +01:00
Isabel Drost-Fromm
46c30e6bc3 Make maximum number of parallel search requests configurable. (#22192)
Problem: So far all rank eval requests are being executed in parallel. If there
are more than the search thread pool can handle, or if there are other search
requests executed in parallel rank eval can fail.

Solution: Make number of max_concurrent_searches configurable.

Name of configuration parameter is analogous to msearch. Default
max_concurrent_searches set to 10: Rank_eval isn't particularly time critical so
trying to avoid being more clever than probably needed here. Can set this value
through the API to a higher value anytime.

Fixes #21403
2016-12-19 13:05:49 +01:00
Isabel Drost-Fromm
bdc32be8b7 Support specifying multiple templates (#22139)
Problem: We introduced the ability to shorten the rank eval request by using a
template in #20231. When playing with the API it turned out that there might be
use cases where - e.g. due to various heuristics - folks might want to translate
the original user query into more than just one type of Elasticsearch query.

Solution: Give each template an id that can later be referenced in the
actual requests.

Closes #21257
2016-12-19 12:49:15 +01:00
Isabel Drost-Fromm
b1e0d698ac Merge branch 'master' into feature/rank-eval 2016-12-19 10:16:16 +01:00
Masaru Hasegawa
a0185c83a7 Merge pull request #21393 from masaruh/alias_boost
Resolve index names in indices_boost
2016-12-16 15:07:51 +09:00
Isabel Drost-Fromm
5618d6ca49 Merge branch 'master' into feature/rank-eval 2016-12-15 10:29:26 +01:00
Areek Zillur
cdd5fbe3a1 Deprecate _suggest endpoint in favour of _search (#20305)
* Replace _suggest endpoint to _search in docs

In 5.0, the _suggest endpoint is just sugar for _search
with suggestions specified. Users should move away from
using the _suggest endpoint, as it is marked as deprecated in 5.x and
will be removed in 6.0

* update docs to use _search endpoint instead of _suggest

* Add deprecation logging to RestSuggestAction

* Use search endpoint instead of suggest endpoint in rest tests
2016-12-14 21:49:53 -05:00