Commit Graph

912 Commits

Author SHA1 Message Date
Jim Ferenczi 7dc00ef1f5
Search option terminate_after does not handle post_filters and aggregations correctly (#28459)
* Search option terminate_after does not handle post_filters and aggregations correctly

This change fixes the handling of the `terminate_after` option when post_filters (or min_score) are used.
`post_filter` should be applied before `terminate_after` in order to terminate the query when enough document are accepted
by the post_filters.
This commit also changes the type of exception thrown by `terminate_after` in order to ensure that multi collectors (aggregations)
do not try to continue the collection when enough documents have been collected.

Closes #28411
2018-02-12 13:36:33 +01:00
markharwood 77d2dd203e
Search - add allow_partial_search_results flag with default setting false (#28440)
Adds allow_partial_search_results flag to search requests with default setting = true.
When false, will error if search either timeouts, has partial errors or has missing shards rather
than returning partial search results. A cluster-level setting provides a default for search requests with no flag.

Closes #27435
2018-01-31 15:51:29 +00:00
Vlad Holubiev eea9ee57dd [Docs] Fix typo in inner-hits.asciidoc (#27998) 2018-01-31 11:55:53 +01:00
Christoph Büscher 6731c76900
Add ranking evaluation API to High Level Rest Client (#28357)
This change adds support for the new ranking evaluation API to the High Level Rest Client.
This mostly means adding support for parsing the various response objects back from the
REST representation. It includes one change to the response syntax where previously we didn't
print the type of the metric details section but we now need it to pick the right parser to
parse this section back.

Closes #28198
2018-01-30 17:48:09 +01:00
Robin Stocker 64bbb3a235 [Docs] Clarify `html` encoder in highlighting.asciidoc (#27766)
The previous description was a bit confusing because the pre/post tags used for highlighting are not escaped, the rest of the content is.
2018-01-24 16:45:40 +01:00
Andrew Kramarev ef468327e9 mistyping in one of the highlighting examples comment -> content (#28139) 2018-01-18 17:32:42 -05:00
Jim Ferenczi defb53a0bc
add a note regarding rescore and sort (#28251) 2018-01-18 09:23:19 +01:00
Christoph Büscher 39ff7b5a3f [Docs] Correct response json in rank-eval.asciidoc 2018-01-11 15:52:11 +01:00
Andrew Banchich e92acefba0 [Docs] Improvements in script-fields.asciidoc (#28174) 2018-01-11 10:59:27 +01:00
Vlad Holubiev 31d4a4bf7c [DOCS] Fix link formatting (#27990) 2017-12-26 16:25:05 +00:00
Mayya Sharipova cbd271e497
Limit the analyzed text for highlighting (#27934)
* Limit the analyzed text for highlighting

- Introduce index level settings to control the max number of character
to be analyzed for highlighting
- Throw an error if analysis is required on a larger text

Closes #27517
2017-12-21 10:19:58 -05:00
Christoph Büscher f3293879b5 [Docs] Improve rendering of ranking evaluation docs 2017-12-15 10:45:44 +01:00
Adrien Grand 1b660821a2
Allow `_doc` as a type. (#27816)
Allowing `_doc` as a type will enable users to make the transition to 7.0
smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`.
This also moves most of the documentation to `_doc` as a type name.

Closes #27750
Closes #27751
2017-12-14 17:47:53 +01:00
Christoph Büscher 3d3a1d2a0d Adding short description for experimental status in docs 2017-12-08 15:12:15 +01:00
Christoph Büscher 52cb6c8ef2 Merge branch 'master' into rankeval 2017-12-07 14:22:46 +01:00
Deb Adair 2f9a882061 [DOCS] Fixed typos and broken attribute. 2017-12-05 11:46:40 -08:00
Christoph Büscher bbec33d35c Merge branch 'master' into rankeval 2017-12-04 12:57:19 +01:00
olcbean d25c9671de Deprecate `jarowinkler` in favor of `jaro_winkler` (#27526)
Jaro and Winkler are two people, so we should use the same naming convention as for Damerau–Levenshtein.
2017-11-30 12:49:34 +00:00
Martijn van Groningen dbf17152d1
docs: use `doc_value_fields` fields as alternative for nested inner hits _source fetching
instead of stored fields as doc values are more likely to be enabled by default
2017-11-29 17:31:39 +01:00
Christoph Büscher 35688f6441 Merge branch 'master' into rankeval 2017-11-29 15:24:06 +01:00
Christoph Büscher 0d11b9fe34
[Docs] Unify spelling of Elasticsearch (#27567)
Removes occurences of "elasticsearch" or "ElasticSearch" in favour of
"Elasticsearch" where appropriate.
2017-11-29 09:44:25 +01:00
Martijn van Groningen cb1204774b
Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response.
Also include _type and _id for parent/child hits inside inner hits.

In the case of top_hits aggregation the nested search hits are
directly returned and are not grouped by a root or parent document, so
it is important to include the _id and _index attributes in order to know
to what documents these nested search hits belong to.

Closes #27053
2017-11-28 14:05:29 +01:00
Christoph Büscher 5661b1c3df Merge branch 'master' into rankeval 2017-11-24 16:25:05 +01:00
olcbean fd564b10db Deprecate `levenstein` in favor of `levenshtein` (#27409)
Support both spellings thoughout 6.x, reporting the incorrect one as deprecated.
2017-11-23 12:53:47 +00:00
Christoph Büscher 5735477283 Fix some documentation typos 2017-11-23 12:31:25 +01:00
Simon Willnauer fadbe0de08
Automatically prepare indices for splitting (#27451)
Today we require users to prepare their indices for split operations.
Yet, we can do this automatically when an index is created which would
make the split feature a much more appealing option since it doesn't have
any 3rd party prerequisites anymore.

This change automatically sets the number of routinng shards such that
an index is guaranteed to be able to split once into twice as many shards.
The number of routing shards is scaled towards the default shard limit per index
such that indices with a smaller amount of shards can be split more often than
larger ones. For instance an index with 1 or 2 shards can be split 10x
(until it approaches 1024 shards) while an index created with 128 shards can only
be split 3x by a factor of 2. Please note this is just a default value and users
can still prepare their indices with `index.number_of_routing_shards` for custom
splitting.

NOTE: this change has an impact on the document distribution since we are changing
the hash space. Documents are still uniformly distributed across all shards but since
we are artificually changing the number of buckets in the consistent hashign space
document might be hashed into different shards compared to previous versions.

This is a 7.0 only change.
2017-11-23 09:48:54 +01:00
Christoph Büscher d979ccace9 Merge branch 'master' into rankeval 2017-11-21 14:11:02 +01:00
Christoph Büscher 3348d2317f Reworking javadocs, minor changes in some implementation classes 2017-11-21 14:09:04 +01:00
Christoph Büscher 5c65a59369 Extending rank_eval asciidocs 2017-11-21 14:08:42 +01:00
Christoph Büscher d9e67a2c95 Extending `_rank_eval` documentation 2017-11-21 14:08:28 +01:00
Zachary Tong 6e9e07d6f8
Fix profiling naming issues (#27133)
Some code-paths use anonymous classes (such as NonCollectingAggregator
in terms agg), which messes up the display name of the profiler.  If
we encounter an anonymous class, we need to grab the super's name.

Another naming issue was that ProfileAggs were not delegating to the
wrapped agg's name for toString(), leading to ugly display.

This PR also fixes up the profile documentation.  Some of the examples were
executing against empty indices, which shows different profile results
than a populated index (and made for confusing examples).

Finally, I switched the agg display names from the fully qualified name
to the simple name, so that it's similar to how the query profiles work.

Closes #26405
2017-11-06 16:37:33 -05:00
Shai Erera bd0261916c Fix Laplace scorer to multiply by alpha (and not add) (#27125) 2017-10-31 13:08:44 +01:00
Martijn van Groningen 87c9b79b10
Return the _source of inner hit nested as is without wrapping it into its full path context
Due to a change happened via #26102 to make the nested source consistent
with or without source filtering, the _source of a nested inner hit was
always wrapped in the parent path. This turned out to be not ideal for
users relying on the nested source, as it would require additional parsing
on the client side. This change fixes this, the _source of nested inner hits
is now no longer wrapped by parent json objects, irregardless of whether
the _source is included as is or source filtering is used.

Internally source filtering and highlighting relies on the fact that the
_source of nested inner hits are accessible by its full field path, so
in order to now break this, the conversion of the _source into its binary
form is performed in FetchSourceSubPhase, after any potential source filtering
is performed to make sure the structure of _source of the nested inner hit
is consistent irregardless if source filtering is performed.

PR for #26944

Closes #26944
2017-10-19 12:04:56 +02:00
Nhat bf4c3642b2 remove _primary and _replica shard preferences (#26791)
The shard preference _primary, _replica and its variants were useful
for the asynchronous replication. However, with the current impl, they
are no longer useful and should be removed.

Closes #26335
2017-10-08 11:03:06 -04:00
Christoph Büscher bea8451b2f Merge branch 'master' into feature/rank-eval 2017-09-15 11:44:51 +02:00
Jim Ferenczi 401f4ba2ce Fix percolator highlight sub fetch phase to not highlight query twice (#26622)
* Fix percolator highlight sub fetch phase to not highlight query twice

The PercolatorHighlightSubFetchPhase does not override hitExecute and since it extends HighlightPhase the search hits
are highlighted twice (by the highlight phase and then by the percolator). This does not alter the results, the second highlighting
just overrides the first one but this slow down the request because it duplicates the work.
2017-09-14 09:31:14 +02:00
Tanguy Leroux 7404221b55 [Docs] Clarify size parameter in Completion Suggester doc (#26617) 2017-09-13 17:28:31 +02:00
Jim Ferenczi d68d8c9cef Expose duplicate removal in the completion suggester (#26496)
This change exposes the duplicate removal option added in Lucene for the completion suggester
with a new option called `skip_duplicates` (defaults to false).
This commit also adapts the custom suggest collector to handle deduplication when multiple contexts match the input.

Closes #23364
2017-09-07 17:11:01 +02:00
Matt Weber 140395c83f Multi-level Nested Sort with Filters (#26395)
Multi-level Nested Sort with Filters

Allow multiple levels of nested sorting where each level can have it's own filter.
Backward compatible with previous single-level nested sort.
2017-08-30 18:52:56 +02:00
Martijn van Groningen c821dce3fe
Revert "Multi-level Nested Sort with Filters"
This reverts commit 6377afa6c3.
2017-08-30 14:53:25 +02:00
Martijn van Groningen 6377afa6c3
Multi-level Nested Sort with Filters
Allow multple levels of nested sorting where each level
can have it's own filter.  Backward compatible with
previous single-level nested sort.
2017-08-30 14:30:20 +02:00
Tanguy Leroux db54c4dc7c [Docs] Convert more doc snippets (#26404)
This commit converts some remaining doc snippets so that they are now
testable.
2017-08-30 09:30:36 +02:00
Jim Ferenczi 86d97971a4 Remove the _all metadata field (#26356)
* Remove the _all metadata field

This change removes the `_all` metadata field. This field is deprecated in 6
and cannot be activated for indices created in 6 so it can be safely removed in
the next major version (e.g. 7).
2017-08-28 17:43:59 +02:00
Christoph Büscher 62a7cac3a0 Merge branch 'master' into feature/rank-eval 2017-08-23 11:19:16 +02:00
Alexander Reelsen 483086220f Docs: Add search response took time explanation (#26202) 2017-08-15 08:43:26 +02:00
Martijn van Groningen 076167fbe5
inner hits: Unfiltered nested source should keep its full path
like filtered nested source.

Closes #23090
2017-08-10 15:58:29 +02:00
Christoph Büscher 18155ed69a Merge branch 'master' into feature/rank-eval 2017-08-07 16:07:34 +02:00
Clinton Gormley ff4a2519f2 Update experimental labels in the docs (#25727)
Relates https://github.com/elastic/elasticsearch/issues/19798

Removed experimental label from:
* Painless
* Diversified Sampler Agg
* Sampler Agg
* Significant Terms Agg
* Terms Agg document count error and execution_hint
* Cardinality Agg precision_threshold
* Pipeline Aggregations
* index.shard.check_on_startup
* index.store.type (added warning)
* Preloading data into the file system cache
* foreach ingest processor
* Field caps API
* Profile API

Added experimental label to:
* Moving Average Agg Prediction


Changed experimental to beta for:
* Adjacency matrix agg
* Normalizers
* Tasks API
* Index sorting

Labelled experimental in Lucene:
* ICU plugin custom rules file
* Flatten graph token filter
* Synonym graph token filter
* Word delimiter graph token filter
* Simple pattern tokenizer
* Simple pattern split tokenizer

Replaced experimental label with warning that details may change in the future:
* Analysis explain output format
* Segments verbose output format
* Percentile Agg compression and HDR Histogram
* Percentile Rank Agg HDR Histogram
2017-07-18 14:06:22 +02:00
Christoph Büscher 6d999f074a Merge branch 'master' into feature/rank-eval 2017-07-14 18:36:08 +02:00
Jim Ferenczi fe383b7c27 More clarifications on the unified highlighter being the new default (#25668)
* More clarifications on the unified highlighter being the new default
2017-07-13 15:38:58 +02:00
Deb Adair ded9f55263 [DOCS] Incorporated feedback on the highlighting changes. 2017-07-12 16:36:33 -07:00
Ryan Ernst 70b2897bdf Scripting: Deprecate stored search template apis (#25437)
This commit deprecates the PUT, GET and DELETE search template apis.
Instead, the stored script api should be used.

closes #24596
2017-07-12 16:07:28 -07:00
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
Deb Adair b5e81132cf [DOCS] Reorganized the highlighting topic so it's less confusing. 2017-07-11 21:16:14 -07:00
Simon Willnauer 98c91a3bd0 Limit the number of concurrent shard requests per search request (#25632)
This is a protection mechanism to prevent a single search request from
hitting a large number of shards in the cluster concurrently. If a search is
executed against all indices in the cluster this can easily overload the cluster
causing rejections etc. which is not necessarily desirable. Instead this PR adds
a per request limit of `max_concurrent_shard_requests` that throttles the number of
concurrent initial phase requests to `256` by default. This limit can be increased per request
and protects single search requests from overloading the cluster. Subsequent PRs can introduces
addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of
the number of nodes or sort shards iters such that we gain the best concurrency across nodes.
2017-07-11 16:23:10 +02:00
Clinton Gormley bd7ddfa175 Removed field-stats docs 2017-07-11 15:15:25 +02:00
Martijn van Groningen d0f9f425bd
parent/child: Removed ParentJoinFieldSubFetchPhase 2017-07-06 13:15:02 +02:00
Clinton Gormley 0170e0e8d3 Remove usage of multi-types from the docs and added a page explaining type removal (#25543)
Closes #25401
2017-07-05 12:30:19 +02:00
Christoph Büscher 2708bcc6ed Merge branch 'master' into feature/rank-eval 2017-06-29 15:07:45 +02:00
Jim Ferenczi 664193185e [Docs] Fix cross reference for parent-join field 2017-06-16 11:53:16 +02:00
Jim Ferenczi ccb3c9aae7 Add documentation for the new parent-join field (#25227)
* Add documentation for the new parent-join field

This commit adds the docs for the new parent-join field.
It explains how to define, index and query this new field.

Relates #20257
2017-06-16 11:13:23 +02:00
Adrien Grand 0c117145f6 Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222)
This snapshot has faster range queries on range fields (LUCENE-7828), more
accurate norms (LUCENE-7730) and the ability to use fake term frequencies
(LUCENE-7854).
2017-06-15 09:52:07 +02:00
Christoph Büscher ac3db8c30f Merge branch 'master' into feature/rank-eval 2017-06-14 11:57:05 +02:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id (#25127)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Jim Ferenczi 5e8b569255 fix highlighting docs 2017-06-09 14:42:08 +02:00
Jim Ferenczi 8250aa4267 Remove the postings highlighter and make unified the default highlighter choice (#25028)
This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves
exactly like the `unified` highlighter when index_options is set to `offsets`:
https://issues.apache.org/jira/browse/LUCENE-7815

It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided).
The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text.
Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such.
There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available.
I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.
2017-06-09 14:09:57 +02:00
Andrey Groshev e4fd8485ce Made the same length of opening and closing lines (#23583) 2017-06-09 00:50:43 -07:00
Jim Ferenczi 36a5cf8f35 Automatically early terminate search query based on index sorting (#24864)
This commit refactors the query phase in order to be able
to automatically detect queries that can be early terminated.
If the index sort matches the query sort, the top docs collection is early terminated
on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector.
This change also adds a new parameter to the search request called `track_total_hits`.
It indicates if the total number of hits that match the query should be tracked.
If false, queries sorted by the index sort will not try to compute this information and 
and will limit the collection to the first N documents per segment.
Aggregations are not impacted and will continue to see every document
even when the index sort matches the query sort and `track_total_hits` is false.

Relates #6720
2017-06-08 12:10:46 +02:00
Yibin Lin fbf2e3d574 Tiny correction in inner-hits.asciidoc (#25066) 2017-06-06 13:26:37 +02:00
Christoph Büscher 3d6fb4eb0b Merge branch 'master' into feature/rank-eval 2017-05-30 14:24:26 +02:00
Clinton Gormley 0656d0236b Update context-suggest.asciidoc
Removed incorrect parameter
2017-05-26 17:41:40 +02:00
Matt Weber 601a61a91c Support Multiple Collapse Inner Hits
Support multiple named inner hits on a field collapsing
request.
2017-05-26 13:23:57 +02:00
António Ribeiro 85a1b2b406 Fix link to perl docs (#24842)
* Fixes Elasticsearch issue #24606.

* Fixes Elasticsearch issue #24606.

* Fixes Elasticsearch issue #24606.

* Fixes Elasticsearch issue #24606.

* Issue #24606 - Changed the link text to Search::Elasticsearch::Client::5_0::Bulk and
Search::Elasticsearch::Client::5_0::Scroll.
2017-05-24 11:43:54 +02:00
Nik Everett 13a86fec99 Add magic $_path stash key to docs tests (#24724)
Adds a "magic" key to the yaml testing stash mostly for use with
documentation tests. When unstashing an object, `$_path` is the
path into the current position in the object you are unstashing.
This means that in docs tests you can use
`// TESTRESPONSEs/somevalue/$body.${_path}/` to mean "replace
`somevalue` with whatever is the response in the same position."

Compare how you must carefully mock out all the numbers in the profile
response without this change:
```
// TESTRESPONSE[s/"id": "\[2aE02wS1R8q_QFnYu6vDVQ\]\[twitter\]\[1\]"/"id": $body.profile.shards.0.id/]
// TESTRESPONSE[s/"rewrite_time": 51443/"rewrite_time": $body.profile.shards.0.searches.0.rewrite_time/]
// TESTRESPONSE[s/"score": 51306/"score": $body.profile.shards.0.searches.0.query.0.breakdown.score/]
// TESTRESPONSE[s/"time_in_nanos": "1873811"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.time_in_nanos/]
// TESTRESPONSE[s/"build_scorer": 2935582/"build_scorer": $body.profile.shards.0.searches.0.query.0.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 919297/"create_weight": $body.profile.shards.0.searches.0.query.0.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 53876/"next_doc": $body.profile.shards.0.searches.0.query.0.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "391943"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.0.time_in_nanos/]
// TESTRESPONSE[s/"score": 28776/"score": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.score/]
// TESTRESPONSE[s/"build_scorer": 784451/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 1669564/"create_weight": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 10111/"next_doc": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "210682"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.1.time_in_nanos/]
// TESTRESPONSE[s/"score": 4552/"score": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.score/]
// TESTRESPONSE[s/"build_scorer": 42602/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 89323/"create_weight": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 2852/"next_doc": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "304311"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.time_in_nanos/]
// TESTRESPONSE[s/"time_in_nanos": "32273"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.children.0.time_in_nanos/]
```

To how you can cavalierly mock all the numbers at once with this change:
```
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
```
2017-05-23 15:33:48 -04:00
Jack Conradson 0aa380b770 Fix search template documentation reference to scripting security. 2017-05-18 14:27:58 -07:00
Christoph Büscher cd0941810f Merge branch 'master' into feature/rank-eval 2017-05-18 16:47:47 +02:00
Ryan Ernst 463fe2f4d4 Scripting: Remove file scripts (#24627)
This commit removes file scripts, which were deprecated in 5.5.

closes #21798
2017-05-17 14:42:25 -07:00
Martijn van Groningen 840da4aebf
Removed deprecated template query.
Relates to #19390
2017-05-11 14:56:45 +02:00
Adrien Grand a72eaa8e0f Identify documents by their `_id`. (#24460)
Now that indices have a single type by default, we can move to the next step
and identify documents using their `_id` rather than the `_uid`.

One notable change in this commit is that I made deletions implicitly create
types. This helps with the live version map in the case that documents are
deleted before the first type is introduced. Otherwise there would be no way
to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from
`DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are
different if versioning is involved.
2017-05-09 16:33:52 +02:00
Anupam 0b36fb052c Update completion-suggest.asciidoc (#24506) 2017-05-05 11:34:41 -04:00
Simon Willnauer 6b67e0bf2f Include all aliases including non-filtering in `_search_shards` response (#24489)
`_search_shards`API today only returns aliases names if there is an alias
filter associated with one of them. Now it can be useful to see which aliases
have been expanded for an index given the index expressions. This change also includes non-filtering aliases even without a filtering alias being present.
2017-05-05 09:34:12 +02:00
Nik Everett 9f431543fc CONSOLEify inner hits docs
Rewrites most of the snippets in the `innert_hits` docs to be
complete examples and enables `VIEW IN CONSOLE`, `COPY AS CURL`,
and automatic testing of the snippets.
2017-05-04 17:30:54 -04:00
Adrien Grand 977016ba25 Do not index `_type` when there is at most one type. (#24363)
This change makes `_type` behave pretty much like `_index` when
`index.mapping.single_type` is true.
2017-05-04 16:29:35 +02:00
Clinton Gormley 582b3c06b6 Added docs for batched_reduce_size
Relates to #23288
2017-05-02 14:25:03 +02:00
Jim Ferenczi 9d8254fadf Fix FieldCaps documentation
Fix the expected output for field_caps call.
Fixes #24413
2017-05-02 10:14:47 +02:00
Martijn van Groningen b77254871b
docs: document alternative for nested inner hits source
Closes #24110
2017-04-28 11:09:24 +02:00
Guillaume Le Floch 739cb35d1b Allow passing single scrollID in clear scroll API body (#24242)
* Allow single scrollId in string format

Closes #24233
2017-04-25 13:43:21 +02:00
Christoph Büscher 5254731039 Merge branch 'master' into feature/rank-eval 2017-04-22 21:47:32 +02:00
Suhas Karanth f97d8bc78d Update reference docs for Highlighter fragmenter (#23754)
Explain the fragmenter and add examples.
2017-04-17 14:00:24 -04:00
Simon Willnauer 040b86a76b Set shard count limit to unlimited (#24012)
Now that we have incremental reduce functions for topN and aggregations
we can set the default for `action.search.shard_count.limit` to unlimited.
This still allows users to restrict these settings while by default we executed
across all shards matching the search requests index pattern.
2017-04-10 17:09:21 +02:00
Jim Ferenczi 9b3c85dd88 Deprecate _field_stats endpoint (#23914)
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field.
In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can
retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures).
Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field.
This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently.
For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why
 this PR is made directly to 6.0.
 The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
2017-04-10 10:10:16 +02:00
Nik Everett 7fad7c675d Rewrite the scripting security docs (#23930)
They needed to be updated now that Painless is the default and
the non-sandboxed scripting languages are going away or gone.

I dropped the entire section about customizing the classloader
whitelists. In master this barely does anything (exposes more
things to expressions).
2017-04-07 11:46:41 -04:00
Nik Everett 048191ceb6 CONSOLEify highlighting a function_score docs
Converts many of the partial examples into full search requests.

Relates #18160
2017-04-06 08:13:56 -04:00
Christoph Büscher 024ed1b6ca Merge branch 'master' into feature/rank-eval 2017-04-04 18:23:41 +02:00
Jim Ferenczi a8250b26e7 Add FieldCapabilities (_field_caps) API (#23007)
This change introduces a new API called `_field_caps` that allows to retrieve the capabilities of specific fields.

Example:

````
GET t,s,v,w/_field_caps?fields=field1,field2
````
... returns:
````
{
   "fields": {
      "field1": {
         "string": {
            "searchable": true,
            "aggregatable": true
         }
      },
      "field2": {
         "keyword": {
            "searchable": false,
            "aggregatable": true,
            "non_searchable_indices": ["t"]
            "indices": ["t", "s"]
         },
         "long": {
            "searchable": true,
            "aggregatable": false,
            "non_aggregatable_indices": ["v"]
            "indices": ["v", "w"]
         }
      }
   }
}
````

In this example `field1` have the same type `text` across the requested indices `t`, `s`, `v`, `w`.
Conversely `field2` is defined with two conflicting types `keyword` and `long`.
Note that `_field_caps` does not treat this case as an error but rather return the list of unique types seen for this field.
2017-03-31 15:34:46 +02:00
Glen Smith c62d4b7b0f Clarify preference docs
This commit clarifies the preference docs regarding the explanation of
how operations are routed by default. In particular, the previous use of
"shard replicas" was confusing as it could imply an operation would only
be routed to replicas by default.

Relates #23794
2017-03-29 12:55:47 -04:00
Christoph Büscher 96fc3aaf6f Merge branch 'master' into feature/rank-eval 2017-03-23 19:55:47 +01:00
Igor Motov f927a2708d Make it possible to validate a query on all shards instead of a single random shard (#23697)
This is especially useful when we rewrite the query because the result of the rewrite can be very different on different shards. See #18254 for example.
2017-03-22 17:39:21 -04:00
Jim Ferenczi b8c352fc3f Add support for fragment_length in the unified highlighter (#23431)
* Add support for fragment_length in the unified highlighter

This commit introduce a new break iterator (a BoundedBreakIterator) designed for the unified highlighter
 that is able to limit the size of fragments produced by generic break iterator like `sentence`.
The `unified` highlighter now supports `boundary_scanner` which can `words` or `sentence`.
The `sentence` mode will use the bounded break iterator in order to limit the size of the sentence to `fragment_length`.
When sentences bigger than `fragment_length` are produced, this mode will break the sentence at the next word boundary **after**
 `fragment_length` is reached.
2017-03-17 18:10:13 +01:00
Jack Conradson 8e04561c0d Change params._source to params['_source'] in example. 2017-03-15 17:29:31 -07:00
Jack Conradson 4c11ebc8b9 Fix example in documentation for Painless using _source. (#21322) 2017-03-15 17:18:34 -07:00
Christoph Büscher cf35545e2d Merge branch 'master' into feature/rank-eval 2017-03-13 17:36:13 -07:00
NFM f8fa5c96aa Fix indentation in sort docs
This commit fixes the indentation in an example query in the sort docs.

Relates #23561
2017-03-12 17:08:06 -07:00
Christoph Büscher 1f4c4d99b9 Merge branch 'master' into feature/rank-eval 2017-02-27 11:25:17 +01:00
Shai Erera eeac6d27f2 Add BreakIteratorBoundaryScanner support for FVH (#23248)
This commit adds a boundary_scanner property to the search highlight
request so the user can specify different boundary scanners:

* `chars` (default,  current behavior)
* `word` Use a WordBreakIterator
* `sentence` Use a SentenceBreakIterator

This commit also adds "boundary_scanner_locale" to define which locale
should be used when scanning the text.
2017-02-23 23:32:22 +01:00
Christoph Büscher cfa52f8b9a Merge branch 'master' into feature/rank-eval 2017-02-16 10:39:07 +01:00
Adrien Grand 8d6a41f671 Nested queries should avoid adding unnecessary filters when possible. (#23079)
When nested objects are present in the mappings, many queries get deoptimized
due to the need to exclude documents that are not in the right space. For
instance, a filter is applied to all queries that prevents them from matching
non-root documents (`+*:* -_type:__*`). Moreover, a filter is applied to all
child queries of `nested` queries in order to make sure that the child query
only matches child documents (`_type:__nested_path`), which is required by
`ToParentBlockJoinQuery` (the Lucene query behing Elasticsearch's `nested`
queries).

These additional filters slow down `nested` queries. In 1.7-, the cost was
somehow amortized by the fact that we cached filters very aggressively. However,
this has proven to be a significant source of slow downs since 2.0 for users
of `nested` mappings and queries, see #20797.

This change makes the filtering a bit smarter. For instance if the query is a
`match_all` query, then we need to exclude nested docs. However, if the query
is `foo: bar` then it may only match root documents since `foo` is a top-level
field, so no additional filtering is required.

Another improvement is to use a `FILTER` clause on all types rather than a
`MUST_NOT` clause on all nested paths when possible since `FILTER` clauses
are more efficient.

Here are some examples of queries and how they get rewritten:

```
"match_all": {}
```

This query gets rewritten to `ConstantScore(+*:* -_type:__*)` on master and
`ConstantScore(_type:AutomatonQuery {\norg.apache.lucene.util.automaton.Automaton@4371da44})`
with this change. The automaton is the complement of `_type:__*` so it matches
the same documents, but is faster since it is now a positive clause. Simplistic
performance testing on a 10M index where each root document has 5 nested
documents on average gave a latency of 420ms on master and 90ms with this change
applied.

```
"term": {
  "foo": {
    "value": "0"
  }
}
```

This query is rewritten to `+foo:0 #(ConstantScore(+*:* -_type:__*))^0.0` on
master and `foo:0` with this change: we do not need to filter nested docs out
since the query cannot match nested docs. While doing performance testing in
the same conditions as above, response times went from 250ms to 50ms.

```
"nested": {
  "path": "nested",
  "query": {
    "term": {
      "nested.foo": {
        "value": "0"
      }
    }
  }
}
```

This query is rewritten to
`+ToParentBlockJoinQuery (+nested.foo:0 #_type:__nested) #(ConstantScore(+*:* -_type:__*))^0.0`
on master and `ToParentBlockJoinQuery (nested.foo:0)` with this change. The
top-level filter (`-_type:__*`) could be removed since `nested` queries only
match documents of the parent space, as well as the child filter
(`#_type:__nested`) since the child query may only match nested docs since the
`nested` object has both `include_in_parent` and `include_in_root` set to
`false`. While doing performance testing in the same conditions as above,
response times went from 850ms to 270ms.
2017-02-14 16:05:19 +01:00
Tanguy Leroux e2e5937455 Use `typed_keys` parameter to prefix suggester names by type in search responses (#23080)
This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.
2017-02-10 10:53:38 +01:00
Tanguy Leroux 63ea6f7168 [Docs] Remove unnecessary // TEST[continued] in search-template doc
It has been explained in e39b96f257
2017-02-10 10:08:24 +01:00
Jim Ferenczi 94087b3274 Removes ExpandCollapseSearchResponseListener, search response listeners and blocking calls
This changes removes the SearchResponseListener that was used by the ExpandCollapseSearchResponseListener to expand collapsed hits.
The removal of SearchResponseListener is not a breaking change because it was never released.
This change also replace the blocking call in ExpandCollapseSearchResponseListener by a single asynchronous multi search request. The parallelism of the expand request can be set via CollapseBuilder#max_concurrent_group_searches

Closes #23048
2017-02-09 18:06:10 +01:00
Tanguy Leroux 832952cb29 [Docs] Fix consoleify search-template.asciidoc
It does not reproduce well, hopefully this will fix the failure on DELETE _search/template/<templatename>.
2017-02-08 21:23:38 +01:00
Jay Modi 7f3769c745 Remove ldjson support and document ndjson for bulk/msearch (#23049)
This commit removes support for the `application/x-ldjson` Content-Type header as this was only used in the first draft
of the spec and had very little uptake. Additionally, the docs for bulk and msearch have been updated to specifically
call out ndjson and mention that the newline character may be preceded by a carriage return.

Finally, the bulk request handling of the carriage return has been improved to remove this character from the source.

Closes #23025
2017-02-08 11:55:50 -05:00
Tanguy Leroux 477d1aa8bf [Docs] Consoleify multi-search and search-template docs (#23047)
Relates #23001
2017-02-08 17:05:22 +01:00
Nik Everett 0e98c9107a Docs: CONSOLEify some more docs
These need to be CONSOLEified *now* because we're starting to
require Content-Type headers and they didn't have any.

* cluster/reroute: Marked as CONSOLE but skipped because the docs
build runs with a single node.
* docs/bulk: Marked as NOTCONSOLE because the snippets describe
either examples or `curl` commands. Fixed the `curl` command to
include the `Content-Type` header.
* query-dsl/terms-query: Marked as CONSOLE.
* search/request/rescore: Marked as CONSOLE. Fixed deprecated
syntax.

Relates #23001
Relates #18160
2017-02-07 16:49:01 -05:00
Jim Ferenczi bbf62e3472 CONSOLify indices/analyze.asciidoc and search/field-stats.asciidoc
Relates #23001
2017-02-07 20:15:09 +01:00
Clinton Gormley e181a020a9 Replaced absolute URLs in docs with attributes 2017-02-04 12:05:03 +01:00
Christoph Büscher 4cb8d9d08c Merge branch 'master' into feature/rank-eval
Conflicts:
	core/src/main/java/org/elasticsearch/script/Script.java
        docs/reference/search.asciidoc
2017-02-03 17:27:20 +01:00
Clinton Gormley 8ace37e214 Fix asciidoc in stored fields 2017-02-03 10:18:01 +01:00
Nicholas Knize b41d5747f0 Reduce GeoDistance insanity
GeoDistance query, sort, and scripts make use of a crazy GeoDistance enum for handling 4 different ways of computing geo distance: SLOPPY_ARC, ARC, FACTOR, and PLANE. Only two of these are necessary: ARC, PLANE. This commit removes SLOPPY_ARC, and FACTOR and cleans up the way Geo distance is computed.
2017-02-02 12:39:42 -06:00
Jim Ferenczi f6d38d480a Integrate UnifiedHighlighter (#21621)
* Integrate UnifiedHighlighter

This change integrates the Lucene highlighter called "unified" in the list of supported highlighters for ES.
This highlighter can extract offsets from either postings, term vectors, or via re-analyzing text.
The best strategy is picked automatically at query time and depends on the field and the query to highlight.
2017-01-31 19:06:03 +01:00
Jim Ferenczi e7e871acdd Fix link to keyword and numeric type 2017-01-30 13:57:28 +01:00
Clinton Gormley 938f5194ef Include field-collapsing docs in request-body search 2017-01-30 11:47:12 +01:00
Jim Ferenczi e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Christoph Büscher 9ed867ea83 [DOCS] Fix inconsistent formatting for fieldnames in profile.asciidoc 2017-01-18 10:41:22 +01:00
Clinton Gormley 401438819e Docs: Fix the first highlighting example to work
Closes #22642
2017-01-17 12:20:03 +01:00
Christoph Büscher 2791c69960 Update profile.asciidoc
Making the "Human readable output" section a note instead of an own section.
2017-01-16 16:19:07 +01:00
Christoph Büscher 49a49da3f5 [Docs] Fix section title in profile.asciidoc 2017-01-16 14:53:06 +01:00
Christoph Büscher 59a48ffc41 ProfileResult and CollectorResult should print machine readable timing information (#22561)
Currently both ProfileResult and CollectorResult print the time field in a human readable string format
 (e.g. "time": "55.20315000ms"). When trying to parse this back to a long value, for example to use in 
the planned high level java rest client, we can lose precision because of conversion and rounding issues. 
This change adds a new additional field (`time_in_nanos`) to the profile response to be able to get the 
original time value in nanoseconds back. 

The old `time` field is only printed when the `?`human=true` flag in the url is set. This follow the behaviour for 
all other stats-related apis. Also the format of the `time` field is slightly changed. Instead of always formatting 
the output as a 10-digit ms value, by using the `XContentBuilder#timeValueField()` method we now print 
the largest time unit present is used (e.g. "s", "ms", "micros").
2017-01-16 14:27:55 +01:00
maciejkula b4c8c21553 State default sort order on missing values
Closes #19099
2017-01-13 17:05:13 +01:00
Lee Hinman 2db01b6127 Merge remote-tracking branch 'dakrone/disable-all-by-default' 2017-01-12 10:17:51 -07:00
Tanguy Leroux df703dce0a [DOC] Document {{url}} mustache function (#22549)
This function introduced in #20838 wasn't documented at all.

Related to #22459
2017-01-12 14:57:03 +01:00
Lee Hinman 7a18bb50fc Disable _all by default
This change disables the _all meta field by default.

Now that we have the "all-fields" method of query execution, we can save both
indexing time and disk space by disabling it.

_all can no longer be configured for indices created after 6.0.

Relates to #20925 and #21341
Resolves #19784
2017-01-11 16:47:13 -07:00
Martijn van Groningen cb2333dacd percolator: remove deprecated percolate and mpercolate apis 2017-01-10 11:18:27 +01:00
Isabel Drost-Fromm 46c30e6bc3 Make maximum number of parallel search requests configurable. (#22192)
Problem: So far all rank eval requests are being executed in parallel. If there
are more than the search thread pool can handle, or if there are other search
requests executed in parallel rank eval can fail.

Solution: Make number of max_concurrent_searches configurable.

Name of configuration parameter is analogous to msearch. Default
max_concurrent_searches set to 10: Rank_eval isn't particularly time critical so
trying to avoid being more clever than probably needed here. Can set this value
through the API to a higher value anytime.

Fixes #21403
2016-12-19 13:05:49 +01:00
Isabel Drost-Fromm bdc32be8b7 Support specifying multiple templates (#22139)
Problem: We introduced the ability to shorten the rank eval request by using a
template in #20231. When playing with the API it turned out that there might be
use cases where - e.g. due to various heuristics - folks might want to translate
the original user query into more than just one type of Elasticsearch query.

Solution: Give each template an id that can later be referenced in the
actual requests.

Closes #21257
2016-12-19 12:49:15 +01:00
Isabel Drost-Fromm b1e0d698ac Merge branch 'master' into feature/rank-eval 2016-12-19 10:16:16 +01:00
Masaru Hasegawa a0185c83a7 Merge pull request #21393 from masaruh/alias_boost
Resolve index names in indices_boost
2016-12-16 15:07:51 +09:00
Isabel Drost-Fromm 5618d6ca49 Merge branch 'master' into feature/rank-eval 2016-12-15 10:29:26 +01:00
Areek Zillur cdd5fbe3a1 Deprecate _suggest endpoint in favour of _search (#20305)
* Replace _suggest endpoint to _search in docs

In 5.0, the _suggest endpoint is just sugar for _search
with suggestions specified. Users should move away from
using the _suggest endpoint, as it is marked as deprecated in 5.x and
will be removed in 6.0

* update docs to use _search endpoint instead of _suggest

* Add deprecation logging to RestSuggestAction

* Use search endpoint instead of suggest endpoint in rest tests
2016-12-14 21:49:53 -05:00
Isabel Drost-Fromm eeff9fb100 Adjust docs to reflect removed template endpoint
The dedicated template endpoint for rank_eval queries was removed, reflect this in the docs as well.
2016-12-13 13:59:11 +01:00
Masaru Hasegawa 3df2a086d4 Resolve index names in indices_boost
This change allows specifying alias/wildcard expression in indices_boost.
And added another format for specifying indices_boost. It accepts array of index name and boost pair.
If an index is included in multiple aliases/wildcard expressions, the first match will be used.
With new format, old format is marked as deprecated.

Closes #4756
2016-12-11 21:41:49 +09:00
Isabel Drost-Fromm 5c6cdb90ad Merge branch 'master' into feature/rank-eval 2016-12-07 10:59:05 +01:00
Jim Ferenczi b42ca6bcc9 Include unindexed field in FieldStats response (#21821)
* Include unindexed field in FieldStats response

This change adds non-searchable fields to the FieldStats response. These fields do not have min/max informations but they can be aggregatable. Fields that are only stored in _source (store:no, index:no, doc_values:no) will still be missing since they do not have any useful information to show. Indices and clients must be at least on V_5_2_0 to see this change.
2016-12-06 13:32:57 +01:00
Isabel Drost-Fromm 2fecd8e394 Merge branch 'master' into feature/rank-eval 2016-11-30 11:17:56 +01:00
Jim Ferenczi d791ddf704 Upgrade to lucene-6.4.0-snapshot-ec38570 (#21853)
Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license
Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true
Adapt the expectations of some tests to the new format of the Lucene explain output
2016-11-29 18:40:31 +01:00
Isabel Drost-Fromm 3e0aedd7da Merge branch 'master' into feature/rank-eval 2016-11-29 11:09:22 +01:00
Adrin Jalali eec05ec208 then -> than (#21829) 2016-11-28 17:04:56 +01:00
Isabel Drost-Fromm 2b24091361 Merge branch 'master' into feature/rank-eval 2016-11-28 11:45:29 +01:00
Adrin Jalali 953928b2c5 typo fix (it self -> itself) (#21781)
* typo fix.

* apply "stored field value"

* replaced "whereas" with "on the contrary"
2016-11-24 17:11:43 +01:00
Adrin Jalali 0871073f9b clarification on geo distance sorting (#21779)
* clarification on geo distance sorting

* applying the suggested change
2016-11-24 16:06:10 +01:00
Christoph Büscher e1fe0dc462 Merge branch 'master' into feature/rank-eval 2016-11-24 08:57:26 +01:00
Luca Cavanna db5a72774b Add indices and filter information to search shards api output (#21738)
Add indices and filter information to search shards api output

The search shards api returns info about which shards are going to be hit by executing a search with provided parameters: indices, routing, preference. Indices can also be aliases, which can also hold filters. The output includes an array of shards and a summary of all the nodes the shards are allocated on. This commit adds a new indices section to the search shards output that includes one entry per index, where each index can be associated with an optional filter in case the index was hit through a filtered alias.

This is relevant since we have moved parsing of alias filters to the coordinating node.

Relates to #20916
2016-11-22 23:00:25 +01:00
Luca Cavanna db8b2dceea Remove ignored type parameter in search_shards api (#21688)
The `type` parameter has always been accepted by the search_shards api, probably to make the api and its urls the same as search. Truth is that the type never had any effect, it's been ignored from day one while accepting it may make users think that we actually do something with it.

This commit removes support for the type parameter from the REST layer and the Java API. Backwards compatibility is maintained on the transport layer though.

The new added serialization test also uncovered a bug in the java API where the `ClusterSearchShardsRequest` could be created with no arguments, but the indices were required to be not null otherwise the request couldn't be serialized as `writeTo` would throw NPE. Fixed by setting a default value (empty array) for indices.
2016-11-22 17:22:33 +01:00
Christoph Büscher 1d3e58ab9f Merge branch 'master' into feature/rank-eval 2016-11-22 11:31:46 +01:00
Lee Hinman 11da09e9bc Allow overriding all-field leniency when `lenient` option is specified
As part of #20925 and #21341 we added an "all-fields" mode to the
`query_string` and `simple_query_string`. This would expand the query to
all fields and automatically set `lenient` to true.

However, we should still allow a user to override the `lenient` flag to
whichever value they desire, should they add it in the request. This
commit does that.
2016-11-21 21:32:25 -07:00
Luca Wintergerst 277f4b8d24 fix two errors in suggester docs
The first changed referred to an example of the 2.4 documentation. I removed the no longer relevant parts. We should consider adding a little more here. 

The second change was just then->than in the suggest_mode popular section
2016-11-18 12:05:49 +01:00
Christoph Büscher 1f5f2e312a Use the twitter index in documentation snippets 2016-11-17 17:03:46 +01:00
Isabel Drost-Fromm e5a08937e3 Fix failing doc test, add missing CONSOLE 2016-11-17 15:01:04 +01:00
Isabel Drost-Fromm 19bb0a928d Reference documentation for rank evaluation API (#21427)
* Reference documentation for rank evaluation API

This adds a first page of reference documentation to the current state of the
rank evaluation API.

Closes to #21402

* Add default values for precision metric.

Add information on default relevant_rating_threshold and ignore_unlabeled
settings.

Relates to #21304

* Move under search request docs, fix formatting

Also removes some detail where it seemed unneeded for reference docs
2016-11-17 10:27:57 +01:00
Nik Everett 593d47efe2 Make it clear _suggest doesn't support source filtering (#21268)
We plan to deprecate `_suggest` during 5.0 so it isn't worth fixing
it to support the `_source` parameter for `_source` filtering. But we
should fix the docs so they are accurate.

Since this removes the last non-`// CONSOLE` line in
`completion-suggest.asciidoc` this also removes it from the list of
files that have non-`// CONSOLE` docs.

Closes #20482
2016-11-06 20:15:45 -05:00
Adrien Grand 52de0645fb Remove `lowercase_expanded_terms` and `locale` from query-parser options. (#20208)
Lucene 6.2 introduces the new `Analyzer.normalize` API, which allows to apply
only character-level normalization such as lowercasing or accent folding, which
is exactly what is needed to process queries that operate on partial terms such
as `prefix`, `wildcard` or `fuzzy` queries. As a consequence, the
`lowercase_expanded_terms` option is not necessary anymore. Furthermore, the
`locale` option was only needed in order to know how to perform the lowercasing,
so this one can be removed as well.

Closes #9978
2016-11-02 14:25:08 +01:00
Craig Squire 1f1daf59bc Documentation updates for scroll API size parameter (#21229)
* Document size parameter for scroll API

* Fix size parameter behavior description for scroll
2016-11-01 15:55:09 -04:00
Igor Motov 17ad88d539 Makes search action cancelable by task management API
Long running searches now can be cancelled using standard task cancellation mechanism.
2016-10-25 12:27:34 -10:00
Jim Ferenczi d0bbe89c16 Optimize query with types filter in the URL (t/t/_search) (#20979)
This change adds a TypesQuery that checks if the disjunction of types should be rewritten to a MatchAllDocs query. The check is done only if the number of terms is below a threshold (16 by default and configurable via max_boolean_clause).
2016-10-20 12:33:32 +02:00
Joshua Rich cdb156e691 Merge pull request #20794 from joshuar/doc/fix_highlighter_ambiguities
[DOCS] Use a better name for fields in examples to avoid ambiguity
2016-10-18 14:23:27 +11:00
Adrien Grand 7a403f640b Clarify some docs about geo-distance sorting. (#20735)
This also improves formatting a bit.
2016-10-07 15:26:34 +02:00
Jason Tedor d01a62908a Change separator for shards preference
The shards preference on a search request enables specifying a list of
shards to hit, and then a secondary preference (e.g., "_primary") can be
added. Today, the separator between the shards list and the secondary
preference is ';'. Unfortunately, this is also a valid separtor for URL
query parameters. This means that a preference like "_shards:0;_primary"
will be parsed into two URL parameters: "_shards:0" and "_primary". With
the recent change to strict URL parsing, the second parameter will be
rejected, "_primary" is not a valid URL parameter on a search
request. This means that this feature has never worked (unless the ';'
is escaped, but no one does that because our docs do not that, and there
was no indication from Elasticsearch that this did not work). This
commit changes the separator to '|'.

Relates #20786
2016-10-07 07:17:01 -05:00
Joshua Rich e06a40ccbd [DOCS] Use a better name for fields in examples to avoid ambiguity
Previously, this doc was using a field called "content". This is
confusing, especially when the doc starts talking about the content of
the content field.  This change makes the field name "comment" which
is less ambiguous and also changes some related field names in the doc
to make a consistent example theme of editing docs around blog posts.
2016-10-07 14:46:55 +11:00
Nik Everett 41d6529d06 CONSOLEify scroll docs
This causes the snippets to be tested during the build and gives
helpful links to the reader to open the docs in console or copy them
as curl commands.

Relates to #18160
2016-10-05 11:21:54 -04:00
Simon Willnauer 74184cb1b0 Stabelize tests in phrase-suggest.asciidoc 2016-09-29 11:13:17 +02:00
Nik Everett 560fba1b28 Document that sliced scroll works for reindex
Surprise! You can use sliced scroll to easily parallelize reindex
and friend. They support it because they use the same infrastructure
as a regular search to parse the search request. While we would like
to make an "automatic" option for parallelizing reindex, this manual
option works right now and is pretty convenient!
2016-09-26 05:27:44 +02:00
David Pilato ed4d0881b1 Add profile and explain parameters to template API
We can now run templates using `explain` and/or `profile` parameters.
Which is interesting when you have defined a complicated profile but want to debug it in an easier way than running the full query again.

You can use `explain` parameter when running a template:

```js
GET /_search/template
{
  "file": "my_template",
  "params": {
    "status": [ "pending", "published" ]
  },
  "explain": true
}
```

You can use `profile` parameter when running a template:

```js
GET /_search/template
{
  "file": "my_template",
  "params": {
    "status": [ "pending", "published" ]
  },
  "profile": true
}
```
2016-09-19 17:52:13 +02:00
Nik Everett e4c80c94e9 Convert more search docs to CONSOLE
`profile.asciidoc` now runs all of its command but it doesn't validate
all of the results. Writing the validation is time consuming so I only
did some of it.
2016-09-15 11:58:21 -04:00
Nik Everett 2d568ece2d CONSOLEify some search docs
* search/search.asciidoc
* search/request-body.asciidoc
* search/explain.asciidoc
* search/search-shards.asciidoc
2016-09-14 13:06:37 -04:00
Tobias Günther 3a7a437594 Update rescoring docs in respect to sort (#20477)
* Update rescoring docs in respect to sort

If sort is present in a query the rescore query is not executed. As long as this feature is neither implemented (see discussion in #6788) nor  the combination of sort and rescoring raises an error, we should warn the user in the documentation about this.

* Missed a dot
2016-09-14 17:07:10 +01:00
Nicholas Knize 598bab93ae [DOC] Cleanup dangling references to deprecated geo parameters
With the cut over to LatLonPoint the geohash, geohash_precision, lat_lon, and geohash_prefix parameters have been removed. This commit fixes the doc build by removing the remaining dangling references to these removed parameters.
2016-09-13 16:38:38 -05:00
Jim Ferenczi 1764ec56b3 Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
2016-09-13 20:54:41 +02:00
Isabel Drost-Fromm 0e707f241e Add docs to template support for _msearch (#17382)
Add docs to template support for _msearch

Relates to #10885
Relates to #15674

* Reference those docs from the rest api spec for _msearch/template support.
2016-09-13 13:19:25 +02:00
Clinton Gormley 9c3007496c Update search-template.asciidoc
Fixed typo
2016-09-09 13:06:49 +02:00
Nik Everett e03fb602cd Add CONSOLE places where it is obviously missing
These places already have other annotations like `// TEST` and
`// TESTSETUP` so they are already in console format.
2016-09-06 10:48:19 -04:00
Nik Everett 5cff2a046d Remove most of the need for `// NOTCONSOLE`
and be much more stingy about what we consider a console candidate.

* Add `// CONSOLE` to check-running
* Fix version in some snippets
* Mark groovy snippets as groovy
* Fix versions in plugins
* Fix language marker errors
* Fix language parsing in snippets

  This adds support for snippets who's language is written like
  `[source, txt]` and `["source","js",subs="attributes,callouts"]`.

  This also makes language required for snippets which is nice because
  then we can be sure we can grep for snippets in a particular language.
2016-09-06 10:32:54 -04:00
Areek Zillur c92f82e624 Merge pull request #20169 from areek/doc/fix_completion_breaking_changes
Update breaking changes for completion suggester
2016-09-02 12:39:16 -04:00
Areek Zillur af215b528f move completion performance tips from migration docs to completion docs 2016-09-02 12:37:56 -04:00
Nik Everett dcaed58f90 [doc] Remove leftover from CONSOLE conversion 2016-08-31 13:14:47 -04:00
Clinton Gormley 70f4d718f8 Update percolate.asciidoc
Tidy up percolator docs
2016-08-30 13:05:02 +02:00
Josh Becker 3c24ea43fd Docs: Remove extra word from phrase-suggester 2016-08-26 13:02:54 -04:00
Jim Ferenczi 9bedbbaa6a Fixed doc links 2016-08-24 22:37:59 +02:00
Jim Ferenczi 4682fc34ae Add the ability to disable the retrieval of the stored fields entirely
This change adds a special field named _none_ that allows to disable the retrieval of the stored fields in a search request or in a TopHitsAggregation.

To completely disable stored fields retrieval (including disabling metadata fields retrieval such as _id or _type) use _none_ like this:

````
POST _search
{
   "stored_fields": "_none_"
}
````
2016-08-24 16:40:08 +02:00
Adrien Grand d894db1590 Only use `PUT` for index creation, not POST. #20001
Currently both `PUT` and `POST` can be used to create indices. This commit
removes support for `POST index_name` so that we can use it to index documents
with auto-generated ids once types are removed.

Relates #15613
2016-08-17 10:15:42 +02:00
Gytis Šk 8a97f05e41 Fix typos in inner-hits documentation (#19910) 2016-08-11 21:15:11 +02:00
Clinton Gormley 2e3bc656e6 Update inner-hits.asciidoc
Typo

Closes #19775
2016-08-11 12:36:31 +02:00
Areek Zillur d107141bf6 Remove payload option from completion suggester
The payload option was introduced with the new completion
suggester implementation in v5, as a stop gap solution
to return additional metadata with suggestions.

Now we can return associated documents with suggestions
(#19536) through fetch phase using stored field (_source).
The additional fetch phase ensures that we only fetch
the _source for the global top-N suggestions instead of
fetching _source of top results for each shard.
2016-08-08 16:04:06 -04:00
Areek Zillur 469eb2546d Merge pull request #19536 from areek/enhancement/completion_suggester_documents
Add support for returning documents with completion suggester
2016-08-05 18:55:08 -04:00
Areek Zillur fee013c07c Add support for returning documents with completion suggester
This commit enables completion suggester to return documents
associated with suggestions. Now the document source is returned
with every suggestion, which respects source filtering options.

In case of suggest queries spanning more than one shard, the
suggest is executed in two phases, where the last phase fetches
the relevant documents from shards, implying executing suggest
requests against a single shard is more performant due to the
document fetch overhead when the suggest spans multiple shards.
2016-08-05 17:51:45 -04:00
Nik Everett 1e587406d8 Fail yaml tests and docs snippets that get unexpected warnings
Adds `warnings` syntax to the yaml test that allows you to expect
a `Warning` header that looks like:
```
    - do:
        warnings:
            - '[index] is deprecated'
            - quotes are not required because yaml
            - but this argument is always a list, never a single string
            - no matter how many warnings you expect
        get:
            index:    test
            type:    test
            id:        1
```

These are accessible from the docs with:
```
// TEST[warning:some warning]
```

This should help to force you to update the docs if you deprecate
something. You *must* add the warnings marker to the docs or the build
will fail. While you are there you *should* update the docs to add
deprecation warnings visible in the rendered results.
2016-08-04 15:23:05 -04:00
debadair bcc5c7c07a Docs: Fixed callout error that broke the build. 2016-08-03 17:20:00 -07:00
Nik Everett 3be1e7ec35 CONSOLify the completion suggester docs (#19758)
* CONSOLEify search/suggesters/completion
* CONSOLEify context suggester docs
2016-08-03 18:40:17 -04:00
Isabel Drost-Fromm 672ffb6e4d Revert "Add console to docs for inner hits, explain, and friends" 2016-08-01 14:09:54 +02:00
Isabel Drost-Fromm 5104437e4f Merge pull request #18519 from MaineC/docs/add_console_to_search
Add console to docs for inner hits, explain, and friends
2016-08-01 12:10:09 +02:00
Clinton Gormley 3c639f0673 Removed array-of-string example from search template
Relates to #19643
2016-07-28 13:49:53 +02:00
Isabel Drost-Fromm 2a148a5b25 Update to current format. 2016-07-27 14:30:14 +02:00
Nik Everett 3c0288ee98 Consolify term and phrase suggester docs
This includes a working example of reverse filters to support
correcting prefix errors.
2016-07-26 12:28:31 -04:00
Isabel Drost-Fromm 00a8516780 Merge branch 'master' into docs/add_console_to_search 2016-07-25 11:54:26 +02:00
Nik Everett 7aeea764ba Remove wait_for_status=yellow from the docs
It is no longer required after 687e2e12b3.
2016-07-15 16:02:07 -04:00
Zachary Tong c950ea0023 Record method counts while profiling (#18302)
Invocation counts can be used to help judge the selectivity of individual query components in the context of the entire query.  E.g. a query may not look selective when run by itself (matches most of the index), but when run in context of a full search request, is evaluated only rarely due to execution order

Since this is modifying the base timing class, it'll enrich both query and agg profiles (as well as future profile results)
2016-07-14 09:46:24 -04:00
Clinton Gormley ee86a9f634 Update field-stats.asciidoc
Change use of index constraints to correctly identify any indices containing relevant docs

Closes #19232
2016-07-07 14:56:40 +02:00
Clinton Gormley f572f8cc17 Bad asciidoc link 2016-07-04 11:02:06 +02:00
Jim Ferenczi afe99fcdcd Restore reverted change now that alpha4 is out:
Rename `fields` to `stored_fields` and add `docvalue_fields`

`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-07-04 10:39:49 +02:00
Colin Goodheart-Smithe 0d7c11ea1d [DOCS] put profiling performance and limitations section on same page 2016-06-30 12:28:46 +01:00
Isabel Drost-Fromm 9e155e48a5 Fix request-body search test. 2016-06-29 11:32:27 +02:00
Isabel Drost-Fromm 9f30ae3359 Merge branch 'master' into docs/add_console_to_search 2016-06-29 10:20:25 +02:00
Tanguy Leroux 4820d49120 Mustache: Add util functions to render JSON and join array values
This pull request adds two util functions to the Mustache templating engine:
- {{#toJson}}my_map{{/toJson}} to render a Map parameter as a JSON string
- {{#join}}my_iterable{{/join}} to render any iterable (including arrays) as a comma separated list of values like `1, 2, 3`. It's also possible de change the default delimiter (comma) to something else.

closes #18970
2016-06-29 09:48:58 +02:00
Colin Goodheart-Smithe 1aa31ec934 #19133 Added documentation for aggregation profiling
Added documentation for aggregation profiling
2016-06-28 19:33:55 +01:00
Colin Goodheart-Smithe 44ee56c073 Added documentation for aggregation profiling 2016-06-28 19:33:29 +01:00
Robert Muir 6d52cec2a0 Merge pull request #19092 from rmuir/more_painless_docs
cutover some docs to painless
2016-06-28 13:40:25 -04:00
Jim Ferenczi eb1e231a63 Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`"
This reverts commit 2f46f53dc8.
2016-06-27 17:20:32 +02:00
Robert Muir 6fc1a22977 cutover some docs to painless 2016-06-27 09:55:16 -04:00
Adrien Grand c87ba0bfa8 Fix docs build. 2016-06-23 09:44:33 +02:00
Tanguy Leroux 04da1bda0d Move templates out of the Search API, into lang-mustache module
This commit moves template support out of the Search API to its own dedicated Search Template API in the lang-mustache module. It provides a new SearchTemplateAction that can be used to render templates before it gets delegated to the usual Search API. The current REST endpoint are identical, but the Render Search Template endpoint now uses the same Search Template API with a new "simulate" option. When this option is enabled, the Search Template API only renders template and returns immediatly, without executing the search.

Closes #17906
2016-06-23 09:30:53 +02:00
Jim Ferenczi 2f46f53dc8 Rename `fields` to `stored_fields` and add `docvalue_fields`
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-06-22 17:38:30 +02:00
Jim Ferenczi 881afcba60 Fixed tests that failed now that BM25 is the default similarity. 2016-06-21 15:42:42 +02:00
Martijn van Groningen 5ad2fdaa8e inner_hits: Don't include `_id`, `_type` and `_index` keys in search response for inner hits
Closes #18091
2016-06-21 14:13:38 +02:00
Jason Tedor d09d89f8c5 Remove only node preference
This commit removes the search preference _only_node as the same
functionality can be obtained by using the search preference
_only_nodes. This commit also adds a test that ensures that _only_nodes
will continue to support specifying node IDs.

Relates #18875
2016-06-17 15:27:46 -04:00
Alexander Lin 7d42e7e716 Closes #18013. Added status field to _msearch response bodies. 2016-06-16 00:25:17 -07:00
Jason Tedor e96722d91c Add search preference to prefer multiple nodes
The search preference _prefer_node allows specifying a single node to
prefer when routing a request. This functionality can be enhanced by
permitting multiple nodes to be preferred. This commit replaces the
search preference _prefer_node with the search preference _prefer_nodes
which supplants the former by specifying a single node and otherwise
adds functionality.

Relates #18872
2016-06-14 21:34:24 -04:00
Vladimir Kovpak d5f71f9e85 Updated from parameter description. (#18852)
Not sure that my description better but origin description looks very weird,
and i try to make emphasize to offset...
2016-06-14 14:33:15 +02:00
Itamar Syn-Hershko 5a9303dec2 Fixing typos (#18851) 2016-06-14 14:22:55 +02:00
Martijn van Groningen 3b96055b23 msearch: Cap the number of searches the msearch api will concurrently execute
By default the number of searches msearch executes is capped by the number of
nodes multiplied with the default size of the search threadpool. This default can be
overwritten by using the newly added `max_concurrent_searches` parameter.

Before the msearch api would concurrently execute all searches concurrently. If many large
msearch requests would be executed this could lead to some searches being rejected
while other searches in the msearch request would succeed.

The goal of this change is to avoid this exhausting of the search TP.

Closes #17926
2016-06-13 10:13:08 +02:00
Jim Ferenczi 439b2a96e5 Add an index setting to limit the maximum number of slices allowed in a scroll request (default to 1024). 2016-06-10 09:43:32 +02:00
Britta Weber 053a615686 [TEST] wait for yellow before query execution
We can remove this once https://github.com/elastic/elasticsearch/pull/18759
is in.
2016-06-09 15:11:48 +02:00
Jim Ferenczi b9030bf6fe Add the ability to partition a scroll in multiple slices.
API:

```
curl -XGET 'localhost:9200/twitter/tweet/_search?scroll=1m' -d '{
    "slice": {
        "field": "_uid", <1>
        "id": 0, <2>
        "max": 10 <3>
    },
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    }
}
```

<1> (optional) The field name used to do the slicing (_uid by default)
<2> The id of the slice

By default the splitting is done on the shards first and then locally on each shard using the _uid field
with the following formula:
`slice(doc) = floorMod(hashCode(doc._uid), max)`
For instance if the number of shards is equal to 2 and the user requested 4 slices then the slices 0 and 2 are assigned
to the first shard and the slices 1 and 3 are assigned to the second shard.

Each scroll is independent and can be processed in parallel like any scroll request.

Closes #13494
2016-06-07 16:21:53 +02:00
Isabel Drost-Fromm ea3320e171 Merge pull request #18424 from MaineC/docs/add_console_to_highlighting
Docs/add console to highlighting
2016-05-24 12:14:36 +02:00
Isabel Drost-Fromm fcd6a08366 Add CONSOLE annotations to request body docs 2016-05-19 15:11:43 +02:00
Isabel Drost-Fromm e414881a43 Add CONSOLE to field-stats docs. 2016-05-19 13:45:53 +02:00
Isabel Drost-Fromm 88c5a574a6 Add CONSOLE to explain 2016-05-19 12:17:13 +02:00
Isabel Drost-Fromm 4057682d6f Add CONSOLE to inner hits examples. 2016-05-19 11:01:36 +02:00
Isabel Drost-Fromm 10874fbdf9 Add CONSOLE to scroll docs
Relates to #18160
2016-05-19 10:30:58 +02:00
Isabel Drost-Fromm 27e6908c8d Add indent 2016-05-19 09:33:29 +02:00
Isabel Drost-Fromm bf471c6950 Merge branch 'master' into docs/add_console_to_search_request_options 2016-05-19 09:32:46 +02:00
Isabel Drost-Fromm 9d2a3c0600 Merge pull request #18442 from MaineC/docs/add_console_to_fromsize
Add CONSOLE to from/size docs
2016-05-18 18:38:41 +02:00
Isabel Drost-Fromm 394a60f3fd Switch to more match query for better illustration 2016-05-18 15:52:08 +02:00
Isabel Drost-Fromm 1f0f6132be Merge branch 'master' into docs/add_console_to_highlighting 2016-05-18 15:45:20 +02:00
Isabel Drost-Fromm a5268cd40d Add CONSOLE to version docs 2016-05-18 15:32:48 +02:00
Isabel Drost-Fromm c20a669c2d Add CONSOLE to source filtering docs 2016-05-18 15:20:21 +02:00
Isabel Drost-Fromm a849cc97ea Add CONSOLE to script-fields docs 2016-05-18 14:38:54 +02:00
Isabel Drost-Fromm 0032d4760e Add CONSOLE to preference docs 2016-05-18 14:34:22 +02:00
Isabel Drost-Fromm eca53d909c Merge branch 'master' into docs/add_console_to_search_request_options 2016-05-18 14:31:48 +02:00
Isabel Drost-Fromm a3425b4bf8 Add CONSOLE to post-filter 2016-05-18 14:31:04 +02:00
Isabel Drost-Fromm 125b715e45 Adds CONSOLE to count api 2016-05-18 13:36:19 +02:00
Isabel Drost-Fromm f22f3c7df5 Add CONSOLE to several trivial search request docs.
Relates to #18160

Touches explain, fielddata-fields, fields, index-boost, min-score,
named-queries-and-filters, query
2016-05-18 13:15:19 +02:00