OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	713c07e14d	Add early termination support to BucketCollector (#33279 ) This commit adds the support to early terminate the collection of a leaf in the aggregation framework. This change introduces a MultiBucketCollector which handles CollectionTerminatedException exactly like the Lucene MultiCollector. Any aggregator can now throw a CollectionTerminatedException without stopping the collection of a sibling aggregator. This is useful for aggregators that can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).	2018-09-03 09:34:35 +02:00
lipsill	b7c0d2830a	[Docs] Remove repeating words (#33087 )	2018-08-28 13:16:43 +02:00
Ignacio Vera	d7219c05a2	Search: Support of wildcard on docvalue_fields (#32980 ) * Search: Support of wildcard on docvalue_fields For consistency with stored_fields, docvalue_fields should support the use of wildcards. Documentation of doc values fields is updated accordingly. See also: #26390 Closes #26299	2018-08-23 10:04:00 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Simon Willnauer	ffb1a5d5b7	Expose `max_concurrent_shard_requests` in `_msearch` (#33016 ) Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests` per sub search request. This change adds support for setting this parameter on all sub-search requests in an `_msearch`. Relates to #31877	2018-08-22 08:45:08 +02:00
markharwood	70d80a3d09	Docs enhancement: added reference to cluster-level setting `search.default_allow_partial_results` (#32810 ) Closes #32809	2018-08-16 10:21:37 +01:00
Christoph Büscher	c1cc0cef61	Add ERR to ranking evaluation documentation (#32314 ) This change adds a section about the Expected Reciprocal Rank metric (ERR) to the Ranking Evaluation documentation.	2018-07-24 19:58:34 +02:00
Christoph Büscher	fe6bb75eb4	Rename ranking evaluation `quality_level` to `metric_score` (#32168 ) The notion of "quality" is an overloaded term in the search ranking evaluation context. Its usually used to decribe certain levels of "good" vs. "bad" of a seach result with respect to the users information need. We currently report the result of the ranking evaluation as `quality_level` which is a bit missleading. This changes the response parameter name to `metric_score` which fits better.	2018-07-23 22:25:02 +02:00
Christoph Büscher	5cbd9ad177	Rename ranking evaluation response section (#32166 ) Currently the ranking evaluation response contains a 'unknown_docs' section for each search use case in the evaluation set. It contains document ids for results in the search hits that currently don't have a quality rating. This change renames it to `unrated_docs`, which better reflects its purpose.	2018-07-20 11:43:46 +02:00
David Turner	380b45b965	Improve docs for search preferences (#32159 ) Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation. Forward-port of #32098 to `master`.	2018-07-18 12:58:17 +01:00
Mayya Sharipova	80492cacfc	Add second level of field collapsing (#31808 ) * Put second level collapse under inner_hits Closes #24855	2018-07-13 11:40:03 -04:00
Christoph Büscher	450a450b2c	[Docs] Clarify accepted sort case (#31605 ) Rescore only works with an explicite "sort" element if it is on descending "_score". Even using "order" : "asc" will throw an error.	2018-07-06 10:11:36 +02:00
Christoph Büscher	5f87a84bef	[Docs] Correct default window_size (#31582 )	2018-07-04 14:07:20 +02:00
Julie Tibshirani	26a927a120	Fix a formatting issue in the docvalue_fields documentation. (#31563 )	2018-06-26 10:15:56 -07:00
Igor Motov	7a9d9b0abf	Add support for ignore_unmapped to geo sort (#31153 ) Adds support for `ignore_unmapped` parameter in geo distance sorting, which is functionally equivalent to specifying an `unmapped_type` in the field sort. Closes #28152	2018-06-07 11:11:13 -04:00
Jim Ferenczi	0f5e570184	Deprecates indexing and querying a context completion field without context (#30712 ) This change deprecates completion queries and documents without context that target a context enabled completion field. Querying without context degrades the search performance considerably (even when the number of indexed contexts is low). This commit targets master but the deprecation will take place in 6.x and the functionality will be removed in 7 in a follow up. Closes #29222	2018-05-31 16:09:48 +02:00
Adrien Grand	a19df4ab3b	Add a `format` option to `docvalue_fields`. (#29639 ) This commit adds the ability to configure how a docvalue field should be formatted, so that it would be possible eg. to return a date field formatted as the number of milliseconds since Epoch. Closes #27740	2018-05-23 14:39:04 +02:00
Fernando Medina Corey	739bb4f0ec	Fix a grammatical error in the 'search types' documentation. Simple grammatical fix.	2018-05-22 22:09:04 -07:00
Christoph Büscher	f7b5986682	[Docs] Fix script-fields snippet execution (#30693 ) Currently the first snippet in the documentation test in script-fields.asciidoc isn't executed, although it has the CONSOLE annotation. Adding a test setup annotation to it seems to fix the problem.	2018-05-22 20:22:42 +02:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Ke Li	d373e1b49c	Fix the search request default operation behavior doc (#29302 ) (#29405 )	2018-05-07 14:43:45 +02:00
Julie Tibshirani	5c9f08402e	Correct an example in the top-level suggester documentation. (#30224 )	2018-05-01 15:16:28 -07:00
Julie Tibshirani	f5978d6d33	In the field capabilities API, remove support for providing fields in the request body. (#30185 )	2018-04-27 16:14:11 -07:00
Saren Currie	0b4d2f5225	Clarify documentation of scroll_id (#29424 ) * Clarify documentation of scroll_id The Scroll API may return the same scroll ID for multiple requests due to server side state. This is not clear from the current documentation. * Further clarify scroll ID return behaviour	2018-04-26 09:45:48 +01:00
Julie Tibshirani	32dfb65144	In the field capabilities API, deprecate support for providing fields in the request body. (#30157 ) (cherry picked from commit d8d884b29d4aa7d01070484fee5de8d3db60cb25)	2018-04-25 23:01:53 -07:00
debadair	0c9baebe15	[DOCS] Added include for internal highlighters section. (#29597 )	2018-04-18 16:56:09 -07:00
Mayya Sharipova	bf6cfff080	[DOCS] Update highlighting docs (#28802 ) - add more explanation to some highlighting parameters - add a document describing how highlighters work internally	2018-04-18 17:41:19 -04:00
Adrien Grand	ebd6b5b7ba	Deprecate filtering on `_type`. (#29468 ) As indices are only allowed to have one type now, and types are going away in the future, we should deprecate filtering by `_type`. Relates #15613	2018-04-13 09:07:51 +02:00
Adrien Grand	4918924fae	Remove legacy mapping code. (#29224 ) Some features have been deprecated since `6.0` like the `_parent` field or the ability to have multiple types per index. This allows to remove quite some code, which in-turn will hopefully make it easier to proceed with the removal of types.	2018-04-11 09:41:37 +02:00
Christoph Büscher	9f0c5ccf34	[Docs] Correct typos in rank-eval and multi-search	2018-04-10 12:48:16 +02:00
Christoph Büscher	dc1c16964a	[Docs] Correct experimental note formatting	2018-04-03 16:16:21 +02:00
Christoph Büscher	c3fdf8fbfb	[Docs] Fix small typo in ranking evaluation docs	2018-03-28 17:45:44 +02:00
Christoph Büscher	afe95a7738	[Docs] Add rank_eval size parameter k (#29218 ) The rank_eval documentation was missing an explanation of the parameter `k` that controls the number of top hits that are used in the ranking evaluation. Closes #29205	2018-03-23 18:04:32 +01:00
Mayya Sharipova	fb5b2dff57	Correct the way to reference params in painless	2018-03-13 12:33:37 -07:00
Mayya Sharipova	f53d159aa1	Limit analyzed text for highlighting (improvements) (#28808 ) Increase the default limit of `index.highlight.max_analyzed_offset` to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to https://github.com/elastic/kibana/issues/16764	2018-03-02 08:09:05 -08:00
Jason Tedor	fb073216b1	Move search concurrency and parallelism paragraphs These paragraphs should be on the top-level search page for visibility so this commit moves them, and puts them under a clear heading.	2018-02-26 07:47:57 -08:00
olcbean	beb8b10556	Fix inconsistency in docs regarding single types (#28715 ) This commit fixes some inconsistencies in the docs regarding single types. The inconsistencies are between the verbiage and the relevant snippets.	2018-02-26 07:08:37 -08:00
Rachel Johnson	617044e5fe	Update search.asciidoc (#28646 ) [DOCS] Corrected typo - singe to single.	2018-02-12 15:02:13 -08:00
Jim Ferenczi	7dc00ef1f5	Search option terminate_after does not handle post_filters and aggregations correctly (#28459 ) * Search option terminate_after does not handle post_filters and aggregations correctly This change fixes the handling of the `terminate_after` option when post_filters (or min_score) are used. `post_filter` should be applied before `terminate_after` in order to terminate the query when enough document are accepted by the post_filters. This commit also changes the type of exception thrown by `terminate_after` in order to ensure that multi collectors (aggregations) do not try to continue the collection when enough documents have been collected. Closes #28411	2018-02-12 13:36:33 +01:00
markharwood	77d2dd203e	Search - add allow_partial_search_results flag with default setting false (#28440 ) Adds allow_partial_search_results flag to search requests with default setting = true. When false, will error if search either timeouts, has partial errors or has missing shards rather than returning partial search results. A cluster-level setting provides a default for search requests with no flag. Closes #27435	2018-01-31 15:51:29 +00:00
Vlad Holubiev	eea9ee57dd	[Docs] Fix typo in inner-hits.asciidoc (#27998 )	2018-01-31 11:55:53 +01:00
Christoph Büscher	6731c76900	Add ranking evaluation API to High Level Rest Client (#28357 ) This change adds support for the new ranking evaluation API to the High Level Rest Client. This mostly means adding support for parsing the various response objects back from the REST representation. It includes one change to the response syntax where previously we didn't print the type of the metric details section but we now need it to pick the right parser to parse this section back. Closes #28198	2018-01-30 17:48:09 +01:00
Robin Stocker	64bbb3a235	[Docs] Clarify `html` encoder in highlighting.asciidoc (#27766 ) The previous description was a bit confusing because the pre/post tags used for highlighting are not escaped, the rest of the content is.	2018-01-24 16:45:40 +01:00
Andrew Kramarev	ef468327e9	mistyping in one of the highlighting examples comment -> content (#28139 )	2018-01-18 17:32:42 -05:00
Jim Ferenczi	defb53a0bc	add a note regarding rescore and sort (#28251 )	2018-01-18 09:23:19 +01:00
Christoph Büscher	39ff7b5a3f	[Docs] Correct response json in rank-eval.asciidoc	2018-01-11 15:52:11 +01:00
Andrew Banchich	e92acefba0	[Docs] Improvements in script-fields.asciidoc (#28174 )	2018-01-11 10:59:27 +01:00
Vlad Holubiev	31d4a4bf7c	[DOCS] Fix link formatting (#27990 )	2017-12-26 16:25:05 +00:00
Mayya Sharipova	cbd271e497	Limit the analyzed text for highlighting (#27934 ) * Limit the analyzed text for highlighting - Introduce index level settings to control the max number of character to be analyzed for highlighting - Throw an error if analysis is required on a larger text Closes #27517	2017-12-21 10:19:58 -05:00
Christoph Büscher	f3293879b5	[Docs] Improve rendering of ranking evaluation docs	2017-12-15 10:45:44 +01:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	3d3a1d2a0d	Adding short description for experimental status in docs	2017-12-08 15:12:15 +01:00
Christoph Büscher	52cb6c8ef2	Merge branch 'master' into rankeval	2017-12-07 14:22:46 +01:00
Deb Adair	2f9a882061	[DOCS] Fixed typos and broken attribute.	2017-12-05 11:46:40 -08:00
Christoph Büscher	bbec33d35c	Merge branch 'master' into rankeval	2017-12-04 12:57:19 +01:00
olcbean	d25c9671de	Deprecate `jarowinkler` in favor of `jaro_winkler` (#27526 ) Jaro and Winkler are two people, so we should use the same naming convention as for Damerau–Levenshtein.	2017-11-30 12:49:34 +00:00
Martijn van Groningen	dbf17152d1	docs: use `doc_value_fields` fields as alternative for nested inner hits _source fetching instead of stored fields as doc values are more likely to be enabled by default	2017-11-29 17:31:39 +01:00
Christoph Büscher	35688f6441	Merge branch 'master' into rankeval	2017-11-29 15:24:06 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Martijn van Groningen	cb1204774b	Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response. Also include _type and _id for parent/child hits inside inner hits. In the case of top_hits aggregation the nested search hits are directly returned and are not grouped by a root or parent document, so it is important to include the _id and _index attributes in order to know to what documents these nested search hits belong to. Closes #27053	2017-11-28 14:05:29 +01:00
Christoph Büscher	5661b1c3df	Merge branch 'master' into rankeval	2017-11-24 16:25:05 +01:00
olcbean	fd564b10db	Deprecate `levenstein` in favor of `levenshtein` (#27409 ) Support both spellings thoughout 6.x, reporting the incorrect one as deprecated.	2017-11-23 12:53:47 +00:00
Christoph Büscher	5735477283	Fix some documentation typos	2017-11-23 12:31:25 +01:00
Simon Willnauer	fadbe0de08	Automatically prepare indices for splitting (#27451 ) Today we require users to prepare their indices for split operations. Yet, we can do this automatically when an index is created which would make the split feature a much more appealing option since it doesn't have any 3rd party prerequisites anymore. This change automatically sets the number of routinng shards such that an index is guaranteed to be able to split once into twice as many shards. The number of routing shards is scaled towards the default shard limit per index such that indices with a smaller amount of shards can be split more often than larger ones. For instance an index with 1 or 2 shards can be split 10x (until it approaches 1024 shards) while an index created with 128 shards can only be split 3x by a factor of 2. Please note this is just a default value and users can still prepare their indices with `index.number_of_routing_shards` for custom splitting. NOTE: this change has an impact on the document distribution since we are changing the hash space. Documents are still uniformly distributed across all shards but since we are artificually changing the number of buckets in the consistent hashign space document might be hashed into different shards compared to previous versions. This is a 7.0 only change.	2017-11-23 09:48:54 +01:00
Christoph Büscher	d979ccace9	Merge branch 'master' into rankeval	2017-11-21 14:11:02 +01:00
Christoph Büscher	3348d2317f	Reworking javadocs, minor changes in some implementation classes	2017-11-21 14:09:04 +01:00
Christoph Büscher	5c65a59369	Extending rank_eval asciidocs	2017-11-21 14:08:42 +01:00
Christoph Büscher	d9e67a2c95	Extending `_rank_eval` documentation	2017-11-21 14:08:28 +01:00
Zachary Tong	6e9e07d6f8	Fix profiling naming issues (#27133 ) Some code-paths use anonymous classes (such as NonCollectingAggregator in terms agg), which messes up the display name of the profiler. If we encounter an anonymous class, we need to grab the super's name. Another naming issue was that ProfileAggs were not delegating to the wrapped agg's name for toString(), leading to ugly display. This PR also fixes up the profile documentation. Some of the examples were executing against empty indices, which shows different profile results than a populated index (and made for confusing examples). Finally, I switched the agg display names from the fully qualified name to the simple name, so that it's similar to how the query profiles work. Closes #26405	2017-11-06 16:37:33 -05:00
Shai Erera	bd0261916c	Fix Laplace scorer to multiply by alpha (and not add) (#27125 )	2017-10-31 13:08:44 +01:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00
Nhat	bf4c3642b2	remove _primary and _replica shard preferences (#26791 ) The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335	2017-10-08 11:03:06 -04:00
Christoph Büscher	bea8451b2f	Merge branch 'master' into feature/rank-eval	2017-09-15 11:44:51 +02:00
Jim Ferenczi	401f4ba2ce	Fix percolator highlight sub fetch phase to not highlight query twice (#26622 ) * Fix percolator highlight sub fetch phase to not highlight query twice The PercolatorHighlightSubFetchPhase does not override hitExecute and since it extends HighlightPhase the search hits are highlighted twice (by the highlight phase and then by the percolator). This does not alter the results, the second highlighting just overrides the first one but this slow down the request because it duplicates the work.	2017-09-14 09:31:14 +02:00
Tanguy Leroux	7404221b55	[Docs] Clarify size parameter in Completion Suggester doc (#26617 )	2017-09-13 17:28:31 +02:00
Jim Ferenczi	d68d8c9cef	Expose duplicate removal in the completion suggester (#26496 ) This change exposes the duplicate removal option added in Lucene for the completion suggester with a new option called `skip_duplicates` (defaults to false). This commit also adapts the custom suggest collector to handle deduplication when multiple contexts match the input. Closes #23364	2017-09-07 17:11:01 +02:00
Matt Weber	140395c83f	Multi-level Nested Sort with Filters (#26395 ) Multi-level Nested Sort with Filters Allow multiple levels of nested sorting where each level can have it's own filter. Backward compatible with previous single-level nested sort.	2017-08-30 18:52:56 +02:00
Martijn van Groningen	c821dce3fe	Revert "Multi-level Nested Sort with Filters" This reverts commit `6377afa6c3`.	2017-08-30 14:53:25 +02:00
Martijn van Groningen	6377afa6c3	Multi-level Nested Sort with Filters Allow multple levels of nested sorting where each level can have it's own filter. Backward compatible with previous single-level nested sort.	2017-08-30 14:30:20 +02:00
Tanguy Leroux	db54c4dc7c	[Docs] Convert more doc snippets (#26404 ) This commit converts some remaining doc snippets so that they are now testable.	2017-08-30 09:30:36 +02:00
Jim Ferenczi	86d97971a4	Remove the _all metadata field (#26356 ) * Remove the _all metadata field This change removes the `_all` metadata field. This field is deprecated in 6 and cannot be activated for indices created in 6 so it can be safely removed in the next major version (e.g. 7).	2017-08-28 17:43:59 +02:00
Christoph Büscher	62a7cac3a0	Merge branch 'master' into feature/rank-eval	2017-08-23 11:19:16 +02:00
Alexander Reelsen	483086220f	Docs: Add search response took time explanation (#26202 )	2017-08-15 08:43:26 +02:00
Martijn van Groningen	076167fbe5	inner hits: Unfiltered nested source should keep its full path like filtered nested source. Closes #23090	2017-08-10 15:58:29 +02:00
Christoph Büscher	18155ed69a	Merge branch 'master' into feature/rank-eval	2017-08-07 16:07:34 +02:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Christoph Büscher	6d999f074a	Merge branch 'master' into feature/rank-eval	2017-07-14 18:36:08 +02:00
Jim Ferenczi	fe383b7c27	More clarifications on the unified highlighter being the new default (#25668 ) * More clarifications on the unified highlighter being the new default	2017-07-13 15:38:58 +02:00
Deb Adair	ded9f55263	[DOCS] Incorporated feedback on the highlighting changes.	2017-07-12 16:36:33 -07:00
Ryan Ernst	70b2897bdf	Scripting: Deprecate stored search template apis (#25437 ) This commit deprecates the PUT, GET and DELETE search template apis. Instead, the stored script api should be used. closes #24596	2017-07-12 16:07:28 -07:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
Deb Adair	b5e81132cf	[DOCS] Reorganized the highlighting topic so it's less confusing.	2017-07-11 21:16:14 -07:00
Simon Willnauer	98c91a3bd0	Limit the number of concurrent shard requests per search request (#25632 ) This is a protection mechanism to prevent a single search request from hitting a large number of shards in the cluster concurrently. If a search is executed against all indices in the cluster this can easily overload the cluster causing rejections etc. which is not necessarily desirable. Instead this PR adds a per request limit of `max_concurrent_shard_requests` that throttles the number of concurrent initial phase requests to `256` by default. This limit can be increased per request and protects single search requests from overloading the cluster. Subsequent PRs can introduces addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of the number of nodes or sort shards iters such that we gain the best concurrency across nodes.	2017-07-11 16:23:10 +02:00
Clinton Gormley	bd7ddfa175	Removed field-stats docs	2017-07-11 15:15:25 +02:00
Martijn van Groningen	d0f9f425bd	parent/child: Removed ParentJoinFieldSubFetchPhase	2017-07-06 13:15:02 +02:00
Clinton Gormley	0170e0e8d3	Remove usage of multi-types from the docs and added a page explaining type removal (#25543 ) Closes #25401	2017-07-05 12:30:19 +02:00
Christoph Büscher	2708bcc6ed	Merge branch 'master' into feature/rank-eval	2017-06-29 15:07:45 +02:00
Jim Ferenczi	664193185e	[Docs] Fix cross reference for parent-join field	2017-06-16 11:53:16 +02:00
Jim Ferenczi	ccb3c9aae7	Add documentation for the new parent-join field (#25227 ) * Add documentation for the new parent-join field This commit adds the docs for the new parent-join field. It explains how to define, index and query this new field. Relates #20257	2017-06-16 11:13:23 +02:00
Adrien Grand	0c117145f6	Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222 ) This snapshot has faster range queries on range fields (LUCENE-7828), more accurate norms (LUCENE-7730) and the ability to use fake term frequencies (LUCENE-7854).	2017-06-15 09:52:07 +02:00
Christoph Büscher	ac3db8c30f	Merge branch 'master' into feature/rank-eval	2017-06-14 11:57:05 +02:00
Ryan Ernst	a03b6c2fa5	Scripting: Change keys for inline/stored scripts to source/id (#25127 ) This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.	2017-06-09 08:29:25 -07:00
Jim Ferenczi	5e8b569255	fix highlighting docs	2017-06-09 14:42:08 +02:00
Jim Ferenczi	8250aa4267	Remove the postings highlighter and make unified the default highlighter choice (#25028 ) This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves exactly like the `unified` highlighter when index_options is set to `offsets`: https://issues.apache.org/jira/browse/LUCENE-7815 It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided). The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text. Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such. There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available. I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.	2017-06-09 14:09:57 +02:00
Andrey Groshev	e4fd8485ce	Made the same length of opening and closing lines (#23583 )	2017-06-09 00:50:43 -07:00
Jim Ferenczi	36a5cf8f35	Automatically early terminate search query based on index sorting (#24864 ) This commit refactors the query phase in order to be able to automatically detect queries that can be early terminated. If the index sort matches the query sort, the top docs collection is early terminated on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector. This change also adds a new parameter to the search request called `track_total_hits`. It indicates if the total number of hits that match the query should be tracked. If false, queries sorted by the index sort will not try to compute this information and and will limit the collection to the first N documents per segment. Aggregations are not impacted and will continue to see every document even when the index sort matches the query sort and `track_total_hits` is false. Relates #6720	2017-06-08 12:10:46 +02:00
Yibin Lin	fbf2e3d574	Tiny correction in inner-hits.asciidoc (#25066 )	2017-06-06 13:26:37 +02:00
Christoph Büscher	3d6fb4eb0b	Merge branch 'master' into feature/rank-eval	2017-05-30 14:24:26 +02:00
Clinton Gormley	0656d0236b	Update context-suggest.asciidoc Removed incorrect parameter	2017-05-26 17:41:40 +02:00
Matt Weber	601a61a91c	Support Multiple Collapse Inner Hits Support multiple named inner hits on a field collapsing request.	2017-05-26 13:23:57 +02:00
António Ribeiro	85a1b2b406	Fix link to perl docs (#24842 ) * Fixes Elasticsearch issue #24606. * Fixes Elasticsearch issue #24606. * Fixes Elasticsearch issue #24606. * Fixes Elasticsearch issue #24606. * Issue #24606 - Changed the link text to Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll.	2017-05-24 11:43:54 +02:00
Nik Everett	13a86fec99	Add magic $_path stash key to docs tests (#24724 ) Adds a "magic" key to the yaml testing stash mostly for use with documentation tests. When unstashing an object, `$_path` is the path into the current position in the object you are unstashing. This means that in docs tests you can use `// TESTRESPONSEs/somevalue/$body.${_path}/` to mean "replace `somevalue` with whatever is the response in the same position." Compare how you must carefully mock out all the numbers in the profile response without this change: ``` // TESTRESPONSE[s/"id": "\[2aE02wS1R8q_QFnYu6vDVQ\]\[twitter\]\[1\]"/"id": $body.profile.shards.0.id/] // TESTRESPONSE[s/"rewrite_time": 51443/"rewrite_time": $body.profile.shards.0.searches.0.rewrite_time/] // TESTRESPONSE[s/"score": 51306/"score": $body.profile.shards.0.searches.0.query.0.breakdown.score/] // TESTRESPONSE[s/"time_in_nanos": "1873811"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.time_in_nanos/] // TESTRESPONSE[s/"build_scorer": 2935582/"build_scorer": $body.profile.shards.0.searches.0.query.0.breakdown.build_scorer/] // TESTRESPONSE[s/"create_weight": 919297/"create_weight": $body.profile.shards.0.searches.0.query.0.breakdown.create_weight/] // TESTRESPONSE[s/"next_doc": 53876/"next_doc": $body.profile.shards.0.searches.0.query.0.breakdown.next_doc/] // TESTRESPONSE[s/"time_in_nanos": "391943"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.0.time_in_nanos/] // TESTRESPONSE[s/"score": 28776/"score": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.score/] // TESTRESPONSE[s/"build_scorer": 784451/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.build_scorer/] // TESTRESPONSE[s/"create_weight": 1669564/"create_weight": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.create_weight/] // TESTRESPONSE[s/"next_doc": 10111/"next_doc": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.next_doc/] // TESTRESPONSE[s/"time_in_nanos": "210682"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.1.time_in_nanos/] // TESTRESPONSE[s/"score": 4552/"score": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.score/] // TESTRESPONSE[s/"build_scorer": 42602/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.build_scorer/] // TESTRESPONSE[s/"create_weight": 89323/"create_weight": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.create_weight/] // TESTRESPONSE[s/"next_doc": 2852/"next_doc": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.next_doc/] // TESTRESPONSE[s/"time_in_nanos": "304311"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.time_in_nanos/] // TESTRESPONSE[s/"time_in_nanos": "32273"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.children.0.time_in_nanos/] ``` To how you can cavalierly mock all the numbers at once with this change: ``` // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/] ```	2017-05-23 15:33:48 -04:00
Jack Conradson	0aa380b770	Fix search template documentation reference to scripting security.	2017-05-18 14:27:58 -07:00
Christoph Büscher	cd0941810f	Merge branch 'master' into feature/rank-eval	2017-05-18 16:47:47 +02:00
Ryan Ernst	463fe2f4d4	Scripting: Remove file scripts (#24627 ) This commit removes file scripts, which were deprecated in 5.5. closes #21798	2017-05-17 14:42:25 -07:00
Martijn van Groningen	840da4aebf	Removed deprecated template query. Relates to #19390	2017-05-11 14:56:45 +02:00
Adrien Grand	a72eaa8e0f	Identify documents by their `_id`. (#24460 ) Now that indices have a single type by default, we can move to the next step and identify documents using their `_id` rather than the `_uid`. One notable change in this commit is that I made deletions implicitly create types. This helps with the live version map in the case that documents are deleted before the first type is introduced. Otherwise there would be no way to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from `DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are different if versioning is involved.	2017-05-09 16:33:52 +02:00
Anupam	0b36fb052c	Update completion-suggest.asciidoc (#24506 )	2017-05-05 11:34:41 -04:00
Simon Willnauer	6b67e0bf2f	Include all aliases including non-filtering in `_search_shards` response (#24489 ) `_search_shards`API today only returns aliases names if there is an alias filter associated with one of them. Now it can be useful to see which aliases have been expanded for an index given the index expressions. This change also includes non-filtering aliases even without a filtering alias being present.	2017-05-05 09:34:12 +02:00
Nik Everett	9f431543fc	CONSOLEify inner hits docs Rewrites most of the snippets in the `innert_hits` docs to be complete examples and enables `VIEW IN CONSOLE`, `COPY AS CURL`, and automatic testing of the snippets.	2017-05-04 17:30:54 -04:00
Adrien Grand	977016ba25	Do not index `_type` when there is at most one type. (#24363 ) This change makes `_type` behave pretty much like `_index` when `index.mapping.single_type` is true.	2017-05-04 16:29:35 +02:00
Clinton Gormley	582b3c06b6	Added docs for batched_reduce_size Relates to #23288	2017-05-02 14:25:03 +02:00
Jim Ferenczi	9d8254fadf	Fix FieldCaps documentation Fix the expected output for field_caps call. Fixes #24413	2017-05-02 10:14:47 +02:00
Martijn van Groningen	b77254871b	docs: document alternative for nested inner hits source Closes #24110	2017-04-28 11:09:24 +02:00
Guillaume Le Floch	739cb35d1b	Allow passing single scrollID in clear scroll API body (#24242 ) * Allow single scrollId in string format Closes #24233	2017-04-25 13:43:21 +02:00
Christoph Büscher	5254731039	Merge branch 'master' into feature/rank-eval	2017-04-22 21:47:32 +02:00
Suhas Karanth	f97d8bc78d	Update reference docs for Highlighter fragmenter (#23754 ) Explain the fragmenter and add examples.	2017-04-17 14:00:24 -04:00
Simon Willnauer	040b86a76b	Set shard count limit to unlimited (#24012 ) Now that we have incremental reduce functions for topN and aggregations we can set the default for `action.search.shard_count.limit` to unlimited. This still allows users to restrict these settings while by default we executed across all shards matching the search requests index pattern.	2017-04-10 17:09:21 +02:00
Jim Ferenczi	9b3c85dd88	Deprecate _field_stats endpoint (#23914 ) _field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field. In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures). Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field. This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently. For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why this PR is made directly to 6.0. The rest tests have also been adapted to not throw an error while this change is backported to 5.4.	2017-04-10 10:10:16 +02:00
Nik Everett	7fad7c675d	Rewrite the scripting security docs (#23930 ) They needed to be updated now that Painless is the default and the non-sandboxed scripting languages are going away or gone. I dropped the entire section about customizing the classloader whitelists. In master this barely does anything (exposes more things to expressions).	2017-04-07 11:46:41 -04:00
Nik Everett	048191ceb6	CONSOLEify highlighting a function_score docs Converts many of the partial examples into full search requests. Relates #18160	2017-04-06 08:13:56 -04:00
Christoph Büscher	024ed1b6ca	Merge branch 'master' into feature/rank-eval	2017-04-04 18:23:41 +02:00
Jim Ferenczi	a8250b26e7	Add FieldCapabilities (_field_caps) API (#23007 ) This change introduces a new API called `_field_caps` that allows to retrieve the capabilities of specific fields. Example: ```` GET t,s,v,w/_field_caps?fields=field1,field2 ```` ... returns: ```` { "fields": { "field1": { "string": { "searchable": true, "aggregatable": true } }, "field2": { "keyword": { "searchable": false, "aggregatable": true, "non_searchable_indices": ["t"] "indices": ["t", "s"] }, "long": { "searchable": true, "aggregatable": false, "non_aggregatable_indices": ["v"] "indices": ["v", "w"] } } } } ```` In this example `field1` have the same type `text` across the requested indices `t`, `s`, `v`, `w`. Conversely `field2` is defined with two conflicting types `keyword` and `long`. Note that `_field_caps` does not treat this case as an error but rather return the list of unique types seen for this field.	2017-03-31 15:34:46 +02:00
Glen Smith	c62d4b7b0f	Clarify preference docs This commit clarifies the preference docs regarding the explanation of how operations are routed by default. In particular, the previous use of "shard replicas" was confusing as it could imply an operation would only be routed to replicas by default. Relates #23794	2017-03-29 12:55:47 -04:00
Christoph Büscher	96fc3aaf6f	Merge branch 'master' into feature/rank-eval	2017-03-23 19:55:47 +01:00
Igor Motov	f927a2708d	Make it possible to validate a query on all shards instead of a single random shard (#23697 ) This is especially useful when we rewrite the query because the result of the rewrite can be very different on different shards. See #18254 for example.	2017-03-22 17:39:21 -04:00
Jim Ferenczi	b8c352fc3f	Add support for fragment_length in the unified highlighter (#23431 ) * Add support for fragment_length in the unified highlighter This commit introduce a new break iterator (a BoundedBreakIterator) designed for the unified highlighter that is able to limit the size of fragments produced by generic break iterator like `sentence`. The `unified` highlighter now supports `boundary_scanner` which can `words` or `sentence`. The `sentence` mode will use the bounded break iterator in order to limit the size of the sentence to `fragment_length`. When sentences bigger than `fragment_length` are produced, this mode will break the sentence at the next word boundary after `fragment_length` is reached.	2017-03-17 18:10:13 +01:00
Jack Conradson	8e04561c0d	Change params._source to params['_source'] in example.	2017-03-15 17:29:31 -07:00
Jack Conradson	4c11ebc8b9	Fix example in documentation for Painless using _source. (#21322 )	2017-03-15 17:18:34 -07:00
Christoph Büscher	cf35545e2d	Merge branch 'master' into feature/rank-eval	2017-03-13 17:36:13 -07:00
NFM	f8fa5c96aa	Fix indentation in sort docs This commit fixes the indentation in an example query in the sort docs. Relates #23561	2017-03-12 17:08:06 -07:00
Christoph Büscher	1f4c4d99b9	Merge branch 'master' into feature/rank-eval	2017-02-27 11:25:17 +01:00
Shai Erera	eeac6d27f2	Add BreakIteratorBoundaryScanner support for FVH (#23248 ) This commit adds a boundary_scanner property to the search highlight request so the user can specify different boundary scanners: * `chars` (default, current behavior) * `word` Use a WordBreakIterator * `sentence` Use a SentenceBreakIterator This commit also adds "boundary_scanner_locale" to define which locale should be used when scanning the text.	2017-02-23 23:32:22 +01:00
Christoph Büscher	cfa52f8b9a	Merge branch 'master' into feature/rank-eval	2017-02-16 10:39:07 +01:00
Adrien Grand	8d6a41f671	Nested queries should avoid adding unnecessary filters when possible. (#23079 ) When nested objects are present in the mappings, many queries get deoptimized due to the need to exclude documents that are not in the right space. For instance, a filter is applied to all queries that prevents them from matching non-root documents (`+: -_type:__`). Moreover, a filter is applied to all child queries of `nested` queries in order to make sure that the child query only matches child documents (`_type:__nested_path`), which is required by `ToParentBlockJoinQuery` (the Lucene query behing Elasticsearch's `nested` queries). These additional filters slow down `nested` queries. In 1.7-, the cost was somehow amortized by the fact that we cached filters very aggressively. However, this has proven to be a significant source of slow downs since 2.0 for users of `nested` mappings and queries, see #20797. This change makes the filtering a bit smarter. For instance if the query is a `match_all` query, then we need to exclude nested docs. However, if the query is `foo: bar` then it may only match root documents since `foo` is a top-level field, so no additional filtering is required. Another improvement is to use a `FILTER` clause on all types rather than a `MUST_NOT` clause on all nested paths when possible since `FILTER` clauses are more efficient. Here are some examples of queries and how they get rewritten: ``` "match_all": {} ``` This query gets rewritten to `ConstantScore(+:* -_type:__)` on master and `ConstantScore(_type:AutomatonQuery {\norg.apache.lucene.util.automaton.Automaton@4371da44})` with this change. The automaton is the complement of `_type:__` so it matches the same documents, but is faster since it is now a positive clause. Simplistic performance testing on a 10M index where each root document has 5 nested documents on average gave a latency of 420ms on master and 90ms with this change applied. ``` "term": { "foo": { "value": "0" } } ``` This query is rewritten to `+foo:0 #(ConstantScore(+: -_type:__))^0.0` on master and `foo:0` with this change: we do not need to filter nested docs out since the query cannot match nested docs. While doing performance testing in the same conditions as above, response times went from 250ms to 50ms. ``` "nested": { "path": "nested", "query": { "term": { "nested.foo": { "value": "0" } } } } ``` This query is rewritten to `+ToParentBlockJoinQuery (+nested.foo:0 #_type:__nested) #(ConstantScore(+:* -_type:__))^0.0` on master and `ToParentBlockJoinQuery (nested.foo:0)` with this change. The top-level filter (`-_type:__`) could be removed since `nested` queries only match documents of the parent space, as well as the child filter (`#_type:__nested`) since the child query may only match nested docs since the `nested` object has both `include_in_parent` and `include_in_root` set to `false`. While doing performance testing in the same conditions as above, response times went from 850ms to 270ms.	2017-02-14 16:05:19 +01:00
Tanguy Leroux	e2e5937455	Use `typed_keys` parameter to prefix suggester names by type in search responses (#23080 ) This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.	2017-02-10 10:53:38 +01:00
Tanguy Leroux	63ea6f7168	[Docs] Remove unnecessary // TEST[continued] in search-template doc It has been explained in `e39b96f257`	2017-02-10 10:08:24 +01:00
Jim Ferenczi	94087b3274	Removes ExpandCollapseSearchResponseListener, search response listeners and blocking calls This changes removes the SearchResponseListener that was used by the ExpandCollapseSearchResponseListener to expand collapsed hits. The removal of SearchResponseListener is not a breaking change because it was never released. This change also replace the blocking call in ExpandCollapseSearchResponseListener by a single asynchronous multi search request. The parallelism of the expand request can be set via CollapseBuilder#max_concurrent_group_searches Closes #23048	2017-02-09 18:06:10 +01:00
Tanguy Leroux	832952cb29	[Docs] Fix consoleify search-template.asciidoc It does not reproduce well, hopefully this will fix the failure on DELETE _search/template/<templatename>.	2017-02-08 21:23:38 +01:00
Jay Modi	7f3769c745	Remove ldjson support and document ndjson for bulk/msearch (#23049 ) This commit removes support for the `application/x-ldjson` Content-Type header as this was only used in the first draft of the spec and had very little uptake. Additionally, the docs for bulk and msearch have been updated to specifically call out ndjson and mention that the newline character may be preceded by a carriage return. Finally, the bulk request handling of the carriage return has been improved to remove this character from the source. Closes #23025	2017-02-08 11:55:50 -05:00

1 2 3 4 5 ...

850 Commits