OpenSearch

Commit Graph

Author	SHA1	Message	Date
Andrew Kramarev	ef468327e9	mistyping in one of the highlighting examples comment -> content (#28139 )	2018-01-18 17:32:42 -05:00
Jim Ferenczi	defb53a0bc	add a note regarding rescore and sort (#28251 )	2018-01-18 09:23:19 +01:00
Christoph Büscher	39ff7b5a3f	[Docs] Correct response json in rank-eval.asciidoc	2018-01-11 15:52:11 +01:00
Andrew Banchich	e92acefba0	[Docs] Improvements in script-fields.asciidoc (#28174 )	2018-01-11 10:59:27 +01:00
Vlad Holubiev	31d4a4bf7c	[DOCS] Fix link formatting (#27990 )	2017-12-26 16:25:05 +00:00
Mayya Sharipova	cbd271e497	Limit the analyzed text for highlighting (#27934 ) * Limit the analyzed text for highlighting - Introduce index level settings to control the max number of character to be analyzed for highlighting - Throw an error if analysis is required on a larger text Closes #27517	2017-12-21 10:19:58 -05:00
Christoph Büscher	f3293879b5	[Docs] Improve rendering of ranking evaluation docs	2017-12-15 10:45:44 +01:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	3d3a1d2a0d	Adding short description for experimental status in docs	2017-12-08 15:12:15 +01:00
Christoph Büscher	52cb6c8ef2	Merge branch 'master' into rankeval	2017-12-07 14:22:46 +01:00
Deb Adair	2f9a882061	[DOCS] Fixed typos and broken attribute.	2017-12-05 11:46:40 -08:00
Christoph Büscher	bbec33d35c	Merge branch 'master' into rankeval	2017-12-04 12:57:19 +01:00
olcbean	d25c9671de	Deprecate `jarowinkler` in favor of `jaro_winkler` (#27526 ) Jaro and Winkler are two people, so we should use the same naming convention as for Damerau–Levenshtein.	2017-11-30 12:49:34 +00:00
Martijn van Groningen	dbf17152d1	docs: use `doc_value_fields` fields as alternative for nested inner hits _source fetching instead of stored fields as doc values are more likely to be enabled by default	2017-11-29 17:31:39 +01:00
Christoph Büscher	35688f6441	Merge branch 'master' into rankeval	2017-11-29 15:24:06 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Martijn van Groningen	cb1204774b	Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response. Also include _type and _id for parent/child hits inside inner hits. In the case of top_hits aggregation the nested search hits are directly returned and are not grouped by a root or parent document, so it is important to include the _id and _index attributes in order to know to what documents these nested search hits belong to. Closes #27053	2017-11-28 14:05:29 +01:00
Christoph Büscher	5661b1c3df	Merge branch 'master' into rankeval	2017-11-24 16:25:05 +01:00
olcbean	fd564b10db	Deprecate `levenstein` in favor of `levenshtein` (#27409 ) Support both spellings thoughout 6.x, reporting the incorrect one as deprecated.	2017-11-23 12:53:47 +00:00
Christoph Büscher	5735477283	Fix some documentation typos	2017-11-23 12:31:25 +01:00
Simon Willnauer	fadbe0de08	Automatically prepare indices for splitting (#27451 ) Today we require users to prepare their indices for split operations. Yet, we can do this automatically when an index is created which would make the split feature a much more appealing option since it doesn't have any 3rd party prerequisites anymore. This change automatically sets the number of routinng shards such that an index is guaranteed to be able to split once into twice as many shards. The number of routing shards is scaled towards the default shard limit per index such that indices with a smaller amount of shards can be split more often than larger ones. For instance an index with 1 or 2 shards can be split 10x (until it approaches 1024 shards) while an index created with 128 shards can only be split 3x by a factor of 2. Please note this is just a default value and users can still prepare their indices with `index.number_of_routing_shards` for custom splitting. NOTE: this change has an impact on the document distribution since we are changing the hash space. Documents are still uniformly distributed across all shards but since we are artificually changing the number of buckets in the consistent hashign space document might be hashed into different shards compared to previous versions. This is a 7.0 only change.	2017-11-23 09:48:54 +01:00
Christoph Büscher	d979ccace9	Merge branch 'master' into rankeval	2017-11-21 14:11:02 +01:00
Christoph Büscher	3348d2317f	Reworking javadocs, minor changes in some implementation classes	2017-11-21 14:09:04 +01:00
Christoph Büscher	5c65a59369	Extending rank_eval asciidocs	2017-11-21 14:08:42 +01:00
Christoph Büscher	d9e67a2c95	Extending `_rank_eval` documentation	2017-11-21 14:08:28 +01:00
Zachary Tong	6e9e07d6f8	Fix profiling naming issues (#27133 ) Some code-paths use anonymous classes (such as NonCollectingAggregator in terms agg), which messes up the display name of the profiler. If we encounter an anonymous class, we need to grab the super's name. Another naming issue was that ProfileAggs were not delegating to the wrapped agg's name for toString(), leading to ugly display. This PR also fixes up the profile documentation. Some of the examples were executing against empty indices, which shows different profile results than a populated index (and made for confusing examples). Finally, I switched the agg display names from the fully qualified name to the simple name, so that it's similar to how the query profiles work. Closes #26405	2017-11-06 16:37:33 -05:00
Shai Erera	bd0261916c	Fix Laplace scorer to multiply by alpha (and not add) (#27125 )	2017-10-31 13:08:44 +01:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00
Nhat	bf4c3642b2	remove _primary and _replica shard preferences (#26791 ) The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335	2017-10-08 11:03:06 -04:00
Christoph Büscher	bea8451b2f	Merge branch 'master' into feature/rank-eval	2017-09-15 11:44:51 +02:00
Jim Ferenczi	401f4ba2ce	Fix percolator highlight sub fetch phase to not highlight query twice (#26622 ) * Fix percolator highlight sub fetch phase to not highlight query twice The PercolatorHighlightSubFetchPhase does not override hitExecute and since it extends HighlightPhase the search hits are highlighted twice (by the highlight phase and then by the percolator). This does not alter the results, the second highlighting just overrides the first one but this slow down the request because it duplicates the work.	2017-09-14 09:31:14 +02:00
Tanguy Leroux	7404221b55	[Docs] Clarify size parameter in Completion Suggester doc (#26617 )	2017-09-13 17:28:31 +02:00
Jim Ferenczi	d68d8c9cef	Expose duplicate removal in the completion suggester (#26496 ) This change exposes the duplicate removal option added in Lucene for the completion suggester with a new option called `skip_duplicates` (defaults to false). This commit also adapts the custom suggest collector to handle deduplication when multiple contexts match the input. Closes #23364	2017-09-07 17:11:01 +02:00
Matt Weber	140395c83f	Multi-level Nested Sort with Filters (#26395 ) Multi-level Nested Sort with Filters Allow multiple levels of nested sorting where each level can have it's own filter. Backward compatible with previous single-level nested sort.	2017-08-30 18:52:56 +02:00
Martijn van Groningen	c821dce3fe	Revert "Multi-level Nested Sort with Filters" This reverts commit `6377afa6c3`.	2017-08-30 14:53:25 +02:00
Martijn van Groningen	6377afa6c3	Multi-level Nested Sort with Filters Allow multple levels of nested sorting where each level can have it's own filter. Backward compatible with previous single-level nested sort.	2017-08-30 14:30:20 +02:00
Tanguy Leroux	db54c4dc7c	[Docs] Convert more doc snippets (#26404 ) This commit converts some remaining doc snippets so that they are now testable.	2017-08-30 09:30:36 +02:00
Jim Ferenczi	86d97971a4	Remove the _all metadata field (#26356 ) * Remove the _all metadata field This change removes the `_all` metadata field. This field is deprecated in 6 and cannot be activated for indices created in 6 so it can be safely removed in the next major version (e.g. 7).	2017-08-28 17:43:59 +02:00
Christoph Büscher	62a7cac3a0	Merge branch 'master' into feature/rank-eval	2017-08-23 11:19:16 +02:00
Alexander Reelsen	483086220f	Docs: Add search response took time explanation (#26202 )	2017-08-15 08:43:26 +02:00
Martijn van Groningen	076167fbe5	inner hits: Unfiltered nested source should keep its full path like filtered nested source. Closes #23090	2017-08-10 15:58:29 +02:00
Christoph Büscher	18155ed69a	Merge branch 'master' into feature/rank-eval	2017-08-07 16:07:34 +02:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Christoph Büscher	6d999f074a	Merge branch 'master' into feature/rank-eval	2017-07-14 18:36:08 +02:00
Jim Ferenczi	fe383b7c27	More clarifications on the unified highlighter being the new default (#25668 ) * More clarifications on the unified highlighter being the new default	2017-07-13 15:38:58 +02:00
Deb Adair	ded9f55263	[DOCS] Incorporated feedback on the highlighting changes.	2017-07-12 16:36:33 -07:00
Ryan Ernst	70b2897bdf	Scripting: Deprecate stored search template apis (#25437 ) This commit deprecates the PUT, GET and DELETE search template apis. Instead, the stored script api should be used. closes #24596	2017-07-12 16:07:28 -07:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
Deb Adair	b5e81132cf	[DOCS] Reorganized the highlighting topic so it's less confusing.	2017-07-11 21:16:14 -07:00
Simon Willnauer	98c91a3bd0	Limit the number of concurrent shard requests per search request (#25632 ) This is a protection mechanism to prevent a single search request from hitting a large number of shards in the cluster concurrently. If a search is executed against all indices in the cluster this can easily overload the cluster causing rejections etc. which is not necessarily desirable. Instead this PR adds a per request limit of `max_concurrent_shard_requests` that throttles the number of concurrent initial phase requests to `256` by default. This limit can be increased per request and protects single search requests from overloading the cluster. Subsequent PRs can introduces addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of the number of nodes or sort shards iters such that we gain the best concurrency across nodes.	2017-07-11 16:23:10 +02:00

1 2 3 4 5 ...

707 Commits