OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-01 08:29:09 +00:00

Author	SHA1	Message	Date
Igor Motov	20af856abd	[7.x] EQL: Adds an ability to execute an asynchronous EQL search (#58192 ) Adds async support to EQL searches Closes #49638 Co-authored-by: James Rodewig james.rodewig@elastic.co	2020-06-25 14:11:57 -04:00
Nik Everett	03e6d1b535	Add Variable Width Histogram Aggregation (backport of #42035 ) (#58440 ) Implements a new histogram aggregation called `variable_width_histogram` which dynamically determines bucket intervals based on document groupings. These groups are determined by running a one-pass clustering algorithm on each shard and then reducing each shard's clusters using an agglomerative clustering algorithm. This PR addresses #9572. The shard-level clustering is done in one pass to minimize memory overhead. The algorithm was lightly inspired by [this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches a small number of documents to sample the data and determine initial clusters. Subsequent documents are then placed into one of these clusters, or a new one if they are an outlier. This algorithm is described in more details in the aggregation's docs. At reduce time, a [hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304) continually merges the closest buckets from all shards (based on their centroids) until the target number of buckets is reached. The final values produced by this aggregation are approximate. Each bucket's min value is used as its key in the histogram. Furthermore, buckets are merged based on their centroids and not their bounds. So it is possible that adjacent buckets will overlap after reduction. Because each bucket's key is its min, this overlap is not shown in the final histogram. However, when such overlap occurs, we set the key of the bucket with the larger centroid to the midpoint between its minimum and the smaller bucket’s maximum: `min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to increases the accuracy of the clustering. Nodes are unable to share centroids during the shard-level clustering phase. In the future, resolving https://github.com/elastic/elasticsearch/issues/50863 would let us solve this issue. It doesn’t make sense for this aggregation to support the `min_doc_count` parameter, since clusters are determined dynamically. The `order` parameter is not supported here to keep this large PR from becoming too complex. Co-authored-by: James Dorfman <jamesdorfman@users.noreply.github.com>	2020-06-25 11:40:47 -04:00
James Rodewig	c3f4034199	[DOCS] Note that DS timestamp field mapping changes require reindex (#58444 ) (#58517 ) With #58096, data streams now track the timestamp field mapping outside of the template associated with the stream. This means you can no longer update the timestamp field mapping using template changes. This updates the associated data stream docs.	2020-06-24 17:21:26 -04:00
markharwood	837f2643eb	Docs - Added field capabilities breaking change (#58509 )	2020-06-24 18:39:01 +01:00
Russ Cam	441bc14d21	[DOCS] Update aliases to indicate array (#58469 ) Updates the aliases documentation to correct the parameter to an array.	2020-06-24 09:41:23 -04:00
markharwood	d5ac3bb87f	Field capabilities - make `keyword` a family of field types (#58315 ) (#58483 ) Introduces a new method on `MappedFieldType` to return a family type name which defaults to the field type. Changes `wildcard` and `constant_keyword` field types to return `keyword` for field capabilities. Relates to #53175	2020-06-24 12:32:14 +01:00
James Rodewig	afbf3bd33b	[DOCS] Add data streams to bulk, delete, and index API docs (#58340 ) (#58434 ) Updates existing docs for the bulk, delete and index APIs to make them aware of data streams.	2020-06-23 09:40:25 -04:00
James Rodewig	9d03204308	[DOCS] Prohibit deletion of composable template in use by data stream (#58347 ) (#58430 ) Notes that you cannot delete a composable template currently in use by a data stream. Relates to #57957.	2020-06-23 09:01:17 -04:00
James Rodewig	b213f0222c	[DOCS] Reword tip in data streams overview	2020-06-23 08:57:59 -04:00
István Zoltán Szabó	3169e4c70e	[DOCS] Updates screenshots in ML population analysis (#58318 )	2020-06-23 09:05:08 +02:00
Dan Hermann	c5f5cc4cf8	[DOCS] Prohibit cloning, splitting, and shrinking a data stream's write index (#58105 ) (#58401 )	2020-06-22 07:29:26 -05:00
Benjamin Trent	bf8641aa15	[7.x] [ML] calculate cache misses for inference and return in stats (#58252 ) (#58363 ) When a local model is constructed, the cache hit miss count is incremented. When a user calls _stats, we will include the sum cache hit miss count across ALL nodes. This statistic is important to in comparing against the inference_count. If the cache hit miss count is near the inference_count it indicates that the cache is overburdened, or inappropriately configured.	2020-06-19 09:46:51 -04:00
James Rodewig	d8dc638a67	[DOCS] Document get data stream API response body (#58344 ) (#58360 )	2020-06-18 16:42:05 -04:00
James Rodewig	b8fa90198b	[DOCS] Prohibit deletion of a data stream's write index (#58341 ) (#58358 )	2020-06-18 16:00:10 -04:00
Lisa Cawley	6680271691	[DOCS] Updates pull and issue release attributes (#58348 )	2020-06-18 12:55:02 -07:00
Tal Levy	11086d5c7d	add geo_shape documentation for supported aggregations (#58284 ) (#58354 ) This commit adds documentation for geo_shape fields in aggregations Closes #55495.	2020-06-18 12:36:24 -07:00
Stuart Tettemer	20abba8433	Scripting: Deprecate general cache settings (#55753 ) (#58283 ) Backport: ef543b0	2020-06-18 11:54:23 -06:00
Jason Tedor	be08268562	Allow follower indices to override leader settings (#58103 ) Today when creating a follower index via the put follow API, or via an auto-follow pattern, it is not possible to specify settings overrides for the follower index. Instead, we copy all of the leader index settings to the follower. Yet, there are cases where a user would want some different settings on the follower index such as the number of replicas, or allocation settings. This commit addresses this by allowing the user to specify settings overrides when creating follower index via manual put follower calls, or via auto-follow patterns. Note that not all settings can be overrode (e.g., index.number_of_shards) so we also have detection that prevents attempting to override settings that must be equal between the leader and follow index. Note that we do not even allow specifying such settings in the overrides, even if they are specified to be equal between the leader and the follower index. Instead, the must be implicitly copied from the leader index, not explicitly set by the user.	2020-06-18 11:56:06 -04:00
James Rodewig	9ba1b1d067	[DOCS] Reformat data stream API docs (#58322 ) (#58334 )	2020-06-18 10:59:12 -04:00
Marios Trivyzas	50b391e91b	SQL: [Docs] Fix TIME_PARSE documentation (#58182 ) (#58317 ) TIME_PARSE works correctly if both date and time parts are specified, and a TIME object (that contains only time is returned). Adjust docs and add a unit test that validates the behavior. Follows: #55223 (cherry picked from commit 9d6b679a5da88f3c131b9bdba49aa92c6c272abe)	2020-06-18 16:09:13 +02:00
Dan Hermann	3b511fd829	[DOCS] Add data stream APIs to main API page (#58204 ) (#58325 )	2020-06-18 08:41:43 -05:00
Dan Hermann	a2837097ff	[DOCS] Move some docs about data streams from the create page to the intro page	2020-06-18 08:24:06 -05:00
James Rodewig	64fb326637	[DOCS] Add data streams to search docs (#58278 ) (#58320 ) Changes: * Adds additional examples to the `Search a data stream` section of `Use a data stream` * Updates existing search docs to make them aware of data streams	2020-06-18 08:59:00 -04:00
Jim Ferenczi	82db0b575c	Allow index filtering in field capabilities API (#57276 ) (#58299 ) This change allows to use an `index_filter` in the field capabilities API. Indices are filtered from the response if the provided query rewrites to `match_none` on every shard: ```` GET metrics-* { "index_filter": { "bool": { "must": [ "range": { "@timestamp": { "gt": "2019" } } } } } ```` The filtering is done on a best-effort basis, it uses the can match phase to rewrite queries to `match_none` instead of fully executing the request. The first shard that can match the filter is used to create the field capabilities response for the entire index. Closes #56195	2020-06-18 10:23:26 +02:00
Yannick Welsch	ffeff4090e	Add new flag to check whether alias exists on remove (#58100 ) This allows doing true CAS operations on aliases, making sure that an alias is actually properly moved from a given source index onto a given target index. This is useful to ensure that an alias is actually moved from a given index to another one, and not just added to another index.	2020-06-18 10:15:26 +02:00
James Rodewig	8b99a891a8	[DOCS] Fix typo in create data stream API docs	2020-06-17 17:15:50 -04:00
James Rodewig	4ab9aea965	[DOCS] Remove redundant links in data stream docs	2020-06-17 17:08:19 -04:00
James Rodewig	5e0b00f022	[DOCS] Fix routing param in search API docs (#58267 ) (#58288 )	2020-06-17 15:19:53 -04:00
James Rodewig	01043eb8aa	[7.x] [DOCS] Add 'update/delete docs in a data stream' tutorial (#58194 ) (#58264 ) Adds a tutorial for updating and deleting documents in the backing indices of a data stream.	2020-06-17 12:41:24 -04:00
Dan Hermann	4962a91157	Document that data stream write indices cannot be closed	2020-06-17 10:39:58 -05:00
Przemyslaw Gomulka	9894d90e0b	[doc] known issues - week based patterns not working in 7.6 (#58099 ) (#58227 ) relates #57128 # Conflicts: # docs/reference/release-notes/7.6.asciidoc	2020-06-17 10:54:22 +02:00
Lisa Cawley	46d797b1d9	[DOCS] Fixes license management links (#58213 )	2020-06-16 16:49:48 -07:00
debadair	cfef2b2bec	[DOCS] Removed unused pages (#58209 )	2020-06-16 15:55:56 -07:00
Stuart Tettemer	01795d1925	Revert "Scripting: Deprecate general cache settings (#55753 )" (#58201 ) This reverts commit 88e8b34fc2d672060a82979cb782b8cf491a3985.	2020-06-16 14:58:18 -06:00
James Rodewig	ce22e951f8	[DOCS] Add resolve index API check to DS setup tutorial (#58167 ) (#58197 ) Updates the set up a data stream tutorial to include a name check using the resolve index API.	2020-06-16 16:28:42 -04:00
James Rodewig	c548a87673	[DOCS] Add 'Change DS mappings and settings' tutorial (#58148 ) (#58195 ) Adds a tutorial for updating the mappings and index settings of a data stream's backing indices.	2020-06-16 16:20:32 -04:00
Stuart Tettemer	88e8b34fc2	Scripting: Deprecate general cache settings (#55753 ) Backport: ef543b0	2020-06-16 13:06:59 -06:00
debadair	276a4898ba	[DOCS] Fixes problematic terminology (#58184 ) * [DOCS] Fixes problematic terminology (#58178) * [DOCS] Fixes problematic terminology. * Update docs/reference/snapshot-restore/register-repository.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-06-16 11:43:22 -07:00
debadair	2edcd064fe	[DOCS] Fix bad xref (#58150 )	2020-06-15 15:50:49 -07:00
Jake Landis	dc7ffb154a	Update hh to HH in date processor example (#58089 ) (#58144 ) Co-authored-by: Leaf-Lin <39002973+Leaf-Lin@users.noreply.github.com>	2020-06-15 17:04:14 -05:00
Adam Locke	ad0364dc06	[DOCS] Add documentation for near real-time search (#57560 ) (#58138 ) * Adding documentation for near real-time search. * Adding link to NRT topic and clarifying some text. * Adding diagrams and incorporating changes from David T.	2020-06-15 16:42:57 -04:00
debadair	80524098fc	[DOCS] Reformat release highlights as What's new. (#58073 )	2020-06-15 13:26:03 -07:00
James Rodewig	e268a89ef2	[DOCS] Fix typo in data stream docs	2020-06-15 12:59:36 -04:00
Andrei Dan	3635bd741c	[DOCS] Make ILM documentation data stream aware (#58035 ) (#58110 ) Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> (cherry picked from commit 25cbbe56dd29fbee2efe8040e9c8b92d168cb670) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-15 15:16:14 +01:00
James Rodewig	0bc7c4f69e	[DOCS] Fix xref in data stream docs	2020-06-15 09:49:18 -04:00
Dan Hermann	8a910443c4	Add ignore_empty_value parameter in set ingest processor (#57030 ) (#58108 )	2020-06-15 08:35:08 -05:00
István Zoltán Szabó	c3e6aa65dc	[DOCS] Adds web session details example to painless transform examples (#57942 )	2020-06-15 15:19:02 +02:00
István Zoltán Szabó	3a5ee4476d	Merge branch '7.x' of github.com:elastic/elasticsearch into 7.x	2020-06-15 15:18:39 +02:00
James Rodewig	66c33c8c96	[DOCS] Update prohibited ops for ds write index Adds 'clone' and 'split' to a list of actions that are prohibited on a data stream's write index	2020-06-15 08:57:33 -04:00
James Rodewig	9392210cb5	[DOCS] Add get and delete steps to data stream setup tutorial (#58068 ) (#58107 ) Adds corresponding steps for the get and delete data stream APIs to the data stream setup tutorial. Also provides some guidance on how to determine the current write index for a data stream.	2020-06-15 08:52:43 -04:00

... 5 6 7 8 9 ...

8906 Commits