OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Christoph Büscher	978d1ed257	[Docs] Improve tuning for speed advice (#33315 ) This change merges two sections in the "Tune for search speed" documentation that recommend mapping numeric identifiers as keywords. Both sections contain mostly the same advice, so they can be merged. Closes #32733	2018-09-03 11:09:30 +02:00
DeDe Morton	ecd05d5be4	Use correct formatting for links (#29460 )	2018-07-16 21:11:24 +02:00
Adrien Grand	21fe6159d4	Docs: remove notes on sparsity. (#30905 ) Sparsity is less of a concern since 6.0. Closes #30833	2018-06-05 08:58:52 +02:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Adrien Grand	4918924fae	Remove legacy mapping code. (#29224 ) Some features have been deprecated since `6.0` like the `_parent` field or the ability to have multiple types per index. This allows to remove quite some code, which in-turn will hopefully make it easier to proceed with the removal of types.	2018-04-11 09:41:37 +02:00
Sue Gallagher	3530a676e0	[Docs]Corrected spelling errors. (#28976 )	2018-03-19 10:22:40 -07:00
Adrien Grand	89b4485511	Document how copy-to can help speed up queries by querying fewer fields. (#28373 )	2018-01-31 15:03:54 +01:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Adrien Grand	4e1ff8d086	Add documentation about disabling `_field_names`. (#26813 ) This field has significant index-time overhead. Closes #26779	2017-10-06 16:49:15 +02:00
Lee Hinman	cff904bf97	Enable adaptive replica selection by default (#26522 ) Relates to #24915	2017-09-07 09:25:05 -06:00
Lee Hinman	4157eead22	[DOCS] Add documentation for adaptive replica selection This adds a blurb for adaptive replica selection since it was previously undocumented. Relates to #24915	2017-09-01 09:53:22 -06:00
Jim Ferenczi	86d97971a4	Remove the _all metadata field (#26356 ) * Remove the _all metadata field This change removes the `_all` metadata field. This field is deprecated in 6 and cannot be activated for indices created in 6 so it can be safely removed in the next major version (e.g. 7).	2017-08-28 17:43:59 +02:00
Christoph Wurm	0120448f76	Expand How to tune for disk usage (#25562 )	2017-08-21 12:07:54 -07:00
Clinton Gormley	25a89e613a	Broke recipes into separate pages	2017-07-17 18:21:39 +02:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
Adrien Grand	8c869e2a0b	More advices around search speed and disk usage. (#25252 ) It adds notes about: - how preference can help optimize cache usage - the fact that too many replicas can hurt search performance due to lower utilization of the filesystem cache - how index sorting can improve _source compression - how always putting fields in the same order in documents can improve _source compression	2017-06-16 11:23:40 +02:00
Adrien Grand	0c117145f6	Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222 ) This snapshot has faster range queries on range fields (LUCENE-7828), more accurate norms (LUCENE-7730) and the ability to use fake term frequencies (LUCENE-7854).	2017-06-15 09:52:07 +02:00
Adrien Grand	bbdf50f6bd	Docs: More search speed advices. (#24802 )	2017-06-01 17:23:22 +02:00
Glen Smith	a590a22ea3	Add note and link to 'tune for disk usage' (#23252 ) * Add note and link to 'tune for disk usage' * Changed formatting as suggested Thanks, @ clintongormley!	2017-02-20 20:31:19 +01:00
Elijah	3b92179e09	Improve wording in recipes docs This commit improves some of the wording the recipes docs. Relates #22661	2017-01-17 21:00:36 -05:00
Elijah	297b1b7d9a	Capitalize "Elasticsearch" in indexing speed docs This commit fixes the capitalization of "Elasticsearch" in the indexing speed docs. Relates #22659	2017-01-17 12:33:01 -05:00
Adrien Grand	52408fc389	Add a recommendation against large documents to the docs. (#21652 )	2016-11-21 15:01:36 +01:00
Adrien Grand	68b0e395b2	Add recommendations about getting consistent scores despite shards and replicas. (#21167 ) This is a topic that has triggered many questions recently so it would be good to have these recommendations documented.	2016-11-02 10:50:38 +01:00
Adrien Grand	9cbbddb6dc	Add support for `quote_field_suffix` to `simple_query_string`. (#21060 ) Closes #18641	2016-10-28 09:11:57 +02:00
Pascal Borreli	fcb01deb34	Fixed typos (#20843 )	2016-10-10 14:51:47 -06:00
Adrien Grand	cdc27b75b8	Add more information to the how-to docs. #20297 - use auto-generated ids for indexing #20211 - use rounded dates in queries #20115	2016-09-02 14:28:47 +02:00
Adrien Grand	398d70b567	Add `scaled_float`. #19264 This is a tentative to revive #15939 motivated by elastic/beats#1941. Half-floats are a pretty bad option for storing percentages. They would likely require 2 bytes all the time while they don't need more than one byte. So this PR exposes a new `scaled_float` type that requires a `scaling_factor` and internally indexes `valuescaling_factor` in a long field. Compared to the original PR it exposes a lower-level API so that the trade-offs are clearer and avoids any reference to fixed precision that might imply that this type is more accurate (actually it is less* accurate). In addition to being more space-efficient for some use-cases that beats is interested in, this is also faster that `half_float` unless we can improve the efficiency of decoding half-float bits (which is currently done using software) or until Java gets first-class support for half-floats.	2016-07-18 12:36:23 +02:00
Jason Tedor	c05f818160	Fix casing of "Elasticsearch" in how-to docs	2016-07-07 12:33:27 -04:00
Adrien Grand	873661df17	Fix typo.	2016-07-07 17:49:01 +02:00
Adrien Grand	f295a218a0	Add notes about sparsity.	2016-07-07 17:47:19 +02:00
Tanguy Leroux	453a4b9647	Fix documentation typo in How-To docs	2016-06-27 14:49:37 +02:00
Adrien Grand	fbad3af352	Add a how-to section to the docs. #18998 This moves the "Performance Considerations for Elasticsearch Indexing" blog post to the reference guide and adds similar recommendations for tuning disk usage and search speed.	2016-06-24 10:58:33 +02:00

34 Commits