[DOCS] Document shard sizing guide (#61942) (#62957)

Revises the current 'How to avoid oversharding' docs to incorporate information from our [shard sizing blog post][0]. Changes: * Streamlines introduction * Adds "Things to remember" section to describe how shards work * Adds "Guidelines" section based on blog tips * Creates a "Fix an oversharded cluster" section [0]: https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
2025-03-25 01:19:02 +00:00 · 2020-09-28 09:57:39 -04:00 · 2020-09-28 09:57:39 -04:00 · b814d10063
commit b814d10063
parent be5edcfb26
5 changed files with 295 additions and 149 deletions
--- a/docs/reference/how-to.asciidoc
+++ b/docs/reference/how-to.asciidoc
@ -25,4 +25,4 @@ include::how-to/search-speed.asciidoc[]

 include::how-to/disk-usage.asciidoc[]

-include::how-to/avoid-oversharding.asciidoc[]
+include::how-to/size-your-shards.asciidoc[]
--- a/docs/reference/how-to/avoid-oversharding.asciidoc
+++ b/docs/reference/how-to/avoid-oversharding.asciidoc
@ -1,148 +0,0 @@
-[[avoid-oversharding]]
-== Avoid oversharding
-
-In some cases, reducing the number of shards in a cluster while maintaining the
-same amount of data leads to a more effective use of system resources
-(CPU, RAM, IO). In these situations, we consider the cluster _oversharded_.
-
-The number of shards where this inflection point occurs depends on a variety
-of factors, including:
-
-* available hardware
-* indexing load
-* data volume
-* the types of queries executed against the clusters
-* the rate of these queries being issued
-* the volume of data being queried
-
-Testing against production data with production queries on production hardware
-is the only way to calibrate optimal shard sizes. Shard sizes of tens of GB
-are commonly used, and this may be a useful starting point from which to
-experiment. {kib}'s {kibana-ref}/elasticsearch-metrics.html[{es} monitoring]
-provides a useful view of historical cluster performance when evaluating the
-impact of different shard sizes.
-
-[discrete]
-[[oversharding-inefficient]]
-=== Why oversharding is inefficient
-
-Each segment has metadata that needs to be kept in heap memory. These include
-lists of fields, the number of documents, and terms dictionaries. As a shard
-grows in size, the size of its segments generally grow because smaller segments
-are <<index-modules-merge,merged>> into fewer, larger segments. This typically
-reduces the amount of heap required by a shard’s segment metadata for a given
-data volume. At a bare minimum shards should be at least larger than 1GB to
-make the most efficient use of memory. 
-
-However, even though shards start to be more memory efficient at around 1GB,
-a cluster full of 1GB shards will likely still perform poorly. This is because
-having many small shards can also have a negative impact on search and
-indexing operations. Each query or indexing operation is executed in a single
-thread per shard of indices being queried or indexed to. The node receiving
-a request from a client becomes responsible for distributing that request to
-the appropriate shards as well as reducing the results from those individual
-shards into a single response. Even assuming that a cluster has sufficient
-<<modules-threadpool,search threadpool threads>> available to immediately
-process the requested action against all shards required by the request, the
-overhead associated with making network requests to the nodes holding those
-shards and with having to merge the results of results from many small shards
-can lead to increased latency. This in turn can lead to exhaustion of the
-threadpool and, as a result, decreased throughput.
-
-[discrete]
-[[reduce-shard-counts-increase-shard-size]]
-=== How to reduce shard counts and increase shard size
-
-Try these methods to reduce oversharding.
-
-[discrete]
-[[reduce-shards-for-new-indices]]
-==== Reduce the number of shards for new indices
-
-You can specify the `index.number_of_shards` setting  for new indices created
-with the <<indices-create-index,create index API>> or as part of
-<<indices-templates,index templates>> for indices automatically created by
-<<index-lifecycle-management,{ilm} ({ilm-init})>>.
-
-You can override the `index.number_of_shards`  when rolling over an index
-using the <<rollover-index-api-example,rollover index API>>.
-
-[discrete]
-[[create-larger-shards-by-increasing-rollover-thresholds]]
-==== Create larger shards by increasing rollover thresholds
-
-You can roll over indices using the
-<<indices-rollover-index,rollover index API>> or by specifying the
-<<ilm-rollover-action,rollover action>> in an {ilm-init} policy. If using an
-{ilm-init} policy, increase the rollover condition thresholds (`max_age`,
-  `max_docs`, `max_size`)  to allow the indices to grow to a larger size
-  before being rolled over, which creates larger shards.
-
-Take special note of any empty indices. These may be managed by an {ilm-init}
-policy that is rolling over the indices because the `max_age` threshold is met.
-In this case, you may need to adjust the policy to make use of the `max_docs`
-or `max_size` properties to prevent the creation of these empty indices. One
-example where this may happen is if one or more {beats} stop sending data. If
-the {ilm-init}-managed indices for those {beats} are configured to roll over
-  daily, then new, empty indices will be generated each day. Empty indices can
-  be identified using the <<cat-count,cat count API>>.
-
-[discrete]
-[[create-larger-shards-with-index-patterns]]
-==== Create larger shards by using index patterns spanning longer time periods
-
-Creating indices covering longer time periods reduces index and shard counts
-while increasing index sizes. For example, instead of daily indices, you can
-create monthly, or even yearly indices.
-
-If creating indices using {ls}, the 
-{logstash-ref}/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-index[index]
-property of the {es} output can be modified to a
-<<date-math-index-names,date math expression>> covering a longer time period.
-For example, use `logstash-%{+YYYY.MM}` instead of `logstash-%{+YYYY.MM.dd}`
-to create monthly, rather than daily, indices. {beats} also lets you change the
-date math expression defined in the `index` property of the {es} output, such
-as for {filebeat-ref}/elasticsearch-output.html#index-option-es[Filebeat].
-
-[discrete]
-[[shrink-existing-index-to-fewer-shards]]
-==== Shrink an existing index to fewer shards
-
-You can use the <<indices-shrink-index,shrink index API>> to shrink an
-existing index down to fewer shards.
-
-<<index-lifecycle-management,{ilm}>> also has a
-<<ilm-shrink-action,shrink action>> available for indices in the warm phase.
-
-[discrete]
-[[reindex-an-existing-index-to-fewer-shards]]
-==== Reindex an existing index to fewer shards
-
-You can use the <<docs-reindex,reindex API>> to reindex from an existing index
-to a new index with fewer shards. After the data has been reindexed, the
-oversharded index can be deleted.
-
-[discrete]
-[[reindex-indices-from-shorter-periods-into-longer-periods]]
-==== Reindex indices from shorter periods into longer periods
-
-You can use the <<docs-reindex,reindex API>> to reindex multiple small indices
-covering shorter time periods into a larger index covering a longer time period.
-For example, daily indices from October with naming patterns such as
-`foo-2019.10.11` could be combined into a monthly `foo-2019.10` index,
-like this:
-
-[source,console]
--------------------------------------------------
-POST /_reindex
-{
-  "source": {
-    "index": "foo-2019.10.*"
-  },
-  "dest": {
-    "index": "foo-2019.10"
-  }
-}
--------------------------------------------------
-
- 
--- a/docs/reference/how-to/size-your-shards.asciidoc
+++ b/docs/reference/how-to/size-your-shards.asciidoc
@ -0,0 +1,288 @@
+[[size-your-shards]]
+== How to size your shards
++++
+<titleabbrev>Size your shards</titleabbrev>
++++
+
+To protect against hardware failure and increase capacity, {es} stores copies of
+an index’s data across multiple shards on multiple nodes. The number and size of
+these shards can have a significant impact on your cluster's health. One common
+problem is _oversharding_, a situation in which a cluster with a large number of
+shards becomes unstable.
+
+[discrete]
+[[create-a-sharding-strategy]]
+=== Create a sharding strategy
+
+The best way to prevent oversharding and other shard-related issues
+is to create a sharding strategy. A sharding strategy helps you determine and
+maintain the optimal number of shards for your cluster while limiting the size
+of those shards.
+
+Unfortunately, there is no one-size-fits-all sharding strategy. A strategy that
+works in one environment may not scale in another. A good sharding strategy must
+account for your infrastructure, use case, and performance expectations.
+
+The best way to create a sharding strategy is to benchmark your production data
+on production hardware using the same queries and indexing loads you'd see in
+production. For our recommended methodology, watch the
+https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing[quantitative
+cluster sizing video]. As you test different shard configurations, use {kib}'s
+{kibana-ref}/elasticsearch-metrics.html[{es} monitoring tools] to track your
+cluster's stability and performance.
+
+The following sections provide some reminders and guidelines you should consider
+when designing your sharding strategy. If your cluster has shard-related
+problems, see <<fix-an-oversharded-cluster>>.
+
+[discrete]
+[[shard-sizing-considerations]]
+=== Sizing considerations
+
+Keep the following things in mind when building your sharding strategy.
+
+[discrete]
+[[single-thread-per-shard]]
+==== Searches run on a single thread per shard
+
+Most searches hit multiple shards. Each shard runs the search on a single
+CPU thread. While a shard can run multiple concurrent searches, searches across a
+large number of shards can deplete a node's <<modules-threadpool,search
+thread pool>>. This can result in low throughput and slow search speeds.
+
+[discrete]
+[[each-shard-has-overhead]]
+==== Each shard has overhead
+
+Every shard uses memory and CPU resources. In most cases, a small
+set of large shards uses fewer resources than many small shards.
+
+Segments play a big role in a shard's resource usage. Most shards contain
+several segments, which store its index data. {es} keeps segment metadata in
+<<heap-size,heap memory>> so it can be quickly retrieved for searches. As a
+shard grows, its segments are <<index-modules-merge,merged>> into fewer, larger
+segments. This decreases the number of segments, which means less metadata is
+kept in heap memory.
+
+[discrete]
+[[shard-auto-balance]]
+==== {es} automatically balances shards within a data tier
+
+A cluster's nodes are grouped into data tiers. Within each tier, {es}
+attempts to spread an index's shards across as many nodes as possible. When you
+add a new node or a node fails, {es} automatically rebalances the index's shards
+across the tier's remaining nodes.
+
+[discrete]
+[[shard-size-best-practices]]
+=== Best practices
+
+Where applicable, use the following best practices as starting points for your
+sharding strategy.
+
+[discrete]
+[[delete-indices-not-documents]]
+==== Delete indices, not documents
+
+Deleted documents aren't immediately removed from {es}'s file system.
+Instead, {es} marks the document as deleted on each related shard. The marked
+document will continue to use resources until it's removed during a periodic
+<<index-modules-merge,segment merge>>.
+
+When possible, delete entire indices instead. {es} can immediately remove
+deleted indices directly from the file system and free up resources.
+
+[discrete]
+[[use-ds-ilm-for-time-series]]
+==== Use data streams and {ilm-init} for time series data
+
+<<data-streams,Data streams>> let you store time series data across multiple,
+time-based backing indices. You can use <<index-lifecycle-management,{ilm}
+({ilm-init})>> to automatically manage these backing indices.
+
+[role="screenshot"]
+image:images/ilm/index-lifecycle-policies.png[]
+
+One advantage of this setup is
+<<getting-started-index-lifecycle-management,automatic rollover>>, which creates
+a new write index when the current one meets a defined `max_age`, `max_docs`, or
+`max_size` threshold. You can use these thresholds to create indices based on
+your retention intervals. When an index is no longer needed, you can use
+{ilm-init} to automatically delete it and free up resources.
+
+{ilm-init} also makes it easy to change your sharding strategy over time:
+
+* *Want to decrease the shard count for new indices?* +
+Change the <<index-number-of-shards,`index.number_of_shards`>> setting in the
+data stream's <<data-streams-change-mappings-and-settings,matching index
+template>>.
+
+* *Want larger shards?* +
+Increase your {ilm-init} policy's <<ilm-rollover,rollover threshold>>.
+
+* *Need indices that span shorter intervals?* +
+Offset the increased shard count by deleting older indices sooner. You can do
+this by lowering the `min_age` threshold for your policy's
+<<ilm-index-lifecycle,delete phase>>.
+
+Every new backing index is an opportunity to further tune your strategy.
+
+[discrete]
+[[shard-size-recommendation]]
+==== Aim for shard sizes between 10GB and 50GB
+
+Shards larger than 50GB may make a cluster less likely to recover from failure.
+When a node fails, {es} rebalances the node's shards across the data tier's
+remaining nodes. Shards larger than 50GB can be harder to move across a network
+and may tax node resources.
+
+[discrete]
+[[shard-count-recommendation]]
+==== Aim for 20 shards or fewer per GB of heap memory
+
+The number of shards a node can hold is proportional to the node's
+<<heap-size,heap memory>>. For example, a node with 30GB of heap memory should
+have at most 600 shards. The further below this limit you can keep your nodes,
+the better. If you find your nodes exceeding more than 20 shards per GB,
+consider adding another node. You can use the <<cat-shards,cat shards API>> to
+check the number of shards per node.
+
+[source,console]
+----
+GET _cat/shards
+----
+// TEST[setup:my_index]
+
+To use compressed pointers and save memory, we
+recommend each node have a maximum heap size of 32GB or 50% of the node's
+available memory, whichever is lower. See <<heap-size>>.
+
+
+[discrete]
+[[avoid-node-hotspots]]
+==== Avoid node hotspots
+
+If too many shards are allocated to a specific node, the node can become a
+hotspot. For example, if a single node contains too many shards for an index
+with a high indexing volume, the node is likely to have issues.
+
+To prevent hotspots, use the
+<<total-shards-per-node,`index.routing.allocation.total_shards_per_node`>> index
+setting to explicitly limit the number of shards on a single node. You can
+configure `index.routing.allocation.total_shards_per_node` using the
+<<indices-update-settings,update index settings API>>.
+
+[source,console]
+--------------------------------------------------
+PUT /my-index-000001/_settings
+{
+  "index" : {
+    "routing.allocation.total_shards_per_node" : 5
+  }
+}
+--------------------------------------------------
+// TEST[setup:my_index]
+
+
+[discrete]
+[[fix-an-oversharded-cluster]]
+=== Fix an oversharded cluster
+
+If your cluster is experiencing stability issues due to oversharded indices,
+you can use one or more of the following methods to fix them.
+
+[discrete]
+[[reindex-indices-from-shorter-periods-into-longer-periods]]
+==== Create time-based indices that cover longer periods
+
+For time series data, you can create indices that cover longer time intervals.
+For example, instead of daily indices, you can create indices on a monthly or
+yearly basis.
+
+If you're using {ilm-init}, you can do this by increasing the `max_age`
+threshold for the <<ilm-rollover,rollover action>>.
+
+If your retention policy allows it, you can also create larger indices by
+omitting a `max_age` threshold and using `max_docs` and/or `max_size`
+thresholds instead.
+
+[discrete]
+[[delete-empty-indices]]
+==== Delete empty or unneeded indices
+
+If you're using {ilm-init} and roll over indices based on a `max_age` threshold,
+you can inadvertently create indices with no documents. These empty indices
+provide no benefit but still consume resources.
+
+You can find these empty indices using the <<cat-count,cat count API>>.
+
+[source,console]
+----
+GET /_cat/count/my-index-000001?v
+----
+// TEST[setup:my_index]
+
+Once you have a list of empty indices, you can delete them using the
+<<indices-delete-index,delete index API>>. You can also delete any other
+unneeded indices.
+
+[source,console]
+----
+DELETE /my-index-*
+----
+// TEST[setup:my_index]
+
+[discrete]
+[[force-merge-during-off-peak-hours]]
+==== Force merge during off-peak hours
+
+If you no longer write to an index, you can use the <<indices-forcemerge,force
+merge API>> to <<index-modules-merge,merge>> smaller segments into larger ones.
+This can reduce shard overhead and improve search speeds. However, force merges
+are resource-intensive. If possible, run the force merge during off-peak hours.
+
+[source,console]
+----
+POST /my-index-000001/_forcemerge
+----
+// TEST[setup:my_index]
+
+[discrete]
+[[shrink-existing-index-to-fewer-shards]]
+==== Shrink an existing index to fewer shards
+
+If you no longer write to an index, you can use the
+<<indices-shrink-index,shrink index API>> to reduce its shard count.
+
+[source,console]
+----
+POST /my-index-000001/_shrink/my-shrunken-index-000001
+----
+// TEST[s/^/PUT my-index-000001\n{"settings":{"index.number_of_shards":2,"blocks.write":true}}\n/]
+
+{ilm-init} also has a <<ilm-shrink-action,shrink action>> for indices in the
+warm phase.
+
+[discrete]
+[[combine-smaller-indices]]
+==== Combine smaller indices
+
+You can also use the <<docs-reindex,reindex API>> to combine indices
+with similar mappings into a single large index. For time series data, you could
+reindex indices for short time periods into a new index covering a
+longer period. For example, you could reindex daily indices from October with a
+shared index pattern, such as `my-index-2099.10.11`, into a monthly
+`my-index-2099.10` index. After the reindex, delete the smaller indices.
+
+[source,console]
+----
+POST /_reindex
+{
+  "source": {
+    "index": "my-index-2099.10.*"
+  },
+  "dest": {
+    "index": "my-index-2099.10"
+  }
+}
+----
--- a/docs/reference/index-modules/allocation/total_shards.asciidoc
+++ b/docs/reference/index-modules/allocation/total_shards.asciidoc
@ -9,6 +9,7 @@ shards evenly.
 The following _dynamic_ setting allows you to specify a hard limit on the total
 number of shards from a single index allowed per node:

+[[total-shards-per-node]]
 `index.routing.allocation.total_shards_per_node`::

    The maximum number of shards (replicas and primaries) that will be
--- a/docs/reference/redirects.asciidoc
+++ b/docs/reference/redirects.asciidoc
@ -1173,3 +1173,8 @@ For other searchable snapshot APIs, see <<searchable-snapshots-apis>>.
 === Point in time API

 See <<point-in-time-api>>.
+
+[role="exclude",id="avoid-oversharding"]
+=== Avoid oversharding
+
+See <<size-your-shards>>.