Nhat Nguyen 15aa3764a4
Reduce recovery time with compress or secure transport (#36981)
Today file-chunks are sent sequentially one by one in peer-recovery. This is a
correct choice since the implementation is straightforward and recovery is
network bound in most of the time. However, if the connection is encrypted, we
might not be able to saturate the network pipe because encrypting/decrypting
are cpu bound rather than network-bound.

With this commit, a source node can send multiple (default to 2) file-chunks
without waiting for the acknowledgments from the target.

Below are the benchmark results for PMC and NYC_taxis.

- PMC (20.2 GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| -------- | -------- | -------- | -------- |
| Plain     | 184s     | 137s     | 106s     | 105s     | 106s     |
| TLS       | 346s     | 294s     | 176s     | 153s     | 117s     |
| Compress  | 1556s    | 1407s    | 1193s    | 1183s    | 1211s    |

- NYC_Taxis (38.6GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| ---------| ---------| ---------| -------- |
| Plain     | 321s     | 249s     | 191s     |  *       | *        |
| TLS       | 618s     | 539s     | 323s     | 290s     | 213s     |
| Compress  | 2622s    | 2421s    | 2018s    | 2029s    | n/a      |

Relates #33844
2019-01-14 15:14:46 -05:00

34 lines
1.8 KiB
Plaintext

[[recovery]]
=== Indices Recovery
<<cat-recovery,Peer recovery>> is the process used to build a new copy of a
shard on a node by copying data from the primary. {es} uses this peer recovery
process to rebuild shard copies that were lost if a node has failed, and uses
the same process when migrating a shard copy between nodes to rebalance the
cluster or to honor any changes to the <<modules-cluster,shard allocation
settings>>.
The following _expert_ setting can be set to manage the resources consumed by
peer recoveries:
`indices.recovery.max_bytes_per_sec`::
Limits the total inbound and outbound peer recovery traffic on each node.
Since this limit applies on each node, but there may be many nodes
performing peer recoveries concurrently, the total amount of peer recovery
traffic within a cluster may be much higher than this limit. If you set
this limit too high then there is a risk that ongoing peer recoveries will
consume an excess of bandwidth (or other resources) which could destabilize
the cluster. Defaults to `40mb`.
`indices.recovery.max_concurrent_file_chunks`::
Controls the number of file chunk requests that can be sent in parallel per recovery.
As multiple recoveries are already running in parallel (controlled by
cluster.routing.allocation.node_concurrent_recoveries), increasing this expert-level
setting might only help in situations where peer recovery of a single shard is not
reaching the total inbound and outbound peer recovery traffic as configured by
indices.recovery.max_bytes_per_sec, but is CPU-bound instead, typically when using
transport-level security or compression. Defaults to `2`.
This setting can be dynamically updated on a live cluster with the
<<cluster-update-settings,cluster-update-settings>> API.