Add post 2.10 remote store tweaks (#5118)

* Add post 2.10 remote store tweaks.

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Fix comment about node types.

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Apply suggestions from code review

Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

---------

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
This commit is contained in:
Naarcha-AWS 2023-10-03 08:49:02 -05:00 committed by GitHub
parent 39c0358851
commit f5ec6b536e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 21 additions and 18 deletions

View File

@ -123,7 +123,7 @@ Your results may vary based on your cluster topology, hardware, shard count, and
For these benchmarks, we used the following cluster, shard, and test configuration:
* Node types: Three nodes---one data, one ingest, and one cluster manager node
* Nodes: Three nodes, each using the data, ingest, and cluster manager roles
* Node instance: Amazon EC2 r6g.xlarge
* OpenSearch Benchmark host: Single Amazon EC2 m5.2xlarge instance
* Shard configuration: Three shards with one replica
@ -133,10 +133,10 @@ For these benchmarks, we used the following cluster, shard, and test configurati
The following table lists the benchmarking results for the `so` workload with a remote translog buffer interval of 250 ms.
| | |8 bulk indexing clients (Default) |16 bulk indexing clients |24 bulk indexing clients |
|--- |--- |--- |--- |--- |
| | | Document replication | Remote enabled |Percent difference | Document replication | Remote enabled | Percent difference |Document replication | Remote enabled | Percent difference |
|Indexing throughput |Mean |29582.5 |40667.4 |37.47 |31154.9 |47862.3 |53.63 |31777.2 |51123.2 |60.88 |
| | | 8 bulk indexing clients (Default) | | | 16 bulk indexing clients | | | 24 bulk indexing clients | | |
|--- |--- |--- |--- |--- | --- | --- | --- | --- | --- | --- |
| | | Document replication | Remote enabled | Percent difference | Document replication | Remote enabled | Percent difference | Document replication | Remote enabled | Percent difference |
|Indexing throughput |Mean |29582.5 | 40667.4 |37.47 |31154.9 |47862.3 |53.63 |31777.2 |51123.2 |60.88 |
|P50 |28915.4 |40343.4 |39.52 |30406.4 |47472.5 |56.13 |30852.1 |50547.2 |63.84 |
|Indexing latency |P90 |1716.34 |1469.5 |-14.38 |3709.77 |2799.82 |-24.53 |5768.68 |3794.13 |-34.23 |
@ -144,8 +144,8 @@ The following table lists the benchmarking results for the `so` workload with a
The following table lists the benchmarking results for the `http_logs` workload with a remote translog buffer interval of 200 ms.
| | |8 bulk indexing clients (Default) |16 bulk indexing clients |24 bulk indexing clients |
|--- |--- |--- |--- |--- |
| | | 8 bulk indexing clients (Default) | | | 16 bulk indexing clients | | | 24 bulk indexing clients | | |
|--- |--- |--- |--- |--- | --- | --- | --- | --- | --- | --- |
| | | Document replication | Remote enabled |Percent difference | Document replication | Remote enabled | Percent difference |Document replication | Remote enabled | Percent difference |
|Indexing throughput |Mean |149062 |82198.7 |-44.86 |134696 |148749 |10.43 |133050 |197239 |48.24 |
|P50 |148123 |81656.1 |-44.87 |133591 |148859 |11.43 |132872 |197455 |48.61 |
@ -155,8 +155,8 @@ The following table lists the benchmarking results for the `http_logs` workload
The following table lists the benchmarking results for the `http_logs` workload with a remote translog buffer interval of 250 ms.
| | |8 bulk indexing clients (Default) |16 bulk indexing clients |24 bulk indexing clients |
|--- |--- |--- |--- |--- |
| | | 8 bulk indexing clients (Default) | | | 16 bulk indexing clients | | | 24 bulk indexing clients | | |
|--- |--- |--- |--- |--- | --- | --- | --- | --- | --- | --- |
| | | Document replication | Remote enabled |Percent difference | Document replication | Remote enabled | Percent difference |Document replication | Remote enabled | Percent difference |
|Indexing throughput |Mean |93383.9 |94186.1 |0.86 |91624.8 |125770 |37.27 |93627.7 |132006 |40.99 |
|P50 |91645.1 |93906.7 |2.47 |89659.8 |125443 |39.91 |91120.3 |132166 |45.05 |

View File

@ -8,19 +8,22 @@ grand_parent: Availability and recovery
# Shallow snapshots
Shallow copy snapshots allow you to reference data from an entire remote-backed segment instead of storing all of the data from the segment in a snapshot. This makes accessing segment data faster than using normal snapshots because segment data is not stored in the snapshot repository.
Shallow copy snapshots allow you to reference data from an entire remote-backed repository instead of storing all of the data from the segment in a snapshot repository. This makes accessing segment data faster than using normal snapshots because segment data is not stored in the snapshot repository.
## Enabling shallow snapshots
Use the [Cluster Settings API]({{site.url}}{{site.baseurl}}/api-reference/cluster-api/cluster-settings/) to enable the `remote_store_index_shallow_copy` repository setting, as shown in the following example:
Use the [Snapshot API]({{site.url}}{{site.baseurl}}/api-reference/snapshots/create-repository/) and set the `remote_store_index_shallow_copy` repository setting to `true` to enable shallow snapshot copies, as shown in the following example:
```bash
PUT _cluster/settings
PUT /_snapshot/snap_repo
{
"persistent":{
"remote_store_index_shallow_copy": true
}
}
"type": "s3",
"settings": {
"bucket": "test-bucket",
"base_path": "daily-snaps",
"remote_store_index_shallow_copy": true
}
}
```
{% include copy-curl.html %}
@ -32,5 +35,5 @@ Consider the following before using shallow copy snapshots:
- Shallow copy snapshots only work for remote-backed indexes.
- All nodes in the cluster must use OpenSearch 2.10 or later to take advantage of shallow copy snapshots.
- There is no difference in file size between standard shards and shallow copy snapshot shards because no segment data is stored in the snapshot itself.
- The `incremental` file count and size between the current snapshot and the last snapshot is `0` when using shallow copy snapshots.
- Searchable snapshots are not supported inside shallow copy snapshots.

View File

@ -110,7 +110,7 @@ When using segment replication, consider the following:
1. [Cross-cluster replication](https://github.com/opensearch-project/OpenSearch/issues/4090) does not currently use segment replication to copy between clusters.
1. Segment replication is not compatible with [document-level monitors]({{site.url}}{{site.baseurl}}/observing-your-data/alerting/api/#document-level-monitors), which are used with the [Alerting]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/) and [Security Analytics]({{site.url}}{{site.baseurl}}/security-analytics/index/) plugins. The plugins also use the latest available data on replica shards when using the `immediate` refresh policy, and segment replication can delay the policy's availability, resulting in stale replica shards.
1. Segment replication leads to increased network congestion on primary shards using node-to-node replication because replica shards fetch updates from the primary shard. With remote-backed storage, the primary shard can upload segments to, and the replicas can fetch updates from, the remote-backed storage. This helps offload responsibilities from the primary shard to the remote-backed storage.
Read-after-write guarantees: Segment replication does not currently support setting the refresh policy to `wait_for`. If you set the `refresh` query parameter to `wait_for` and then ingest documents, you'll get a response only after the primary node has refreshed and made those documents searchable. Replica shards will respond only after having written to their local translog. If real-time reads are needed, consider using the [`get`]({{site.url}}{{site.baseurl}}/api-reference/document-apis/get-documents/) or [`mget`]({{site.url}}{{site.baseurl}}/api-reference/document-apis/multi-get/) API operations.
1. Read-after-write guarantees: Segment replication does not currently support setting the refresh policy to `wait_for` or `true`. If you set the `refresh` query parameter to `wait_for` or `true` and then ingest documents, you'll get a response only after the primary node has refreshed and made those documents searchable. Replica shards will respond only after having written to their local translog. If real-time reads are needed, consider using the [`get`]({{site.url}}{{site.baseurl}}/api-reference/document-apis/get-documents/) or [`mget`]({{site.url}}{{site.baseurl}}/api-reference/document-apis/multi-get/) API operations.
1. As of OpenSearch 2.10, system indexes support segment replication.
1. Get, MultiGet, TermVector, and MultiTermVector requests serve strong reads by routing requests to the primary shards. Routing more requests to the primary shards may degrade performance as compared to distributing requests across primary and replica shards. To improve performance in read-heavy clusters, we recommend setting the `realtime` parameter in these requests to `false`. For more information, see [Issue #8700](https://github.com/opensearch-project/OpenSearch/issues/8700).