Add additional technical feedback to workloads (#4879)
* Add additional technical feedback to workloads TO DO: Add Running Tasks in Parallel section. Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Fix indices table. Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Fix corpora page Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update _benchmark/workloads/corpora.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
This commit is contained in:
parent
f04a887a60
commit
dc21de0f80
|
@ -5,6 +5,8 @@ parent: Workload reference
|
||||||
nav_order: 70
|
nav_order: 70
|
||||||
---
|
---
|
||||||
|
|
||||||
|
# corpora
|
||||||
|
|
||||||
The `corpora` element contains all the document corpora used by the workload. You can use document corpora across workloads by copying and pasting any corpora definitions.
|
The `corpora` element contains all the document corpora used by the workload. You can use document corpora across workloads by copying and pasting any corpora definitions.
|
||||||
|
|
||||||
## Example
|
## Example
|
||||||
|
@ -32,23 +34,23 @@ Use the following options with `corpora`.
|
||||||
|
|
||||||
Parameter | Required | Type | Description
|
Parameter | Required | Type | Description
|
||||||
:--- | :--- | :--- | :---
|
:--- | :--- | :--- | :---
|
||||||
| `name` | Yes | String | The name of the document corpus. Because OpenSearch Benchmark uses this name in its directories, use only lowercase names without white spaces. |
|
`name` | Yes | String | The name of the document corpus. Because OpenSearch Benchmark uses this name in its directories, use only lowercase names without white spaces.
|
||||||
| `documents` | Yes | JSON array | An array of document files. |
|
`documents` | Yes | JSON array | An array of document files.
|
||||||
| `meta` | No | String | A mapping of key-value pairs with additional metadata for a corpus. |
|
`meta` | No | String | A mapping of key-value pairs with additional metadata for a corpus.
|
||||||
|
|
||||||
|
|
||||||
Each entry in the `documents` array consists of the following options.
|
Each entry in the `documents` array consists of the following options.
|
||||||
|
|
||||||
Parameter | Required | Type | Description
|
Parameter | Required | Type | Description
|
||||||
:--- | :--- | :--- | :---
|
:--- | :--- | :--- | :---
|
||||||
| `source-file` | Yes | String | The file name containing the corresponding documents for the workload. When using OpenSearch Benchmark locally, documents are contained in a JSON file. When providing a `base_url`, use a compressed file format: `.zip`, `.bz2`, `.gz`, `.tar`, `.tar.gz`, `.tgz`, or `.tar.bz2`. The compressed file must have one JSON file containing the name. |
|
`source-file` | Yes | String | The file name containing the corresponding documents for the workload. When using OpenSearch Benchmark locally, documents are contained in a JSON file. When providing a `base_url`, use a compressed file format: `.zip`, `.bz2`, `.gz`, `.tar`, `.tar.gz`, `.tgz`, or `.tar.bz2`. The compressed file must have one JSON file containing the name.
|
||||||
| `document-count` | Yes | Integer | The number of documents in the `source-file`, which determines which client indices correlate to which parts of the document corpus. Each N client receives an Nth of the document corpus. When using a source that contains a document with a parent-child relationship, specify the number of parent documents. |
|
`document-count` | Yes | Integer | The number of documents in the `source-file`, which determines which client indexes correlate to which parts of the document corpus. Each N client receives an Nth of the document corpus. When using a source that contains a document with a parent-child relationship, specify the number of parent documents.
|
||||||
| `base-url` | No | String | An http(s), Amazon Simple Storage Service (Amazon S3), or Google Cloud Storage URL that points to the root path where OpenSearch Benchmark can obtain the corresponding source file. |
|
`base-url` | No | String | An http(s), Amazon Simple Storage Service (Amazon S3), or Google Cloud Storage URL that points to the root path where OpenSearch Benchmark can obtain the corresponding source file.
|
||||||
| `source-format` | No | String | Defines the format OpenSearch Benchmark uses to interpret the data file specified in `source-file`. Only `bulk` is supported. |
|
`source-format` | No | String | Defines the format OpenSearch Benchmark uses to interpret the data file specified in `source-file`. Only `bulk` is supported.
|
||||||
| `compressed-bytes` | No | Integer | The size, in bytes, of the compressed source file, indicating how much data OpenSearch Benchmark downloads. |
|
`compressed-bytes` | No | Integer | The size, in bytes, of the compressed source file, indicating how much data OpenSearch Benchmark downloads.
|
||||||
| `uncompressed-bytes` | No | Integer | The size, in bytes, of the source file after decompression, indicating how much disk space the decompressed source file needs. |
|
`uncompressed-bytes` | No | Integer | The size, in bytes, of the source file after decompression, indicating how much disk space the decompressed source file needs.
|
||||||
| `target-index` | No | String | Defines the name of the index that the `bulk` operation should target. OpenSearch Benchmark automatically derives this value when only one index is defined in the `indices` element. The value of `target-index` is ignored when the `includes-action-and-meta-data` setting is `true`. |
|
`target-index` | No | String | Defines the name of the index that the `bulk` operation should target. OpenSearch Benchmark automatically derives this value when only one index is defined in the `indices` element. The value of `target-index` is ignored when the `includes-action-and-meta-data` setting is `true`.
|
||||||
| `target-type` | No | String | Defines the document type of the target index targeted in bulk operations. OpenSearch Benchmark automatically derives this value when only one index is defined in the `indices` element and the index has only one type. The value of `target-type` is ignored when the `includes-action-and-meta-data` setting is `true`. |
|
`target-type` | No | String | Defines the document type of the target index targeted in bulk operations. OpenSearch Benchmark automatically derives this value when only one index is defined in the `indices` element and the index has only one type. The value of `target-type` is ignored when the `includes-action-and-meta-data` setting is `true`.
|
||||||
| `includes-action-and-meta-data` | No | Boolean | When set to `true`, indicates that the document's file already contains an `action` line and a `meta-data` line. When `false`, indicates that the document's file contains only documents. Default is `false`. |
|
`includes-action-and-meta-data` | No | Boolean | When set to `true`, indicates that the document's file already contains an `action` line and a `meta-data` line. When `false`, indicates that the document's file contains only documents. Default is `false`.
|
||||||
| `meta` | No | String | A mapping of key-value pairs with additional metadata for a corpus. |
|
`meta` | No | String | A mapping of key-value pairs with additional metadata for a corpus.
|
||||||
|
|
||||||
|
|
|
@ -152,10 +152,10 @@ According to this schedule, the actions will run in the following order:
|
||||||
2. The `cluster-health` operation assesses the health of the cluster before running the workload. In this example, the workload waits until the status of the cluster's health is `green`.
|
2. The `cluster-health` operation assesses the health of the cluster before running the workload. In this example, the workload waits until the status of the cluster's health is `green`.
|
||||||
- The `bulk` operation runs the `bulk` API to index `5000` documents simultaneously.
|
- The `bulk` operation runs the `bulk` API to index `5000` documents simultaneously.
|
||||||
- Before benchmarking, the workload waits until the specified `warmup-time-period` passes. In this example, the warmup period is `120` seconds.
|
- Before benchmarking, the workload waits until the specified `warmup-time-period` passes. In this example, the warmup period is `120` seconds.
|
||||||
5. The `clients` option defines the number of clients that will run the remaining actions in the schedule concurrently.
|
5. The `clients` field defines the number of clients that will run the remaining actions in the schedule concurrently.
|
||||||
6. The `search` runs a `match_all` query to match all documents after they have been indexed by the `bulk` API using the 8 clients specified.
|
6. The `search` runs a `match_all` query to match all documents after they have been indexed by the `bulk` API using the 8 clients specified.
|
||||||
- The `iterations` option indicates the number of times each client runs the `search` operation. The report generated by the benchmark automatically adjusts the percentile numbers based on this number. To generate a precise percentile, the benchmark needs to run at least 1,000 iterations.
|
- The `iterations` field indicates the number of times each client runs the `search` operation. The report generated by the benchmark automatically adjusts the percentile numbers based on this number. To generate a precise percentile, the benchmark needs to run at least 1,000 iterations.
|
||||||
- Lastly, the `target-throughput` option defines the number of requests per second each client performs, which, when set, can help reduce the latency of the benchmark. For example, a `target-throughput` of 100 requests divided by 8 clients means that each client will issue 12 requests per second.
|
- Lastly, the `target-throughput` field defines the number of requests per second each client performs, which, when set, can help reduce the latency of the benchmark. For example, a `target-throughput` of 100 requests divided by 8 clients means that each client will issue 12 requests per second.
|
||||||
|
|
||||||
|
|
||||||
## More workload examples
|
## More workload examples
|
||||||
|
|
|
@ -5,6 +5,8 @@ parent: Workload reference
|
||||||
nav_order: 65
|
nav_order: 65
|
||||||
---
|
---
|
||||||
|
|
||||||
|
# indices
|
||||||
|
|
||||||
The `indices` element contains a list of all indices used in the workload.
|
The `indices` element contains a list of all indices used in the workload.
|
||||||
|
|
||||||
## Example
|
## Example
|
||||||
|
@ -24,5 +26,5 @@ Use the following options with `indices`:
|
||||||
|
|
||||||
Parameter | Required | Type | Description
|
Parameter | Required | Type | Description
|
||||||
:--- | :--- | :--- | :---
|
:--- | :--- | :--- | :---
|
||||||
| `name` | Yes | String | The name of the index template. |
|
`name` | Yes | String | The name of the index template.
|
||||||
| `body` | No | String | The file name corresponding to the index definition used in the body of the Create Index API. |
|
`body` | No | String | The file name corresponding to the index definition used in the body of the Create Index API.
|
||||||
|
|
Loading…
Reference in New Issue