Add editorial for both Benchmark sections (#4192)

* Benchmark editorial

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

---------

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
Naarcha-AWS 2023-05-26 22:52:01 -05:00 committed by GitHub
parent 91ca0ffd55
commit 2a5864ed6b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 119 additions and 120 deletions

View File

@ -7,11 +7,11 @@ parent: Command reference
# compare
The `compare` command helps you analyze the difference between two benchmark tests. This can help you analyze the performance impact of changes made from a previous test based on a specific Git revision.
The `compare` command helps you analyze the difference between two benchmark tests. This can help you analyze the performance impact of changes made from a previous test based on a specific Git revision.
## Usage
You can compare two different workload tests using their `TestExecution IDs`. To find a list of tests run from a specific workload, use `opensearch-benchmark list test_executions`. You should receive an output similar to the following:
You can compare two different workload tests using their `TestExecution IDs`. To find a list of tests run from a specific workload, use `opensearch-benchmark list test_executions`. You should receive an output similar to the following:
```
@ -120,13 +120,13 @@ Query latency country_agg_cached (100.0 percentile) [ms] 3.42547 2.8681
## Options
You can use the following options to customize the results of your test comparison:
You can use the following options to customize the results of your test comparison:
- `--baseline`: The baseline TestExecution ID used to compare the contender TestExecution.
- `--contender`: The TestExecution ID for the contender being compared to the baseline.
- `--baseline`: The baseline TestExecution ID used to compare the contender TestExecution.
- `--contender`: The TestExecution ID for the contender being compared to the baseline.
- `--results-format`: Defines the output format for the command line results, either `markdown` or `csv`. Default is `markdown`.
- `--results-number-align`: Defines the column number alignment for when the `compare` command outputs results. Default is `right`.
- `--results-file`: When provided a file path, writes the compare results to the file indicated in the path.
- `--show-in-results`: Determines whether or not to include the comparison in the results file.
- `--results-file`: When provided a file path, writes the compare results to the file indicated in the path.
- `--show-in-results`: Determines whether or not to include the comparison in the results file.

View File

@ -7,34 +7,34 @@ parent: Command reference
# download
Use the `download` command to select which OpenSearch distribution version to download.
Use the `download` command to select which OpenSearch distribution version to download.
## Usage
The following example downloads OpenSearch version 2.7.0:
```
opensearch-benchmark download --distribution-version=2.7.0
opensearch-benchmark download --distribution-version=2.7.0
```
Benchmark then returns the location of the OpenSearch artifact:
```
{
"opensearch": "/Users/.benchmark/benchmarks/distributions/opensearch-2.7.0.tar.gz"
"opensearch": "/Users/.benchmark/benchmarks/distributions/opensearch-2.7.0.tar.gz"
}
```
## Options
Use the following options to customize how OpenSearch Benchmark downloads OpenSearch:
- `--provision-config-repository`: Defines the repository from which OpenSearch Benchmark loads `provision-configs` and `provision-config-instances`.
- `--provision-config-revision`: Defines a specific Git revision in the `provision-config` that OpenSearch Benchmark should use.
- `--provision-config-path`: Defines the path to the `--provision-config-instance` and any OpenSearch plugin configurations to use.
- `--distribution-version`: Downloads the specified OpenSearch distribution based on version number. For a list of released OpenSearch versions, see [Version history](https://opensearch.org/docs/version-history/).
- `--provision-config-revision`: Defines a specific Git revision in the `provision-config` that OpenSearch Benchmark should use.
- `--provision-config-path`: Defines the path to the `--provision-config-instance` and any OpenSearch plugin configurations to use.
- `--distribution-version`: Downloads the specified OpenSearch distribution based on version number. For a list of released OpenSearch versions, see [Version history](https://opensearch.org/docs/version-history/).
- `--distribution-repository`: Defines the repository from where the OpenSearch distribution should be downloaded. Default is `release`.
- `--provision-config-instance`: Defines the `--provision-config-instance` to use. You can see possible configuration instances using the command `opensearch-benchmark list provision-config-instances`.
- `--provision-config-instance`: Defines the `--provision-config-instance` to use. You can view possible configuration instances using the command `opensearch-benchmark list provision-config-instances`.
- `--provision-config-instance-params`: A comma-separated list of key-value pairs injected verbatim as variables for the `provision-config-instance`.
- `--target-os`: The target operating system (OS) for which the OpenSearch artifact should be downloaded. Default is the current OS.
- `--target-arch`: The name of the CPU architecture for which an artifact should be downloaded.
- `--target-arch`: The name of the CPU architecture for which an artifact should be downloaded.

View File

@ -7,17 +7,17 @@ parent: Command reference
# execute-test
Whether you're using the included [OpenSearch Benchmark workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) or a [custom workload]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/), use the `execute-test` command to gather data about the performance of your OpenSearch cluster according the selected workload.
Whether you're using the included [OpenSearch Benchmark workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) or a [custom workload]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/), use the `execute-test` command to gather data about the performance of your OpenSearch cluster according to the selected workload.
## Usage
The following example executes a test using the `geonames` workload in test mode:
The following example executes a test using the `geonames` workload in test mode:
```
opensearch-benchmark execute-test --workload=geonames --test-mode
opensearch-benchmark execute-test --workload=geonames --test-mode
```
After the test runs, OpenSearch Benchmark responds with a summary of the benchmark metrics:
After the test runs, OpenSearch Benchmark responds with a summary of the benchmark metrics:
```
------------------------------------------------------
@ -81,78 +81,78 @@ After the test runs, OpenSearch Benchmark responds with a summary of the benchma
## Options
Use the following options to customize the `execute-test` command for your use case. Options in this section are categorized by their use case.
Use the following options to customize the `execute-test` command for your use case. Options in this section are categorized by their use case.
## General settings
The following options shape how each test runs and how results are output.
The following options shape how each test runs and how results appear:
- `--test-mode`: Runs the given workload in test mode, which is useful when checking a workload for errors.
- `--user-tag`: Defines user-specific key-value pairs to be used in metric record as meta information, for example, `intention:baseline-ticket-12345`.
- `--results-format`: Defines the output format for the command line results, either `markdown` or `csv`. Default is `markdown`.
- `--results-number-align`: Defines the column number alignment for when the `compare` command outputs results. Default is `right`.
- `--results-number-align`: Defines the column number alignment for when the `compare` command outputs results. Default is `right`.
- `--results-file`: When provided a file path, writes the compare results to the file indicated in the path.
- `--show-in-results`: Determines whether or not to include the comparison in the results file.
- `--show-in-results`: Determines whether or not to include the comparison in the results file.
### Distributions
The following options set which version OpenSearch and OpenSearch plugins the benchmark test uses.
The following options set which version of OpenSearch and the OpenSearch plugins the benchmark test uses:
- `--distribution-version`: Downloads the specified OpenSearch distribution based on version number. For a list of released OpenSearch versions, see [Version history](https://opensearch.org/docs/version-history/).
- `--distribution-version`: Downloads the specified OpenSearch distribution based on version number. For a list of released OpenSearch versions, see [Version history](https://opensearch.org/docs/version-history/).
- `--distribution-repository`: Defines the repository from where the OpenSearch distribution should be downloaded. Default is `release`.
- `--revision`: Defines the current source code revision to use for running a benchmark test. Default is `current`.
- `current`: Uses the source tree's current revision based on your OpenSearch distribution.
- `latest`: Fetches the latest revision from the main branch of the source tree.
- You can also use a time stamp or commit ID from the source tree. When using a timestamp, specify `@ts` where "ts" is a valid ISO 8601 timestamp, for example `@2013-07-27T10:37:00Z`.
- `latest`: Fetches the latest revision from the main branch of the source tree.
- You can also use a timestamp or commit ID from the source tree. When using a timestamp, specify `@ts`, where "ts" is a valid ISO 8601 timestamp, for example, `@2013-07-27T10:37:00Z`.
- `--opensearch-plugins`: Defines which [OpenSearch plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/) to install. By default, no plugins are installed.
- `--plugin-params:` Defines a comma-separated list of key:value pairs that are injected verbatim into all plugins as variables.
- `--runtime-jdk`: The major version of JDK to use.
- `--runtime-jdk`: The major version of JDK to use.
- `--client-options`: Defines a comma-separated list of clients to use. All options are passed to the OpenSearch Python client. Default is `timeout:60`.
### Cluster
The following option relates to the target cluster of the benchmark.
- `--target-hosts`: Defines a comma-separated list of host-port pairs which should be targeted if using the pipeline `benchmark-only`. Default is `localhost:9200`.
- `--target-hosts`: Defines a comma-separated list of host-port pairs that should be targeted if using the pipeline `benchmark-only`. Default is `localhost:9200`.
### Distributed workload generation
The following options help those who want to use multiple hosts to generate load to the benchmark cluster.
The following options help those who want to use multiple hosts to generate load to the benchmark cluster:
- `--load-worker-coordinator-hosts`: Defines a comma-separated list of hosts that coordinate loads. Default is `localhost`.
- `--enable-worker-coordinator-profiling`: Enables an analysis of the performance of OpenSearch Benchmark's worker coordinator. Default is `false`.
### Provisioning
The following options help customize how OpenSearch Benchmark provisions OpenSearch and workloads.
The following options help customize how OpenSearch Benchmark provisions OpenSearch and workloads:
- `--provision-config-repository`: Defines the repository from which OpenSearch Benchmark loads `provision-configs` and `provision-config-instances`.
- `--provision-config-path`: Defines the path to the `--provision-config-instance` and any OpenSearch plugin configurations to use.
- `--provision-config-revision`: Defines a specific Git revision in the `provision-config` that OpenSearch Benchmark should use.
- `--provision-config-instance`: Defines the `--provision-config-instance` to use. You can see possible configuration instances using the command `opensearch-benchmark list provision-config-instances`.
- `--provision-config-instance-params`: A comma-separated list of key-value pairs injected verbatim as variables for the `provision-config-instance`.
- `--provision-config-instance-params`: A comma-separated list of key-value pairs injected verbatim as variables for the `provision-config-instance`.
### Workload
The following options determine which workload is used to run the test.
The following options determine which workload is used to run the test:
- `--workload-repository`: Defines the repository from where OpenSearch Benchmark loads workloads.
- `--workload-repository`: Defines the repository from which OpenSearch Benchmark loads workloads.
- `--workload-path`: Defines the path to a downloaded or custom workload.
- `--workload-revision`: Defines a specific revision from the workload source tree that OpenSearch Benchmark should use.
- `--workload`: Defines the workload to use based on the workload's name. You can find a list of preloaded workloads using `opensearch-benchmark list workloads`.
### Test procedures
The following options define what test procedures the test uses and which operations are contained inside the procedure.
The following options define what test procedures the test uses and which operations are contained inside the procedure:
- `--test-execution-id`: Defines a unique ID for this test run.
- `--test-procedure`: Defines a test procedure to use. You can find a list of test procedures using `opensearch-benchmark list test-procedures`.
- `--include-tasks`: Defines a comma-separated list of test procedure tasks to run. By default, all tasks listed in a test procedure array are run.
- `--exclude-tasks`: Defines a comma-separated list of test procedure tasks not to run.
- `--enable-assertions`: Enables assertion checks for tasks. Default is `false`.
- `--enable-assertions`: Enables assertion checks for tasks. Default is `false`.
### Pipelines
@ -161,18 +161,18 @@ The `--pipeline` option selects a pipeline to run. You can find a list of pipeli
### Telemetry
The following options enable telemetry devices on OpenSearch Benchmark.
The following options enable telemetry devices on OpenSearch Benchmark:
- `--telemetry`: Enables the provided telemetry devices when the devices are provided using a comma-separated list. You can find a list of possible telemetry devices by using `opensearch-benchmark list telemetry`.
- `--telemetry-params`: Defines a comma-separated list of key-value pairs that are injected verbatim into the telemetry devices as parameters.
### Errors
The following options set how OpenSearch Benchmark handles errors when running tests.
The following options set how OpenSearch Benchmark handles errors when running tests:
- `--on-error`: Controls how OpenSearch Benchmark responds to errors. Default is `continue`.
- `--on-error`: Controls how OpenSearch Benchmark responds to errors. Default is `continue`.
- `continue`: Continues to run the test despite the error.
- `abort`: Aborts the test when an error occurs.
- `--preserve-install`: Keeps the Benchmark candidate and its index. Default is `false`.
- `--kill-running-processes`: When set to `true`, stops any OpenSearch Benchmark processes currently running and allows OpenSearch Benchmark to continue to run. Default is `false`.
- `--kill-running-processes`: When set to `true`, stops any OpenSearch Benchmark processes currently running and allows OpenSearch Benchmark to continue to run. Default is `false`.

View File

@ -5,20 +5,20 @@ nav_order: 70
parent: Command reference
---
The `generate` command generates visualization based on benchmark results.
The `generate` command generates visualizations based on benchmark results.
## Usage
The following example generates a time-series chart, which outputs into the `.benchmark` directory.
The following example generates a time-series chart, which outputs into the `.benchmark` directory:
```
opensearch-benchmark generate --chart-type="time-series"
opensearch-benchmark generate --chart-type="time-series"
```
## Options
The following options customize the visualization produced by the `generate` command:
The following options customize the visualization produced by the `generate` command:
- `--chart-spec-path`: Sets the path to the JSON files containing chart specifications that can be used to generate charts.
- `--chart-type`: Generates the indicated chart type, either `time-series` or `bar`. Default is `time-series`.
- `--output-path`: The path and name where the chart outputs. Default is `stdout`.
- `--output-path`: The path and name where the chart outputs. Default is `stdout`.

View File

@ -9,18 +9,18 @@ has_children: true
This section provides a list of commands supported by OpenSearch Benchmark, including commonly used commands such as `execute-test` and `list`.
- [compare]({{site.url}}{{site.baseurl}}/benchmark/commands/compare/)
- [download]({{site.url}}{{site.baseurl}}/benchmark/commands/download/)
- [execute-test]({{site.url}}{{site.baseurl}}/benchmark/commands/execute-test/)
- [generate]({{site.url}}{{site.baseurl}}/benchmark/commands/generate/)
- [info]({{site.url}}{{site.baseurl}}/benchmark/commands/info/)
- [list]({{site.url}}{{site.baseurl}}/benchmark/commands/list/)
- [compare]({{site.url}}{{site.baseurl}}/benchmark/commands/compare/)
- [download]({{site.url}}{{site.baseurl}}/benchmark/commands/download/)
- [execute-test]({{site.url}}{{site.baseurl}}/benchmark/commands/execute-test/)
- [generate]({{site.url}}{{site.baseurl}}/benchmark/commands/generate/)
- [info]({{site.url}}{{site.baseurl}}/benchmark/commands/info/)
- [list]({{site.url}}{{site.baseurl}}/benchmark/commands/list/)
## List of common options
All OpenSearch Benchmark commands support the following options:
All OpenSearch Benchmark commands support the following options:
- `--h` or `--help`: Provides options and other useful information about each command.
- `--quiet`: Hides as much of the results output as possible. Default is `false`.
- `--offline`: Indicates whether OpenSearch Benchmark has a connection to the Internet. Default is `false`.
- `--h` or `--help`: Provides options and other useful information about each command.
- `--quiet`: Hides as much of the results output as possible. Default is `false`.
- `--offline`: Indicates whether OpenSearch Benchmark has a connection to the internet. Default is `false`.

View File

@ -7,17 +7,17 @@ parent: Command reference
# info
The `info` command prints details about a OpenSearch Benchmark component.
The `info` command prints details about an OpenSearch Benchmark component.
## Usage
The following example returns information about a workload named `nyc_taxis`.
The following example returns information about a workload named `nyc_taxis`:
```
opensearch-benchmark info --workload=nyc_taxis
```
OpenSearch Benchmark returns information about the workload, as shown in the following example response.
OpenSearch Benchmark returns information about the workload, as shown in the following example response:
```
____ _____ __ ____ __ __
@ -40,11 +40,11 @@ TestProcedure [searchable-snapshot]
Measuring performance for Searchable Snapshot feature. Based on the default test procedure 'append-no-conflicts'.
Schedule:
Schedule:
----------
1. delete-index
2. create-index
2. create-index
3. check-cluster-health
4. index (8 clients)
5. refresh-after-index
@ -57,17 +57,17 @@ Schedule:
12. wait-for-snapshot-creation
13. delete-local-index
14. restore-snapshot
15. default
15. default
16. range
17. distance_amount_agg
18. autohisto_agg
19. date_histogram_agg
====================================================
TestProcedure [append-no-conflicts] (run by default)
TestProcedure [append-no-conflicts] (run by default)
====================================================
Indexes the whole document corpus using a setup that will lead to a larger indexing throughput than the default settings and produce a smaller index (higher compression rate). Document ids are unique so all index operations are append only. After that a couple of queries are run.
Indexes the entire document corpus using a setup that will lead to a larger indexing throughput than the default settings and produce a smaller index (higher compression rate). Document IDs are unique, so all index operations are append only. After that, a couple of queries are run.
Schedule:
----------
@ -146,13 +146,13 @@ Schedule:
## Options
You can use the following options with the `info` command:
You can use the following options with the `info` command:
- `--workload-repository`: Defines the repository from where OpenSearch Benchmark loads workloads.
- `--workload-path`: Defines the path to a downloaded or custom workload.
- `--workload-path`: Defines the path to a downloaded or custom workload.
- `--workload-revision`: Defines a specific revision from the workload source tree that OpenSearch Benchmark should use.
- `--workload`: Defines the workload to use based on the workload's name. You can find a list of preloaded workloads using `opensearch-benchmark list workloads`.
- `--workload`: Defines the workload to use based on the workload's name. You can find a list of preloaded workloads using `opensearch-benchmark list workloads`.
- `--test-procedure`: Defines a test procedure to use. You can find a list of test procedures using `opensearch-benchmark list test_procedures`.
- `--include-tasks`: Defines a comma-separated list of test procedure tasks to run. By default, all tasks listed in a test procedure array are run.
- `--exclude-tasks`: Defines a comma-separated list of test procedure tasks not to run.
- `--exclude-tasks`: Defines a comma-separated list of test procedure tasks not to run.

View File

@ -9,23 +9,23 @@ parent: Command reference
The `list` command lists the following elements used by OpenSearch Benchmark:
- `telemetry`: Telemetry devices
- `workloads`: Workloads
- `telemetry`: Telemetry devices
- `workloads`: Workloads
- `pipelines`: Pipelines
- `test_executions`: Single run of a workload
- `test_executions`: Single run of a workload
- `provision_config_instances`: Provisioned configuration instances
- `opensearch-plugins`: OpenSearch plugins
- `opensearch-plugins`: OpenSearch plugins
## Usage
The following example lists any workload test runs and detailed information about each test.
The following example lists any workload test runs and detailed information about each test:
```
`opensearch-benchmark list test_executions
```
OpenSearch Benchmark returns information about each test.
OpenSearch Benchmark returns information about each test.
```
benchmark list test_executions
@ -60,11 +60,11 @@ ba643ed3-0db5-452e-a680-2b0dc0350cf2 20230522T224450Z geonames
## Options
You can use the following options with the `test` command:
You can use the following options with the `test` command:
- `--limit`: Limits the number of search results for recent test executions. Default is `10`.
- `--limit`: Limits the number of search results for recent test runs. Default is `10`.
- `--workload-repository`: Defines the repository from where OpenSearch Benchmark loads workloads.
- `--workload-path`: Defines the path to a downloaded or custom workload.
- `--workload-revision`: Defines a specific revision from the workload source tree that OpenSearch Benchmark should use.
- `--workload-revision`: Defines a specific revision from the workload source tree that OpenSearch Benchmark should use.

View File

@ -7,25 +7,25 @@ has_children: false
# Creating custom workloads
OpenSearch Benchmark includes a set of [workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) that you can use to benchmark data from your cluster. Additionally, if you want to create a workload that is tailored to your own data, you can create a custom workload using one of the following options:
OpenSearch Benchmark includes a set of [workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) that you can use to benchmark data from your cluster. Additionally, if you want to create a workload that is tailored to your own data, you can create a custom workload using one of the following options:
- [Creating a workload from an existing cluster](#creating-a-workload-from-an-existing-cluster)
- [Creating a workload without an existing cluster](#creating-a-workload-without-an-existing-cluster)
- [Creating a workload from an existing cluster](#creating-a-workload-from-an-existing-cluster)
- [Creating a workload without an existing cluster](#creating-a-workload-without-an-existing-cluster)
## Creating a workload from an existing cluster
If you already have an OpenSearch cluster with indexed data, use the following steps to create a custom workload for your cluster.
If you already have an OpenSearch cluster with indexed data, use the following steps to create a custom workload for your cluster.
### Prerequisites
Before creating a custom workload, make sure you have the following prerequisites:
Before creating a custom workload, make sure you have the following prerequisites:
- An OpenSearch cluster with an index that contains 1000 or more documents. If your cluster's index does not contain at least 1000 documents, the workload can still run tests, however, you cannot run workloads using `--test-mode`.
- You must have the correct permissions to access your OpenSearch cluster. For more information about cluster permissions, see [Permissions]({{site.url}}{{site.baseurl}}/security/access-control/permissions/).
- You must have the correct permissions to access your OpenSearch cluster. For more information about cluster permissions, see [Permissions]({{site.url}}{{site.baseurl}}/security/access-control/permissions/).
### Customizing the workload
To begin creating a custom workload, use the `opensearch-benchmark create-workload` command.
To begin creating a custom workload, use the `opensearch-benchmark create-workload` command.
```
opensearch-benchmark create-workload \
@ -39,12 +39,12 @@ opensearch-benchmark create-workload \
Replace the following options in the preceding example with information specific to your existing cluster:
- `--workload`: A custom name for your custom workload.
- `--target-hosts:` A comma-separated list of host:port pairs for the cluster to extract data from.
- `--client-options`: The basic authentication client options that OpenSearch Benchmark uses to access the cluster.
- `--indices`: One or more indexes inside your OpenSearch cluster that contain data.
- `--output-path`: The directory where OpenSearch Benchmark creates the workload and its configuration files.
- `--target-hosts:` A comma-separated list of host:port pairs from which the cluster extracts data.
- `--client-options`: The basic authentication client options that OpenSearch Benchmark uses to access the cluster.
- `--indices`: One or more indexes inside your OpenSearch cluster that contain data.
- `--output-path`: The directory in which OpenSearch Benchmark creates the workload and its configuration files.
The following example response creates a workload named `movies` from a cluster with an index named `movies-info`. The `movies-info` index contains over 2000 documents.
The following example response creates a workload named `movies` from a cluster with an index named `movies-info`. The `movies-info` index contains over 2,000 documents.
```
____ _____ __ ____ __ __
@ -67,13 +67,13 @@ Extracting documents for index [movies]... 2000/2000 docs [10
-------------------------------
```
As part of workload creation, OpenSearch Benchmark generates the following files. You can access them in directory specified by the `--output-path` option.
As part of workload creation, OpenSearch Benchmark generates the following files. You can access them in the directory specified by the `--output-path` option.
- `workload.json`: Contains general workload specifications.
- `<index>.json`: Contains mappings and settings for the extracted indexes.
- `<index>-documents.json`: Contains the sources of every document from the extracted indexes. Any sources suffixed with `-1k` encompasses only a fraction of the document corpus of the workload and are only used when running the workload in test mode.
- `workload.json`: Contains general workload specifications.
- `<index>.json`: Contains mappings and settings for the extracted indexes.
- `<index>-documents.json`: Contains the sources of every document from the extracted indexes. Any sources suffixed with `-1k` encompass only a fraction of the document corpus of the workload and are only used when running the workload in test mode.
By default, OpenSearch Benchmark does not contain a reference to generate queries. Because you have the best understanding of your data, we recommend adding a query to `workload.json` that matches your index's specifications. Use the following `match_all` query as an example of a query added to your workload:
By default, OpenSearch Benchmark does not contain a reference to generate queries. Because you have the best understanding of your data, we recommend adding a query to `workload.json` that matches your index's specifications. Use the following `match_all` query as an example of a query added to your workload:
```
{
@ -99,7 +99,7 @@ If you want to create a custom workload but do not have an existing OpenSearch c
To build a workload with source files, create a directory for your workload and perform the following steps:
1. Build a `<index>-documents.json` file that contains rows of documents that comprise the document corpora of the workload, and houses all data to be ingested and queried into the cluster. The following example shows the first few rows of a `movies-documents.json` file that contains rows of documents about famous movies:
1. Build a `<index>-documents.json` file that contains rows of documents that comprise the document corpora of the workload and houses all data to be ingested and queried into the cluster. The following example shows the first few rows of a `movies-documents.json` file that contains rows of documents about famous movies:
```json
# First few rows of movies-documents.json
@ -109,7 +109,7 @@ To build a workload with source files, create a directory for your workload and
{"title": "The Godfather: Part II", "director": "Francis Ford Coppola", "revenue": "$48,000,000 USD", "rating": "9 out of 10", "image_url": "https://imdb.com/images/7"}
```
2. In the same directory, build a `index.json` file. The workload uses this file as a reference for data mappings and index settings for the documents contained in `<index>-documents.json`. The following example creates mappings and settings specific to the `movie-documents.json` data from the previous step:
2. In the same directory, build a `index.json` file. The workload uses this file as a reference for data mappings and index settings for the documents contained in `<index>-documents.json`. The following example creates mappings and settings specific to the `movie-documents.json` data from the previous step:
```json
{
@ -139,21 +139,21 @@ To build a workload with source files, create a directory for your workload and
}
```
3. Next, build a `workload.json` file that provides a high-level overview of your workload and determines how your workload runs benchmark tests. The `workload.json` file contains the following sections:
3. Next, build a `workload.json` file that provides a high-level overview of your workload and determines how your workload runs benchmark tests. The `workload.json` file contains the following sections:
- `indices`: Defines the name of the index to be created in your OpenSearch cluster using the mappings from the workload's `index.json` file created in the previous step.
- `corpora`: Defines the corpora and the source file, including the:
- `document-count`: The number of documents in `<index>-documents.json`. To get an accurate number of documents, run `wc -l <index>-documents.json`.
- `uncompressed-bytes`: The number of bytes inside the index. To get an accurate number of bytes, run `stat -f %z <index>-documents.json` on macOS or `stat -c %s <index>-documents.json` on GNU/Linux. Alternatively, run `ls -lrt | grep <index>-documents.json`.
- `indices`: Defines the name of the index to be created in your OpenSearch cluster using the mappings from the workload's `index.json` file created in the previous step.
- `corpora`: Defines the corpora and the source file, including the:
- `document-count`: The number of documents in `<index>-documents.json`. To get an accurate number of documents, run `wc -l <index>-documents.json`.
- `uncompressed-bytes`: The number of bytes inside the index. To get an accurate number of bytes, run `stat -f %z <index>-documents.json` on macOS or `stat -c %s <index>-documents.json` on GNU/Linux. Alternatively, run `ls -lrt | grep <index>-documents.json`.
- `schedule`: Defines the sequence of operations and available test procedures for the workload.
The following example `workload.json` provides the entry point for the `movies` workload. The `indices` section creates an index called `movies`. The corpora section refers to the source file created in step one, `movie-documents.json` and provides the document count and uncompressed bytes. Lastly, the schedule section defines a few operations the workload performs when invoked, including:
The following example `workload.json` file provides the entry point for the `movies` workload. The `indices` section creates an index called `movies`. The corpora section refers to the source file created in step one, `movie-documents.json`, and provides the document count and the amount of uncompressed bytes. Lastly, the schedule section defines a few operations the workload performs when invoked, including:
- Deleting any current index named `movies`.
- Deleting any current index named `movies`.
- Creating an index named `movies` based on data from `movie-documents.json` and the mappings from `index.json`.
- Verifying the cluster is in good health and can ingest the new index.
- Ingesting the data corpora from `workload.json` into the cluster.
- Querying the results.
- Verifying that the cluster is in good health and can ingest the new index.
- Ingesting the data corpora from `workload.json` into the cluster.
- Querying the results.
```json
{
@ -229,7 +229,7 @@ To build a workload with source files, create a directory for your workload and
}
```
4. With all the workload files created, verify the workload runs by executing a test. To verify the workload, run the following command, replacing `--workload-path` with a path to your workload directory:
4. For all the workload files created, verify that the workload is functional by running a test. To verify the workload, run the following command, replacing `--workload-path` with a path to your workload directory:
```
opensearch-benchmark list workloads --workload-path=</path/to/workload/>
@ -237,7 +237,7 @@ To build a workload with source files, create a directory for your workload and
## Invoking your custom workload
Use the `opensearch-benchmark execute-test` command to invoke your new workload and run a benchmark test against your OpenSearch cluster, as shown in the following example. Replace `--workload-path` with the path to your custom workload, `--target-host` with the `host:port` pairs for your cluster, and `--client-options` with any authorization options required to access the cluster.
Use the `opensearch-benchmark execute-test` command to invoke your new workload and run a benchmark test against your OpenSearch cluster, as shown in the following example. Replace `--workload-path` with the path to your custom workload, `--target-host` with the `host:port` pairs for your cluster, and `--client-options` with any authorization options required to access the cluster.
```
opensearch-benchmark execute_test \
@ -278,11 +278,11 @@ opensearch-benchmark execute_test \
After using your custom workload several times, you might want to use the same workload but perform the workload's operations in a different order. Instead of creating a new workload or reorganizing the procedures directly, you can provide test procedures to vary workload operations.
To add variance to your workload operations, go to your `workload.json` file and replace the `schedule` section with a `test_procedures` array, as shown in the following example. Each item in the array contains the following:
To add variance to your workload operations, go to your `workload.json` file and replace the `schedule` section with a `test_procedures` array, as shown in the following example. Each item in the array contains the following:
- `name`: The name of the test procedure.
- `default`: When set to `true`, OpenSearch Benchmark defaults to the test procedure specified as `default` in the workload if no other test procedures are specified.
- `schedule`: All the operations the test procedure will run.
- `name`: The name of the test procedure.
- `default`: When set to `true`, OpenSearch Benchmark defaults to the test procedure specified as `default` in the workload if no other test procedures are specified.
- `schedule`: All the operations the test procedure will run.
```json
@ -346,11 +346,11 @@ To add variance to your workload operations, go to your `workload.json` file and
### Separate operations and test procedures
If you want to make your `workload.json` file more readable, you can separate your operations and test procedures into different directories and reference the path to each in `workload.json`. To separate operations and procedures, perform the following steps:
If you want to make your `workload.json` file more readable, you can separate your operations and test procedures into different directories and reference the path to each in `workload.json`. To separate operations and procedures, perform the following steps:
1. Add all test procedures to a single file. You can give the file any name. Because the `movies` workload in the preceding contains and index task and queries, this step names the test procedures file `index-and-query.json`.
2. Add all operations to a file named `operations.json`.
3. Reference the new files in `workloads.json` by adding the following syntax, replacing `parts` with the relative path to each file, as showing in the following example:
2. Add all operations to a file named `operations.json`.
3. Reference the new files in `workloads.json` by adding the following syntax, replacing `parts` with the relative path to each file, as shown in the following example:
```json
"operations": [
@ -364,8 +364,8 @@ If you want to make your `workload.json` file more readable, you can separate yo
## Next steps
- For more information about configuring OpenSearch Benchmark, see [Configuring OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/).
- To show a list of pre-packaged workloads for OpenSearch Benchmark, see the [opensearch-benchmark-workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) repo.
- For more information about configuring OpenSearch Benchmark, see [Configuring OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/).
- To show a list of prepackaged workloads for OpenSearch Benchmark, see the [opensearch-benchmark-workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) repository.

@ -1 +0,0 @@
Subproject commit 21c1d36f8b1681a7f3a4430f2e2dd62fb0f87a4b