Add new Bechmark IA (#5022)
* Rework Benchmark IA Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add new Benchmark IA. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add Quickstart steps and Sigv4 guide. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add tutorial text Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix links Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add technical feedback Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Remove section Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update quickstart.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update concepts.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: Naarcha-AWS <naarcha@amazon.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
parent
70be12b867
commit
796076bb04
|
@ -17,17 +17,18 @@ OpenSearch Benchmark is a macrobenchmark utility provided by the [OpenSearch Pro
|
|||
|
||||
OpenSearch Benchmark can be installed directly on a compatible host running Linux and macOS. You can also run OpenSearch Benchmark in a Docker container. See [Installing OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/installing-benchmark/) for more information.
|
||||
|
||||
## Concepts
|
||||
The following diagram visualizes how OpenSearch Benchmark works when run against a local host:
|
||||
|
||||
Before using OpenSearch Benchmark, familiarize yourself with the following concepts:
|
||||
![Benchmark workflow]({{site.url}}{{site.baseurl}}/images/benchmark/OSB-workflow.png).
|
||||
|
||||
The OpenSearch Benchmark documentation is split into five sections:
|
||||
|
||||
- [Quickstart]({{site.url}}{{site.baseurl}}/benchmark/quickstart/): Learn how to quickly run and install OpenSearch Benchmark.
|
||||
- [User guide]({{site.url}}{{site.baseurl}}/benchmark/user-guide/index/): Dive deep into how OpenSearch Benchmark can help you track the performance of your cluster.
|
||||
- [Tutorials]({{site.url}}{{site.baseurl}}/benchmark/tutorials/index/): Use step-by-step guides for more advanced benchmarking configurations and functionality.
|
||||
- [Commands]({{site.url}}{{site.baseurl}}/benchmark/commands/index/): A detailed reference of commands and command options supported by OpenSearch.
|
||||
- [Workloads]({{site.url}}{{site.baseurl}}/benchmark/workloads/index/): A detailed reference of options available for both default and custom workloads.
|
||||
|
||||
- **Workload**: The description of one or more benchmarking scenarios that use a specific document corpus from which to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workflow runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads inside the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/).
|
||||
|
||||
- **Pipeline**: A series of steps before and after a workload is run that determines benchmark results. OpenSearch Benchmark supports three pipelines:
|
||||
- `from-sources`: Builds and provisions OpenSearch, runs a benchmark, and then publishes the results.
|
||||
- `from-distribution`: Downloads an OpenSearch distribution, provisions it, runs a benchmark, and then publishes the results.
|
||||
- `benchmark-only`: The default pipeline. Assumes an already running OpenSearch instance, runs a benchmark on that instance, and then publishes the results.
|
||||
|
||||
- **Test**: A single invocation of the OpenSearch Benchmark binary.
|
||||
|
||||
|
||||
|
|
|
@ -0,0 +1,401 @@
|
|||
---
|
||||
layout: default
|
||||
title: Quickstart
|
||||
nav_order: 2
|
||||
---
|
||||
|
||||
# OpenSearch Benchmark quickstart
|
||||
|
||||
This tutorial outlines how to quickly install OpenSearch Benchmark and run your first OpenSearch Benchmark workload.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
To perform the Quickstart steps, you'll need to fulfill the following prerequisites:
|
||||
|
||||
- A currently active OpenSearch cluster. For instructions on how to create an OpenSearch cluster, see [Creating a cluster]({{site.url}}{{site.baseurl}}//tuning-your-cluster/index/).
|
||||
- Git 2.3 or greater.
|
||||
|
||||
Additional prerequisites are required, depending on your installation method:
|
||||
|
||||
- If you plan to install OpenSearch Benchmark with [PyPi](https://pypi.org/), install Python 3.8 or later.
|
||||
- If you plan to install OpenSearch Benchmark using Docker, install Docker.
|
||||
|
||||
## Installing OpenSearch Benchmark
|
||||
|
||||
You can install OpenSearch Benchmark using either PyPi or Docker.
|
||||
|
||||
If you plan to run OpenSearch Benchmark with a cluster using AWS Signature Version 4, see [Sigv4 support]({{site.url}}{{site.baseurl}}/benchmark/tutorials/sigv4/).
|
||||
|
||||
### PyPi
|
||||
|
||||
To install OpenSearch Benchmark with PyPi, enter the following `pip` command:
|
||||
|
||||
```bash
|
||||
pip3 install opensearch-benchmark
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
After the installation completes, verify that OpenSearch Benchmark is running by entering the following command:
|
||||
|
||||
```bash
|
||||
opensearch-benchmark --help
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
If successful, OpenSearch returns the following response:
|
||||
|
||||
```bash
|
||||
$ opensearch-benchmark --help
|
||||
usage: opensearch-benchmark [-h] [--version] {execute-test,list,info,create-workload,generate,compare,download,install,start,stop} ...
|
||||
|
||||
____ _____ __ ____ __ __
|
||||
/ __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
|
||||
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
|
||||
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
|
||||
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
|
||||
/_/
|
||||
|
||||
A benchmarking tool for OpenSearch
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
--version show program's version number and exit
|
||||
|
||||
subcommands:
|
||||
{execute-test,list,info,create-workload,generate,compare,download,install,start,stop}
|
||||
execute-test Run a benchmark
|
||||
list List configuration options
|
||||
info Show info about a workload
|
||||
create-workload Create a Benchmark workload from existing data
|
||||
generate Generate artifacts
|
||||
compare Compare two test_executions
|
||||
download Downloads an artifact
|
||||
install Installs an OpenSearch node locally
|
||||
start Starts an OpenSearch node locally
|
||||
stop Stops an OpenSearch node locally
|
||||
|
||||
Find out more about Benchmark at https://opensearch.org/docs
|
||||
```
|
||||
|
||||
### Docker
|
||||
|
||||
To pull the image from Docker Hub, run the following command:
|
||||
|
||||
```bash
|
||||
docker pull opensearchproject/opensearch-benchmark:latest
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
Then run the Docker image:
|
||||
|
||||
```bash
|
||||
docker run opensearchproject/opensearch-benchmark -h
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
|
||||
## Running your first benchmark
|
||||
|
||||
You can now run your first benchmark. For your first benchmark, you'll use the [geonames](https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/geonames) workload.
|
||||
|
||||
|
||||
### Understanding workload command flags
|
||||
|
||||
Benchmarks are run using the [`execute-test`]({{site.url}}{{site.baseurl}}/benchmark/commands/execute-test/) command with the following command flags:
|
||||
|
||||
For additional `execute_test` command flags, see the [execute-test]({{site.url}}{{site.baseurl}}/benchmark/commands/execute-test/) reference. Some commonly used options are `--workload-params`, `--exclude-tasks`, and `--include-tasks`.
|
||||
{: .tip}
|
||||
|
||||
* `--pipeline=benchmark-only` : Informs OSB that users wants to provide their own OpenSearch cluster.
|
||||
- `workload=geonames`: The name of workload used by OpenSearch Benchmark.
|
||||
* `--target-host="<OpenSearch Cluster Endpoint>"`: Indicates the target cluster or host that will be benchmarked. Enter the endpoint of your OpenSearch cluster here.
|
||||
* `--client-options="basic_auth_user:'<Basic Auth Username>',basic_auth_password:'<Basic Auth Password>'"`: The username and password for your OpenSearch cluster.
|
||||
* `--test-mode`: Allows a user to run the workload without running it for the entire duration. When this flag is present, Benchmark runs the first thousand operations of each task in the workload. This is only meant for sanity checks---the metrics produced are meaningless.
|
||||
|
||||
The `--distribution-version`, which indicates which OpenSearch version Benchmark will use when provisioning. When run, the `execute-test` command will parse the correct distribution version when it connects to the OpenSearch cluster.
|
||||
|
||||
### Running the workload
|
||||
|
||||
If you installed Benchmark with PyPi, customize and use the following command:
|
||||
|
||||
```bash
|
||||
opensearch-benchmark execute-test --pipeline=benchmark-only --workload=geonames --target-host="<OpenSearch Cluster Endpoint>" --client-options="basic_auth_user:'<Basic Auth Username>',basic_auth_password:'<Basic Auth Password>'" --test-mode
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
If you installed Benchmark with Docker, customize and use the following command:
|
||||
|
||||
```bash
|
||||
docker run opensearchproject/opensearch-benchmark execute-test --pipeline=benchmark-only --workload=geonames --target-host="<OpenSearch Cluster Endpoint>" --client-options="basic_auth_user:'<Basic Auth Username>',basic_auth_password:'<Basic Auth Password>'" --test-mode
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
When the `execute_test` command runs, all tasks and operations in the `geonames` workload run sequentially.
|
||||
|
||||
### Understanding the results
|
||||
|
||||
Benchmark returns the following response once the benchmark completes:
|
||||
|
||||
```bash
|
||||
-----------------------------------------------------
|
||||
_______ __ _____
|
||||
/ ____(_)___ ____ _/ / / ___/_________ ________
|
||||
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
|
||||
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
|
||||
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
|
||||
------------------------------------------------------
|
||||
|
||||
| Metric | Task | Value | Unit |
|
||||
|---------------------------------------------------------------:|-------------------------------:|------------:|--------:|
|
||||
| Cumulative indexing time of primary shards | | 0.0359333 | min |
|
||||
| Min cumulative indexing time across primary shards | | 0.00453333 | min |
|
||||
| Median cumulative indexing time across primary shards | | 0.00726667 | min |
|
||||
| Max cumulative indexing time across primary shards | | 0.00878333 | min |
|
||||
| Cumulative indexing throttle time of primary shards | | 0 | min |
|
||||
| Min cumulative indexing throttle time across primary shards | | 0 | min |
|
||||
| Median cumulative indexing throttle time across primary shards | | 0 | min |
|
||||
| Max cumulative indexing throttle time across primary shards | | 0 | min |
|
||||
| Cumulative merge time of primary shards | | 0 | min |
|
||||
| Cumulative merge count of primary shards | | 0 | |
|
||||
| Min cumulative merge time across primary shards | | 0 | min |
|
||||
| Median cumulative merge time across primary shards | | 0 | min |
|
||||
| Max cumulative merge time across primary shards | | 0 | min |
|
||||
| Cumulative merge throttle time of primary shards | | 0 | min |
|
||||
| Min cumulative merge throttle time across primary shards | | 0 | min |
|
||||
| Median cumulative merge throttle time across primary shards | | 0 | min |
|
||||
| Max cumulative merge throttle time across primary shards | | 0 | min |
|
||||
| Cumulative refresh time of primary shards | | 0.00728333 | min |
|
||||
| Cumulative refresh count of primary shards | | 35 | |
|
||||
| Min cumulative refresh time across primary shards | | 0.000966667 | min |
|
||||
| Median cumulative refresh time across primary shards | | 0.00136667 | min |
|
||||
| Max cumulative refresh time across primary shards | | 0.00236667 | min |
|
||||
| Cumulative flush time of primary shards | | 0 | min |
|
||||
| Cumulative flush count of primary shards | | 0 | |
|
||||
| Min cumulative flush time across primary shards | | 0 | min |
|
||||
| Median cumulative flush time across primary shards | | 0 | min |
|
||||
| Max cumulative flush time across primary shards | | 0 | min |
|
||||
| Total Young Gen GC time | | 0.01 | s |
|
||||
| Total Young Gen GC count | | 1 | |
|
||||
| Total Old Gen GC time | | 0 | s |
|
||||
| Total Old Gen GC count | | 0 | |
|
||||
| Store size | | 0.00046468 | GB |
|
||||
| Translog size | | 2.56114e-07 | GB |
|
||||
| Heap used for segments | | 0.113216 | MB |
|
||||
| Heap used for doc values | | 0.0171394 | MB |
|
||||
| Heap used for terms | | 0.0777283 | MB |
|
||||
| Heap used for norms | | 0.010437 | MB |
|
||||
| Heap used for points | | 0 | MB |
|
||||
| Heap used for stored fields | | 0.00791168 | MB |
|
||||
| Segment count | | 17 | |
|
||||
| Min Throughput | index-append | 1879.5 | docs/s |
|
||||
| Mean Throughput | index-append | 1879.5 | docs/s |
|
||||
| Median Throughput | index-append | 1879.5 | docs/s |
|
||||
| Max Throughput | index-append | 1879.5 | docs/s |
|
||||
| 50th percentile latency | index-append | 505.028 | ms |
|
||||
| 100th percentile latency | index-append | 597.718 | ms |
|
||||
| 50th percentile service time | index-append | 505.028 | ms |
|
||||
| 100th percentile service time | index-append | 597.718 | ms |
|
||||
| error rate | index-append | 0 | % |
|
||||
| Min Throughput | wait-until-merges-finish | 43.82 | ops/s |
|
||||
| Mean Throughput | wait-until-merges-finish | 43.82 | ops/s |
|
||||
| Median Throughput | wait-until-merges-finish | 43.82 | ops/s |
|
||||
| Max Throughput | wait-until-merges-finish | 43.82 | ops/s |
|
||||
| 100th percentile latency | wait-until-merges-finish | 22.2577 | ms |
|
||||
| 100th percentile service time | wait-until-merges-finish | 22.2577 | ms |
|
||||
| error rate | wait-until-merges-finish | 0 | % |
|
||||
| Min Throughput | index-stats | 58.04 | ops/s |
|
||||
| Mean Throughput | index-stats | 58.04 | ops/s |
|
||||
| Median Throughput | index-stats | 58.04 | ops/s |
|
||||
| Max Throughput | index-stats | 58.04 | ops/s |
|
||||
| 100th percentile latency | index-stats | 24.891 | ms |
|
||||
| 100th percentile service time | index-stats | 7.02568 | ms |
|
||||
| error rate | index-stats | 0 | % |
|
||||
| Min Throughput | node-stats | 51.21 | ops/s |
|
||||
| Mean Throughput | node-stats | 51.21 | ops/s |
|
||||
| Median Throughput | node-stats | 51.21 | ops/s |
|
||||
| Max Throughput | node-stats | 51.21 | ops/s |
|
||||
| 100th percentile latency | node-stats | 26.4279 | ms |
|
||||
| 100th percentile service time | node-stats | 6.38569 | ms |
|
||||
| error rate | node-stats | 0 | % |
|
||||
| Min Throughput | default | 14.03 | ops/s |
|
||||
| Mean Throughput | default | 14.03 | ops/s |
|
||||
| Median Throughput | default | 14.03 | ops/s |
|
||||
| Max Throughput | default | 14.03 | ops/s |
|
||||
| 100th percentile latency | default | 78.9157 | ms |
|
||||
| 100th percentile service time | default | 7.30501 | ms |
|
||||
| error rate | default | 0 | % |
|
||||
| Min Throughput | term | 59.96 | ops/s |
|
||||
| Mean Throughput | term | 59.96 | ops/s |
|
||||
| Median Throughput | term | 59.96 | ops/s |
|
||||
| Max Throughput | term | 59.96 | ops/s |
|
||||
| 100th percentile latency | term | 22.4626 | ms |
|
||||
| 100th percentile service time | term | 5.38508 | ms |
|
||||
| error rate | term | 0 | % |
|
||||
| Min Throughput | phrase | 44.66 | ops/s |
|
||||
| Mean Throughput | phrase | 44.66 | ops/s |
|
||||
| Median Throughput | phrase | 44.66 | ops/s |
|
||||
| Max Throughput | phrase | 44.66 | ops/s |
|
||||
| 100th percentile latency | phrase | 27.4984 | ms |
|
||||
| 100th percentile service time | phrase | 4.81552 | ms |
|
||||
| error rate | phrase | 0 | % |
|
||||
| Min Throughput | country_agg_uncached | 16.16 | ops/s |
|
||||
| Mean Throughput | country_agg_uncached | 16.16 | ops/s |
|
||||
| Median Throughput | country_agg_uncached | 16.16 | ops/s |
|
||||
| Max Throughput | country_agg_uncached | 16.16 | ops/s |
|
||||
| 100th percentile latency | country_agg_uncached | 67.5527 | ms |
|
||||
| 100th percentile service time | country_agg_uncached | 5.40069 | ms |
|
||||
| error rate | country_agg_uncached | 0 | % |
|
||||
| Min Throughput | country_agg_cached | 49.31 | ops/s |
|
||||
| Mean Throughput | country_agg_cached | 49.31 | ops/s |
|
||||
| Median Throughput | country_agg_cached | 49.31 | ops/s |
|
||||
| Max Throughput | country_agg_cached | 49.31 | ops/s |
|
||||
| 100th percentile latency | country_agg_cached | 38.2485 | ms |
|
||||
| 100th percentile service time | country_agg_cached | 17.6579 | ms |
|
||||
| error rate | country_agg_cached | 0 | % |
|
||||
| Min Throughput | scroll | 29.76 | pages/s |
|
||||
| Mean Throughput | scroll | 29.76 | pages/s |
|
||||
| Median Throughput | scroll | 29.76 | pages/s |
|
||||
| Max Throughput | scroll | 29.76 | pages/s |
|
||||
| 100th percentile latency | scroll | 93.1197 | ms |
|
||||
| 100th percentile service time | scroll | 25.3068 | ms |
|
||||
| error rate | scroll | 0 | % |
|
||||
| Min Throughput | expression | 8.32 | ops/s |
|
||||
| Mean Throughput | expression | 8.32 | ops/s |
|
||||
| Median Throughput | expression | 8.32 | ops/s |
|
||||
| Max Throughput | expression | 8.32 | ops/s |
|
||||
| 100th percentile latency | expression | 127.701 | ms |
|
||||
| 100th percentile service time | expression | 7.30691 | ms |
|
||||
| error rate | expression | 0 | % |
|
||||
| Min Throughput | painless_static | 6.2 | ops/s |
|
||||
| Mean Throughput | painless_static | 6.2 | ops/s |
|
||||
| Median Throughput | painless_static | 6.2 | ops/s |
|
||||
| Max Throughput | painless_static | 6.2 | ops/s |
|
||||
| 100th percentile latency | painless_static | 167.239 | ms |
|
||||
| 100th percentile service time | painless_static | 5.76951 | ms |
|
||||
| error rate | painless_static | 0 | % |
|
||||
| Min Throughput | painless_dynamic | 19.56 | ops/s |
|
||||
| Mean Throughput | painless_dynamic | 19.56 | ops/s |
|
||||
| Median Throughput | painless_dynamic | 19.56 | ops/s |
|
||||
| Max Throughput | painless_dynamic | 19.56 | ops/s |
|
||||
| 100th percentile latency | painless_dynamic | 56.9046 | ms |
|
||||
| 100th percentile service time | painless_dynamic | 5.50498 | ms |
|
||||
| error rate | painless_dynamic | 0 | % |
|
||||
| Min Throughput | decay_geo_gauss_function_score | 50.28 | ops/s |
|
||||
| Mean Throughput | decay_geo_gauss_function_score | 50.28 | ops/s |
|
||||
| Median Throughput | decay_geo_gauss_function_score | 50.28 | ops/s |
|
||||
| Max Throughput | decay_geo_gauss_function_score | 50.28 | ops/s |
|
||||
| 100th percentile latency | decay_geo_gauss_function_score | 25.9491 | ms |
|
||||
| 100th percentile service time | decay_geo_gauss_function_score | 5.7773 | ms |
|
||||
| error rate | decay_geo_gauss_function_score | 0 | % |
|
||||
| Min Throughput | decay_geo_gauss_script_score | 28.96 | ops/s |
|
||||
| Mean Throughput | decay_geo_gauss_script_score | 28.96 | ops/s |
|
||||
| Median Throughput | decay_geo_gauss_script_score | 28.96 | ops/s |
|
||||
| Max Throughput | decay_geo_gauss_script_score | 28.96 | ops/s |
|
||||
| 100th percentile latency | decay_geo_gauss_script_score | 41.179 | ms |
|
||||
| 100th percentile service time | decay_geo_gauss_script_score | 6.20007 | ms |
|
||||
| error rate | decay_geo_gauss_script_score | 0 | % |
|
||||
| Min Throughput | field_value_function_score | 52.97 | ops/s |
|
||||
| Mean Throughput | field_value_function_score | 52.97 | ops/s |
|
||||
| Median Throughput | field_value_function_score | 52.97 | ops/s |
|
||||
| Max Throughput | field_value_function_score | 52.97 | ops/s |
|
||||
| 100th percentile latency | field_value_function_score | 25.9004 | ms |
|
||||
| 100th percentile service time | field_value_function_score | 6.68765 | ms |
|
||||
| error rate | field_value_function_score | 0 | % |
|
||||
| Min Throughput | field_value_script_score | 35.24 | ops/s |
|
||||
| Mean Throughput | field_value_script_score | 35.24 | ops/s |
|
||||
| Median Throughput | field_value_script_score | 35.24 | ops/s |
|
||||
| Max Throughput | field_value_script_score | 35.24 | ops/s |
|
||||
| 100th percentile latency | field_value_script_score | 34.2866 | ms |
|
||||
| 100th percentile service time | field_value_script_score | 5.63202 | ms |
|
||||
| error rate | field_value_script_score | 0 | % |
|
||||
| Min Throughput | large_terms | 1.05 | ops/s |
|
||||
| Mean Throughput | large_terms | 1.05 | ops/s |
|
||||
| Median Throughput | large_terms | 1.05 | ops/s |
|
||||
| Max Throughput | large_terms | 1.05 | ops/s |
|
||||
| 100th percentile latency | large_terms | 1220.12 | ms |
|
||||
| 100th percentile service time | large_terms | 256.856 | ms |
|
||||
| error rate | large_terms | 0 | % |
|
||||
| Min Throughput | large_filtered_terms | 4.11 | ops/s |
|
||||
| Mean Throughput | large_filtered_terms | 4.11 | ops/s |
|
||||
| Median Throughput | large_filtered_terms | 4.11 | ops/s |
|
||||
| Max Throughput | large_filtered_terms | 4.11 | ops/s |
|
||||
| 100th percentile latency | large_filtered_terms | 389.415 | ms |
|
||||
| 100th percentile service time | large_filtered_terms | 137.216 | ms |
|
||||
| error rate | large_filtered_terms | 0 | % |
|
||||
| Min Throughput | large_prohibited_terms | 5.68 | ops/s |
|
||||
| Mean Throughput | large_prohibited_terms | 5.68 | ops/s |
|
||||
| Median Throughput | large_prohibited_terms | 5.68 | ops/s |
|
||||
| Max Throughput | large_prohibited_terms | 5.68 | ops/s |
|
||||
| 100th percentile latency | large_prohibited_terms | 352.926 | ms |
|
||||
| 100th percentile service time | large_prohibited_terms | 169.633 | ms |
|
||||
| error rate | large_prohibited_terms | 0 | % |
|
||||
| Min Throughput | desc_sort_population | 42.48 | ops/s |
|
||||
| Mean Throughput | desc_sort_population | 42.48 | ops/s |
|
||||
| Median Throughput | desc_sort_population | 42.48 | ops/s |
|
||||
| Max Throughput | desc_sort_population | 42.48 | ops/s |
|
||||
| 100th percentile latency | desc_sort_population | 28.6485 | ms |
|
||||
| 100th percentile service time | desc_sort_population | 4.82649 | ms |
|
||||
| error rate | desc_sort_population | 0 | % |
|
||||
| Min Throughput | :_sort_population | 49.06 | ops/s |
|
||||
| Mean Throughput | asc_sort_population | 49.06 | ops/s |
|
||||
| Median Throughput | asc_sort_population | 49.06 | ops/s |
|
||||
| Max Throughput | asc_sort_population | 49.06 | ops/s |
|
||||
| 100th percentile latency | asc_sort_population | 30.7929 | ms |
|
||||
| 100th percentile service time | asc_sort_population | 10.0023 | ms |
|
||||
| error rate | asc_sort_population | 0 | % |
|
||||
| Min Throughput | asc_sort_with_after_population | 55.9 | ops/s |
|
||||
| Mean Throughput | asc_sort_with_after_population | 55.9 | ops/s |
|
||||
| Median Throughput | asc_sort_with_after_population | 55.9 | ops/s |
|
||||
| Max Throughput | asc_sort_with_after_population | 55.9 | ops/s |
|
||||
| 100th percentile latency | asc_sort_with_after_population | 25.413 | ms |
|
||||
| 100th percentile service time | asc_sort_with_after_population | 7.00911 | ms |
|
||||
| error rate | asc_sort_with_after_population | 0 | % |
|
||||
| Min Throughput | desc_sort_geonameid | 63.86 | ops/s |
|
||||
| Mean Throughput | desc_sort_geonameid | 63.86 | ops/s |
|
||||
| Median Throughput | desc_sort_geonameid | 63.86 | ops/s |
|
||||
| Max Throughput | desc_sort_geonameid | 63.86 | ops/s |
|
||||
| 100th percentile latency | desc_sort_geonameid | 21.3566 | ms |
|
||||
| 100th percentile service time | desc_sort_geonameid | 5.41555 | ms |
|
||||
| error rate | desc_sort_geonameid | 0 | % |
|
||||
| Min Throughput | desc_sort_with_after_geonameid | 58.36 | ops/s |
|
||||
| Mean Throughput | desc_sort_with_after_geonameid | 58.36 | ops/s |
|
||||
| Median Throughput | desc_sort_with_after_geonameid | 58.36 | ops/s |
|
||||
| Max Throughput | desc_sort_with_after_geonameid | 58.36 | ops/s |
|
||||
| 100th percentile latency | desc_sort_with_after_geonameid | 24.3476 | ms |
|
||||
| 100th percentile service time | desc_sort_with_after_geonameid | 6.81395 | ms |
|
||||
| error rate | desc_sort_with_after_geonameid | 0 | % |
|
||||
| Min Throughput | asc_sort_geonameid | 69.44 | ops/s |
|
||||
| Mean Throughput | asc_sort_geonameid | 69.44 | ops/s |
|
||||
| Median Throughput | asc_sort_geonameid | 69.44 | ops/s |
|
||||
| Max Throughput | asc_sort_geonameid | 69.44 | ops/s |
|
||||
| 100th percentile latency | asc_sort_geonameid | 19.4046 | ms |
|
||||
| 100th percentile service time | asc_sort_geonameid | 4.72967 | ms |
|
||||
| error rate | asc_sort_geonameid | 0 | % |
|
||||
| Min Throughput | asc_sort_with_after_geonameid | 70.35 | ops/s |
|
||||
| Mean Throughput | asc_sort_with_after_geonameid | 70.35 | ops/s |
|
||||
| Median Throughput | asc_sort_with_after_geonameid | 70.35 | ops/s |
|
||||
| Max Throughput | asc_sort_with_after_geonameid | 70.35 | ops/s |
|
||||
| 100th percentile latency | asc_sort_with_after_geonameid | 18.664 | ms |
|
||||
| 100th percentile service time | asc_sort_with_after_geonameid | 4.16119 | ms |
|
||||
| error rate | asc_sort_with_after_geonameid | 0 | % |
|
||||
|
||||
|
||||
--------------------------------
|
||||
[INFO] SUCCESS (took 98 seconds)
|
||||
--------------------------------
|
||||
```
|
||||
|
||||
Each task run by the `geonames` workload represents a specific OpenSearch API operation---such as Bulk or Search---that was performed when the test was run. Each task in the output summary contains the following information:
|
||||
|
||||
* **Throughput:** The number of successful OpenSearch operations per second.
|
||||
* **Latency:** The amount of time, including wait time, taken for the request and the response to be sent and received by Benchmark.
|
||||
* **Service Time:** The amount of time, excluding wait time, taken for the request and the response to be sent and received by Benchmark.
|
||||
* **Error Rate:** The percentage of operations run during the task that were not successful or returned a 200 error code.
|
||||
|
||||
|
||||
## Next steps
|
||||
|
||||
See the following resources to learn more about OpenSearch Benchmark:
|
||||
|
||||
- [User guide]({{site.url}}{{site.baseurl}}/benchmark/user-guide/index/): Dive deep into how OpenSearch Benchmark can you help you track the performance of your cluster.
|
||||
- [Tutorials]({{site.url}}{{site.baseurl}}/benchmark/tutorials/index/): Use step-by-step guides for more advanced Benchmarking configurations and functionality.
|
|
@ -0,0 +1,10 @@
|
|||
---
|
||||
layout: default
|
||||
title: Tutorials
|
||||
nav_order: 10
|
||||
has_children: true
|
||||
---
|
||||
|
||||
# Tutorial
|
||||
|
||||
This section of the OpenSearch Benchmark documentation provides a set of tutorials for those who want to learn more advanced OpenSearch Benchmark concepts.
|
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
layout: default
|
||||
title: AWS Signature Version 4 support
|
||||
nav_order: 70
|
||||
parent: Tutorials
|
||||
---
|
||||
|
||||
# Running OpenSearch Benchmark with AWS Signature Version 4
|
||||
|
||||
OpenSearch Benchmark supports AWS Signature Version 4 authentication. To run Benchmark with Signature Version 4, use the following steps:
|
||||
|
||||
1. Set up an [IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create.html) and provide it access to the OpenSearch cluster using Signature Version 4 authentication.
|
||||
|
||||
2. Set up the following environment variables for your IAM user:
|
||||
|
||||
```bash
|
||||
OSB_AWS_ACCESS_KEY_ID=<<IAM USER AWS ACCESS KEY ID>
|
||||
OSB_AWS_SECRET_ACCESS_KEY=<IAM USER AWS SECRET ACCESS KEY>
|
||||
OSB_REGION=<YOUR REGION>
|
||||
OSB_SERVICE=aos
|
||||
```
|
||||
{% include copy.html %}
|
||||
|
||||
3. Customize and run the following `execute-test` command with the ` --client-options=amazon_aws_log_in:environment` flag. This flag tells OpenSearch Benchmark the location of your exported credentials.
|
||||
|
||||
```bash
|
||||
opensearch-benchmark execute-test \
|
||||
--target-hosts=<CLUSTER ENDPOINT> \
|
||||
--pipeline=benchmark-only \
|
||||
--workload=geonames \
|
||||
--client-options=timeout:120,amazon_aws_log_in:environment \
|
||||
```
|
|
@ -0,0 +1,171 @@
|
|||
---
|
||||
layout: default
|
||||
title: Concepts
|
||||
nav_order: 3
|
||||
parent: User guide
|
||||
---
|
||||
|
||||
# Concepts
|
||||
|
||||
Before using OpenSearch Benchmark, familiarize yourself with the following concepts.
|
||||
|
||||
## Core concepts and definitions
|
||||
|
||||
- **Workload**: The description of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workflow runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads in the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For more information about the elements of a workload, see [Anatomy of a workload](#anatomy-of-a-workload). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/).
|
||||
|
||||
- **Pipeline**: A series of steps occurring before and after a workload is run that determines benchmark results. OpenSearch Benchmark supports three pipelines:
|
||||
- `from-sources`: Builds and provisions OpenSearch, runs a benchmark, and then publishes the results.
|
||||
- `from-distribution`: Downloads an OpenSearch distribution, provisions it, runs a benchmark, and then publishes the results.
|
||||
- `benchmark-only`: The default pipeline. Assumes an already running OpenSearch instance, runs a benchmark on that instance, and then publishes the results.
|
||||
|
||||
- **Test**: A single invocation of the OpenSearch Benchmark binary.
|
||||
|
||||
A workload is a specification of one or more benchmarking scenarios. A workload typically includes the following:
|
||||
|
||||
- One or more data streams that are ingested into indexes.
|
||||
- A set of queries and operations that are invoked as part of the benchmark.
|
||||
|
||||
## Anatomy of a workload
|
||||
|
||||
The following example workload shows all of the essential elements needed to create a `workload.json` file. You can run this workload in your own benchmark configuration to understand how all of the elements work together:
|
||||
|
||||
```json
|
||||
{
|
||||
"description": "Tutorial benchmark for OpenSearch Benchmark",
|
||||
"indices": [
|
||||
{
|
||||
"name": "movies",
|
||||
"body": "index.json"
|
||||
}
|
||||
],
|
||||
"corpora": [
|
||||
{
|
||||
"name": "movies",
|
||||
"documents": [
|
||||
{
|
||||
"source-file": "movies-documents.json",
|
||||
"document-count": 11658903, # Fetch document count from command line
|
||||
"uncompressed-bytes": 1544799789 # Fetch uncompressed bytes from command line
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"schedule": [
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "create-index"
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "cluster-health",
|
||||
"request-params": {
|
||||
"wait_for_status": "green"
|
||||
},
|
||||
"retry-until-success": true
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "bulk",
|
||||
"bulk-size": 5000
|
||||
},
|
||||
"warmup-time-period": 120,
|
||||
"clients": 8
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"name": "query-match-all",
|
||||
"operation-type": "search",
|
||||
"body": {
|
||||
"query": {
|
||||
"match_all": {}
|
||||
}
|
||||
}
|
||||
},
|
||||
"iterations": 1000,
|
||||
"target-throughput": 100
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
A workload usually includes the following elements:
|
||||
|
||||
- [indices]({{site.url}}{{site.baseurl}}/benchmark/workloads/indices/): Defines the relevant indexes and index templates used for the workload.
|
||||
- [corpora]({{site.url}}{{site.baseurl}}/benchmark/workloads/corpora/): Defines all document corpora used for the workload.
|
||||
- `schedule`: Defines operations and the order in which the operations run inline. Alternatively, you can use `operations` to group operations and the `test_procedures` parameter to specify the order of operations.
|
||||
- `operations`: **Optional**. Describes which operations are available for the workload and how they are parameterized.
|
||||
|
||||
### Indices
|
||||
|
||||
To create an index, specify its `name`. To add definitions to your index, use the `body` option and point it to the JSON file containing the index definitions. For more information, see [indices]({{site.url}}{{site.baseurl}}/benchmark/workloads/indices/).
|
||||
|
||||
### Corpora
|
||||
|
||||
The `corpora` element requires the name of the index containing the document corpus, for example, `movies`, and a list of parameters that define the document corpora. This list includes the following parameters:
|
||||
|
||||
- `source-file`: The file name that contains the workload's corresponding documents. When using OpenSearch Benchmark locally, documents are contained in a JSON file. When providing a `base_url`, use a compressed file format: `.zip`, `.bz2`, `.gz`, `.tar`, `.tar.gz`, `.tgz`, or `.tar.bz2`. The compressed file must have one JSON file containing the name.
|
||||
- `document-count`: The number of documents in the `source-file`, which determines which client indexes correlate to which parts of the document corpus. Each N client receives an Nth of the document corpus. When using a source that contains a document with a parent-child relationship, specify the number of parent documents.
|
||||
- `uncompressed-bytes`: The size, in bytes, of the source file after decompression, indicating how much disk space the decompressed source file needs.
|
||||
- `compressed-bytes`: The size, in bytes, of the source file before decompression. This can help you assess the amount of time needed for the cluster to ingest documents.
|
||||
|
||||
### Operations
|
||||
|
||||
The `operations` element lists the OpenSearch API operations performed by the workload. For example, you can set an operation to `create-index`, an index in the test cluster to which OpenSearch Benchmark can write documents. Operations are usually listed inside of `schedule`.
|
||||
|
||||
### Schedule
|
||||
|
||||
The `schedule` element contains a list of actions and operations that are run by the workload. Operations run according to the order in which they appear in the `schedule`. The following example illustrates a `schedule` with multiple operations, each defined by its `operation-type`:
|
||||
|
||||
```json
|
||||
"schedule": [
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "create-index"
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "cluster-health",
|
||||
"request-params": {
|
||||
"wait_for_status": "green"
|
||||
},
|
||||
"retry-until-success": true
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "bulk",
|
||||
"bulk-size": 5000
|
||||
},
|
||||
"warmup-time-period": 120,
|
||||
"clients": 8
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"name": "query-match-all",
|
||||
"operation-type": "search",
|
||||
"body": {
|
||||
"query": {
|
||||
"match_all": {}
|
||||
}
|
||||
}
|
||||
},
|
||||
"iterations": 1000,
|
||||
"target-throughput": 100
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
According to this schedule, the actions will run in the following order:
|
||||
|
||||
1. The `create-index` operation creates an index. The index remains empty until the `bulk` operation adds documents with benchmarked data.
|
||||
2. The `cluster-health` operation assesses the health of the cluster before running the workload. In this example, the workload waits until the status of the cluster's health is `green`.
|
||||
- The `bulk` operation runs the `bulk` API to index `5000` documents simultaneously.
|
||||
- Before benchmarking, the workload waits until the specified `warmup-time-period` passes. In this example, the warmup period is `120` seconds.
|
||||
5. The `clients` field defines the number of clients that will run the remaining actions in the schedule concurrently.
|
||||
6. The `search` runs a `match_all` query to match all documents after they have been indexed by the `bulk` API using the 8 clients specified.
|
||||
- The `iterations` field indicates the number of times each client runs the `search` operation. The report generated by the benchmark automatically adjusts the percentile numbers based on this number. To generate a precise percentile, the benchmark needs to run at least 1,000 iterations.
|
||||
- Lastly, the `target-throughput` field defines the number of requests per second each client performs, which, when set, can help reduce the latency of the benchmark. For example, a `target-throughput` of 100 requests divided by 8 clients means that each client will issue 12 requests per second.
|
|
@ -2,7 +2,8 @@
|
|||
layout: default
|
||||
title: Configuring OpenSearch Benchmark
|
||||
nav_order: 7
|
||||
has_children: false
|
||||
parent: User guide
|
||||
redirect_from: /benchmark/configuring-benchmark/
|
||||
---
|
||||
|
||||
# Configuring OpenSearch Benchmark
|
|
@ -2,7 +2,8 @@
|
|||
layout: default
|
||||
title: Creating custom workloads
|
||||
nav_order: 10
|
||||
has_children: false
|
||||
parent: User guide
|
||||
redirect_from: /benchmark/creating-custom-workloads/
|
||||
---
|
||||
|
||||
# Creating custom workloads
|
|
@ -0,0 +1,10 @@
|
|||
---
|
||||
layout: default
|
||||
title: User guide
|
||||
nav_order: 5
|
||||
has_children: true
|
||||
---
|
||||
|
||||
# OpenSearch Benchmark User Guide
|
||||
|
||||
The OpenSearch Benchmark User Guide includes core [concepts]({{site.url}}{{site.baseurl}}/benchmark/user-guide/concepts/), [installation]({{site.url}}{{site.baseurl}}/benchmark/installing-benchmark/) instructions, and [configuration options]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/) to help you get the most out of OpenSearch Benchmark.
|
|
@ -2,7 +2,8 @@
|
|||
layout: default
|
||||
title: Installing OpenSearch Benchmark
|
||||
nav_order: 5
|
||||
has_children: false
|
||||
parent: User guide
|
||||
redirect_from: /benchmark/installing-benchmark/
|
||||
---
|
||||
|
||||
# Installing OpenSearch Benchmark
|
||||
|
@ -150,6 +151,59 @@ run -v $HOME/benchmarks:/opensearch-benchmark/.benchmark opensearchproject/opens
|
|||
|
||||
See [Configuring OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/) to learn more about the files and subdirectories located in `/opensearch-benchmark/.benchmark`.
|
||||
|
||||
## Provisioning an OpenSearch cluster with a test
|
||||
|
||||
OpenSearch Benchmark is compatible with JDK versions 17, 16, 15, 14, 13, 12, 11, and 8.
|
||||
{: .note}
|
||||
|
||||
If you installed OpenSearch with PyPi, you can also provision a new OpenSearch cluster by specifying a `distribution-version` in the `execute-test` command.
|
||||
|
||||
If you plan on having Benchmark provision a cluster, you'll need to inform Benchmark of the location of the `JAVA_HOME` path for the Benchmark cluster. To set the `JAVA_HOME` path and provision a cluster:
|
||||
|
||||
1. Find the `JAVA_HOME` path you're currently using. Open a terminal and enter `/usr/libexec/java_home`.
|
||||
|
||||
2. Set your corresponding JDK version environment variable by entering the path from the previous step. Enter `export JAVA17_HOME=<Java Path>`.
|
||||
|
||||
3. Run the `execute-test` command and indicate the distribution version of OpenSearch you want to use:
|
||||
|
||||
```bash
|
||||
opensearch-benchmark execute-test --distribution-version=2.3.0 --workload=geonames --test-mode
|
||||
```
|
||||
|
||||
## Directory structure
|
||||
|
||||
After running OpenSearch Benchmark for the first time, you can search through all related files, including configuration files, in the `~/.benchmark` directory. The directory includes the following file tree:
|
||||
|
||||
```
|
||||
# ~/.benchmark Tree
|
||||
.
|
||||
├── benchmark.ini
|
||||
├── benchmarks
|
||||
│ ├── data
|
||||
│ │ └── geonames
|
||||
│ ├── distributions
|
||||
│ │ ├── opensearch-1.0.0-linux-x64.tar.gz
|
||||
│ │ └── opensearch-2.3.0-linux-x64.tar.gz
|
||||
│ ├── test_executions
|
||||
│ │ ├── 0279b13b-1e54-49c7-b1a7-cde0b303a797
|
||||
│ │ └── 0279c542-a856-4e88-9cc8-04306378cd38
|
||||
│ └── workloads
|
||||
│ └── default
|
||||
│ └── geonames
|
||||
├── logging.json
|
||||
├── logs
|
||||
│ └── benchmark.log
|
||||
```
|
||||
|
||||
* `benchmark.ini`: Contains any adjustable configurations for tests. For information about how to configure OpenSearch Benchmark, see [Configuring OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/).
|
||||
* `data`: Contains all the data corpora and documents related to OpenSearch Benchmark's [official workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/geonames).
|
||||
* `distributions`: Contains all the OpenSearch distributions downloaded from [OpenSearch.org](http://opensearch.org/) and used to provision clusters.
|
||||
* `test_executions`: Contains all the test `execution_id`s from previous runs of OpenSearch Benchmark.
|
||||
* `workloads`: Contains all files related to workloads, except for the data corpora.
|
||||
* `logging.json`: Contains all of the configuration options related to how logging is performed within OpenSearch Benchmark.
|
||||
* `logs`: Contains all the logs from OpenSearch Benchmark runs. This can be helpful when you've encountered errors during runs.
|
||||
|
||||
|
||||
## Next steps
|
||||
|
||||
- [Configuring OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/)
|
|
@ -12,153 +12,12 @@ A workload is a specification of one or more benchmarking scenarios. A workload
|
|||
- One or more data streams that are ingested into indices
|
||||
- A set of queries and operations that are invoked as part of the benchmark
|
||||
|
||||
## Anatomy of a workload
|
||||
This section provides a list of options and examples you can use when customizing or using a workload.
|
||||
|
||||
The following example workload shows all of the essential elements needed to create a workload.json file. You can run this workload in your own benchmark configuration in order to understand how all of the elements work together:
|
||||
|
||||
```json
|
||||
{
|
||||
"description": "Tutorial benchmark for OpenSearch Benchmark",
|
||||
"indices": [
|
||||
{
|
||||
"name": "movies",
|
||||
"body": "index.json"
|
||||
}
|
||||
],
|
||||
"corpora": [
|
||||
{
|
||||
"name": "movies",
|
||||
"documents": [
|
||||
{
|
||||
"source-file": "movies-documents.json",
|
||||
"document-count": 11658903, # Fetch document count from command line
|
||||
"uncompressed-bytes": 1544799789 # Fetch uncompressed bytes from command line
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"schedule": [
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "create-index"
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "cluster-health",
|
||||
"request-params": {
|
||||
"wait_for_status": "green"
|
||||
},
|
||||
"retry-until-success": true
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "bulk",
|
||||
"bulk-size": 5000
|
||||
},
|
||||
"warmup-time-period": 120,
|
||||
"clients": 8
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"name": "query-match-all",
|
||||
"operation-type": "search",
|
||||
"body": {
|
||||
"query": {
|
||||
"match_all": {}
|
||||
}
|
||||
}
|
||||
},
|
||||
"iterations": 1000,
|
||||
"target-throughput": 100
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
A workload usually consists of the following elements:
|
||||
|
||||
- [indices]({{site.url}}{{site.baseurl}}/benchmark/workloads/indices/): Defines the relevant indices and index templates used for the workload.
|
||||
- [corpora]({{site.url}}{{site.baseurl}}/benchmark/workloads/corpora/): Defines all document corpora used for the workload.
|
||||
- `schedule`: Defines operations and in what order the operations run in-line. Alternatively, you can use `operations` to group operations and the `test_procedures` parameter to specify the order of operations.
|
||||
- `operations`: **Optional**. Describes which operations are available for the workload and how they are parameterized.
|
||||
|
||||
### Indices
|
||||
|
||||
To create an index, specify its `name`. To add definitions to your index, use the `body` option and point it to the JSON file containing the index definitions. For more information, see [indices]({{site.url}}{{site.baseurl}}/benchmark/workloads/indices/). For more information, see [indices]({{site.url}}{{site.baseurl}}/benchmark/workloads/indices/).
|
||||
|
||||
### Corpora
|
||||
|
||||
The `corpora` element requires the name of the index containing the document corpus, for example, `movies`, and a list of parameters that define the document corpora. This list includes the following parameters:
|
||||
|
||||
- `source-file`: The file name that contains the workload's corresponding documents. When using OpenSearch Benchmark locally, documents are contained in a JSON file. When providing a `base_url`, use a compressed file format: `.zip`, `.bz2`, `.gz`, `.tar`, `.tar.gz`, `.tgz`, or `.tar.bz2`. The compressed file must have one JSON file containing the name.
|
||||
- `document-count`: The number of documents in the `source-file`, which determines which client indices correlate to which parts of the document corpus. Each N client receives an Nth of the document corpus. When using a source that contains a document with a parent-child relationship, specify the number of parent documents.
|
||||
- `uncompressed-bytes`: The size, in bytes, of the source file after decompression, indicating how much disk space the decompressed source file needs.
|
||||
- `compressed-bytes`: The size, in bytes, of the source file before decompression. This can help you assess the amount of time needed for the cluster to ingest documents.
|
||||
|
||||
### Operations
|
||||
|
||||
The `operations` element lists the OpenSearch API operations performed by the workload. For example, you can set an operation to `create-index`, which creates an index in the test cluster that OpenSearch Benchmark can write documents into. Operations are usually listed inside of `schedule`.
|
||||
|
||||
### Schedule
|
||||
|
||||
The `schedule` element contains a list of actions and operations that are run by the workload. Operations run according to the order in which they appear in the `schedule`. The following example illustrates a `schedule` with multiple operations, each defined by its `operation-type`:
|
||||
|
||||
```json
|
||||
"schedule": [
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "create-index"
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "cluster-health",
|
||||
"request-params": {
|
||||
"wait_for_status": "green"
|
||||
},
|
||||
"retry-until-success": true
|
||||
}
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"operation-type": "bulk",
|
||||
"bulk-size": 5000
|
||||
},
|
||||
"warmup-time-period": 120,
|
||||
"clients": 8
|
||||
},
|
||||
{
|
||||
"operation": {
|
||||
"name": "query-match-all",
|
||||
"operation-type": "search",
|
||||
"body": {
|
||||
"query": {
|
||||
"match_all": {}
|
||||
}
|
||||
}
|
||||
},
|
||||
"iterations": 1000,
|
||||
"target-throughput": 100
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
According to this schedule, the actions will run in the following order:
|
||||
|
||||
1. The `create-index` operation creates an index. The index remains empty until the `bulk` operation adds documents with benchmarked data.
|
||||
2. The `cluster-health` operation assesses the health of the cluster before running the workload. In this example, the workload waits until the status of the cluster's health is `green`.
|
||||
- The `bulk` operation runs the `bulk` API to index `5000` documents simultaneously.
|
||||
- Before benchmarking, the workload waits until the specified `warmup-time-period` passes. In this example, the warmup period is `120` seconds.
|
||||
5. The `clients` field defines the number of clients that will run the remaining actions in the schedule concurrently.
|
||||
6. The `search` runs a `match_all` query to match all documents after they have been indexed by the `bulk` API using the 8 clients specified.
|
||||
- The `iterations` field indicates the number of times each client runs the `search` operation. The report generated by the benchmark automatically adjusts the percentile numbers based on this number. To generate a precise percentile, the benchmark needs to run at least 1,000 iterations.
|
||||
- Lastly, the `target-throughput` field defines the number of requests per second each client performs, which, when set, can help reduce the latency of the benchmark. For example, a `target-throughput` of 100 requests divided by 8 clients means that each client will issue 12 requests per second.
|
||||
For more information about what comprises a workload, see [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/concepts#anatomy-of-a-workload).
|
||||
|
||||
|
||||
## More workload examples
|
||||
## Workload examples
|
||||
|
||||
If you want to try certain workloads before creating your own, use the following examples.
|
||||
|
||||
|
|
Binary file not shown.
After Width: | Height: | Size: 53 KiB |
Loading…
Reference in New Issue