From 6f779cef0c78efd3dc0f45f9dd30eee3339a65b4 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Thu, 29 Feb 2024 17:00:19 -0600 Subject: [PATCH] Add Contributing workloads page (#6430) * Add Contributing workloads page Signed-off-by: Archer * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update _benchmark/user-guide/contributing-workloads.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update contributing-workloads.md Signed-off-by: Melissa Vagi Signed-off-by: Melissa Vagi * Update contributing-workloads.md Signed-off-by: Melissa Vagi Signed-off-by: Melissa Vagi * Reconcile doc review and additional technical feedback. Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Heather Halter Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Heather Halter Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update contributing-workloads.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Add additonal technical feedback. Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update contributing-workloads.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: Archer Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Signed-off-by: Melissa Vagi Co-authored-by: Melissa Vagi Co-authored-by: Heather Halter Co-authored-by: Nathan Bower --- .../user-guide/contributing-workloads.md | 57 +++++++++++++++++++ _benchmark/user-guide/distributed-load.md | 2 +- _benchmark/user-guide/telemetry.md | 2 +- 3 files changed, 59 insertions(+), 2 deletions(-) create mode 100644 _benchmark/user-guide/contributing-workloads.md diff --git a/_benchmark/user-guide/contributing-workloads.md b/_benchmark/user-guide/contributing-workloads.md new file mode 100644 index 00000000..e60f60ea --- /dev/null +++ b/_benchmark/user-guide/contributing-workloads.md @@ -0,0 +1,57 @@ +--- +layout: default +title: Sharing custom workloads +nav_order: 11 +parent: User guide +--- + +# Sharing custom workloads + +You can share a custom workload with other OpenSearch users by uploading it to the [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/) on GitHub. + +Make sure that any data included in the workload's dataset does not contain proprietary data or personally identifiable information (PII). + +To share a custom workload, follow these steps. + +## Create a README.md + +Provide a detailed `README.MD` file that includes the following: + +- The purpose of the workload. When creating a description for the workload, consider its specific use and how the that use case differs from others in the [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). +- An example document from the dataset that helps users understand the data's structure. +- The workload parameters that can be used to customize the workload. +- A list of default test procedures included in the workload as well as other test procedures that the workload can run. +- An output sample produced by the workload after a test is run. +- A copy of the open-source license that gives the user and OpenSearch Benchmark permission to use the dataset. + +For an example workload README file, go to the `http_logs` [README](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/http_logs/README.md). + +## Verify the workload's structure + +The workload must include the following files: + +- `workload.json` +- `index.json` +- `files.txt` +- `test_procedures/default.json` +- `operations/default.json` + +Both `default.json` file names can be customized to have a descriptive name. The workload can include an optional `workload.py` file to add more dynamic functionality. For more information about a file's contents, go to [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/). + +## Testing the workload + +All workloads contributed to OpenSearch Benchmark must fulfill the following testing requirements: + +- All tests run to explore and produce an example from the workload must target an OpenSearch cluster. +- The workload must pass all integration tests. Follow these steps to ensure that the workload passes the integration tests: + 1. Add the workload to your forked copy of the [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). Make sure that you've forked both the `opensearch-benchmark-workloads` repository and the [OpenSeach Benchmark](https://github.com/opensearch-project/opensearch-benchmark) repository. + 3. In your forked OpenSearch Benchmark repository, update the `benchmark-os-it.ini` and `benchmark-in-memory.ini` files in the `/osbenchmark/it/resources` directory to point to the forked workloads repository containing your workload. + 4. After you've modified the `.ini` files, commit your changes to a branch for testing. + 6. Run your integration tests using GitHub actions by selecting the branch for which you committed your changes. Verify that the tests have run as expected. + 7. If your integration tests run as expected, go to your forked workloads repository and merge your workload changes into branches `1` and `2`. This allows for your workload to appear in both major versions of OpenSearch Benchmark. + +## Create a PR + +After testing the workload, create a pull request (PR) from your fork to the `opensearch-project` [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). Add a sample output and summary result to the PR description. The OpenSearch Benchmark maintainers will review the PR. + +Once the PR is approved, you must share the data corpora of your dataset. The OpenSearch Benchmark team can then add the dataset to a shared S3 bucket. If your data corpora is stored in an S3 bucket, you can use [AWS DataSync](https://docs.aws.amazon.com/datasync/latest/userguide/create-s3-location.html) to share the data corpora. Otherwise, you must inform the maintainers of where the data corpora resides. diff --git a/_benchmark/user-guide/distributed-load.md b/_benchmark/user-guide/distributed-load.md index ec460919..60fc9850 100644 --- a/_benchmark/user-guide/distributed-load.md +++ b/_benchmark/user-guide/distributed-load.md @@ -1,7 +1,7 @@ --- layout: default title: Running distributed loads -nav_order: 10 +nav_order: 15 parent: User guide --- diff --git a/_benchmark/user-guide/telemetry.md b/_benchmark/user-guide/telemetry.md index 7cc7f6b7..d4c40c79 100644 --- a/_benchmark/user-guide/telemetry.md +++ b/_benchmark/user-guide/telemetry.md @@ -1,7 +1,7 @@ --- layout: default title: Enabling telemetry devices -nav_order: 15 +nav_order: 30 parent: User guide ---