Data Prepper - Getting Started - updates (#3284)

* data prepper changes-issue 1823

Signed-off-by: Heather Halter <hdhalter@amazon.com>

* version wording update

Signed-off-by: Heather Halter <hdhalter@amazon.com>

* Update _data-prepper/getting-started.md

Co-authored-by: Caroline <113052567+carolxob@users.noreply.github.com>

* Update _data-prepper/getting-started.md

Co-authored-by: Caroline <113052567+carolxob@users.noreply.github.com>

* Update _data-prepper/getting-started.md

Co-authored-by: Caroline <113052567+carolxob@users.noreply.github.com>

---------

Signed-off-by: Heather Halter <hdhalter@amazon.com>
Co-authored-by: Caroline <113052567+carolxob@users.noreply.github.com>
This commit is contained in:
Heather Halter 2023-03-07 16:57:12 -08:00 committed by GitHub
parent c081034087
commit 5590918c49
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 20 additions and 29 deletions

View File

@ -10,40 +10,32 @@ redirect_from:
Data Prepper is an independent component, not an OpenSearch plugin, that converts data for use with OpenSearch. It's not bundled with the all-in-one OpenSearch installation packages.
If you are migrating from Open Distro Data Prepper, visit the [Migrating from Open Distro]({{site.url}}{{site.baseurl}}/data-prepper/migrate-open-distro/) page.
If you are migrating from Open Distro Data Prepper, see the [Migrating from Open Distro]({{site.url}}{{site.baseurl}}/data-prepper/migrate-open-distro/) page.
{: .note}
## 1. Installing Data Prepper
There are two ways to install Data Prepper:
There are two ways to install Data Prepper: Run the Docker image and build from source.
1. Run the Docker image.
2. Build from source.
The easiest way to use Data Prepper is by running the Docker image. We suggest that you use this approach if you have [Docker](https://www.docker.com) available.
You can pull the Docker image:
The easiest way to use Data Prepper is by running the Docker image. We suggest that you use this approach if you have [Docker](https://www.docker.com) available. Do the following:
```
docker pull opensearchproject/data-prepper:latest
```
If you have special requirements that require you to build from source, or if you
want to contribute, see the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md).
If you have special requirements that require you to build from source, or if you want to contribute, see the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md).
## 2. Configuring Data Prepper
You must configure Data Prepper with a pipeline before running it.
You will configure two files:
You must configure Data Prepper with a pipeline before running it. You'll modify the following files:
* `data-prepper-config.yaml`
* `pipelines.yaml`
Depending on your use case, we have a few different guides to configuring Data Prepper.
To configure Data Prepper, see the following information for each use case:
* [Trace Analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/): Learn how to collect trace data and customize a pipeline that ingests and transforms that data.
* [Log Analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/): Learn how to set up Data Prepper for log observability.
* [Simple Pipeline](https://github.com/opensearch-project/data-prepper/blob/main/docs/simple_pipelines.md): Learn the basics of Data Prepper pipelines with some simple configurations.
* [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/): Learn how to collect trace data and customize a pipeline that ingests and transforms that data.
* [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/): Learn how to set up Data Prepper for log observability.
## 3. Defining a pipeline
@ -65,11 +57,12 @@ Run the following command with your pipeline configuration YAML.
```bash
docker run --name data-prepper \
-v /full/path/to/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml \
-v /${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml \
opensearchproject/data-prepper:latest
```
The preceding example pipeline configuration above demonstrates a simple pipeline with a source (`random`) sending data to a sink (`stdout`). For further detailed examples of more advanced pipeline configurations, see [Pipelines]({{site.url}}{{site.baseurl}}/clients/data-prepper/pipelines/).
The example pipeline configuration above demonstrates a simple pipeline with a source (`random`) sending data to a sink (`stdout`). For examples of more advanced pipeline configurations, see [Pipelines]({{site.url}}{{site.baseurl}}/clients/data-prepper/pipelines/).
After starting Data Prepper, you should see log output and some UUIDs after a few seconds:
@ -90,10 +83,10 @@ e51e700e-5cab-4f6d-879a-1c3235a77d18
b4ed2d7e-cf9c-4e9d-967c-b18e8af35c90
```
The remainder of this page provides examples for running Data Prepper from the Docker image. If you
built from source, refer to the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md) for more information.
built it from source, refer to the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md) for more information.
However you configure your pipeline, you will run Data Prepper the same way. You run the Docker
image and supply both the `pipelines.yaml` and `data-prepper-config.yaml` files.
However you configure your pipeline, you'll run Data Prepper the same way. You run the Docker
image and modify both the `pipelines.yaml` and `data-prepper-config.yaml` files.
For Data Prepper 2.0 or later, use this command:
@ -101,13 +94,13 @@ For Data Prepper 2.0 or later, use this command:
docker run --name data-prepper -p 4900:4900 -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml -v ${PWD}/data-prepper-config.yaml:/usr/share/data-prepper/config/data-prepper-config.yaml opensearchproject/data-prepper:latest
```
For Data Prepper before version 2.0, use this command:
For Data Prepper 2.0 and earlier, use this command:
```
docker run --name data-prepper -p 4900:4900 -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines.yaml -v ${PWD}/data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml opensearchproject/data-prepper:1.x
```
Once Data Prepper is running, it will process data until it is shut down. Once you are done, shut it down with the following command:
Once Data Prepper is running, it processes data until it is shut down. Once you are done, shut it down with the following command:
```
curl -X POST http://localhost:4900/shutdown
@ -115,15 +108,13 @@ curl -X POST http://localhost:4900/shutdown
### Additional configurations
For Data Prepper 2.0 or later, the Log4j 2 configuration file is read from `config/log4j2.properties` in the application's home directory.
By default, it uses `log4j2-rolling.properties` in the *shared-config* directory.
For Data Prepper 2.0 or later, the Log4j 2 configuration file is read from `config/log4j2.properties` in the application's home directory. By default, it uses `log4j2-rolling.properties` in the *shared-config* directory.
For Data Prepper 1.5 or earlier, optionally add `"-Dlog4j.configurationFile=config/log4j2.properties"` to the command if you would
like to pass a custom log4j2 properties file. If no properties file is provided, Data Prepper will default to the log4j2.properties file in the *shared-config* directory.
For Data Prepper 1.5 or earlier, optionally add `"-Dlog4j.configurationFile=config/log4j2.properties"` to the command if you would like to pass a custom log4j2 properties file. If no properties file is provided, Data Prepper defaults to the log4j2.properties file in the *shared-config* directory.
## Next steps
Trace Analytics is an important Data Prepper use case. If you haven't yet configured it, see the [Trace Analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/).
Trace analytics is an important Data Prepper use case. If you haven't yet configured it, see [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/).
Log ingestion is also an important Data Prepper use case. To learn more, see [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/).