From 8651556df8c7bd36a99c55a05d70e13c40f4898b Mon Sep 17 00:00:00 2001
From: Chen <19492223+chenqi0805@users.noreply.github.com>
Date: Thu, 28 Apr 2022 19:42:00 -0500
Subject: [PATCH 1/7] MAINT: update docs
Signed-off-by: Chen <19492223+chenqi0805@users.noreply.github.com>
---
.../data-prepper/data-prepper-reference.md | 43 +++++++------
_clients/data-prepper/pipelines.md | 61 +++++++++++++++++++
2 files changed, 86 insertions(+), 18 deletions(-)
diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md
index 8936e489..24366454 100644
--- a/_clients/data-prepper/data-prepper-reference.md
+++ b/_clients/data-prepper/data-prepper-reference.md
@@ -37,22 +37,23 @@ Sources define where your data comes from.
Source for the OpenTelemetry Collector.
-Option | Required | Type | Description
-:--- | :--- | :--- | :---
-port | No | Integer | The port OTel trace source is running on. Default is `21890`.
-request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
-health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
-proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
-unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
-thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
-max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
-ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
-sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
-sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
-useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
-acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
-awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
-authentication | No | Object| An authentication configuration. By default, this runs an unauthenticated server. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
+Option | Required | Type | Description
+:--- |:--------------|:--------| :---
+port | No | Integer | The port OTel trace source is running on. Default is `21890`.
+request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
+health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
+proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
+unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
+thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
+max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
+ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
+sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
+sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
+useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
+acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
+awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
+authentication | No | Object | An authentication configuration. By default, this runs an unauthenticated server. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
+record_type | No | String | A string represents the supported record data type that will be written into the buffer plugin. Its value takes either `otlp` or `event`. Default is `otlp`.
- `otlp`: otel-trace-source will write each incoming ExportTraceServiceRequest as record data type into the buffer.
- `event`: otel-trace-source will decode each incoming ExportTraceServiceRequest into collection of Data Prepper internal spans serving as buffer items. To achieve better performance in this mode, it is recommended to set the buffer capacity proportional to the estimated number of spans in the incoming request payload.
### http_source
@@ -116,13 +117,19 @@ Prior to Data Prepper 1.3, Processors were named Preppers. Starting in Data Prep
### otel_trace_raw_prepper
-Converts OpenTelemetry data to OpenSearch-compatible JSON documents.
+Converts OpenTelemetry data to OpenSearch-compatible JSON documents and fills in trace group related fields in those JSON documents. It requires `record_type` to be set as `otlp` in `otel_trace_source`.
Option | Required | Type | Description
:--- | :--- | :--- | :---
-root_span_flush_delay | No | Integer | Represents the time interval in seconds to flush all the root spans in the processor together with their descendants. Default is 30.
trace_flush_interval | No | Integer | Represents the time interval in seconds to flush all the descendant spans without any root span. Default is 180.
+### otel_trace_raw
+
+This processor is a Data Prepper event record type compatible version of `otel_trace_raw_prepper` that fills in trace group related fields into all incoming Data Prepper span records. It requires `record_type` to be set as `event` in `otel_trace_source`.
+
+Option | Required | Type | Description
+:--- | :--- | :--- | :---
+trace_flush_interval | No | Integer | Represents the time interval in seconds to flush all the descendant spans without any root span. Default is 180.
### service_map_stateful
diff --git a/_clients/data-prepper/pipelines.md b/_clients/data-prepper/pipelines.md
index b664d98a..8eb084a4 100644
--- a/_clients/data-prepper/pipelines.md
+++ b/_clients/data-prepper/pipelines.md
@@ -75,6 +75,8 @@ This example uses weak security. We strongly recommend securing all plugins whic
The following example demonstrates how to build a pipeline that supports the [Trace Analytics OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/observability-plugin/trace/ta-dashboards/). This pipeline takes data from the OpenTelemetry Collector and uses two other pipelines as sinks. These two separate pipelines index trace and the service map documents for the dashboard plugin.
+#### Classic
+
```yml
entry-pipeline:
delay: "100"
@@ -115,6 +117,65 @@ service-map-pipeline:
trace_analytics_service_map: true
```
+#### Event record type
+
+Starting from Data Prepper 1.4, we support event record type in trace analytics pipeline source, buffer and processors.
+
+```yml
+entry-pipeline:
+ delay: "100"
+ source:
+ otel_trace_source:
+ ssl: false
+ record_type: event
+ buffer:
+ bounded_blocking:
+ buffer_size: 10240
+ batch_size: 160
+ sink:
+ - pipeline:
+ name: "raw-pipeline"
+ - pipeline:
+ name: "service-map-pipeline"
+raw-pipeline:
+ source:
+ pipeline:
+ name: "entry-pipeline"
+ buffer:
+ bounded_blocking:
+ buffer_size: 10240
+ batch_size: 160
+ processor:
+ - otel_trace_raw:
+ sink:
+ - opensearch:
+ hosts: ["https://localhost:9200"]
+ insecure: true
+ username: admin
+ password: admin
+ trace_analytics_raw: true
+service-map-pipeline:
+ delay: "100"
+ source:
+ pipeline:
+ name: "entry-pipeline"
+ buffer:
+ bounded_blocking:
+ buffer_size: 10240
+ batch_size: 160
+ processor:
+ - service_map_stateful:
+ sink:
+ - opensearch:
+ hosts: ["https://localhost:9200"]
+ insecure: true
+ username: admin
+ password: admin
+ trace_analytics_service_map: true
+```
+
+Note that it is recommended to scale the `buffer_size` and `batch_size` by the estimated maximum batch size in the client request payload to maintain similar ingestion throughput and latency as in [Classic](#classic).
+
## Migrating from Logstash
Data Prepper supports Logstash configuration files for a limited set of plugins. Simply use the logstash config to run Data Prepper.
From de422a31420460632d7649711bc6149504bf7e18 Mon Sep 17 00:00:00 2001
From: Chen <19492223+chenqi0805@users.noreply.github.com>
Date: Fri, 29 Apr 2022 11:08:37 -0500
Subject: [PATCH 2/7] MAINT: add user recommendation
Signed-off-by: Chen <19492223+chenqi0805@users.noreply.github.com>
---
_clients/data-prepper/pipelines.md | 2 ++
1 file changed, 2 insertions(+)
diff --git a/_clients/data-prepper/pipelines.md b/_clients/data-prepper/pipelines.md
index 8eb084a4..3937b599 100644
--- a/_clients/data-prepper/pipelines.md
+++ b/_clients/data-prepper/pipelines.md
@@ -77,6 +77,8 @@ The following example demonstrates how to build a pipeline that supports the [Tr
#### Classic
+This pipeline definition will be deprecated in 2.0. Users are recommended to use [Event record type](#event-record-type) pipeline definition.
+
```yml
entry-pipeline:
delay: "100"
From a032aa13ae220b998c6e499dd1d86dad5e8b8911 Mon Sep 17 00:00:00 2001
From: Chen <19492223+chenqi0805@users.noreply.github.com>
Date: Fri, 29 Apr 2022 19:15:52 -0500
Subject: [PATCH 3/7] MAINT: update working
Signed-off-by: Chen <19492223+chenqi0805@users.noreply.github.com>
---
_clients/data-prepper/data-prepper-reference.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md
index 24366454..4d869bc1 100644
--- a/_clients/data-prepper/data-prepper-reference.md
+++ b/_clients/data-prepper/data-prepper-reference.md
@@ -52,7 +52,7 @@ sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the se
useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
-authentication | No | Object | An authentication configuration. By default, this runs an unauthenticated server. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
+authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
record_type | No | String | A string represents the supported record data type that will be written into the buffer plugin. Its value takes either `otlp` or `event`. Default is `otlp`. - `otlp`: otel-trace-source will write each incoming ExportTraceServiceRequest as record data type into the buffer.
- `event`: otel-trace-source will decode each incoming ExportTraceServiceRequest into collection of Data Prepper internal spans serving as buffer items. To achieve better performance in this mode, it is recommended to set the buffer capacity proportional to the estimated number of spans in the incoming request payload.
### http_source
@@ -66,7 +66,7 @@ request_timeout | No | Integer | The request timeout in millis. Default is `10_0
thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
max_pending_requests | No | Integer | The maximum number of allowed tasks in ScheduledThreadPool work queue. Default is `1024`.
-authentication | No | Object | An authentication configuration. By default, this runs an unauthenticated server. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java).
+authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java).
### file
From 62d5ba4a57153cc9d2c123cc2a577dcf633a615b Mon Sep 17 00:00:00 2001
From: David Venable
Date: Fri, 13 May 2022 14:03:41 -0500
Subject: [PATCH 4/7] Documentation for Metrics ingestion in Data Prepper
1.4.0.
Signed-off-by: David Venable
---
.../data-prepper/data-prepper-reference.md | 23 ++++++++++++++++
_clients/data-prepper/pipelines.md | 27 +++++++++++++++++++
2 files changed, 50 insertions(+)
diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md
index 4d869bc1..1e937c25 100644
--- a/_clients/data-prepper/data-prepper-reference.md
+++ b/_clients/data-prepper/data-prepper-reference.md
@@ -68,6 +68,29 @@ max_connection_count | No | Integer | The maximum allowed number of open connect
max_pending_requests | No | Integer | The maximum number of allowed tasks in ScheduledThreadPool work queue. Default is `1024`.
authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java).
+### otel_metrics_source
+
+Source for the OpenTelemetry Collector for collecting Metric data.
+
+Option | Required | Type | Description
+:--- |:--------------|:--------| :---
+port | No | Integer | The port OTel metrics source is running on. Default is `21891`.
+request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
+health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
+proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
+unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
+thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
+max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
+ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
+sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
+sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
+useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
+acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
+awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
+authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
+
+
+
### file
Source for flat file input.
diff --git a/_clients/data-prepper/pipelines.md b/_clients/data-prepper/pipelines.md
index 3937b599..3f00afb6 100644
--- a/_clients/data-prepper/pipelines.md
+++ b/_clients/data-prepper/pipelines.md
@@ -178,6 +178,33 @@ service-map-pipeline:
Note that it is recommended to scale the `buffer_size` and `batch_size` by the estimated maximum batch size in the client request payload to maintain similar ingestion throughput and latency as in [Classic](#classic).
+### Metrics Pipeline
+
+Data Prepper supports metrics ingestion using OTel. It currently supports the following metric types:
+
+* Guage
+* Sum
+* Summary
+* Histogram
+
+Other types are not support and Data Prepper will drop these types, including Exponential Histogram and Summary. Additionally,
+Data Prepper does not support Scope instrumentation.
+
+To setup a Metrics pipeline:
+
+```
+metrics-pipeline:
+ source:
+ otel_trace_source:
+ processor:
+ - otel_metrics_raw_processor:
+ sink:
+ - opensearch:
+ hosts: ["https://localhost:9200"]
+ username: admin
+ password: admin
+```
+
## Migrating from Logstash
Data Prepper supports Logstash configuration files for a limited set of plugins. Simply use the logstash config to run Data Prepper.
From d601d097d37667cf34e16e19b730db1e408ac727 Mon Sep 17 00:00:00 2001
From: keithhc2
Date: Mon, 16 May 2022 23:54:56 -0700
Subject: [PATCH 5/7] Language tweaks
Signed-off-by: keithhc2
---
.../data-prepper/data-prepper-reference.md | 78 ++++++++++---------
_clients/data-prepper/pipelines.md | 15 ++--
2 files changed, 47 insertions(+), 46 deletions(-)
diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md
index 1e937c25..dc02f1d2 100644
--- a/_clients/data-prepper/data-prepper-reference.md
+++ b/_clients/data-prepper/data-prepper-reference.md
@@ -37,23 +37,25 @@ Sources define where your data comes from.
Source for the OpenTelemetry Collector.
-Option | Required | Type | Description
-:--- |:--------------|:--------| :---
-port | No | Integer | The port OTel trace source is running on. Default is `21890`.
-request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
-health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
-proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
-unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
-thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
-max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
-ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
-sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
-sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
-useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
-acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
-awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
-authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
-record_type | No | String | A string represents the supported record data type that will be written into the buffer plugin. Its value takes either `otlp` or `event`. Default is `otlp`. - `otlp`: otel-trace-source will write each incoming ExportTraceServiceRequest as record data type into the buffer.
- `event`: otel-trace-source will decode each incoming ExportTraceServiceRequest into collection of Data Prepper internal spans serving as buffer items. To achieve better performance in this mode, it is recommended to set the buffer capacity proportional to the estimated number of spans in the incoming request payload.
+Option | Required | Type | Description
+:--- |:--- |:--- | :---
+port | No | Integer | The port OTel trace source is running on. Default is `21890`.
+request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
+health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
+proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
+unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
+thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
+max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
+ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
+sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if `ssl` is set to `true`.
+sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
+useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
+acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
+awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
+authentication | No | Object | An authentication configuration. By default, an unauthenticated server is created for the pipeline. This parameter uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
+record_type | No | String | A string represents the supported record data type that is written into the buffer plugin. Value options are `otlp` or `event`. Default is `otlp`.
+`otlp` | No | String | Otel-trace-source writes each incoming `ExportTraceServiceRequest` request as record data type into the buffer.
+`event` | No | String | Otel-trace-source decodes each incoming `ExportTraceServiceRequest` request into a collection of Data Prepper internal spans serving as buffer items. To achieve better performance in this mode, we recommend setting buffer capacity proportional to the estimated number of spans in the incoming request payload.
### http_source
@@ -66,28 +68,28 @@ request_timeout | No | Integer | The request timeout in millis. Default is `10_0
thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
max_pending_requests | No | Integer | The maximum number of allowed tasks in ScheduledThreadPool work queue. Default is `1024`.
-authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java).
+authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java).
### otel_metrics_source
-Source for the OpenTelemetry Collector for collecting Metric data.
+Source for the OpenTelemetry Collector for collecting metric data.
-Option | Required | Type | Description
-:--- |:--------------|:--------| :---
-port | No | Integer | The port OTel metrics source is running on. Default is `21891`.
-request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
-health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
-proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
-unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
-thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
-max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
-ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
-sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
-sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
-useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
-acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
-awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
-authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
+Option | Required | Type | Description
+:--- |:--- |:--- | :---
+port | No | Integer | The port OTel metrics source is running on. Default is `21891`.
+request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
+health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
+proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`.
+unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol.
+thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
+max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
+ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
+sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
+sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
+useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
+acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
+awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
+authentication | No | Object | An authentication configuration. By default, an unauthenticated server is created for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
@@ -148,7 +150,7 @@ trace_flush_interval | No | Integer | Represents the time interval in seconds to
### otel_trace_raw
-This processor is a Data Prepper event record type compatible version of `otel_trace_raw_prepper` that fills in trace group related fields into all incoming Data Prepper span records. It requires `record_type` to be set as `event` in `otel_trace_source`.
+This processor is a Data Prepper event record type compatible version of `otel_trace_raw_prepper` that fills in trace group related fields into all incoming Data Prepper span records. It requires `record_type` to be set as `event` in `otel_trace_source`.
Option | Required | Type | Description
:--- | :--- | :--- | :---
@@ -174,7 +176,7 @@ target_port | No | Integer | The destination port to forward requests to. Defaul
discovery_mode | No | String | Peer discovery mode to be used. Allowable values are `static`, `dns`, and `aws_cloud_map`. Defaults to `static`.
static_endpoints | No | List | List containing string endpoints of all Data Prepper instances.
domain_name | No | String | Single domain name to query DNS against. Typically used by creating multiple DNS A Records for the same domain.
-ssl | No | Boolean | Indicates whether TLS should be used. Default is true.
+ssl | No | Boolean | Indicates whether to use TLS. Default is true.
awsCloudMapNamespaceName | Conditionally | String | Name of your CloudMap Namespace. Required if `discovery_mode` is set to `aws_cloud_map`.
awsCloudMapServiceName | Conditionally | String | Service name within your CloudMap Namespace. Required if `discovery_mode` is set to `aws_cloud_map`.
sslKeyCertChainFile | Conditionally | String | Represents the SSL certificate chain file path or AWS S3 path. S3 path example `s3:///`. Required if `ssl` is set to `true`.
@@ -202,7 +204,7 @@ group_duration | No | String | The amount of time that a group should exist befo
### date
-Adds a default timestamp to the event or parses timestamp fields, and converts it to ISO 8601 format which can be used as event timestamp.
+Adds a default timestamp to the event or parses timestamp fields, and converts it to ISO 8601 format, which can be used as event timestamp.
Option | Required | Type | Description
:--- | :--- | :--- | :---
diff --git a/_clients/data-prepper/pipelines.md b/_clients/data-prepper/pipelines.md
index 3f00afb6..167d870f 100644
--- a/_clients/data-prepper/pipelines.md
+++ b/_clients/data-prepper/pipelines.md
@@ -71,7 +71,7 @@ log-pipeline:
This example uses weak security. We strongly recommend securing all plugins which open external ports in production environments.
{: .note}
-### Trace Analytics pipeline
+### Trace analytics pipeline
The following example demonstrates how to build a pipeline that supports the [Trace Analytics OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/observability-plugin/trace/ta-dashboards/). This pipeline takes data from the OpenTelemetry Collector and uses two other pipelines as sinks. These two separate pipelines index trace and the service map documents for the dashboard plugin.
@@ -121,7 +121,7 @@ service-map-pipeline:
#### Event record type
-Starting from Data Prepper 1.4, we support event record type in trace analytics pipeline source, buffer and processors.
+Starting from Data Prepper 1.4, Data Prepper supports event record type in trace analytics pipeline source, buffer, and processors.
```yml
entry-pipeline:
@@ -176,9 +176,9 @@ service-map-pipeline:
trace_analytics_service_map: true
```
-Note that it is recommended to scale the `buffer_size` and `batch_size` by the estimated maximum batch size in the client request payload to maintain similar ingestion throughput and latency as in [Classic](#classic).
+Note that it is recommended to scale the `buffer_size` and `batch_size` by the estimated maximum batch size in the client request payload to maintain similar ingestion throughput and latency as in [Classic](#classic).
-### Metrics Pipeline
+### Metrics pipeline
Data Prepper supports metrics ingestion using OTel. It currently supports the following metric types:
@@ -187,12 +187,11 @@ Data Prepper supports metrics ingestion using OTel. It currently supports the fo
* Summary
* Histogram
-Other types are not support and Data Prepper will drop these types, including Exponential Histogram and Summary. Additionally,
-Data Prepper does not support Scope instrumentation.
+Other types are not supported. Data Prepper drops all other types, including Exponential Histogram and Summary. Additionally, Data Prepper does not support Scope instrumentation.
-To setup a Metrics pipeline:
+To set up a metrics pipeline:
-```
+```yml
metrics-pipeline:
source:
otel_trace_source:
From 28b3503be3c927bf3d9f148068547808d1caf7c8 Mon Sep 17 00:00:00 2001
From: keithhc2
Date: Mon, 16 May 2022 23:57:10 -0700
Subject: [PATCH 6/7] Style fixes
Signed-off-by: keithhc2
---
_clients/data-prepper/data-prepper-reference.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md
index dc02f1d2..12fb5c18 100644
--- a/_clients/data-prepper/data-prepper-reference.md
+++ b/_clients/data-prepper/data-prepper-reference.md
@@ -75,7 +75,7 @@ authentication | No | Object | An authentication configuration. By default, this
Source for the OpenTelemetry Collector for collecting metric data.
Option | Required | Type | Description
-:--- |:--- |:--- | :---
+:--- | :--- | :--- | :---
port | No | Integer | The port OTel metrics source is running on. Default is `21891`.
request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
From bb7320bb7db381b612895e318d9afc2990eea707 Mon Sep 17 00:00:00 2001
From: keithhc2
Date: Tue, 17 May 2022 12:47:45 -0700
Subject: [PATCH 7/7] Addressed comments
Signed-off-by: keithhc2
---
_clients/data-prepper/data-prepper-reference.md | 16 ++++++++--------
_clients/data-prepper/pipelines.md | 2 +-
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md
index 12fb5c18..6cc5fb09 100644
--- a/_clients/data-prepper/data-prepper-reference.md
+++ b/_clients/data-prepper/data-prepper-reference.md
@@ -14,7 +14,7 @@ This page lists all supported Data Prepper server, sources, buffers, processors,
Option | Required | Type | Description
:--- | :--- | :--- | :---
ssl | No | Boolean | Indicates whether TLS should be used for server APIs. Defaults to true.
-keyStoreFilePath | No | String | Path to a .jks or .p12 keystore file. Required if ssl is true.
+keyStoreFilePath | No | String | Path to a .jks or .p12 keystore file. Required if `ssl` is true.
keyStorePassword | No | String | Password for keystore. Optional, defaults to empty string.
privateKeyPassword | No | String | Password for private key within keystore. Optional, defaults to empty string.
serverPort | No | Integer | Port number to use for server APIs. Defaults to 4900
@@ -38,7 +38,7 @@ Sources define where your data comes from.
Source for the OpenTelemetry Collector.
Option | Required | Type | Description
-:--- |:--- |:--- | :---
+:--- | :--- | :--- | :---
port | No | Integer | The port OTel trace source is running on. Default is `21890`.
request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`.
health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`.
@@ -48,7 +48,7 @@ thread_count | No | Integer | The number of threads to keep in the ScheduledThre
max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if `ssl` is set to `true`.
-sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
+sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if `ssl` is set to `true`.
useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
@@ -84,11 +84,11 @@ unframed_requests | No | Boolean | Enable requests not framed using the gRPC wir
thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`.
max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`.
ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`.
-sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`.
-sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`.
+sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if `ssl` is set to `true`.
+sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if `ssl` is set to `true`.
useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
-acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
-awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
+acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificates. Required if `useAcmCertForSSL` is set to `true`.
+awsRegion | Conditionally | String | Represents the AWS Region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
authentication | No | Object | An authentication configuration. By default, an unauthenticated server is created for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java).
@@ -181,7 +181,7 @@ awsCloudMapNamespaceName | Conditionally | String | Name of your CloudMap Namesp
awsCloudMapServiceName | Conditionally | String | Service name within your CloudMap Namespace. Required if `discovery_mode` is set to `aws_cloud_map`.
sslKeyCertChainFile | Conditionally | String | Represents the SSL certificate chain file path or AWS S3 path. S3 path example `s3:///`. Required if `ssl` is set to `true`.
useAcmCertForSSL | No | Boolean | Enables TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
-awsRegion | Conditionally | String | Represents the AWS region to use ACM, S3, or CloudMap. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
+awsRegion | Conditionally | String | Represents the AWS Region to use ACM, S3, or CloudMap. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths.
acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`.
### string_converter
diff --git a/_clients/data-prepper/pipelines.md b/_clients/data-prepper/pipelines.md
index 167d870f..6f4a64a4 100644
--- a/_clients/data-prepper/pipelines.md
+++ b/_clients/data-prepper/pipelines.md
@@ -182,7 +182,7 @@ Note that it is recommended to scale the `buffer_size` and `batch_size` by the e
Data Prepper supports metrics ingestion using OTel. It currently supports the following metric types:
-* Guage
+* Gauge
* Sum
* Summary
* Histogram