From 4c885dd7677a0e8bef7bf3e36c902061a0640394 Mon Sep 17 00:00:00 2001 From: Caroline <113052567+carolxob@users.noreply.github.com> Date: Tue, 24 Jan 2023 14:25:24 -0700 Subject: [PATCH] Add Monitoring to doc website repo (#2018) * Adding file to new pull request to resolve problem with previous PR. Signed-off-by: carolxob * Minor formatting changes and added a hyperlink. Signed-off-by: carolxob * Minor updates based on doc review feedback. Signed-off-by: carolxob * Awaiting a couple of tech review comments. Doc review feedback incorporated. Signed-off-by: carolxob * Added comment regarding data-prepper-api link. Signed-off-by: carolxob * Checkin test. Signed-off-by: carolxob * Minor edits from technical and editorial feedback. Signed-off-by: carolxob * Minor edits. Signed-off-by: carolxob * Adjusted capitalization for one word. * Added comment for context in editorial review. Signed-off-by: carolxob * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Update _data-prepper/monitoring.md Co-authored-by: Nate Bower * Removed markdown comment from file. Signed-off-by: carolxob * Slight word change. Signed-off-by: carolxob * Update _data-prepper/monitoring.md Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Made doc review edits. Signed-off-by: carolxob * Minor updates to phrasing. Signed-off-by: carolxob * Incorporated technical review feedback. Signed-off-by: carolxob * Minor updates based on tech review feedback. Signed-off-by: carolxob * Minor update to phrasing Signed-off-by: carolxob * Minor updates. Signed-off-by: carolxob * Minor updates. Signed-off-by: carolxob Signed-off-by: carolxob Co-authored-by: Nate Bower Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- _data-prepper/monitoring.md | 58 +++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 _data-prepper/monitoring.md diff --git a/_data-prepper/monitoring.md b/_data-prepper/monitoring.md new file mode 100644 index 00000000..ea1c2a26 --- /dev/null +++ b/_data-prepper/monitoring.md @@ -0,0 +1,58 @@ +--- +layout: default +title: Monitoring +nav_order: 33 +--- + +# Monitoring Data Prepper with metrics + +You can monitor Data Prepper with metrics using [Micrometer](https://micrometer.io/). There are two types of metrics: JVM/system metrics and plugin metrics. [Prometheus](https://prometheus.io/) is used as the default metrics backend. + +## JVM and system metrics + +JVM and system metrics are runtime metrics that are used to monitor Data Prepper instances. They include metrics for classloaders, memory, garbage collection, threads, and others. For more information, see [JVM and system metrics](https://micrometer.io/docs/ref/jvm). + +### Naming + +JVM and system metrics follow predefined names in [Micrometer](https://micrometer.io/docs/concepts#_naming_meters). For example, the Micrometer metrics name for memory usage is `jvm.memory.used`. Micrometer changes the name to match the metrics system. Following the same example, `jvm.memory.used` is reported to Prometheus as `jvm_memory_used`, and is reported to Amazon CloudWatch as `jvm.memory.used.value`. + +### Serving + +By default, metrics are served from the **/metrics/sys** endpoint on the Data Prepper server in Prometheus scrape format. You can configure Prometheus to scrape from the Data Prepper URL. Prometheus then polls Data Prepper for metrics and stores them in its database. To visualize the data, you can set up any frontend that accepts Prometheus metrics, such as [Grafana](https://prometheus.io/docs/visualization/grafana/). You can update the configuration to serve metrics to other registries like Amazon CloudWatch, which does not require or host the endpoint but publishes the metrics directly to CloudWatch. + +## Plugin metrics + +Plugins report their own metrics. Data Prepper uses a naming convention to help with consistency in the metrics. Plugin metrics do not use dimensions. + + +1. AbstractBuffer + - Counter + - `recordsWritten`: The number of records written into a buffer + - `recordsRead`: The number of records read from a buffer + - `recordsProcessed`: The number of records read from a buffer and marked as processed + - `writeTimeouts`: The count of write timeouts in a buffer + - Gaugefir + - `recordsInBuffer`: The number of records in a buffer + - `recordsInFlight`: The number of records read from a buffer and being processed by data-prepper downstreams (for example, processor, sink) + - Timer + - `readTimeElapsed`: The time elapsed while reading from a buffer + - `checkpointTimeElapsed`: The time elapsed while checkpointing +2. AbstractProcessor + - Counter + - `recordsIn`: The number of records ingressed into a processor + - `recordsOut`: The number of records egressed from a processor + - Timer + - `timeElapsed`: The time elapsed during initiation of a processor +3. AbstractSink + - Counter + - `recordsIn`: The number of records ingressed into a sink + - Timer + - `timeElapsed`: The time elapsed during execution of a sink + +### Naming + +Metrics follow a naming convention of **PIPELINE_NAME_PLUGIN_NAME_METRIC_NAME**. For example, a **recordsIn** metric for the **opensearch-sink** plugin in a pipeline named **output-pipeline** has a qualified name of **output-pipeline_opensearch_sink_recordsIn**. + +### Serving + +By default, metrics are served from the **/metrics/sys** endpoint on the Data Prepper server in a Prometheus scrape format. You can configure Prometheus to scrape from the Data Prepper URL. The Data Prepper server port has a default value of `4900` that you can modify, and this port can be used for any frontend that accepts Prometheus metrics, such as [Grafana](https://prometheus.io/docs/visualization/grafana/). You can update the configuration to serve metrics to other registries like CloudWatch, that does not require or host the endpoint, but publishes the metrics directly to CloudWatch. \ No newline at end of file