opensearch-docs-cn/_data-prepper/pipelines/configuration/processors/processors.md

123 lines
7.8 KiB
Markdown
Raw Normal View History

Restructure Data Prepper plugins documentation (#2073) * Removed content from Data Prepper reference and broke out into separate pages. Signed-off-by: carolxob <carolxob@amazon.com> * Checking in file to make sure it's the right version. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update. Signed-off-by: carolxob <carolxob@amazon.com> * Updated files. Signed-off-by: carolxob <carolxob@amazon.com> * Adding Sinks file. Signed-off-by: carolxob <carolxob@amazon.com> * Added file to PR. Signed-off-by: carolxob <carolxob@amazon.com> * Corrected TOC hierarchy. Signed-off-by: carolxob <carolxob@amazon.com> * Added images and reorganized files. Signed-off-by: carolxob <carolxob@amazon.com> * Reconfigured some content based on David's feedback. Signed-off-by: carolxob <carolxob@amazon.com> * Modified reference page. Signed-off-by: carolxob <carolxob@amazon.com> * Fixed minor heading issue. Signed-off-by: carolxob <carolxob@amazon.com> * Minor edits. Signed-off-by: carolxob <carolxob@amazon.com> * Major edits, files created, moved, and content broken out from main config page. Signed-off-by: carolxob <carolxob@amazon.com> * Adding key value and processors pages to the PR. Signed-off-by: carolxob <carolxob@amazon.com> * Basic ToC reorg. Signed-off-by: carolxob <carolxob@amazon.com> * ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update.: Signed-off-by: carolxob <carolxob@amazon.com> * Minor edits to ToC again. Signed-off-by: carolxob <carolxob@amazon.com> * Minor updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor TOC update Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC edits. Signed-off-by: carolxob <carolxob@amazon.com> * Changed filename. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates Signed-off-by: carolxob <carolxob@amazon.com> * Making small Toc changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Added comment blocks for Config and Metrics sections. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates to add Sinks and Sources under config guide. Signed-off-by: carolxob <carolxob@amazon.com> Signed-off-by: carolxob <carolxob@amazon.com>
2022-12-27 17:58:48 -05:00
---
layout: default
title: Processors
has_children: true
Data Prepper ToC Update (#2514) * Creating PR with first file. Signed-off-by: carolxob <carolxob@amazon.com> * Adding newly created files to PR. Signed-off-by: carolxob <carolxob@amazon.com> * Reorganized files and added appropriate metadata to map ToC correctly. Signed-off-by: carolxob <carolxob@amazon.com> * Moved Authoring pipelines page. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates to Sources section for Data Prepper. Signed-off-by: carolxob <carolxob@amazon.com> * Updated Buffers section under Data Prepper. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update to otelmetricssource. Signed-off-by: carolxob <carolxob@amazon.com> * Restructured ToC in Processors section for Data Prepper. Signed-off-by: carolxob <carolxob@amazon.com> * Minor filename change. Signed-off-by: carolxob <carolxob@amazon.com> * Adjustments to metadata in ToC. Signed-off-by: carolxob <carolxob@amazon.com> * Minor edit. Signed-off-by: carolxob <carolxob@amazon.com> * Fixed nav order in metadata. Signed-off-by: carolxob <carolxob@amazon.com> * Minor edit. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update top metadata for ToC. Signed-off-by: carolxob <carolxob@amazon.com> * Adjustmenets to Toc order. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustments to ToC metadata. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustments to Sinks section. Signed-off-by: carolxob <carolxob@amazon.com> * Adjustements to high level ToC. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustement to Pipelines.md Signed-off-by: carolxob <carolxob@amazon.com> * Minor update. Signed-off-by: carolxob <carolxob@amazon.com> * Slight reorganization. Removed two placeholder pages for now. Signed-off-by: carolxob <carolxob@amazon.com> * Removed a page and replaced with pipelines content. Signed-off-by: carolxob <carolxob@amazon.com> * Minor changes/additions to content for placeholder pages. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update to page link. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustments to ToC metadata. Signed-off-by: carolxob <carolxob@amazon.com> * Minor edits. Signed-off-by: carolxob <carolxob@amazon.com> * Removed /clients from redirects to correct nav order. Signed-off-by: carolxob <carolxob@amazon.com> * Minor edits. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustments to ToC metadata. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustments. Signed-off-by: carolxob <carolxob@amazon.com> * Minor adjustment ot metadata. Signed-off-by: carolxob <carolxob@amazon.com> * TOC link fixes Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Changed page name. Signed-off-by: carolxob <carolxob@amazon.com> * Corrected references to Peer Forwarder. Signed-off-by: carolxob <carolxob@amazon.com> * Renamed Data Prepper folder. Signed-off-by: carolxob <carolxob@amazon.com> * Minor updates to phrasing and capitalization. Signed-off-by: carolxob <carolxob@amazon.com> * Minor phrasing update. Signed-off-by: carolxob <carolxob@amazon.com> * Minor phrasing update. Signed-off-by: carolxob <carolxob@amazon.com> * Minor change. Signed-off-by: carolxob <carolxob@amazon.com> * Minor change to change S3 Source to S3Source. Signed-off-by: carolxob <carolxob@amazon.com> * Updated references to peer forwarder and changed capitalization. Signed-off-by: carolxob <carolxob@amazon.com> * Updated capitalization for peer forwarder. Signed-off-by: carolxob <carolxob@amazon.com> * Made edits based on doc review feedback. Signed-off-by: carolxob <carolxob@amazon.com> * Update to one word. Signed-off-by: carolxob <carolxob@amazon.com> --------- Signed-off-by: carolxob <carolxob@amazon.com> Signed-off-by: Naarcha-AWS <naarcha@amazon.com> Co-authored-by: Naarcha-AWS <naarcha@amazon.com>
2023-02-03 17:06:10 -05:00
parent: Pipelines
nav_order: 25
Restructure Data Prepper plugins documentation (#2073) * Removed content from Data Prepper reference and broke out into separate pages. Signed-off-by: carolxob <carolxob@amazon.com> * Checking in file to make sure it's the right version. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update. Signed-off-by: carolxob <carolxob@amazon.com> * Updated files. Signed-off-by: carolxob <carolxob@amazon.com> * Adding Sinks file. Signed-off-by: carolxob <carolxob@amazon.com> * Added file to PR. Signed-off-by: carolxob <carolxob@amazon.com> * Corrected TOC hierarchy. Signed-off-by: carolxob <carolxob@amazon.com> * Added images and reorganized files. Signed-off-by: carolxob <carolxob@amazon.com> * Reconfigured some content based on David's feedback. Signed-off-by: carolxob <carolxob@amazon.com> * Modified reference page. Signed-off-by: carolxob <carolxob@amazon.com> * Fixed minor heading issue. Signed-off-by: carolxob <carolxob@amazon.com> * Minor edits. Signed-off-by: carolxob <carolxob@amazon.com> * Major edits, files created, moved, and content broken out from main config page. Signed-off-by: carolxob <carolxob@amazon.com> * Adding key value and processors pages to the PR. Signed-off-by: carolxob <carolxob@amazon.com> * Basic ToC reorg. Signed-off-by: carolxob <carolxob@amazon.com> * ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor update.: Signed-off-by: carolxob <carolxob@amazon.com> * Minor edits to ToC again. Signed-off-by: carolxob <carolxob@amazon.com> * Minor updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor TOC update Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC edits. Signed-off-by: carolxob <carolxob@amazon.com> * Changed filename. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates Signed-off-by: carolxob <carolxob@amazon.com> * Making small Toc changes. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates. Signed-off-by: carolxob <carolxob@amazon.com> * Added comment blocks for Config and Metrics sections. Signed-off-by: carolxob <carolxob@amazon.com> * Minor ToC updates to add Sinks and Sources under config guide. Signed-off-by: carolxob <carolxob@amazon.com> Signed-off-by: carolxob <carolxob@amazon.com>
2022-12-27 17:58:48 -05:00
---
# Processors
Processors perform some action on your data: filter, transform, enrich, etc.
Prior to Data Prepper 1.3, Processors were named Preppers. Starting in Data Prepper 1.3, the term Prepper is deprecated in favor of Processor. Data Prepper will continue to support the term "Prepper" until 2.0, where it will be removed.
{: .note }
## copy_values
Copy values within an event. `copy_values` is part of [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of entries to be copied. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`.
from_key | N/A | N/A | The key of the entry to be copied.
to_key | N/A | N/A | The key of the new entry to be added.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value gets overwritten if the key already exists within the event. Default is `false`.
## delete_entries
Delete entries in an event. `delete_entries` is part of [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
with_keys | Yes | List | An array of keys of the entries to be deleted.
## rename_keys
Rename keys in an event. `rename_keys` is part of [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of entries. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`. Renaming occurs in the order defined.
from_key | N/A | N/A | The key of the entry to be renamed.
to_key | N/A | N/A | The new key of the entry.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value gets overwritten if `to_key` already exists in the event.
## substitute_string
Matches a key's value against a regular expression and replaces all matches with a replacement string. `substitute_string` is part of [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of entries. Valid values are `source`, `from`, and `to`.
source | N/A | N/A | The key to modify.
from | N/A | N/A | The Regex String to be replaced. Special regex characters such as `[` and `]` must be escaped using `\\` when using double quotes and `\ ` when using single quotes. See [Java Patterns](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html) for more information.
to | N/A | N/A | The String to be substituted for each match of `from`.
## split_string
Splits a field into an array using a delimiter character. `split_string` is part of [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of entries. Valid values are `source`, `delimiter`, and `delimiter_regex`.
source | N/A | N/A | The key to split.
delimiter | No | N/A | The separator character responsible for the split. Cannot be defined at the same time as `delimiter_regex`. At least `delimiter` or `delimiter_regex` must be defined.
delimiter_regex | No | N/A | The regex string responsible for the split. Cannot be defined at the same time as `delimiter`. At least `delimiter` or `delimiter_regex` must be defined.
## uppercase_string
Converts a string to its uppercase counterpart. `uppercase_string` is part of [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
with_keys | Yes | List | A list of keys to convert to uppercase.
## lowercase_string
Converts a string to its lowercase counterpart. `lowercase_string` is part of [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
with_keys | Yes | List | A list of keys to convert to lowercase.
## trim_string
Strips whitespace from the beginning and end of a key. `trim_string` is part of [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processors.
Option | Required | Type | Description
:--- | :--- | :--- | :---
with_keys | Yes | List | A list of keys to trim the whitespace from.
## csv
Takes in an Event and parses its CSV data into columns.
Option | Required | Type | Description
:--- | :--- | :--- | :---
source | No | String | The field in the Event that will be parsed. Default is `message`.
quote_character | No | String | The character used as a text qualifier for a single column of data. Default is double quote `"`.
delimiter | No | String | The character separating each column. Default is `,`.
delete_header | No | Boolean | If specified, the header on the Event (`column_names_source_key`) deletes after the Event is parsed. If theres no header on the Event, no actions is taken. Default is true.
column_names_source_key | No | String | The field in the Event that specifies the CSV column names, which will be autodetected. If there must be extra column names, the column names autogenerate according to their index. If `column_names` is also defined, the header in `column_names_source_key` can also be used to generate the Event fields. If too few columns are specified in this field, the remaining column names autogenerate. If too many column names are specified in this field, CSV processor omits the extra column names.
column_names | No | List | User-specified names for the CSV columns. Default is `[column1, column2, ..., columnN]` if there are N columns of data in the CSV record and `column_names_source_key` is not defined. If `column_names_source_key` is defined, the header in `column_names_source_key` generates the Event fields. If too few columns are specified in this field, the remaining column names will autogenerate. If too many column names are specified in this field, CSV processor omits the extra column names.
## json
Takes in an Event and parses its JSON data, including any nested fields.
Option | Required | Type | Description
:--- | :--- | :--- | :---
source | No | String | The field in the `Event` that will be parsed. Default is `message`.
destination | No | String | The destination field of the parsed JSON. Defaults to the root of the `Event`. Cannot be `""`, `/`, or any whitespace-only `String` because these are not valid `Event` fields.
pointer | No | String | A JSON Pointer to the field to be parsed. There is no `pointer` by default, meaning the entire `source` is parsed. The `pointer` can access JSON Array indices as well. If the JSON Pointer is invalid then the entire `source` data is parsed into the outgoing `Event`. If the pointed-to key already exists in the `Event` and the `destination` is the root, then the pointer uses the entire path of the key.
# Routes
Routes define conditions that can be used in sinks for conditional routing. Routes are specified at the same level as processors and sinks under the name `route` and consist of a list of key-value pairs, where the key is the name of a route and the value is a Data Prepper expression representing the routing condition.
# Sinks
Sinks define where Data Prepper writes your data to.