From e1a1f44dd6931a2f10844fe397cb18d01a01afb9 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Thu, 20 Apr 2023 13:48:36 -0500 Subject: [PATCH] Add list to map processor (#3806) * Add list to map processor. Signed-off-by: Naarcha-AWS * Tweak one last file Signed-off-by: Naarcha-AWS * Fix typo Signed-off-by: Naarcha-AWS * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update mutate-event.md * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md * Add Chris' feedback Signed-off-by: Naarcha-AWS * A couple more wording tweaks Signed-off-by: Naarcha-AWS * Update _data-prepper/pipelines/configuration/processors/list-to-map.md * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Nathan Bower * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Nathan Bower * Apply suggestions from code review Co-authored-by: Nathan Bower --------- Signed-off-by: Naarcha-AWS Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Co-authored-by: Nathan Bower --- .../configuration/processors/add-entries.md | 61 +++- .../configuration/processors/aggregate.md | 11 +- .../processors/convert_entry_type.md | 55 ++++ .../configuration/processors/copy-values.md | 60 +++- .../pipelines/configuration/processors/csv.md | 10 +- .../configuration/processors/date.md | 11 +- .../processors/delete-entries.md | 49 ++- .../configuration/processors/drop-events.md | 5 +- .../configuration/processors/grok.md | 9 +- .../configuration/processors/key-value.md | 5 +- .../configuration/processors/list-to-map.md | 305 +++++++++++++++++ .../processors/lowercase-string.md | 5 +- .../configuration/processors/mutate-event.md | 306 +----------------- .../configuration/processors/mutate-string.md | 2 +- .../processors/otel-trace-raw.md | 16 +- .../configuration/processors/parse-json.md | 4 +- .../configuration/processors/rename-keys.md | 98 +++++- .../configuration/processors/routes.md | 6 +- .../processors/service-map-stateful.md | 10 +- .../configuration/processors/sinks.md | 21 -- .../configuration/processors/split-string.md | 5 +- .../processors/string-converter.md | 5 +- .../processors/substitute-string.md | 10 +- .../processors/trace-peer-forwarder.md | 4 +- .../configuration/processors/trim-string.md | 6 +- .../processors/uppercase-string.md | 6 +- 26 files changed, 657 insertions(+), 428 deletions(-) create mode 100644 _data-prepper/pipelines/configuration/processors/convert_entry_type.md create mode 100644 _data-prepper/pipelines/configuration/processors/list-to-map.md delete mode 100644 _data-prepper/pipelines/configuration/processors/sinks.md diff --git a/_data-prepper/pipelines/configuration/processors/add-entries.md b/_data-prepper/pipelines/configuration/processors/add-entries.md index 457cbb4d..fe69ae1b 100644 --- a/_data-prepper/pipelines/configuration/processors/add-entries.md +++ b/_data-prepper/pipelines/configuration/processors/add-entries.md @@ -1,25 +1,62 @@ --- layout: default -title: add_entries +title: Add entries processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 40 --- # add_entries -## Overview +The `add_entries` processor adds entries to an event. -The `add_entries` processor adds an entry to the event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `add_entries` processor. +### Configuration -Option | Required | Type | Description -:--- | :--- | :--- | :--- -entries | Yes | List | List of events to be added. Valid entries are `key`, `value`, and `overwrite_if_key_exists`. -key | N/A | N/A | Key of the new event to be added. -value | N/A | N/A | Value of the new entry to be added. Valid data types are strings, booleans, numbers, null, nested objects, and arrays containing the aforementioned data types. -overwrite_if_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`. +You can configure the `add_entries` processor with the following options. - +### Usage + +To get started, create the following `pipeline.yaml` file: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - add_entries: + entries: + - key: "newMessage" + value: 3 + overwrite_if_key_exists: true + sink: + - stdout: +``` +{% include copy.html %} + + +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). + +For example, before you run the `add_entries` processor, if the `logs_json.log` file contains the following event record: + +```json +{"message": "hello"} +``` + +Then when you run the `add_entries` processor using the previous configuration, it adds a new entry `{"newMessage": 3}` to the existing event `{"message": "hello"}` so that the new event contains two entries in the final output: + +```json +{"message": "hello", "newMessage": 3} +``` + +> If `newMessage` already exists, its existing value is overwritten with a value of `3`. diff --git a/_data-prepper/pipelines/configuration/processors/aggregate.md b/_data-prepper/pipelines/configuration/processors/aggregate.md index 51f174ae..36bf3dd8 100644 --- a/_data-prepper/pipelines/configuration/processors/aggregate.md +++ b/_data-prepper/pipelines/configuration/processors/aggregate.md @@ -1,16 +1,19 @@ --- layout: default -title: aggregate +title: Aggregate processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 41 --- # aggregate -## Overview +The `aggregate` processor groups events based on the keys provided and performs an action on each group. -The `aggregate` processor groups events based on the keys provided and performs an action on each group. The following table describes the options you can use to configure the `aggregate` processor. + +## Configuration + +The following table describes the options you can use to configure the `aggregate` processor. Option | Required | Type | Description :--- | :--- | :--- | :--- diff --git a/_data-prepper/pipelines/configuration/processors/convert_entry_type.md b/_data-prepper/pipelines/configuration/processors/convert_entry_type.md new file mode 100644 index 00000000..b446ce6a --- /dev/null +++ b/_data-prepper/pipelines/configuration/processors/convert_entry_type.md @@ -0,0 +1,55 @@ +--- +layout: default +title: Convert entry type processor +parent: Processors +grand_parent: Pipelines +nav_order: 47 +--- + +# convert_entry_type_type + +The `convert_entry_type` processor converts a value type associated with the specified key in a event to the specified type. It is a casting processor that changes the types of some fields in events. Some data must be converted to a different type, such as an integer to a double, or a string to an integer, so that it will pass the events through condition-based processors or perform conditional routing. + +## Configuration + +You can configure the `convert_entry_type` processor with the following options. + +| Option | Required | Description | +| :--- | :--- | :--- | +| `key`| Yes | Keys whose value needs to be converted to a different type. | +| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. | + +## Usage + +To get started, create the following `pipeline.yaml` file: + +```yaml +type-conv-pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - convert_entry_type_type: + key: "response_status" + type: "integer" + sink: + - stdout: +``` +{% include copy.html %} + +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). + +For example, before you run the `convert_entry_type` processor, if the `logs_json.log` file contains the following event record: + + +```json +{"message": "value", "response_status":"200"} +``` + +The `convert_entry_type` processor converts the output received to the following output, where the type of `response_status` value changes from a string to an integer: + +```json +{"message":"value","response_status":200} +``` \ No newline at end of file diff --git a/_data-prepper/pipelines/configuration/processors/copy-values.md b/_data-prepper/pipelines/configuration/processors/copy-values.md index 7a9eed84..17db03fd 100644 --- a/_data-prepper/pipelines/configuration/processors/copy-values.md +++ b/_data-prepper/pipelines/configuration/processors/copy-values.md @@ -1,24 +1,60 @@ --- layout: default -title: copy_values +title: Copy values processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 48 --- # copy_values -## Overview +The `copy_values` processor copies values within an event and is a [mutate event]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/mutate-event/) processor. -The `copy_values` processor copies values within an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `copy_values` processor. +## Configuration -Option | Required | Type | Description -:--- | :--- | :--- | :--- -entries | Yes | List | The list of entries to be copied. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`. -from_key | N/A | N/A | The key of the entry to be copied. -to_key | N/A | N/A | The key of the new entry to be added. -overwrite_if_to_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`. +You can configure the `copy_values` processor with the following options. - +## Usage + +To get started, create the following `pipeline.yaml` file: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - copy_values: + entries: + - from_key: "message" + to_key: "newMessage" + overwrite_if_to_key_exists: true + sink: + - stdout: +``` +{% include copy.html %} + +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). + +For example, before you run the `copy_values` processor, if the `logs_json.log` file contains the following event record: + +```json +{"message": "hello"} +``` + +When you run this processor, it parses the message into the following output: + +```json +{"message": "hello", "newMessage": "hello"} +``` + +> If `newMessage` already exists, its existing value is overwritten with `value`. diff --git a/_data-prepper/pipelines/configuration/processors/csv.md b/_data-prepper/pipelines/configuration/processors/csv.md index 5e2c8978..6475e5fb 100644 --- a/_data-prepper/pipelines/configuration/processors/csv.md +++ b/_data-prepper/pipelines/configuration/processors/csv.md @@ -1,16 +1,18 @@ --- layout: default -title: csv +title: CSV processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 49 --- # csv -## Overview +The `csv` processor parses comma-separated values (CSVs) from the event into columns. -The `csv` processor parses comma-separated values (CSVs) from the event into columns. The following table describes the options you can use to configure the `csv` processor. +## Configuration + +The following table describes the options you can use to configure the `csv` processor. Option | Required | Type | Description :--- | :--- | :--- | :--- diff --git a/_data-prepper/pipelines/configuration/processors/date.md b/_data-prepper/pipelines/configuration/processors/date.md index 73215ca4..93ddb19d 100644 --- a/_data-prepper/pipelines/configuration/processors/date.md +++ b/_data-prepper/pipelines/configuration/processors/date.md @@ -1,16 +1,19 @@ --- layout: default -title: date +title: Date parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 50 --- # date -## Overview -The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp. The following table describes the options you can use to configure the `date` processor. +The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp. + +## Configuration + +The following table describes the options you can use to configure the `date` processor. Option | Required | Type | Description :--- | :--- | :--- | :--- diff --git a/_data-prepper/pipelines/configuration/processors/delete-entries.md b/_data-prepper/pipelines/configuration/processors/delete-entries.md index 0f8bfb5d..abdf80df 100644 --- a/_data-prepper/pipelines/configuration/processors/delete-entries.md +++ b/_data-prepper/pipelines/configuration/processors/delete-entries.md @@ -3,19 +3,52 @@ layout: default title: delete_entries parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 51 --- # delete_entries -## Overview +The `delete_entries` processor deletes entries, such as key-value pairs, from an event. You can define the keys you want to delete in the `with-keys` field following `delete_entries` in the YAML configuration file. Those keys and their values are deleted. -The `delete_entries` processor deletes entries in an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `delete-entries` processor. +## Configuration -Option | Required | Type | Description -:--- | :--- | :--- | :--- -with_keys | Yes | List | An array of keys of the entries to be deleted. +You can configure the `delete_entries` processor with the following options. - +## Usage + +To get started, create the following `pipeline.yaml` file: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - delete_entries: + with_keys: ["message"] + sink: + - stdout: +``` +{% include copy.html %} + +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). + +For example, before you run the `delete_entries` processor, if the `logs_json.log` file contains the following event record: + +```json +{"message": "hello", "message2": "goodbye"} +``` + +When you run the `delete_entries` processor, it parses the message into the following output: + +```json +{"message2": "goodbye"} +``` + +> If `message` does not exist in the event, then no action occurs. diff --git a/_data-prepper/pipelines/configuration/processors/drop-events.md b/_data-prepper/pipelines/configuration/processors/drop-events.md index 8eadd636..4ba453ee 100644 --- a/_data-prepper/pipelines/configuration/processors/drop-events.md +++ b/_data-prepper/pipelines/configuration/processors/drop-events.md @@ -1,14 +1,13 @@ --- layout: default -title: drop_events +title: Drop events processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 52 --- # drop_events -## Overview The `drop_events` processor drops all the events that are passed into it. The following table describes when events are dropped and how exceptions for dropping events are handled. diff --git a/_data-prepper/pipelines/configuration/processors/grok.md b/_data-prepper/pipelines/configuration/processors/grok.md index 62abc3be..91d23637 100644 --- a/_data-prepper/pipelines/configuration/processors/grok.md +++ b/_data-prepper/pipelines/configuration/processors/grok.md @@ -3,14 +3,17 @@ layout: default title: grok parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 53 --- # grok -## Overview -The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys. The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query. +The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys. + +## Configuration + +The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query. Option | Required | Type | Description :--- | :--- | :--- | :--- diff --git a/_data-prepper/pipelines/configuration/processors/key-value.md b/_data-prepper/pipelines/configuration/processors/key-value.md index 957d7de5..8c4e71da 100644 --- a/_data-prepper/pipelines/configuration/processors/key-value.md +++ b/_data-prepper/pipelines/configuration/processors/key-value.md @@ -1,14 +1,13 @@ --- layout: default -title: key_value +title: Key value processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 54 --- # key_value -## Overview The `key_value` processor parses a field into key/value pairs. The following table describes `key_value` processor options available that help you parse field information into pairs. diff --git a/_data-prepper/pipelines/configuration/processors/list-to-map.md b/_data-prepper/pipelines/configuration/processors/list-to-map.md new file mode 100644 index 00000000..53f10f2b --- /dev/null +++ b/_data-prepper/pipelines/configuration/processors/list-to-map.md @@ -0,0 +1,305 @@ +--- +layout: default +title: List to map processor +parent: Processors +grand_parent: Pipelines +nav_order: 55 +--- + +# list_to_map + +The `list_to_map` processor converts a list of objects from an event, where each object contains a `key` field, into a map of target keys. + +## Configuration + +The following table describes the configuration options used to generate target keys for the mappings. + +Option | Required | Type | Description +:--- | :--- | :--- | :--- +`key` | Yes | String | The key of the fields to be extracted as keys in the generated mappings. +`source` | Yes | String | The list of objects with `key` fields to be converted into keys for the generated map. +`target` | No | String | The target for the generated map. When not specified, the generated map will be placed in the root node. +`value_key` | No | String | When specified, values given a `value_key` in objects contained in the source list will be extracted and converted into the value specified by this option based on the generated map. When not specified, objects contained in the source list retain their original value when mapped. +`flatten` | No | Boolean | When `true`, values in the generated map output flatten into single items based on the `flattened_element`. Otherwise, objects mapped to values from the generated map appear as lists. +`flattened_element` | Conditionally | String | The element to keep, either `first` or `last`, when `flatten` is set to `true`. + +## Usage + +The following example shows how to test the usage of the `list_to_map` processor before using the processor on your own source. + +Create a source file named `logs_json.log`. Because the `file` source reads each line in the `.log` file as an event, the object list appears as one line even though it contains multiple objects: + +```json +{"mylist":[{"name":"a","value":"val-a"},{"name":"b","value":"val-b1"},{"name":"b", "value":"val-b2"},{"name":"c","value":"val-c"}]} +``` +{% include copy.html %} + +Next, create a `pipeline.yaml` file that uses the `logs_json.log` file as the `source` by pointing to the `.log` file's correct path: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - list_to_map: + key: "name" + source: "mylist" + value_key: "value" + flatten: true + sink: + - stdout: +``` +{% include copy.html %} + +Run the pipeline. If successful, the processor returns the generated map with objects mapped according to their `value_key`. Similar to the original source, which contains one line and therefore one event, the processor returns the following JSON as one line. For readability, the following example and all subsequent JSON examples have been adjusted to span multiple lines: + +```json +{ + "mylist": [ + { + "name": "a", + "value": "val-a" + }, + { + "name": "b", + "value": "val-b1" + }, + { + "name": "b", + "value": "val-b2" + }, + { + "name": "c", + "value": "val-c" + } + ], + "a": "val-a", + "b": "val-b1", + "c": "val-c" +} +``` + +### Example: Maps set to `target` + +The following example `pipeline.yaml` file shows the `list_to_map` processor when set to a specified target, `mymap`: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - list_to_map: + key: "name" + source: "mylist" + target: "mymap" + value_key: "value" + flatten: true + sink: + - stdout: +``` +{% include copy.html %} + +The generated map appears under the target key: + +```json +{ + "mylist": [ + { + "name": "a", + "value": "val-a" + }, + { + "name": "b", + "value": "val-b1" + }, + { + "name": "b", + "value": "val-b2" + }, + { + "name": "c", + "value": "val-c" + } + ], + "mymap": { + "a": "val-a", + "b": "val-b1", + "c": "val-c" + } +} +``` + +### Example: No `value_key` specified + +The follow example `pipeline.yaml` file shows the `list_to_map` processor with no `value_key` specified. Because `key` is set to `name`, the processor extracts the object names to use as keys in the map. + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - list_to_map: + key: "name" + source: "mylist" + flatten: true + sink: + - stdout: +``` +{% include copy.html %} + +The values from the generated map appear as original objects from the `.log` source, as shown in the following example response: + +```json +{ + "mylist": [ + { + "name": "a", + "value": "val-a" + }, + { + "name": "b", + "value": "val-b1" + }, + { + "name": "b", + "value": "val-b2" + }, + { + "name": "c", + "value": "val-c" + } + ], + "a": { + "name": "a", + "value": "val-a" + }, + "b": { + "name": "b", + "value": "val-b1" + }, + "c": { + "name": "c", + "value": "val-c" + } +} +``` + +### Example: `flattened_element` set to `last` + +The following example `pipeline.yaml` file sets the `flattened_element` to last, therefore flattening the processor output based on each value's last element: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - list_to_map: + key: "name" + source: "mylist" + target: "mymap" + value_key: "value" + flatten: true + flattened_element: "last" + sink: + - stdout: +``` +{% include copy.html %} + +The processor maps object `b` to value `val-b2` because `val-b2` is the last element in object `b`, as shown in the following output: + +```json +{ + "mylist": [ + { + "name": "a", + "value": "val-a" + }, + { + "name": "b", + "value": "val-b1" + }, + { + "name": "b", + "value": "val-b2" + }, + { + "name": "c", + "value": "val-c" + } + ], + "a": "val-a", + "b": "val-b2", + "c": "val-c" +} +``` + + +### Example: `flatten` set to false + +The following example `pipeline.yaml` file sets `flatten` to `false`, causing the processor to output values from the generated map as a list: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - list_to_map: + key: "name" + source: "mylist" + target: "mymap" + value_key: "value" + flatten: false + sink: + - stdout: +``` +{% include copy.html %} + +Some objects in the response may have more than one element in their values, as shown in the following response: + +```json +{ + "mylist": [ + { + "name": "a", + "value": "val-a" + }, + { + "name": "b", + "value": "val-b1" + }, + { + "name": "b", + "value": "val-b2" + }, + { + "name": "c", + "value": "val-c" + } + ], + "a": [ + "val-a" + ], + "b": [ + "val-b1", + "val-b2" + ], + "c": [ + "val-c" + ] +} +``` \ No newline at end of file diff --git a/_data-prepper/pipelines/configuration/processors/lowercase-string.md b/_data-prepper/pipelines/configuration/processors/lowercase-string.md index c5a76db2..e72ab8f0 100644 --- a/_data-prepper/pipelines/configuration/processors/lowercase-string.md +++ b/_data-prepper/pipelines/configuration/processors/lowercase-string.md @@ -1,14 +1,13 @@ --- layout: default -title: lowercase_string +title: Lowercase string processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 60 --- # lowercase_string -## Overview The `lowercase_string` processor converts a string to its lowercase counterpart and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes options for configuring the `lowercase_string` processor to convert strings to a lowercase format. diff --git a/_data-prepper/pipelines/configuration/processors/mutate-event.md b/_data-prepper/pipelines/configuration/processors/mutate-event.md index dffb64cb..032bc89f 100644 --- a/_data-prepper/pipelines/configuration/processors/mutate-event.md +++ b/_data-prepper/pipelines/configuration/processors/mutate-event.md @@ -3,311 +3,19 @@ layout: default title: Mutate event parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 65 --- # Mutate event processors Mutate event processors allow you to modify events in Data Prepper. The following processors are available: -* [AddEntries](#addentries) allows you to add entries to an event. -* [CopyValues](#copyvalues) allows you to copy values within an event. -* [DeleteEntry](#deleteentry) allows you to delete entries from an event. -* [RenameKey](#renamekey) allows you to rename keys in an event. -* [ConvertEntry](#convertentry) allows you to convert value types in an event. +* [add_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/add-entries/) allows you to add entries to an event. +* [copy_values]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/copy-values/) allows you to copy values within an event. +* [delete_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/delete-entries/) allows you to delete entries from an event. +* [rename_keys]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/rename-keys/) allows you to rename keys in an event. +* [convert_entry_type]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/convert_entry_type/) allows you to convert value types in an event. +* [list_to_map]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/list-to-map) allows you to convert list of objects from an event where each object contains a `key` field into a map of target keys. -## AddEntries -The `AddEntries` processor adds entries to an event. -### Configuration - -You can configure the `AddEntries` processor with the following options. - -| Option | Required | Description | -| :--- | :--- | :--- | -| `entries` | Yes | A list of entries to add to an event. | -| `key` | Yes | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. | -| `value` | Yes | The value of the new entry to be added. You can use the following data types: strings, Booleans, numbers, null, nested objects, and arrays. | -| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. | - -### Usage - -To get started, create the following `pipeline.yaml` file: - -```yaml -pipeline: - source: - file: - path: "/full/path/to/logs_json.log" - record_type: "event" - format: "json" - processor: - - add_entries: - entries: - - key: "newMessage" - value: 3 - overwrite_if_key_exists: true - sink: - - stdout: -``` -{% include copy.html %} - - -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). - -For example, before you run the `AddEntries` processor, if the `logs_json.log` file contains the following event record: - -```json -{"message": "hello"} -``` - -Then when you run the `AddEntries` processor using the previous configuration, it adds a new entry `{"newMessage": 3}` to the existing event `{"message": "hello"}` so that the new event contains two entries in the final output: - -```json -{"message": "hello", "newMessage": 3} -``` - -> If `newMessage` already exists, its existing value is overwritten with a value of `3`. - - -## CopyValues - -The `CopyValues` processor copies the values of an existing key within an event to another key. - -### Configuration - -You can configure the `CopyValues` processor with the following options. - -| Option | Required | Description | -:--- | :--- | :--- -| `entries` | Yes | A list of entries to be copied in an event. | -| `from_key` | Yes | The key of the entry to be copied. | -| `to_key` | Yes | The key of the new entry to be added. | -| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. | - -### Usage - -To get started, create the following `pipeline.yaml` file: - -```yaml -pipeline: - source: - file: - path: "/full/path/to/logs_json.log" - record_type: "event" - format: "json" - processor: - - copy_values: - entries: - - from_key: "message" - to_key: "newMessage" - overwrite_if_to_key_exists: true - sink: - - stdout: -``` -{% include copy.html %} - -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). - -For example, before you run the `CopyValues` processor, if the `logs_json.log` file contains the following event record: - -```json -{"message": "hello"} -``` - -When you run this processor, it parses the message into the following output: - -```json -{"message": "hello", "newMessage": "hello"} -``` - -> If `newMessage` already exists, its existing value is overwritten with `value`. - - -## DeleteEntry - -The `DeleteEntry` processor deletes entries, such as key-value pairs, from an event. You can define the keys you want to delete in the `with-keys` field following `delete_entries` in the YAML configuration file. Those keys and their values are deleted. - -### Configuration - -You can configure the `DeleteEntry` processor with the following options. - -| Option | Required | Description | -:--- | :--- | :--- -| `with_keys` | Yes | An array of keys for the entries to be deleted. | - -### Usage - -To get started, create the following `pipeline.yaml` file: - -```yaml -pipeline: - source: - file: - path: "/full/path/to/logs_json.log" - record_type: "event" - format: "json" - processor: - - delete_entries: - with_keys: ["message"] - sink: - - stdout: -``` -{% include copy.html %} - -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). - -For example, before you run the `DeleteEntry` processor, if the `logs_json.log` file contains the following event record: - -```json -{"message": "hello", "message2": "goodbye"} -``` - -When you run the `DeleteEntry` processor, it parses the message into the following output: - -```json -{"message2": "goodbye"} -``` - -> If `message` does not exist in the event, then no action occurs. - - -## RenameKey - -The `RenameKey` processor renames keys in an event. - -### Configuration - -You can configure the `RenameKey` processor with the following options. - -Option | Required | Description | -| :--- | :--- | :--- | -| `entries` | Yes | A list of event entries to rename. | -| `from_key` | Yes | The key of the entry to be renamed. | -| `to_key` | Yes | The new key of the entry. | -| `overwrite_if_to_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. | - -### Usage - -To get started, create the following `pipeline.yaml` file: - -```yaml -pipeline: - source: - file: - path: "/full/path/to/logs_json.log" - record_type: "event" - format: "json" - processor: - - rename_keys: - entries: - - from_key: "message" - to_key: "newMessage" - overwrite_if_to_key_exists: true - sink: - - stdout: -``` -{% include copy.html %} - - -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). - -For example, before you run the `RenameKey` processor, if the `logs_json.log` file contains the following event record: - -```json -{"message": "hello"} -``` - -When you run the `RenameKey` processor, it parses the message into the following "newMessage" output: - -```json -{"newMessage": "hello"} -``` - -> If `newMessage` already exists, its existing value is overwritten with `value`. - - - -### Special considerations - -Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `RenameKey` processor. See the following example `pipline.yaml` file: - -```yaml -pipeline: - source: - file: - path: "/full/path/to/logs_json.log" - record_type: "event" - format: "json" - processor: - - rename_key: - entries: - - from_key: "message" - to_key: "message2" - - from_key: "message2" - to_key: "message3" - sink: - - stdout: -``` - -Add the following contents to the `logs_json.log` file: - -```json -{"message": "hello"} -``` -{% include copy.html %} - -After the `RenameKey` processor runs, the following output appears: - -```json -{"message3": "hello"} -``` - -## ConvertEntry - -The `ConvertEntry` processor converts a value type associated with the specified key in a event to the specified type. It is a casting processor that changes the types of some fields in events. Some data must be converted to a different type, such as an integer to a double, or a string to an integer, so that it will pass the events through condition-based processors or perform conditional routing. - -### Configuration - -You can configure the `ConvertEntry` processor with the following options. - -| Option | Required | Description | -| :--- | :--- | :--- | -| `key`| Yes | Keys whose value needs to be converted to a different type. | -| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. | - -### Usage - -To get started, create the following `pipeline.yaml` file: - -```yaml -type-conv-pipeline: - source: - file: - path: "/full/path/to/logs_json.log" - record_type: "event" - format: "json" - processor: - - convert_entry_type: - key: "response_status" - type: "integer" - sink: - - stdout: -``` -{% include copy.html %} - -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). - -For example, before you run the `ConvertEntry` processor, if the `logs_json.log` file contains the following event record: - - -```json -{"message": "value", "response_status":"200"} -``` - -The `ConvertEntry` processor converts the output received to the following output, where the type of `response_status` value changes from a string to an integer: - -```json -{"message":"value","response_status":200} -``` \ No newline at end of file diff --git a/_data-prepper/pipelines/configuration/processors/mutate-string.md b/_data-prepper/pipelines/configuration/processors/mutate-string.md index 67526012..05351f1e 100644 --- a/_data-prepper/pipelines/configuration/processors/mutate-string.md +++ b/_data-prepper/pipelines/configuration/processors/mutate-string.md @@ -3,7 +3,7 @@ layout: default title: Mutate string parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 70 --- # Mutate string processors diff --git a/_data-prepper/pipelines/configuration/processors/otel-trace-raw.md b/_data-prepper/pipelines/configuration/processors/otel-trace-raw.md index 1d1bb95b..35efadca 100644 --- a/_data-prepper/pipelines/configuration/processors/otel-trace-raw.md +++ b/_data-prepper/pipelines/configuration/processors/otel-trace-raw.md @@ -1,31 +1,33 @@ --- layout: default -title: otel_trace_raw +title: OTel trace raw processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 75 --- # otel_trace_raw -## Overview -The `otel_trace_raw` processor completes trace-group-related fields in all incoming Data Prepper span records by state caching the root span information for each `tradeId`. This processor includes the following parameters. +The `otel_trace_raw` processor completes trace-group-related fields in all incoming Data Prepper span records by state caching the root span information for each `tradeId`. + +## Parameters + +This processor includes the following parameters. * `traceGroup`: Root span name * `endTime`: End time of the entire trace in International Organization for Standardization (ISO) 8601 format * `durationInNanos`: Duration of the entire trace in nanoseconds * `statusCode`: Status code for the entire trace in nanoseconds +## Configuration + The following table describes the options you can use to configure the `otel_trace_raw` processor. Option | Required | Type | Description :--- | :--- | :--- | :--- trace_flush_interval | No | Integer | Represents the time interval in seconds to flush all the descendant spans without any root span. Default is 180. - ## Metrics diff --git a/_data-prepper/pipelines/configuration/processors/parse-json.md b/_data-prepper/pipelines/configuration/processors/parse-json.md index 306327a8..8711d72f 100644 --- a/_data-prepper/pipelines/configuration/processors/parse-json.md +++ b/_data-prepper/pipelines/configuration/processors/parse-json.md @@ -1,9 +1,9 @@ --- layout: default -title: parse_json +title: Parse JSON processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 80 --- # parse_json diff --git a/_data-prepper/pipelines/configuration/processors/rename-keys.md b/_data-prepper/pipelines/configuration/processors/rename-keys.md index 34d3a80a..ded4df89 100644 --- a/_data-prepper/pipelines/configuration/processors/rename-keys.md +++ b/_data-prepper/pipelines/configuration/processors/rename-keys.md @@ -1,28 +1,98 @@ --- layout: default -title: rename_keys +title: Rename keys processor parent: Processors grand_parent: Pipelines -nav_order: 44 +nav_order: 85 --- # rename_keys -## Overview +The `rename_keys` processor renames keys in an event. -The `rename_keys` processor renames keys in an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `rename_keys` processor. +## Configuration -Option | Required | Type | Description -:--- | :--- | :--- | :--- -entries | Yes | List | List of entries. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`. Renaming occurs in the order defined. -from_key | N/A | N/A | The key of the entry to be renamed. -to_key | N/A | N/A | The new key of the entry. -overwrite_if_to_key_exists | No | Boolean | If true, the existing value gets overwritten if `to_key` already exists in the event. +You can configure the `rename_keys` processor with the following options. - \ No newline at end of file +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - rename_keys: + entries: + - from_key: "message" + to_key: "newMessage" + overwrite_if_to_key_exists: true + sink: + - stdout: +``` +{% include copy.html %} + + +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). + +For example, before you run the `rename_keys` processor, if the `logs_json.log` file contains the following event record: + +```json +{"message": "hello"} +``` + +When you run the `rename_keys` processor, it parses the message into the following "newMessage" output: + +```json +{"newMessage": "hello"} +``` + +> If `newMessage` already exists, its existing value is overwritten with `value`. + + + +## Special considerations + +Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `rename_keys` processor. See the following example `pipline.yaml` file: + +```yaml +pipeline: + source: + file: + path: "/full/path/to/logs_json.log" + record_type: "event" + format: "json" + processor: + - rename_keys: + entries: + - from_key: "message" + to_key: "message2" + - from_key: "message2" + to_key: "message3" + sink: + - stdout: +``` + +Add the following contents to the `logs_json.log` file: + +```json +{"message": "hello"} +``` +{% include copy.html %} + +After the `rename_keys` processor runs, the following output appears: + +```json +{"message3": "hello"} +``` \ No newline at end of file diff --git a/_data-prepper/pipelines/configuration/processors/routes.md b/_data-prepper/pipelines/configuration/processors/routes.md index 0e3c798d..06fe79fc 100644 --- a/_data-prepper/pipelines/configuration/processors/routes.md +++ b/_data-prepper/pipelines/configuration/processors/routes.md @@ -1,15 +1,13 @@ --- layout: default -title: routes +title: Routes parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 90 --- # Routes -## Overview - Routes define conditions that can be used in sinks for conditional routing. Routes are specified at the same level as processors and sinks under the name `route` and consist of a list of key-value pairs, where the key is the name of a route and the value is a Data Prepper expression representing the routing condition. \ No newline at end of file diff --git a/_data-prepper/pipelines/configuration/processors/split-string.md b/_data-prepper/pipelines/configuration/processors/split-string.md index fac8657d..2139181a 100644 --- a/_data-prepper/pipelines/configuration/processors/split-string.md +++ b/_data-prepper/pipelines/configuration/processors/split-string.md @@ -1,14 +1,13 @@ --- layout: default -title: split_string +title: Split string processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 100 --- # split_string -## Overview The `split_string` processor splits a field into an array using a delimiting character and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the options you can use to configure the `split_string` processor. diff --git a/_data-prepper/pipelines/configuration/processors/string-converter.md b/_data-prepper/pipelines/configuration/processors/string-converter.md index 8ab0cccc..ae71f956 100644 --- a/_data-prepper/pipelines/configuration/processors/string-converter.md +++ b/_data-prepper/pipelines/configuration/processors/string-converter.md @@ -1,14 +1,13 @@ --- layout: default -title: string_converter +title: String converter processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 105 --- # string_converter -## Overview The `string_converter` processor converts a string to uppercase or lowercase. You can use it as an example for developing your own processor. The following table describes the option you can use to configure the `string_converter` processor. diff --git a/_data-prepper/pipelines/configuration/processors/substitute-string.md b/_data-prepper/pipelines/configuration/processors/substitute-string.md index daa6b7b2..a48e98be 100644 --- a/_data-prepper/pipelines/configuration/processors/substitute-string.md +++ b/_data-prepper/pipelines/configuration/processors/substitute-string.md @@ -1,16 +1,18 @@ --- layout: default -title: substitute_string +title: Substitute string processors parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 110 --- # substitute_string -## Overview +The `substitute_string` processor matches a key's value against a regular expression and replaces all matches with a replacement string. `substitute_string` is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. -The `substitute_string` processor matches a key's value against a regular expression and replaces all matches with a replacement string. `substitute_string` is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the options you can use to configure the `substitue_string` processor. +## Configuration + +The following table describes the options you can use to configure the `substitue_string` processor. Option | Required | Type | Description :--- | :--- | :--- | :--- diff --git a/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md b/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md index 33cf1319..4214bc34 100644 --- a/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md +++ b/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md @@ -1,9 +1,9 @@ --- layout: default -title: trace peer forwarder +title: Trace peer forwarder processors parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 115 --- # trace peer forwarder diff --git a/_data-prepper/pipelines/configuration/processors/trim-string.md b/_data-prepper/pipelines/configuration/processors/trim-string.md index 2167590e..bb1defc6 100644 --- a/_data-prepper/pipelines/configuration/processors/trim-string.md +++ b/_data-prepper/pipelines/configuration/processors/trim-string.md @@ -1,15 +1,13 @@ --- layout: default -title: trim_string +title: Trim string processors parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 120 --- # trim_string -## Overview - The `trim_string` processor removes whitespace from the beginning and end of a key and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the option you can use to configure the `trim_string` processor. Option | Required | Type | Description diff --git a/_data-prepper/pipelines/configuration/processors/uppercase-string.md b/_data-prepper/pipelines/configuration/processors/uppercase-string.md index a8236834..57853ba3 100644 --- a/_data-prepper/pipelines/configuration/processors/uppercase-string.md +++ b/_data-prepper/pipelines/configuration/processors/uppercase-string.md @@ -1,15 +1,13 @@ --- layout: default -title: uppercase_string +title: Uppercase string processor parent: Processors grand_parent: Pipelines -nav_order: 45 +nav_order: 125 --- # uppercase_string -## Overview - The `uppercase_string` processor converts an entire string to uppercase and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the option you can use to configure the `uppercase_string` processor. Option | Required | Type | Description