Add list to map processor (#3806)

* Add list to map processor.

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Tweak one last file

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Fix typo

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update mutate-event.md

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

* Add Chris' feedback

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* A couple more wording tweaks

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Nathan Bower <nbower@amazon.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Nathan Bower <nbower@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>
Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
Naarcha-AWS 2023-04-20 13:48:36 -05:00 committed by GitHub
parent 586cb3d2ff
commit e1a1f44dd6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
26 changed files with 657 additions and 428 deletions

View File

@ -1,25 +1,62 @@
---
layout: default
title: add_entries
title: Add entries processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 40
---
# add_entries
## Overview
The `add_entries` processor adds entries to an event.
The `add_entries` processor adds an entry to the event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `add_entries` processor.
### Configuration
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of events to be added. Valid entries are `key`, `value`, and `overwrite_if_key_exists`.
key | N/A | N/A | Key of the new event to be added.
value | N/A | N/A | Value of the new entry to be added. Valid data types are strings, booleans, numbers, null, nested objects, and arrays containing the aforementioned data types.
overwrite_if_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`.
You can configure the `add_entries` processor with the following options.
<!--- ## Configuration
| Option | Required | Description |
| :--- | :--- | :--- |
| `entries` | Yes | A list of entries to add to an event. |
| `key` | Yes | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. |
| `value` | Yes | The value of the new entry to be added. You can use the following data types: strings, Booleans, numbers, null, nested objects, and arrays. |
| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
Content will be added to this section.--->
### Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- add_entries:
entries:
- key: "newMessage"
value: 3
overwrite_if_key_exists: true
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `add_entries` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello"}
```
Then when you run the `add_entries` processor using the previous configuration, it adds a new entry `{"newMessage": 3}` to the existing event `{"message": "hello"}` so that the new event contains two entries in the final output:
```json
{"message": "hello", "newMessage": 3}
```
> If `newMessage` already exists, its existing value is overwritten with a value of `3`.

View File

@ -1,16 +1,19 @@
---
layout: default
title: aggregate
title: Aggregate processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 41
---
# aggregate
## Overview
The `aggregate` processor groups events based on the keys provided and performs an action on each group.
The `aggregate` processor groups events based on the keys provided and performs an action on each group. The following table describes the options you can use to configure the `aggregate` processor.
## Configuration
The following table describes the options you can use to configure the `aggregate` processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---

View File

@ -0,0 +1,55 @@
---
layout: default
title: Convert entry type processor
parent: Processors
grand_parent: Pipelines
nav_order: 47
---
# convert_entry_type_type
The `convert_entry_type` processor converts a value type associated with the specified key in a event to the specified type. It is a casting processor that changes the types of some fields in events. Some data must be converted to a different type, such as an integer to a double, or a string to an integer, so that it will pass the events through condition-based processors or perform conditional routing.
## Configuration
You can configure the `convert_entry_type` processor with the following options.
| Option | Required | Description |
| :--- | :--- | :--- |
| `key`| Yes | Keys whose value needs to be converted to a different type. |
| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. |
## Usage
To get started, create the following `pipeline.yaml` file:
```yaml
type-conv-pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- convert_entry_type_type:
key: "response_status"
type: "integer"
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `convert_entry_type` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "value", "response_status":"200"}
```
The `convert_entry_type` processor converts the output received to the following output, where the type of `response_status` value changes from a string to an integer:
```json
{"message":"value","response_status":200}
```

View File

@ -1,24 +1,60 @@
---
layout: default
title: copy_values
title: Copy values processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 48
---
# copy_values
## Overview
The `copy_values` processor copies values within an event and is a [mutate event]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/mutate-event/) processor.
The `copy_values` processor copies values within an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `copy_values` processor.
## Configuration
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | The list of entries to be copied. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`.
from_key | N/A | N/A | The key of the entry to be copied.
to_key | N/A | N/A | The key of the new entry to be added.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`.
You can configure the `copy_values` processor with the following options.
<!---## Configuration
| Option | Required | Description |
:--- | :--- | :---
| `entries` | Yes | A list of entries to be copied in an event. |
| `from_key` | Yes | The key of the entry to be copied. |
| `to_key` | Yes | The key of the new entry to be added. |
| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
Content will be added to this section.--->
## Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- copy_values:
entries:
- from_key: "message"
to_key: "newMessage"
overwrite_if_to_key_exists: true
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `copy_values` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello"}
```
When you run this processor, it parses the message into the following output:
```json
{"message": "hello", "newMessage": "hello"}
```
> If `newMessage` already exists, its existing value is overwritten with `value`.

View File

@ -1,16 +1,18 @@
---
layout: default
title: csv
title: CSV processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 49
---
# csv
## Overview
The `csv` processor parses comma-separated values (CSVs) from the event into columns.
The `csv` processor parses comma-separated values (CSVs) from the event into columns. The following table describes the options you can use to configure the `csv` processor.
## Configuration
The following table describes the options you can use to configure the `csv` processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---

View File

@ -1,16 +1,19 @@
---
layout: default
title: date
title: Date
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 50
---
# date
## Overview
The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp. The following table describes the options you can use to configure the `date` processor.
The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp.
## Configuration
The following table describes the options you can use to configure the `date` processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---

View File

@ -3,19 +3,52 @@ layout: default
title: delete_entries
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 51
---
# delete_entries
## Overview
The `delete_entries` processor deletes entries, such as key-value pairs, from an event. You can define the keys you want to delete in the `with-keys` field following `delete_entries` in the YAML configuration file. Those keys and their values are deleted.
The `delete_entries` processor deletes entries in an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `delete-entries` processor.
## Configuration
Option | Required | Type | Description
:--- | :--- | :--- | :---
with_keys | Yes | List | An array of keys of the entries to be deleted.
You can configure the `delete_entries` processor with the following options.
<!---## Configuration
| Option | Required | Description |
:--- | :--- | :---
| `with_keys` | Yes | An array of keys for the entries to be deleted. |
Content will be added to this section.--->
## Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- delete_entries:
with_keys: ["message"]
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `delete_entries` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello", "message2": "goodbye"}
```
When you run the `delete_entries` processor, it parses the message into the following output:
```json
{"message2": "goodbye"}
```
> If `message` does not exist in the event, then no action occurs.

View File

@ -1,14 +1,13 @@
---
layout: default
title: drop_events
title: Drop events processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 52
---
# drop_events
## Overview
The `drop_events` processor drops all the events that are passed into it. The following table describes when events are dropped and how exceptions for dropping events are handled.

View File

@ -3,14 +3,17 @@ layout: default
title: grok
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 53
---
# grok
## Overview
The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys. The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.
The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys.
## Configuration
The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.
Option | Required | Type | Description
:--- | :--- | :--- | :---

View File

@ -1,14 +1,13 @@
---
layout: default
title: key_value
title: Key value processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 54
---
# key_value
## Overview
The `key_value` processor parses a field into key/value pairs. The following table describes `key_value` processor options available that help you parse field information into pairs.

View File

@ -0,0 +1,305 @@
---
layout: default
title: List to map processor
parent: Processors
grand_parent: Pipelines
nav_order: 55
---
# list_to_map
The `list_to_map` processor converts a list of objects from an event, where each object contains a `key` field, into a map of target keys.
## Configuration
The following table describes the configuration options used to generate target keys for the mappings.
Option | Required | Type | Description
:--- | :--- | :--- | :---
`key` | Yes | String | The key of the fields to be extracted as keys in the generated mappings.
`source` | Yes | String | The list of objects with `key` fields to be converted into keys for the generated map.
`target` | No | String | The target for the generated map. When not specified, the generated map will be placed in the root node.
`value_key` | No | String | When specified, values given a `value_key` in objects contained in the source list will be extracted and converted into the value specified by this option based on the generated map. When not specified, objects contained in the source list retain their original value when mapped.
`flatten` | No | Boolean | When `true`, values in the generated map output flatten into single items based on the `flattened_element`. Otherwise, objects mapped to values from the generated map appear as lists.
`flattened_element` | Conditionally | String | The element to keep, either `first` or `last`, when `flatten` is set to `true`.
## Usage
The following example shows how to test the usage of the `list_to_map` processor before using the processor on your own source.
Create a source file named `logs_json.log`. Because the `file` source reads each line in the `.log` file as an event, the object list appears as one line even though it contains multiple objects:
```json
{"mylist":[{"name":"a","value":"val-a"},{"name":"b","value":"val-b1"},{"name":"b", "value":"val-b2"},{"name":"c","value":"val-c"}]}
```
{% include copy.html %}
Next, create a `pipeline.yaml` file that uses the `logs_json.log` file as the `source` by pointing to the `.log` file's correct path:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- list_to_map:
key: "name"
source: "mylist"
value_key: "value"
flatten: true
sink:
- stdout:
```
{% include copy.html %}
Run the pipeline. If successful, the processor returns the generated map with objects mapped according to their `value_key`. Similar to the original source, which contains one line and therefore one event, the processor returns the following JSON as one line. For readability, the following example and all subsequent JSON examples have been adjusted to span multiple lines:
```json
{
"mylist": [
{
"name": "a",
"value": "val-a"
},
{
"name": "b",
"value": "val-b1"
},
{
"name": "b",
"value": "val-b2"
},
{
"name": "c",
"value": "val-c"
}
],
"a": "val-a",
"b": "val-b1",
"c": "val-c"
}
```
### Example: Maps set to `target`
The following example `pipeline.yaml` file shows the `list_to_map` processor when set to a specified target, `mymap`:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- list_to_map:
key: "name"
source: "mylist"
target: "mymap"
value_key: "value"
flatten: true
sink:
- stdout:
```
{% include copy.html %}
The generated map appears under the target key:
```json
{
"mylist": [
{
"name": "a",
"value": "val-a"
},
{
"name": "b",
"value": "val-b1"
},
{
"name": "b",
"value": "val-b2"
},
{
"name": "c",
"value": "val-c"
}
],
"mymap": {
"a": "val-a",
"b": "val-b1",
"c": "val-c"
}
}
```
### Example: No `value_key` specified
The follow example `pipeline.yaml` file shows the `list_to_map` processor with no `value_key` specified. Because `key` is set to `name`, the processor extracts the object names to use as keys in the map.
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- list_to_map:
key: "name"
source: "mylist"
flatten: true
sink:
- stdout:
```
{% include copy.html %}
The values from the generated map appear as original objects from the `.log` source, as shown in the following example response:
```json
{
"mylist": [
{
"name": "a",
"value": "val-a"
},
{
"name": "b",
"value": "val-b1"
},
{
"name": "b",
"value": "val-b2"
},
{
"name": "c",
"value": "val-c"
}
],
"a": {
"name": "a",
"value": "val-a"
},
"b": {
"name": "b",
"value": "val-b1"
},
"c": {
"name": "c",
"value": "val-c"
}
}
```
### Example: `flattened_element` set to `last`
The following example `pipeline.yaml` file sets the `flattened_element` to last, therefore flattening the processor output based on each value's last element:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- list_to_map:
key: "name"
source: "mylist"
target: "mymap"
value_key: "value"
flatten: true
flattened_element: "last"
sink:
- stdout:
```
{% include copy.html %}
The processor maps object `b` to value `val-b2` because `val-b2` is the last element in object `b`, as shown in the following output:
```json
{
"mylist": [
{
"name": "a",
"value": "val-a"
},
{
"name": "b",
"value": "val-b1"
},
{
"name": "b",
"value": "val-b2"
},
{
"name": "c",
"value": "val-c"
}
],
"a": "val-a",
"b": "val-b2",
"c": "val-c"
}
```
### Example: `flatten` set to false
The following example `pipeline.yaml` file sets `flatten` to `false`, causing the processor to output values from the generated map as a list:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- list_to_map:
key: "name"
source: "mylist"
target: "mymap"
value_key: "value"
flatten: false
sink:
- stdout:
```
{% include copy.html %}
Some objects in the response may have more than one element in their values, as shown in the following response:
```json
{
"mylist": [
{
"name": "a",
"value": "val-a"
},
{
"name": "b",
"value": "val-b1"
},
{
"name": "b",
"value": "val-b2"
},
{
"name": "c",
"value": "val-c"
}
],
"a": [
"val-a"
],
"b": [
"val-b1",
"val-b2"
],
"c": [
"val-c"
]
}
```

View File

@ -1,14 +1,13 @@
---
layout: default
title: lowercase_string
title: Lowercase string processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 60
---
# lowercase_string
## Overview
The `lowercase_string` processor converts a string to its lowercase counterpart and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes options for configuring the `lowercase_string` processor to convert strings to a lowercase format.

View File

@ -3,311 +3,19 @@ layout: default
title: Mutate event
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 65
---
# Mutate event processors
Mutate event processors allow you to modify events in Data Prepper. The following processors are available:
* [AddEntries](#addentries) allows you to add entries to an event.
* [CopyValues](#copyvalues) allows you to copy values within an event.
* [DeleteEntry](#deleteentry) allows you to delete entries from an event.
* [RenameKey](#renamekey) allows you to rename keys in an event.
* [ConvertEntry](#convertentry) allows you to convert value types in an event.
* [add_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/add-entries/) allows you to add entries to an event.
* [copy_values]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/copy-values/) allows you to copy values within an event.
* [delete_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/delete-entries/) allows you to delete entries from an event.
* [rename_keys]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/rename-keys/) allows you to rename keys in an event.
* [convert_entry_type]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/convert_entry_type/) allows you to convert value types in an event.
* [list_to_map]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/list-to-map) allows you to convert list of objects from an event where each object contains a `key` field into a map of target keys.
## AddEntries
The `AddEntries` processor adds entries to an event.
### Configuration
You can configure the `AddEntries` processor with the following options.
| Option | Required | Description |
| :--- | :--- | :--- |
| `entries` | Yes | A list of entries to add to an event. |
| `key` | Yes | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. |
| `value` | Yes | The value of the new entry to be added. You can use the following data types: strings, Booleans, numbers, null, nested objects, and arrays. |
| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
### Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- add_entries:
entries:
- key: "newMessage"
value: 3
overwrite_if_key_exists: true
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `AddEntries` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello"}
```
Then when you run the `AddEntries` processor using the previous configuration, it adds a new entry `{"newMessage": 3}` to the existing event `{"message": "hello"}` so that the new event contains two entries in the final output:
```json
{"message": "hello", "newMessage": 3}
```
> If `newMessage` already exists, its existing value is overwritten with a value of `3`.
## CopyValues
The `CopyValues` processor copies the values of an existing key within an event to another key.
### Configuration
You can configure the `CopyValues` processor with the following options.
| Option | Required | Description |
:--- | :--- | :---
| `entries` | Yes | A list of entries to be copied in an event. |
| `from_key` | Yes | The key of the entry to be copied. |
| `to_key` | Yes | The key of the new entry to be added. |
| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
### Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- copy_values:
entries:
- from_key: "message"
to_key: "newMessage"
overwrite_if_to_key_exists: true
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `CopyValues` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello"}
```
When you run this processor, it parses the message into the following output:
```json
{"message": "hello", "newMessage": "hello"}
```
> If `newMessage` already exists, its existing value is overwritten with `value`.
## DeleteEntry
The `DeleteEntry` processor deletes entries, such as key-value pairs, from an event. You can define the keys you want to delete in the `with-keys` field following `delete_entries` in the YAML configuration file. Those keys and their values are deleted.
### Configuration
You can configure the `DeleteEntry` processor with the following options.
| Option | Required | Description |
:--- | :--- | :---
| `with_keys` | Yes | An array of keys for the entries to be deleted. |
### Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- delete_entries:
with_keys: ["message"]
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `DeleteEntry` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello", "message2": "goodbye"}
```
When you run the `DeleteEntry` processor, it parses the message into the following output:
```json
{"message2": "goodbye"}
```
> If `message` does not exist in the event, then no action occurs.
## RenameKey
The `RenameKey` processor renames keys in an event.
### Configuration
You can configure the `RenameKey` processor with the following options.
Option | Required | Description |
| :--- | :--- | :--- |
| `entries` | Yes | A list of event entries to rename. |
| `from_key` | Yes | The key of the entry to be renamed. |
| `to_key` | Yes | The new key of the entry. |
| `overwrite_if_to_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
### Usage
To get started, create the following `pipeline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- rename_keys:
entries:
- from_key: "message"
to_key: "newMessage"
overwrite_if_to_key_exists: true
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `RenameKey` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello"}
```
When you run the `RenameKey` processor, it parses the message into the following "newMessage" output:
```json
{"newMessage": "hello"}
```
> If `newMessage` already exists, its existing value is overwritten with `value`.
### Special considerations
Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `RenameKey` processor. See the following example `pipline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- rename_key:
entries:
- from_key: "message"
to_key: "message2"
- from_key: "message2"
to_key: "message3"
sink:
- stdout:
```
Add the following contents to the `logs_json.log` file:
```json
{"message": "hello"}
```
{% include copy.html %}
After the `RenameKey` processor runs, the following output appears:
```json
{"message3": "hello"}
```
## ConvertEntry
The `ConvertEntry` processor converts a value type associated with the specified key in a event to the specified type. It is a casting processor that changes the types of some fields in events. Some data must be converted to a different type, such as an integer to a double, or a string to an integer, so that it will pass the events through condition-based processors or perform conditional routing.
### Configuration
You can configure the `ConvertEntry` processor with the following options.
| Option | Required | Description |
| :--- | :--- | :--- |
| `key`| Yes | Keys whose value needs to be converted to a different type. |
| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. |
### Usage
To get started, create the following `pipeline.yaml` file:
```yaml
type-conv-pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- convert_entry_type:
key: "response_status"
type: "integer"
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `ConvertEntry` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "value", "response_status":"200"}
```
The `ConvertEntry` processor converts the output received to the following output, where the type of `response_status` value changes from a string to an integer:
```json
{"message":"value","response_status":200}
```

View File

@ -3,7 +3,7 @@ layout: default
title: Mutate string
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 70
---
# Mutate string processors

View File

@ -1,31 +1,33 @@
---
layout: default
title: otel_trace_raw
title: OTel trace raw processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 75
---
# otel_trace_raw
## Overview
The `otel_trace_raw` processor completes trace-group-related fields in all incoming Data Prepper span records by state caching the root span information for each `tradeId`. This processor includes the following parameters.
The `otel_trace_raw` processor completes trace-group-related fields in all incoming Data Prepper span records by state caching the root span information for each `tradeId`.
## Parameters
This processor includes the following parameters.
* `traceGroup`: Root span name
* `endTime`: End time of the entire trace in International Organization for Standardization (ISO) 8601 format
* `durationInNanos`: Duration of the entire trace in nanoseconds
* `statusCode`: Status code for the entire trace in nanoseconds
## Configuration
The following table describes the options you can use to configure the `otel_trace_raw` processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---
trace_flush_interval | No | Integer | Represents the time interval in seconds to flush all the descendant spans without any root span. Default is 180.
<!---## Configuration
Content will be added to this section.--->
## Metrics

View File

@ -1,9 +1,9 @@
---
layout: default
title: parse_json
title: Parse JSON processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 80
---
# parse_json

View File

@ -1,28 +1,98 @@
---
layout: default
title: rename_keys
title: Rename keys processor
parent: Processors
grand_parent: Pipelines
nav_order: 44
nav_order: 85
---
# rename_keys
## Overview
The `rename_keys` processor renames keys in an event.
The `rename_keys` processor renames keys in an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `rename_keys` processor.
## Configuration
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of entries. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`. Renaming occurs in the order defined.
from_key | N/A | N/A | The key of the entry to be renamed.
to_key | N/A | N/A | The new key of the entry.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value gets overwritten if `to_key` already exists in the event.
You can configure the `rename_keys` processor with the following options.
<!---## Configuration
Option | Required | Description |
| :--- | :--- | :--- |
| `entries` | Yes | A list of event entries to rename. |
| `from_key` | Yes | The key of the entry to be renamed. |
| `to_key` | Yes | The new key of the entry. |
| `overwrite_if_to_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
Content will be added to this section.
## Usage
## Metrics
To get started, create the following `pipeline.yaml` file:
Content will be added to this section.--->
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- rename_keys:
entries:
- from_key: "message"
to_key: "newMessage"
overwrite_if_to_key_exists: true
sink:
- stdout:
```
{% include copy.html %}
Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
For example, before you run the `rename_keys` processor, if the `logs_json.log` file contains the following event record:
```json
{"message": "hello"}
```
When you run the `rename_keys` processor, it parses the message into the following "newMessage" output:
```json
{"newMessage": "hello"}
```
> If `newMessage` already exists, its existing value is overwritten with `value`.
## Special considerations
Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `rename_keys` processor. See the following example `pipline.yaml` file:
```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- rename_keys:
entries:
- from_key: "message"
to_key: "message2"
- from_key: "message2"
to_key: "message3"
sink:
- stdout:
```
Add the following contents to the `logs_json.log` file:
```json
{"message": "hello"}
```
{% include copy.html %}
After the `rename_keys` processor runs, the following output appears:
```json
{"message3": "hello"}
```

View File

@ -1,15 +1,13 @@
---
layout: default
title: routes
title: Routes
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 90
---
# Routes
## Overview
Routes define conditions that can be used in sinks for conditional routing. Routes are specified at the same level as processors and sinks under the name `route` and consist of a list of key-value pairs, where the key is the name of a route and the value is a Data Prepper expression representing the routing condition.
<!---## Configuration

View File

@ -1,16 +1,18 @@
---
layout: default
title: service_map_stateful
title: Service map stateful processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 95
---
# service_map_stateful
## Overview
The `service_map_stateful` processor uses OpenTelemetry data to create a distributed service map for visualization in OpenSearch Dashboards.
The `service_map_stateful` processor uses OpenTelemetry data to create a distributed service map for visualization in OpenSearch Dashboards. The following table describes the option you can use to configure the `service_map_stateful` processor.
## Configuration
The following table describes the option you can use to configure the `service_map_stateful` processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---

View File

@ -1,21 +0,0 @@
---
layout: default
title: service_map_stateful
parent: sinks
grand_parent: Pipelines
nav_order: 45
---
# Sinks
## Overview
Sinks define where Data Prepper writes your data to.
<!---## Configuration
Content will be added to this section.
## Metrics
Content will be added to this section.--->

View File

@ -1,14 +1,13 @@
---
layout: default
title: split_string
title: Split string processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 100
---
# split_string
## Overview
The `split_string` processor splits a field into an array using a delimiting character and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the options you can use to configure the `split_string` processor.

View File

@ -1,14 +1,13 @@
---
layout: default
title: string_converter
title: String converter processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 105
---
# string_converter
## Overview
The `string_converter` processor converts a string to uppercase or lowercase. You can use it as an example for developing your own processor. The following table describes the option you can use to configure the `string_converter` processor.

View File

@ -1,16 +1,18 @@
---
layout: default
title: substitute_string
title: Substitute string processors
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 110
---
# substitute_string
## Overview
The `substitute_string` processor matches a key's value against a regular expression and replaces all matches with a replacement string. `substitute_string` is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor.
The `substitute_string` processor matches a key's value against a regular expression and replaces all matches with a replacement string. `substitute_string` is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the options you can use to configure the `substitue_string` processor.
## Configuration
The following table describes the options you can use to configure the `substitue_string` processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---

View File

@ -1,9 +1,9 @@
---
layout: default
title: trace peer forwarder
title: Trace peer forwarder processors
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 115
---
# trace peer forwarder

View File

@ -1,15 +1,13 @@
---
layout: default
title: trim_string
title: Trim string processors
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 120
---
# trim_string
## Overview
The `trim_string` processor removes whitespace from the beginning and end of a key and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the option you can use to configure the `trim_string` processor.
Option | Required | Type | Description

View File

@ -1,15 +1,13 @@
---
layout: default
title: uppercase_string
title: Uppercase string processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 125
---
# uppercase_string
## Overview
The `uppercase_string` processor converts an entire string to uppercase and is a [mutate string](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-string-processors#mutate-string-processors) processor. The following table describes the option you can use to configure the `uppercase_string` processor.
Option | Required | Type | Description