Language tweaks

Signed-off-by: keithhc2 <keithhc2@users.noreply.github.com>
This commit is contained in:
keithhc2 2022-03-21 18:59:52 -07:00
parent 29b0ced4d4
commit 67ab805204
1 changed files with 39 additions and 22 deletions

View File

@ -75,7 +75,7 @@ Option | Required | Type | Description
:--- | :--- | :--- | :---
path | Yes | String | Path to the input file (e.g. `logs/my-log.log`).
format | No | String | Format of each line in the file. Valid options are `json` or `plain`. Default is `plain`.
record_type | No | String | The record type that will be stored. Valid options are `string` or `event`. Default is `string`. If you would like to use the file source for log analytics use cases like grok, set this option to `event`.
record_type | No | String | The record type to store. Valid options are `string` or `event`. Default is `string`. If you would like to use the file source for log analytics use cases like grok, set this option to `event`.
### pipeline
@ -109,7 +109,8 @@ batch_size | No | Integer | The maximum number of records the buffer drains afte
## Processors
Processors perform some action on your data: filter, transform, enrich, etc.
Prior to Data Prepper 1.3, Processors were named Preppers. Starting in Data Prepper 1.3, the term Prepper is deprecated in favor or Processor. Data Prepper will continue to support the term "Prepper" until 2.0 where it will be removed.
Prior to Data Prepper 1.3, Processors were named Preppers. Starting in Data Prepper 1.3, the term Prepper is deprecated in favor of Processor. Data Prepper will continue to support the term "Prepper" until 2.0, where it will be removed.
{: .note }
@ -175,12 +176,12 @@ Adds a default timestamp to the event or parses timestamp fields, and converts i
Option | Required | Type | Description
:--- | :--- | :--- | :---
match | Conditionally | List | List of `key` and `patterns` where patterns is a list. The list of match can have exactly one `key` and `patterns`. There is no default value. This cannot be defined at the same time as `from_time_received`. Include multiple date processors in your pipeline if both options should be used.
from_time_received | Conditionally | Boolean | A boolean that is used for adding default timestamp to event data from event metadata which is the time when source receives the event. Default value is `false`. This cannot be defined at the same time as `match`. Include multiple date processors in your pipeline if both options should be used.
match | Conditionally | List | List of `key` and `patterns` where patterns is a list. The list of match can have exactly one `key` and `patterns`. There is no default value. This option cannot be defined at the same time as `from_time_received`. Include multiple date processors in your pipeline if both options should be used.
from_time_received | Conditionally | Boolean | A boolean that is used for adding default timestamp to event data from event metadata which is the time when source receives the event. Default value is `false`. This option cannot be defined at the same time as `match`. Include multiple date processors in your pipeline if both options should be used.
destination | No | String | Field to store the timestamp parsed by date processor. It can be used with both `match` and `from_time_received`. Default value is `@timestamp`.
source_timezone | No | String | Timezone used for parsing dates. It will be used in case of zone or offset cannot be extracted from value. If zone or offset is part of the value timezone will be ignored. Find all the available timezones [here](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List) in "TZ database name" column.
source_timezone | No | String | Time zone used to parse dates. It is used in case zone or offset cannot be extracted from the value. If zone or offset are part of the value, then timezone is ignored. Find all the available timezones [the list of database time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List) in the "TZ database name" column.
destination_timezone | No | String | Timezone used for storing timestamp in `destination` field. The available timezone values are the same as `source_timestamp`.
locale | No | String | Locale is used for parsing dates. It's commonly used for parsing month names(`MMM`). It can have language, country and variant fields using IETF BCP 47 or String representation of [Locale](https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html) object. For example `en-US` for IETF BCP 47 and `en_US` for string representation of Locale. Full list of locale fields which includes language, country and variant can be found [here](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry). Default value is `Locale.ROOT`.
locale | No | String | Locale is used for parsing dates. It's commonly used for parsing month names(`MMM`). It can have language, country and variant fields using IETF BCP 47 or String representation of [Locale](https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html) object. For example `en-US` for IETF BCP 47 and `en_US` for string representation of Locale. Full list of locale fields which includes language, country and variant can be found [the language subtag registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry). Default value is `Locale.ROOT`.
### drop_events
@ -188,8 +189,8 @@ Drops all the events that are passed into this processor.
Option | Required | Type | Description
:--- | :--- | :--- | :---
drop_when | Yes | String | Accepts a Data Prepper Expression string following the [Data Prepper Expression Syntax](https://github.com/opensearch-project/data-prepper/blob/main/docs/expression_syntax.md). Configuring `drop_events` with `drop_when: true` will drop all the events received.
handle_failed_events | No | Enum | Used when an exception occurs while evaluating an event which specifies how the exception will be handled. Default value is `drop`. Available options are `drop`, `drop_silenly`, `skip`, `skip_silently`. For more information, see [handle_failed_events](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/drop-events-processor#handle_failed_events).
drop_when | Yes | String | Accepts a Data Prepper Expression string following the [Data Prepper Expression Syntax](https://github.com/opensearch-project/data-prepper/blob/main/docs/expression_syntax.md). Configuring `drop_events` with `drop_when: true` drops all the events received.
handle_failed_events | No | Enum | Specifies how exceptions are handled when an exception occurs while evaluating an event. Default value is `drop`, which drops the event so it doesn't get sent to OpenSearch. Available options are `drop`, `drop_silently`, `skip`, `skip_silently`. For more information, see [handle_failed_events](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/drop-events-processor#handle_failed_events).
### grok_prepper
@ -214,16 +215,16 @@ Takes in a field and parses it into key/value pairs.
Option | Required | Type | Description
:--- | :--- | :--- | :---
source | No | String | The key in the event that will be parsed. Default value is `message`.
destination | No | String | The key where parsed source will be output to. This will overwrite value of the key if it exists. Default value is `parsed_message`
field_delimiter_regex | Conditionally | String | A regex specifying the delimiter between key/value pairs. Special regex characters such as `[` and `]` must be escaped using `\\`. This cannot be defined at the same time as `field_split_characters`.
field_split_characters | Conditionally | String | A string of characters to split between key/value pairs. Special regex characters such as `[` and `]` must be escaped using `\\`. Default value is `&`. This cannot be defined at the same time as `field_delimiter_regex`.
key_value_delimiter_regex| Conditionally | String | A regex specifying the delimiter between a key and a value. Special regex characters such as `[` and `]` must be escaped using `\\`. There is no default value. This cannot be defined at the same time as `value_split_characters`.
value_split_characters | Conditionally | String | A string of characters to split between keys and values. Special regex characters such as `[` and `]` must be escaped using `\\`. Default value is `=`. This cannot be defined at the same time as `key_value_delimiter_regex`.
non_match_value | No | String | When a key/value cannot be successfully split, the key/value will be placed in the key field and the specified value in the value field. Default value is `null`.
source | No | String | The key in the event that is parsed. Default value is `message`.
destination | No | String | The key where to output the parsed source to. Doing so overwrites the value of the key if it exists. Default value is `parsed_message`
field_delimiter_regex | Conditionally | String | A regex specifying the delimiter between key/value pairs. Special regex characters such as `[` and `]` must be escaped using `\\`. This option cannot be defined at the same time as `field_split_characters`.
field_split_characters | Conditionally | String | A string of characters to split between key/value pairs. Special regex characters such as `[` and `]` must be escaped using `\\`. Default value is `&`. This option cannot be defined at the same time as `field_delimiter_regex`.
key_value_delimiter_regex| Conditionally | String | A regex specifying the delimiter between a key and a value. Special regex characters such as `[` and `]` must be escaped using `\\`. There is no default value. This option cannot be defined at the same time as `value_split_characters`.
value_split_characters | Conditionally | String | A string of characters to split between keys and values. Special regex characters such as `[` and `]` must be escaped using `\\`. Default value is `=`. This option cannot be defined at the same time as `key_value_delimiter_regex`.
non_match_value | No | String | When a key/value cannot be successfully split, the key/value is be placed in the key field and the specified value in the value field. Default value is `null`.
prefix | No | String | A prefix given to all keys. Default value is empty string.
delete_key_regex | No | String | A regex that will be used to delete characters from the key. Special regex characters such as `[` and `]` must be escaped using `\\`. There is no default value.
delete_value_regex | No | String | A regex that will be used to delete characters from the value. Special regex characters such as `[` and `]` must be escaped using `\\`. There is no default value.
delete_key_regex | No | String | A regex used to delete characters from the key. Special regex characters such as `[` and `]` must be escaped using `\\`. There is no default value.
delete_value_regex | No | String | A regex used to delete characters from the value. Special regex characters such as `[` and `]` must be escaped using `\\`. There is no default value.
### add_entries
@ -231,7 +232,10 @@ Adds an entry to event. `add_entries` is part of [mutate event](https://github.c
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of `key`: key of new event to be added <br> `value`: value of new entry to be added. Strings, booleans, numbers, null, nested objects, and arrays containing the aforementioned data types are valid to use. <br> `overwrite_if_key_exists`: boolean and optional, when set to `true`, if `key` already exists in the event, then the existing value will be overwritten. The default is `false`.
entries | Yes | List | List of events to be added. Valid entries are `key`, `value`, and `overwrite_if_key_exists`.
key | N/A | N/A | Key of the new event to be added.
value | N/A | N/A | Value of the new entry to be added. Valid data types are strings, booleans, numbers, null, nested objects, and arrays containing the aforementioned data types.
overwrite_if_key_exists | No | Boolean | If true, the existing value gets overwritten if the key already exists within the event. Default is false.
### copy_values
@ -239,7 +243,11 @@ Copy values within an event. `copy_values` is part of [mutate event](https://gi
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of `from_key`: The key of the entry to be copied <br> `to_key`: The key of the new entry to be added <br> `overwrite_if_to_key_exists`: boolean and optional, when set to `true`, if `to_key` already exists in the event, then the existing value will be overwritten. The default is `false`.
entries | Yes | List | List of entries to be copied. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`.
from_key | N/A | N/A | The key of the entry to be copied.
to_key | N/A | N/A | The key of the new entry to be added.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value gets overwritten if the key already exists within the event. Default is false.
### delete_entries
@ -255,7 +263,10 @@ Rename keys in an event. `rename_keys` is part of [mutate event](https://github.
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of `from_key`: The key of the entry to be renamed <br> `to_key`: The new key of the entry <br> `overwrite_if_to_key_exists`: boolean and optional, when set to `true`, if `to_key` already exists in the event, then the existing value will be overwritten. The default is `false`. <br> Renaming operation occurs in the order defined.
entries | Yes | List | List of entries. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`. Renaming occurs in the order defined.
from_key | N/A | N/A | The key of the entry to be renamed.
to_key | N/A | N/A | The new key of the entry.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value gets overwritten if `to_key` already exists in the event.
### substitute_string
@ -263,7 +274,10 @@ Matches a key's value against a regular expression and replaces all matches with
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of `source`: The key to be modified. <br> `from`: The Regex String to be replaced. Special regex characters such as `[` and `]` must be escaped using `\\` when using double quotes and `\ ` when using single quotes. See [here](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html) for more information. <br>`to`: The String to be substituted for each match of `from`.
entries | Yes | List | List of entries. Valid values are `source`, `from`, and `to`.
source | N/A | N/A | The key to modify.
from | N/A | N/A | The Regex String to be replaced. Special regex characters such as `[` and `]` must be escaped using `\\` when using double quotes and `\ ` when using single quotes. See [Java Patterns](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html) for more information.
to | N/A | N/A | The String to be substituted for each match of `from`.
### split_string
@ -271,7 +285,10 @@ Splits a field into an array using a delimiter character. `split_string` is part
Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of `source`: The key to be split <br> `delimiter` (optional): The separator character responsible for the split. Cannot be defined at the same time as delimiter_regex. At least `delimiter` or `delimiter_regex` must be defined. <br> `delimiter_regex` (optional): A regex string responsible for the split. Cannot be defined at the same time as delimiter. At least `delimiter` or `delimiter_regex` must be defined.
entries | Yes | List | List of entries. Valid values are `source`, `delimiter`, and `delimiter_regex`.
source | N/A | N/A | The key to split.
delimiter | No | N/A | The separator character responsible for the split. Cannot be defined at the same time as `delimiter_regex`. At least `delimiter` or `delimiter_regex` must be defined.
delimiter_regex | No | N/A | The regex string responsible for the split. Cannot be defined at the same time as `delimiter`. At least `delimiter` or `delimiter_regex` must be defined.
### uppercase_string