Added documentation for grok_when option (#5912)
* Added documentation for grok_when option Signed-off-by: Kondaka <krishkdk@amazon.com> * Apply suggestions from code review Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/grok.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update _data-prepper/pipelines/configuration/processors/grok.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update _data-prepper/pipelines/configuration/processors/grok.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update _data-prepper/pipelines/configuration/processors/grok.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update _data-prepper/pipelines/configuration/processors/grok.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update _data-prepper/pipelines/configuration/processors/grok.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update grok.md Address editorial review feedback. Signed-off-by: Melissa Vagi <vagimeli@amazon.com> * Update grok.md Signed-off-by: Melissa Vagi <vagimeli@amazon.com> --------- Signed-off-by: Kondaka <krishkdk@amazon.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Signed-off-by: Melissa Vagi <vagimeli@amazon.com> Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
parent
76276b2eb4
commit
e63b67b873
|
@ -1,27 +1,27 @@
|
|||
---
|
||||
layout: default
|
||||
title: grok
|
||||
title: Grok
|
||||
parent: Processors
|
||||
grand_parent: Pipelines
|
||||
nav_order: 54
|
||||
---
|
||||
|
||||
# grok
|
||||
# Grok
|
||||
|
||||
|
||||
The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys.
|
||||
The Grok processor uses pattern matching to structure and extract important keys from unstructured data.
|
||||
|
||||
## Configuration
|
||||
|
||||
The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.
|
||||
The following table describes options you can use with the Grok processor to structure your data and make your data easier to query.
|
||||
|
||||
Option | Required | Type | Description
|
||||
:--- | :--- | :--- | :---
|
||||
match | No | Map | Specifies which keys to match specific patterns against. Default value is an empty body.
|
||||
keep_empty_captures | No | Boolean | Enables preserving `null` captures. Default value is `false`.
|
||||
named_captures_only | No | Boolean | Specifies whether to keep only named captures. Default value is `true`.
|
||||
break_on_match | No | Boolean | Specifies whether to match all patterns or stop once the first successful match is found. Default value is `true`.
|
||||
grok_when | No | String | Specifies under what condition the `Grok` processor should perform matching. Default is no condition.
|
||||
keep_empty_captures | No | Boolean | Enables the preservation of `null` captures. Default value is `false`.
|
||||
keys_to_overwrite | No | List | Specifies which existing keys will be overwritten if there is a capture with the same key value. Default value is `[]`.
|
||||
match | No | Map | Specifies which keys to match specific patterns against. Default value is an empty body.
|
||||
named_captures_only | No | Boolean | Specifies whether to keep only named captures. Default value is `true`.
|
||||
pattern_definitions | No | Map | Allows for custom pattern use inline. Default value is an empty body.
|
||||
patterns_directories | No | List | Specifies the path of directories that contain customer pattern files. Default value is an empty list.
|
||||
pattern_files_glob | No | String | Specifies which pattern files to use from the directories specified for `pattern_directories`. Default value is `*`.
|
||||
|
@ -32,6 +32,22 @@ timeout_millis | No | Integer | The maximum amount of time during which matching
|
|||
|
||||
Content will be added to this section.--->
|
||||
|
||||
## Conditional grok
|
||||
|
||||
The Grok processor can be configured to run conditionally by using the `grok_when` option. The following is an example Grok processor configuration that uses `grok_when`:
|
||||
```
|
||||
processor:
|
||||
- grok:
|
||||
grok_when: '/type == "ipv4"'
|
||||
match:
|
||||
message: ['%{IPV4:clientip} %{WORD:request} %{POSINT:bytes}']
|
||||
- grok:
|
||||
grok_when: '/type == "ipv6"'
|
||||
match:
|
||||
message: ['%{IPV6:clientip} %{WORD:request} %{POSINT:bytes}']
|
||||
```
|
||||
The `grok_when` option can take a conditional expression. This expression is detailed in the [Expression syntax](https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/) documentation.
|
||||
|
||||
## Metrics
|
||||
|
||||
The following table describes common [Abstract processor](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-api/src/main/java/org/opensearch/dataprepper/model/processor/AbstractProcessor.java) metrics.
|
||||
|
@ -42,7 +58,7 @@ The following table describes common [Abstract processor](https://github.com/ope
|
|||
| `recordsOut` | Counter | Metric representing the egress of records from a pipeline component. |
|
||||
| `timeElapsed` | Timer | Metric representing the time elapsed during execution of a pipeline component. |
|
||||
|
||||
The `Grok` processor includes the following custom metrics.
|
||||
The Grok processor includes the following custom metrics.
|
||||
|
||||
### Counter
|
||||
|
||||
|
@ -53,4 +69,4 @@ The `Grok` processor includes the following custom metrics.
|
|||
|
||||
### Timer
|
||||
|
||||
* `grokProcessingTime`: The time taken by individual records to match against patterns from `match`. The `avg` metric is the most useful metric for this timer because it provides you with an average value of the time it takes records to match.
|
||||
* `grokProcessingTime`: The time taken by individual records to match against `match` patterns. The `avg` metric is the most useful metric for this timer because because it provides the average time taken to match records.
|
||||
|
|
Loading…
Reference in New Issue