Add CIDR and User Agent processors (#4585)

* Add CIDR and User Agent processors

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Add CIDR functions

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Fix typos

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Add technical feedback

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Update user-agent.md

Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

* Update _data-prepper/pipelines/expression-syntax.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

* Update user-agent.md

Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

---------

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
Naarcha-AWS 2023-07-27 14:58:47 -05:00 committed by GitHub
parent 6badff1b33
commit c14b8b2a07
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 100 additions and 0 deletions

View File

@ -0,0 +1,63 @@
---
layout: default
title: user_agent
parent: Processors
grand_parent: Pipelines
nav_order: 130
---
# user_agent
The `user_agent` processor parses any user agent (UA) string in an event and then adds the parsing results to the event's write data.
## Usage
In this example, the `user_agent` processor calls the source that contains the UA string, the `ua` field, and indicates the key to which the parsed string will write, `user_agent`, as shown in the following example:
```yaml
processor:
- user_agent:
source: "ua"
target: "user_agent"
```
The following example event contains the `ua` field with a string that provides information about a user:
```json
{
"ua": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1"
}
```
The `user_agent` processor parses the string into a format compatible with Elastic Common Schema (ECS) and then adds the result to the specified target, as shown in the following example:
```json
{
"user_agent": {
"original": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1",
"os": {
"version": "13.5.1",
"full": "iOS 13.5.1",
"name": "iOS"
},
"name": "Mobile Safari",
"version": "13.1.1",
"device": {
"name": "iPhone"
}
},
"ua": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1"
}
```
## Configuration options
You can use the following configuration options with the `user_agent` processor.
| Option | Required | Description |
| :--- | :--- | :--- |
| `source` | Yes | The field in the event that will be parsed.
| `target` | No | The field to which the parsed event will write. Default is `user_agent`.
| `exclude_original` | No | Determines whether to exclude the original UA string from the parsing result. Defaults to `false`.
| `cache_size` | No | The cache size of the parser in megabytes. Defaults to `1000`. |
| `tags_on_parse_failure` | No | The tag to add to an event if the `user_agent` processor fails to parse the UA string. |

View File

@ -22,6 +22,7 @@ Operators are listed in order of precedence (top to bottom, left to right).
| `and`, `or` | Conditional Expression | left-to-right |
## Reserved for possible future functionality
Reserved symbol set: `^`, `*`, `/`, `%`, `+`, `-`, `xor`, `=`, `+=`, `-=`, `*=`, `/=`, `%=`, `++`, `--`, `${<text>}`
## Set initializer
@ -33,15 +34,19 @@ The set initializer defines a set or term and/or expressions.
The following are examples of set initializer syntax.
#### HTTP status codes
```
{200, 201, 202}
```
#### HTTP response payloads
```
{"Created", "Accepted"}
```
#### Handle multiple event types with different keys
```
{/request_payload, /request_message}
```
@ -57,9 +62,11 @@ A priority expression identifies an expression that will be evaluated at the hig
```
## Relational operators
Relational operators are used to test the relationship of two numeric values. The operands must be numbers or JSON Pointers that resolve to numbers.
### Syntax
```
<Number | JSON Pointer> < <Number | JSON Pointer>
<Number | JSON Pointer> <= <Number | JSON Pointer>
@ -68,11 +75,13 @@ Relational operators are used to test the relationship of two numeric values. Th
```
### Example
```
/status_code >= 200 and /status_code < 300
```
## Equality operators
Equality operators are used to test whether two values are equivalent.
### Syntax
@ -106,6 +115,7 @@ null != /response
```
#### Conditional expression
A conditional expression is used to chain together multiple expressions and/or values.
#### Syntax
@ -208,3 +218,30 @@ White space is **required** surrounding set initializers, priority expressions,
| `==`, `!=` | Equality operators | No | `/status == 200`<br>`/status_code==200` | |
| `and`, `or`, `not` | Conditional operators | Yes | `/a<300 and /b>200` | `/b<300and/b>200` |
| `,` | Set value delimiter | No | `/a in {200, 202}`<br>`/a in {200,202}`<br>`/a in {200 , 202}` | `/a in {200,}` |
## Functions
Data Prepper supports the following built-in functions that can be used in an expression.
### `length()`
The `length()` function takes one argument of the JSON pointer type and returns the length of the value passed. For example, `length(/message)` returns a length of `10` when a key message exists in the event and has a value of `1234567890`.
### `hasTags()`
The `hastags()` function takes one or more string type arguments and returns `true` if all the arguments passed are present in an event's tags. When an argument does not exist in the event's tags, the function returns `false`. For example, if you use the expression `hasTags("tag1")` and the event contains `tag1`, Data Prepper returns `true`. If you use the expression `hasTags("tag2")` but the event only contains a `tag1` tag, Data Prepper returns `false`.
### `getMetadata()`
The `getMetadata()` function takes one literal string argument to look up specific keys in a an event's metadata. If the key contains a `/`, then the function looks up the metadata recursively. When passed, the expression returns the value corresponding to the key. The value returned can be of any type. For example, if the metadata contains `{"key1": "value2", "key2": 10}`, then the function, `getMetadata("key1")`, returns `value2`. The function, `getMetadata("key2")`, returns 10.
### `contains()`
The `contains()` function takes two string arguments and determines whether either a literal string or a JSON pointer is contained within an event. When the second argument contains a substring of the first argument, such as `contains("abcde", "abcd")`, the function returns `true`. If the second argument does not contain any substrings, such as `contains("abcde", "xyz")`, it returns `false`.
### `cidrContains()`
The `cidrContains()` function takes two or more arguments. The first argument is a JSON pointer, which represents the key to the IP address that is checked. It supports both IPv4 and IPv6 addresses. Every argument that comes after the key is a string type that represents CIDR blocks that are checked against.
If the IP address in the first argument is in the range of any of the given CIDR blocks, the function returns `true`. If the IP address is not in the range of the CIDR blocks, the function returns `false`. For example, `cidrContains(/sourceIp,"192.0.2.0/24","10.0.1.0/16")` will return `true` if the `sourceIp` field indicated in the JSON pointer has a value of `192.0.2.5`.