Merge pull request #443 from opensearch-project/observe-ppl-1.3

[1.3] Add Observability changes
This commit is contained in:
Naarcha-AWS 2022-03-17 13:14:26 -05:00 committed by GitHub
commit 808857224e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 103 additions and 3 deletions

View File

@ -151,6 +151,10 @@ search source=accounts | dedup gender consecutive=true | fields account_number,
13 | F 13 | F
18 | M 18 | M
### Limitations
The `dedup` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.
## eval ## eval
The `eval` command evaluates an expression and appends its result to the search result. The `eval` command evaluates an expression and appends its result to the search result.
@ -211,6 +215,11 @@ search source=accounts | eval doubleAge = age * 2, ddAge = doubleAge * 2 | field
| 28 | 56 | 112 | 28 | 56 | 112
| 33 | 66 | 132 | 33 | 66 | 132
### Limitation
The ``eval`` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.
## fields ## fields
Use the `fields` command to keep or remove fields from a search result. Use the `fields` command to keep or remove fields from a search result.
@ -256,6 +265,80 @@ search source=accounts | fields account_number, firstname, lastname | fields - a
| Nanette | Bates | Nanette | Bates
| Dale | Adams | Dale | Adams
## parse
Use the `parse` command to parse a text field using regular expression and append the result to the search result.
### Syntax
```sql
parse <field> <regular-expression>
```
Field | Description | Required
:--- | :--- |:---
field | A text field. | Yes
regular-expression | The regular expression used to extract new fields from the given test field. If a new field name exists, it will replace the original field. | Yes
The regular expression is used to match the whole text field of each document with Java regex engine. Each named capture group in the expression will become a new ``STRING`` field.
*Example 1*: Create new field
The example shows how to create new field `host` for each document. `host` will be the hostname after `@` in `email` field. Parsing a null field will return an empty string.
```sql
os> source=accounts | parse email '.+@(?<host>.+)' | fields email, host ;
fetched rows / total rows = 4/4
```
| email | host
:--- | :--- |
| amberduke@pyrami.com | pyrami.com
| hattiebond@netagy.com | netagy.com
| null | null
| daleadams@boink.com | boink.com
*Example 2*: Override the existing field
The example shows how to override the existing address field with street number removed.
```sql
os> source=accounts | parse address '\d+ (?<address>.+)' | fields address ;
fetched rows / total rows = 4/4
```
| address
:--- |
| Holmes Lane
| Bristol Street
| Madison Street
| Hutchinson Court
*Example 3*: Filter and sort be casted parsed field
The example shows how to sort street numbers that are higher than 500 in address field.
```sql
os> source=accounts | parse address '(?<streetNumber>\d+) (?<street>.+)' | where cast(streetNumber as int) > 500 | sort num(streetNumber) | fields streetNumber, street ;
fetched rows / total rows = 3/3
```
| streetNumber | street
:--- | :--- |
| 671 | Bristol Street
| 789 | Madison Street
| 880 | Holmes Lane
### Limitations
A few limitations exist when using the parse command:
- Fields defined by parse cannot be parsed again. For example, `source=accounts | parse address '\d+ (?<street>.+)' | parse street '\w+ (?<road>\w+)' ;` will fail to return any expressions.
- Fields defined by parse cannot be overridden with other commands. For example, when entering `source=accounts | parse address '\d+ (?<street>.+)' | eval street='1' | where street='1' ;` `where` will not match any documents since `street` cannot be overridden.
- The text field used by parse cannot be overridden. For example, when entering `source=accounts | parse address '\d+ (?<street>.+)' | eval address='1' ;` `street` will not be parse since address is overridden.
- Fields defined by parse cannot be filtered/sorted after using them in the `stats` command. For example, `source=accounts | parse email '.+@(?<host>.+)' | stats avg(age) by host | where host=pyrami.com ;` `where` will not parse the domain listed.
## rename ## rename
Use the `rename` command to rename one or more fields in the search result. Use the `rename` command to rename one or more fields in the search result.
@ -301,6 +384,10 @@ search source=accounts | rename account_number as an, employer as emp | fields a
| 13 | Quility | 13 | Quility
| 18 | null | 18 | null
### Limitations
The `rename` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.
## sort ## sort
Use the `sort` command to sort search results by a specified field. Use the `sort` command to sort search results by a specified field.
@ -547,6 +634,10 @@ search source=accounts | fields firstname, age | head 2;
| Amber | 32 | Amber | 32
| Hattie | 36 | Hattie | 36
### Limitations
The `head` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.
## rare ## rare
Use the `rare` command to find the least common values of all fields in a field list. Use the `rare` command to find the least common values of all fields in a field list.
@ -590,6 +681,10 @@ search source=accounts | rare age by gender;
| M | 32 | M | 32
| M | 33 | M | 33
### Limitations
The `rare` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.
## top {#top-command} ## top {#top-command}
Use the `top` command to find the most common values of all fields in the field list. Use the `top` command to find the most common values of all fields in the field list.
@ -644,6 +739,10 @@ search source=accounts | top 1 age by gender;
| F | 28 | F | 28
| M | 32 | M | 32
### Limitations
The `top` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.
## match ## match
Use the `match` command to search documents that match a `string`, `number`, `date`, or `boolean` value for a given field. Use the `match` command to search documents that match a `string`, `number`, `date`, or `boolean` value for a given field.

View File

@ -23,6 +23,7 @@ double | double | DOUBLE
keyword | string | VARCHAR keyword | string | VARCHAR
text | text | VARCHAR text | text | VARCHAR
date | timestamp | TIMESTAMP date | timestamp | TIMESTAMP
date_nanos | timestamp | TIMESTAMP
ip | ip | VARCHAR ip | ip | VARCHAR
date | timestamp | TIMESTAMP date | timestamp | TIMESTAMP
binary | binary | VARBINARY binary | binary | VARBINARY
@ -54,7 +55,7 @@ The `time` type represents the time of a clock regardless of its timezone. The `
| Type | Syntax | Range | Type | Syntax | Range
:--- | :--- | :--- :--- | :--- | :---
time | `hh:mm:ss[.fraction]` | `00:00:00.000000` to `23:59:59.999999` time | `hh:mm:ss[.fraction]` | `00:00:00.0000000000` to `23:59:59.9999999999`
### Datetime ### Datetime
@ -62,7 +63,7 @@ The `datetime` type is a combination of date and time. It doesn't contain timezo
| Type | Syntax | Range | Type | Syntax | Range
:--- | :--- | :--- :--- | :--- | :---
datetime | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:00.000000` to `9999-12-31 23:59:59.999999` datetime | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:00.0000000000` to `9999-12-31 23:59:59.9999999999`
### Timestamp ### Timestamp
@ -72,7 +73,7 @@ The `timestamp` type is stored differently from the other types. It's converted
| Type | Syntax | Range | Type | Syntax | Range
:--- | :--- | :--- :--- | :--- | :---
timestamp | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:01.000000` UTC to `9999-12-31 23:59:59.999999` timestamp | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:01.9999999999` UTC to `9999-12-31 23:59:59.9999999999`
### Interval ### Interval