diff --git a/_observability-plugin/ppl/commands.md b/_observability-plugin/ppl/commands.md index 7b5725b8..f237ead2 100644 --- a/_observability-plugin/ppl/commands.md +++ b/_observability-plugin/ppl/commands.md @@ -151,6 +151,10 @@ search source=accounts | dedup gender consecutive=true | fields account_number, 13 | F 18 | M +### Limitations + +The `dedup` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. + ## eval The `eval` command evaluates an expression and appends its result to the search result. @@ -211,6 +215,11 @@ search source=accounts | eval doubleAge = age * 2, ddAge = doubleAge * 2 | field | 28 | 56 | 112 | 33 | 66 | 132 + +### Limitation + +The ``eval`` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. + ## fields Use the `fields` command to keep or remove fields from a search result. @@ -256,6 +265,80 @@ search source=accounts | fields account_number, firstname, lastname | fields - a | Nanette | Bates | Dale | Adams + +## parse + +Use the `parse` command to parse a text field using regular expression and append the result to the search result. + +### Syntax + +```sql +parse +``` + +Field | Description | Required +:--- | :--- |:--- +field | A text field. | Yes +regular-expression | The regular expression used to extract new fields from the given test field. If a new field name exists, it will replace the original field. | Yes + +The regular expression is used to match the whole text field of each document with Java regex engine. Each named capture group in the expression will become a new ``STRING`` field. + +*Example 1*: Create new field + +The example shows how to create new field `host` for each document. `host` will be the hostname after `@` in `email` field. Parsing a null field will return an empty string. + +```sql +os> source=accounts | parse email '.+@(?.+)' | fields email, host ; +fetched rows / total rows = 4/4 +``` + +| email | host +:--- | :--- | +| amberduke@pyrami.com | pyrami.com +| hattiebond@netagy.com | netagy.com +| null | null +| daleadams@boink.com | boink.com + +*Example 2*: Override the existing field + +The example shows how to override the existing address field with street number removed. + +```sql +os> source=accounts | parse address '\d+ (?
.+)' | fields address ; +fetched rows / total rows = 4/4 +``` + +| address +:--- | +| Holmes Lane +| Bristol Street +| Madison Street +| Hutchinson Court + +*Example 3*: Filter and sort be casted parsed field + +The example shows how to sort street numbers that are higher than 500 in address field. + +```sql +os> source=accounts | parse address '(?\d+) (?.+)' | where cast(streetNumber as int) > 500 | sort num(streetNumber) | fields streetNumber, street ; +fetched rows / total rows = 3/3 +``` + +| streetNumber | street +:--- | :--- | +| 671 | Bristol Street +| 789 | Madison Street +| 880 | Holmes Lane + +### Limitations + +A few limitations exist when using the parse command: + +- Fields defined by parse cannot be parsed again. For example, `source=accounts | parse address '\d+ (?.+)' | parse street '\w+ (?\w+)' ;` will fail to return any expressions. +- Fields defined by parse cannot be overridden with other commands. For example, when entering `source=accounts | parse address '\d+ (?.+)' | eval street='1' | where street='1' ;` `where` will not match any documents since `street` cannot be overridden. +- The text field used by parse cannot be overridden. For example, when entering `source=accounts | parse address '\d+ (?.+)' | eval address='1' ;` `street` will not be parse since address is overridden. +- Fields defined by parse cannot be filtered/sorted after using them in the `stats` command. For example, `source=accounts | parse email '.+@(?.+)' | stats avg(age) by host | where host=pyrami.com ;` `where` will not parse the domain listed. + ## rename Use the `rename` command to rename one or more fields in the search result. @@ -301,6 +384,10 @@ search source=accounts | rename account_number as an, employer as emp | fields a | 13 | Quility | 18 | null +### Limitations + +The `rename` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. + ## sort Use the `sort` command to sort search results by a specified field. @@ -547,6 +634,10 @@ search source=accounts | fields firstname, age | head 2; | Amber | 32 | Hattie | 36 +### Limitations + +The `head` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. + ## rare Use the `rare` command to find the least common values of all fields in a field list. @@ -590,6 +681,10 @@ search source=accounts | rare age by gender; | M | 32 | M | 33 +### Limitations + +The `rare` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. + ## top {#top-command} Use the `top` command to find the most common values of all fields in the field list. @@ -644,6 +739,10 @@ search source=accounts | top 1 age by gender; | F | 28 | M | 32 +### Limitations + +The `top` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. + ## match Use the `match` command to search documents that match a `string`, `number`, `date`, or `boolean` value for a given field. diff --git a/_search-plugins/sql/datatypes.md b/_search-plugins/sql/datatypes.md index 6927082a..a11fa233 100644 --- a/_search-plugins/sql/datatypes.md +++ b/_search-plugins/sql/datatypes.md @@ -23,6 +23,7 @@ double | double | DOUBLE keyword | string | VARCHAR text | text | VARCHAR date | timestamp | TIMESTAMP +date_nanos | timestamp | TIMESTAMP ip | ip | VARCHAR date | timestamp | TIMESTAMP binary | binary | VARBINARY @@ -54,7 +55,7 @@ The `time` type represents the time of a clock regardless of its timezone. The ` | Type | Syntax | Range :--- | :--- | :--- -time | `hh:mm:ss[.fraction]` | `00:00:00.000000` to `23:59:59.999999` +time | `hh:mm:ss[.fraction]` | `00:00:00.0000000000` to `23:59:59.9999999999` ### Datetime @@ -62,7 +63,7 @@ The `datetime` type is a combination of date and time. It doesn't contain timezo | Type | Syntax | Range :--- | :--- | :--- -datetime | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:00.000000` to `9999-12-31 23:59:59.999999` +datetime | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:00.0000000000` to `9999-12-31 23:59:59.9999999999` ### Timestamp @@ -72,7 +73,7 @@ The `timestamp` type is stored differently from the other types. It's converted | Type | Syntax | Range :--- | :--- | :--- -timestamp | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:01.000000` UTC to `9999-12-31 23:59:59.999999` +timestamp | `yyyy-MM-dd hh:mm:ss[.fraction]` | `0001-01-01 00:00:01.9999999999` UTC to `9999-12-31 23:59:59.9999999999` ### Interval