From 894efa3fb6eaeecf0c3ec1d63af0cc97176aa009 Mon Sep 17 00:00:00 2001 From: Tal Levy Date: Mon, 25 Jan 2016 12:06:39 -0800 Subject: [PATCH] update ingest docs - move ingest plugin docs to core reference docs - move geoip processor docs to plugins/ingest-geoip.asciidoc - add missing options tables for some processors - add description of pipeline definition - add description of processor definitions including common parameters like "tag" and "on_failure" --- docs/plugins/ingest-geoip.asciidoc | 64 ++++++ .../ingest}/ingest.asciidoc | 216 ++++++++++++------ 2 files changed, 212 insertions(+), 68 deletions(-) create mode 100644 docs/plugins/ingest-geoip.asciidoc rename docs/{plugins => reference/ingest}/ingest.asciidoc (87%) diff --git a/docs/plugins/ingest-geoip.asciidoc b/docs/plugins/ingest-geoip.asciidoc new file mode 100644 index 00000000000..539c29971a4 --- /dev/null +++ b/docs/plugins/ingest-geoip.asciidoc @@ -0,0 +1,64 @@ +[[ingest-geoip]] +== Ingest Geoip Processor Plugin + +The GeoIP processor adds information about the geographical location of IP addresses, based on data from the Maxmind databases. +This processor adds this information by default under the `geoip` field. + +The ingest plugin ships by default with the GeoLite2 City and GeoLite2 Country geoip2 databases from Maxmind made available +under the CCA-ShareAlike 3.0 license. For more details see, http://dev.maxmind.com/geoip/geoip2/geolite2/ + +The GeoIP processor can run with other geoip2 databases from Maxmind. The files must be copied into the geoip config directory +and the `database_file` option should be used to specify the filename of the custom database. The geoip config directory +is located at `$ES_HOME/config/ingest/geoip` and holds the shipped databases too. + +[[geoip-options]] +.Geoip options +[options="header"] +|====== +| Name | Required | Default | Description +| `source_field` | yes | - | The field to get the ip address or hostname from for the geographical lookup. +| `target_field` | no | geoip | The field that will hold the geographical information looked up from the Maxmind database. +| `database_file` | no | GeoLite2-City.mmdb | The database filename in the geoip config directory. The ingest plugin ships with the GeoLite2-City.mmdb and GeoLite2-Country.mmdb files. +| `fields` | no | [`continent_name`, `country_iso_code`, `region_name`, `city_name`, `location`] <1> | Controls what properties are added to the `target_field` based on the geoip lookup. +|====== + +<1> Depends on what is available in `database_field`: +* If the GeoLite2 City database is used then the following fields may be added under the `target_field`: `ip`, +`country_iso_code`, `country_name`, `continent_name`, `region_name`, `city_name`, `timezone`, `latitude`, `longitude` +and `location`. The fields actually added depend on what has been found and which fields were configured in `fields`. +* If the GeoLite2 Country database is used then the following fields may be added under the `target_field`: `ip`, +`country_iso_code`, `country_name` and `continent_name`.The fields actually added depend on what has been found and which fields were configured in `fields`. + +An example that uses the default city database and adds the geographical information to the `geoip` field based on the `ip` field: + +[source,js] +-------------------------------------------------- +{ + "description" : "...", + "processors" : [ + { + "geoip" : { + "source_field" : "ip" + } + } + ] +} +-------------------------------------------------- + +An example that uses the default country database and add the geographical information to the `geo` field based on the `ip` field`: + +[source,js] +-------------------------------------------------- +{ + "description" : "...", + "processors" : [ + { + "geoip" : { + "source_field" : "ip", + "target_field" : "geo", + "database_file" : "GeoLite2-Country.mmdb" + } + } + ] +} +-------------------------------------------------- diff --git a/docs/plugins/ingest.asciidoc b/docs/reference/ingest/ingest.asciidoc similarity index 87% rename from docs/plugins/ingest.asciidoc rename to docs/reference/ingest/ingest.asciidoc index 72336086b23..0c049f82b69 100644 --- a/docs/plugins/ingest.asciidoc +++ b/docs/reference/ingest/ingest.asciidoc @@ -28,12 +28,59 @@ PUT /my-index/my-type/my-id?pipeline=my_pipeline_id -------------------------------------------------- // AUTOSENSE +=== Pipeline Definition + +A pipeline is a definition of a series of processors that are to be +executed in the same sequential order as they are declared. +[source,js] +-------------------------------------------------- +{ + "description" : "...", + "processors" : [ ... ] +} +-------------------------------------------------- + +The `description` is a special field to store a helpful description of +what the pipeline attempts to achieve. + +The `processors` parameter defines a list of processors to be executed in +order. + === Processors +All processors are defined in the following way within a pipeline definition: + +[source,js] +-------------------------------------------------- +{ + "PROCESSOR_NAME" : { + ... processor configuration options ... + } +} +-------------------------------------------------- + +Each processor defines its own configuration parameters, but all processors have +the ability to declare `tag` and `on_failure` fields. These fields are optional. + +A `tag` is simply a string identifier of the specific instatiation of a certain +processor in a pipeline. The `tag` field does not affect any processor's behavior, +but is very useful for bookkeeping and tracing errors to specific processors. + +See <> to learn more about the `on_failure` field and error handling in pipelines. + ==== Set processor Sets one field and associates it with the specified value. If the field already exists, its value will be replaced with the provided one. +[[set-options]] +.Set Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to insert, upsert, or update +| `value` | yes | - | The value to be set for the field +|====== + [source,js] -------------------------------------------------- { @@ -50,6 +97,15 @@ Converts a scalar to an array and appends one or more values to it if the field Creates an array containing the provided values if the fields doesn't exist. Accepts a single value or an array of values. +[[append-options]] +.Append Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to be appended to +| `value` | yes | - | The value to be appended +|====== + [source,js] -------------------------------------------------- { @@ -63,6 +119,14 @@ Accepts a single value or an array of values. ==== Remove processor Removes an existing field. If the field doesn't exist, an exception will be thrown +[[remove-options]] +.Remove Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to be removed +|====== + [source,js] -------------------------------------------------- { @@ -76,6 +140,15 @@ Removes an existing field. If the field doesn't exist, an exception will be thro Renames an existing field. If the field doesn't exist, an exception will be thrown. Also, the new field name must not exist. +[[rename-options]] +.Rename Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to be renamed +| `to` | yes | - | The new name of the field +|====== + [source,js] -------------------------------------------------- { @@ -96,6 +169,15 @@ The supported types include: `integer`, `float`, `string`, and `boolean`. `boolean` will set the field to true if its string value is equal to `true` (ignore case), to false if its string value is equal to `false` (ignore case) and it will throw exception otherwise. +[[convert-options]] +.Convert Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field whose value is to be converted +| `type` | yes | - | The type to convert the existing value to +|====== + [source,js] -------------------------------------------------- { @@ -110,9 +192,15 @@ false if its string value is equal to `false` (ignore case) and it will throw ex Converts a string field by applying a regular expression and a replacement. If the field is not a string, the processor will throw an exception. -This configuration takes a `field` for the field name, `pattern` for the -pattern to be replaced, and `replacement` for the string to replace the matching patterns with. - +[[gsub-options]] +.Gsub Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field apply the replacement for +| `pattern` | yes | - | The pattern to be replaced +| `replacement` | yes | - | The string to replace the matching patterns with. +|====== [source,js] -------------------------------------------------- @@ -129,6 +217,15 @@ pattern to be replaced, and `replacement` for the string to replace the matching Joins each element of an array into a single string using a separator character between each element. Throws error when the field is not an array. +[[join-options]] +.Join Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to be separated +| `separator` | yes | - | The separator character +|====== + [source,js] -------------------------------------------------- { @@ -142,6 +239,14 @@ Throws error when the field is not an array. ==== Split processor Split a field to an array using a separator character. Only works on string fields. +[[split-options]] +.Split Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to split +|====== + [source,js] -------------------------------------------------- { @@ -154,6 +259,14 @@ Split a field to an array using a separator character. Only works on string fiel ==== Lowercase processor Converts a string to its lowercase equivalent. +[[lowercase-options]] +.Lowercase Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to lowercase +|====== + [source,js] -------------------------------------------------- { @@ -166,6 +279,14 @@ Converts a string to its lowercase equivalent. ==== Uppercase processor Converts a string to its uppercase equivalent. +[[uppercase-options]] +.Uppercase Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The field to uppercase +|====== + [source,js] -------------------------------------------------- { @@ -178,6 +299,14 @@ Converts a string to its uppercase equivalent. ==== Trim processor Trims whitespace from field. NOTE: this only works on leading and trailing whitespaces. +[[trim-options]] +.Trim Options +[options="header"] +|====== +| Name | Required | Default | Description +| `field` | yes | - | The string-valued field to trim whitespace from +|====== + [source,js] -------------------------------------------------- { @@ -346,71 +475,6 @@ An example of a pipeline specifying custom pattern definitions: } -------------------------------------------------- - -==== Geoip processor - -The GeoIP processor adds information about the geographical location of IP addresses, based on data from the Maxmind databases. -This processor adds this information by default under the `geoip` field. - -The ingest plugin ships by default with the GeoLite2 City and GeoLite2 Country geoip2 databases from Maxmind made available -under the CCA-ShareAlike 3.0 license. For more details see, http://dev.maxmind.com/geoip/geoip2/geolite2/ - -The GeoIP processor can run with other geoip2 databases from Maxmind. The files must be copied into the geoip config directory -and the `database_file` option should be used to specify the filename of the custom database. The geoip config directory -is located at `$ES_HOME/config/ingest/geoip` and holds the shipped databases too. - -[[geoip-options]] -.Geoip options -[options="header"] -|====== -| Name | Required | Default | Description -| `source_field` | yes | - | The field to get the ip address or hostname from for the geographical lookup. -| `target_field` | no | geoip | The field that will hold the geographical information looked up from the Maxmind database. -| `database_file` | no | GeoLite2-City.mmdb | The database filename in the geoip config directory. The ingest plugin ships with the GeoLite2-City.mmdb and GeoLite2-Country.mmdb files. -| `fields` | no | [`continent_name`, `country_iso_code`, `region_name`, `city_name`, `location`] <1> | Controls what properties are added to the `target_field` based on the geoip lookup. -|====== - -<1> Depends on what is available in `database_field`: -* If the GeoLite2 City database is used then the following fields may be added under the `target_field`: `ip`, -`country_iso_code`, `country_name`, `continent_name`, `region_name`, `city_name`, `timezone`, `latitude`, `longitude` -and `location`. The fields actually added depend on what has been found and which fields were configured in `fields`. -* If the GeoLite2 Country database is used then the following fields may be added under the `target_field`: `ip`, -`country_iso_code`, `country_name` and `continent_name`.The fields actually added depend on what has been found and which fields were configured in `fields`. - -An example that uses the default city database and adds the geographical information to the `geoip` field based on the `ip` field: - -[source,js] --------------------------------------------------- -{ - "description" : "...", - "processors" : [ - { - "geoip" : { - "source_field" : "ip" - } - } - ] -} --------------------------------------------------- - -An example that uses the default country database and add the geographical information to the `geo` field based on the `ip` field`: - -[source,js] --------------------------------------------------- -{ - "description" : "...", - "processors" : [ - { - "geoip" : { - "source_field" : "ip", - "target_field" : "geo", - "database_file" : "GeoLite2-Country.mmdb" - } - } - ] -} --------------------------------------------------- - ==== Date processor The date processor is used for parsing dates from fields, and then using that date or timestamp as the timestamp for that document. @@ -454,6 +518,14 @@ The Fail Processor is used to raise an exception. This is useful for when a user expects a pipeline to fail and wishes to relay a specific message to the requester. +[[fail-options]] +.Fail Options +[options="header"] +|====== +| Name | Required | Default | Description +| `message` | yes | - | The error message of the `FailException` thrown by the processor +|====== + [source,js] -------------------------------------------------- { @@ -467,6 +539,14 @@ to the requester. The DeDot Processor is used to remove dots (".") from field names and replace them with a specific `separator` string. +[[dedot-options]] +.DeDot Options +[options="header"] +|====== +| Name | Required | Default | Description +| `separator` | yes | "_" | The string to replace dots with in all field names +|====== + [source,js] -------------------------------------------------- {