Merge pull request #16620 from dedemorton/add_ingest_doc
Add ingest docs to the build
This commit is contained in:
commit
16e87bbe14
|
@ -41,6 +41,8 @@ include::modules.asciidoc[]
|
||||||
|
|
||||||
include::index-modules.asciidoc[]
|
include::index-modules.asciidoc[]
|
||||||
|
|
||||||
|
include::ingest.asciidoc[]
|
||||||
|
|
||||||
include::testing.asciidoc[]
|
include::testing.asciidoc[]
|
||||||
|
|
||||||
include::glossary.asciidoc[]
|
include::glossary.asciidoc[]
|
||||||
|
|
|
@ -0,0 +1,34 @@
|
||||||
|
[[ingest]]
|
||||||
|
= Ingest Node
|
||||||
|
|
||||||
|
[partintro]
|
||||||
|
--
|
||||||
|
Ingest node can be used to pre-process documents before the actual indexing takes place.
|
||||||
|
This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
|
||||||
|
transformations and then passes the documents back to the index or bulk APIs.
|
||||||
|
|
||||||
|
Ingest node is enabled by default. In order to disable ingest the following
|
||||||
|
setting should be configured in the elasticsearch.yml file:
|
||||||
|
|
||||||
|
[source,yaml]
|
||||||
|
--------------------------------------------------
|
||||||
|
node.ingest: false
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
It is possible to enable ingest on any node or have dedicated ingest nodes.
|
||||||
|
|
||||||
|
In order to pre-process document before indexing the `pipeline` parameter should be used
|
||||||
|
on an index or bulk request to tell Ingest what pipeline is going to be used.
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
|
||||||
|
{
|
||||||
|
...
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
||||||
|
// AUTOSENSE
|
||||||
|
|
||||||
|
--
|
||||||
|
|
||||||
|
include::ingest/ingest-node.asciidoc[]
|
|
@ -1,33 +1,5 @@
|
||||||
[[ingest]]
|
[[pipe-line]]
|
||||||
== Ingest Node
|
== Pipeline Definition
|
||||||
|
|
||||||
Ingest node can be used to pre-process documents before the actual indexing takes place.
|
|
||||||
This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
|
|
||||||
transformations and then passes the documents back to the index or bulk APIs.
|
|
||||||
|
|
||||||
Ingest node is enabled by default. In order to disable ingest the following
|
|
||||||
setting should be configured in the elasticsearch.yml file:
|
|
||||||
|
|
||||||
[source,yaml]
|
|
||||||
--------------------------------------------------
|
|
||||||
node.ingest: false
|
|
||||||
--------------------------------------------------
|
|
||||||
|
|
||||||
It is possible to enable ingest on any node or have dedicated ingest nodes.
|
|
||||||
|
|
||||||
In order to pre-process document before indexing the `pipeline` parameter should be used
|
|
||||||
on an index or bulk request to tell Ingest what pipeline is going to be used.
|
|
||||||
|
|
||||||
[source,js]
|
|
||||||
--------------------------------------------------
|
|
||||||
PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
|
|
||||||
{
|
|
||||||
...
|
|
||||||
}
|
|
||||||
--------------------------------------------------
|
|
||||||
// AUTOSENSE
|
|
||||||
|
|
||||||
=== Pipeline Definition
|
|
||||||
|
|
||||||
A pipeline is a definition of a series of processors that are to be
|
A pipeline is a definition of a series of processors that are to be
|
||||||
executed in the same sequential order as they are declared.
|
executed in the same sequential order as they are declared.
|
||||||
|
@ -45,7 +17,7 @@ what the pipeline attempts to achieve.
|
||||||
The `processors` parameter defines a list of processors to be executed in
|
The `processors` parameter defines a list of processors to be executed in
|
||||||
order.
|
order.
|
||||||
|
|
||||||
=== Processors
|
== Processors
|
||||||
|
|
||||||
All processors are defined in the following way within a pipeline definition:
|
All processors are defined in the following way within a pipeline definition:
|
||||||
|
|
||||||
|
@ -67,7 +39,7 @@ but is very useful for bookkeeping and tracing errors to specific processors.
|
||||||
|
|
||||||
See <<handling-failure-in-pipelines>> to learn more about the `on_failure` field and error handling in pipelines.
|
See <<handling-failure-in-pipelines>> to learn more about the `on_failure` field and error handling in pipelines.
|
||||||
|
|
||||||
==== Set processor
|
=== Set processor
|
||||||
Sets one field and associates it with the specified value. If the field already exists,
|
Sets one field and associates it with the specified value. If the field already exists,
|
||||||
its value will be replaced with the provided one.
|
its value will be replaced with the provided one.
|
||||||
|
|
||||||
|
@ -90,7 +62,7 @@ its value will be replaced with the provided one.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Append processor
|
=== Append processor
|
||||||
Appends one or more values to an existing array if the field already exists and it is an array.
|
Appends one or more values to an existing array if the field already exists and it is an array.
|
||||||
Converts a scalar to an array and appends one or more values to it if the field exists and it is a scalar.
|
Converts a scalar to an array and appends one or more values to it if the field exists and it is a scalar.
|
||||||
Creates an array containing the provided values if the fields doesn't exist.
|
Creates an array containing the provided values if the fields doesn't exist.
|
||||||
|
@ -115,7 +87,7 @@ Accepts a single value or an array of values.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Remove processor
|
=== Remove processor
|
||||||
Removes an existing field. If the field doesn't exist, an exception will be thrown
|
Removes an existing field. If the field doesn't exist, an exception will be thrown
|
||||||
|
|
||||||
[[remove-options]]
|
[[remove-options]]
|
||||||
|
@ -135,7 +107,7 @@ Removes an existing field. If the field doesn't exist, an exception will be thro
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Rename processor
|
=== Rename processor
|
||||||
Renames an existing field. If the field doesn't exist, an exception will be thrown. Also, the new field
|
Renames an existing field. If the field doesn't exist, an exception will be thrown. Also, the new field
|
||||||
name must not exist.
|
name must not exist.
|
||||||
|
|
||||||
|
@ -159,7 +131,7 @@ name must not exist.
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
==== Convert processor
|
=== Convert processor
|
||||||
Converts an existing field's value to a different type, like turning a string to an integer.
|
Converts an existing field's value to a different type, like turning a string to an integer.
|
||||||
If the field value is an array, all members will be converted.
|
If the field value is an array, all members will be converted.
|
||||||
|
|
||||||
|
@ -187,7 +159,7 @@ false if its string value is equal to `false` (ignore case) and it will throw ex
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Gsub processor
|
=== Gsub processor
|
||||||
Converts a string field by applying a regular expression and a replacement.
|
Converts a string field by applying a regular expression and a replacement.
|
||||||
If the field is not a string, the processor will throw an exception.
|
If the field is not a string, the processor will throw an exception.
|
||||||
|
|
||||||
|
@ -212,7 +184,7 @@ If the field is not a string, the processor will throw an exception.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Join processor
|
=== Join processor
|
||||||
Joins each element of an array into a single string using a separator character between each element.
|
Joins each element of an array into a single string using a separator character between each element.
|
||||||
Throws error when the field is not an array.
|
Throws error when the field is not an array.
|
||||||
|
|
||||||
|
@ -235,7 +207,7 @@ Throws error when the field is not an array.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Split processor
|
=== Split processor
|
||||||
Split a field to an array using a separator character. Only works on string fields.
|
Split a field to an array using a separator character. Only works on string fields.
|
||||||
|
|
||||||
[[split-options]]
|
[[split-options]]
|
||||||
|
@ -255,7 +227,7 @@ Split a field to an array using a separator character. Only works on string fiel
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Lowercase processor
|
=== Lowercase processor
|
||||||
Converts a string to its lowercase equivalent.
|
Converts a string to its lowercase equivalent.
|
||||||
|
|
||||||
[[lowercase-options]]
|
[[lowercase-options]]
|
||||||
|
@ -275,7 +247,7 @@ Converts a string to its lowercase equivalent.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Uppercase processor
|
=== Uppercase processor
|
||||||
Converts a string to its uppercase equivalent.
|
Converts a string to its uppercase equivalent.
|
||||||
|
|
||||||
[[uppercase-options]]
|
[[uppercase-options]]
|
||||||
|
@ -295,7 +267,7 @@ Converts a string to its uppercase equivalent.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Trim processor
|
=== Trim processor
|
||||||
Trims whitespace from field. NOTE: this only works on leading and trailing whitespaces.
|
Trims whitespace from field. NOTE: this only works on leading and trailing whitespaces.
|
||||||
|
|
||||||
[[trim-options]]
|
[[trim-options]]
|
||||||
|
@ -315,7 +287,7 @@ Trims whitespace from field. NOTE: this only works on leading and trailing white
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Grok Processor
|
=== Grok Processor
|
||||||
|
|
||||||
The Grok Processor extracts structured fields out of a single text field within a document. You choose which field to
|
The Grok Processor extracts structured fields out of a single text field within a document. You choose which field to
|
||||||
extract matched fields from, as well as the Grok Pattern you expect will match. A Grok Pattern is like a regular
|
extract matched fields from, as well as the Grok Pattern you expect will match. A Grok Pattern is like a regular
|
||||||
|
@ -330,7 +302,7 @@ Here, you can add your own custom grok pattern files with custom grok expression
|
||||||
If you need help building patterns to match your logs, you will find the <http://grokdebug.herokuapp.com> and
|
If you need help building patterns to match your logs, you will find the <http://grokdebug.herokuapp.com> and
|
||||||
<http://grokconstructor.appspot.com/> applications quite useful!
|
<http://grokconstructor.appspot.com/> applications quite useful!
|
||||||
|
|
||||||
===== Grok Basics
|
==== Grok Basics
|
||||||
|
|
||||||
Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
|
Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
|
||||||
The regular expression library is Oniguruma, and you can see the full supported regexp syntax
|
The regular expression library is Oniguruma, and you can see the full supported regexp syntax
|
||||||
|
@ -367,7 +339,7 @@ Grok expression.
|
||||||
%{NUMBER:duration} %{IP:client}
|
%{NUMBER:duration} %{IP:client}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
===== Custom Patterns and Pattern Files
|
==== Custom Patterns and Pattern Files
|
||||||
|
|
||||||
The Grok Processor comes pre-packaged with a base set of pattern files. These patterns may not always have
|
The Grok Processor comes pre-packaged with a base set of pattern files. These patterns may not always have
|
||||||
what you are looking for. These pattern files have a very basic format. Each line describes a named pattern with
|
what you are looking for. These pattern files have a very basic format. Each line describes a named pattern with
|
||||||
|
@ -393,7 +365,7 @@ SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
|
||||||
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
|
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
===== Using Grok Processor in a Pipeline
|
==== Using Grok Processor in a Pipeline
|
||||||
|
|
||||||
[[grok-options]]
|
[[grok-options]]
|
||||||
.Grok Options
|
.Grok Options
|
||||||
|
@ -417,7 +389,7 @@ a document.
|
||||||
|
|
||||||
The pattern for this could be
|
The pattern for this could be
|
||||||
|
|
||||||
[source]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
|
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
@ -474,7 +446,7 @@ An example of a pipeline specifying custom pattern definitions:
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Date processor
|
=== Date processor
|
||||||
|
|
||||||
The date processor is used for parsing dates from fields, and then using that date or timestamp as the timestamp for that document.
|
The date processor is used for parsing dates from fields, and then using that date or timestamp as the timestamp for that document.
|
||||||
The date processor adds by default the parsed date as a new field called `@timestamp`, configurable by setting the `target_field`
|
The date processor adds by default the parsed date as a new field called `@timestamp`, configurable by setting the `target_field`
|
||||||
|
@ -512,7 +484,7 @@ An example that adds the parsed date to the `timestamp` field based on the `init
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Fail processor
|
=== Fail processor
|
||||||
The Fail Processor is used to raise an exception. This is useful for when
|
The Fail Processor is used to raise an exception. This is useful for when
|
||||||
a user expects a pipeline to fail and wishes to relay a specific message
|
a user expects a pipeline to fail and wishes to relay a specific message
|
||||||
to the requester.
|
to the requester.
|
||||||
|
@ -534,7 +506,7 @@ to the requester.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Foreach processor
|
=== Foreach processor
|
||||||
All processors can operate on elements inside an array, but if all elements of an array need to
|
All processors can operate on elements inside an array, but if all elements of an array need to
|
||||||
be processed in the same way defining a processor for each element becomes cumbersome and tricky
|
be processed in the same way defining a processor for each element becomes cumbersome and tricky
|
||||||
because it is likely that the number of elements in an array are unknown. For this reason the `foreach`
|
because it is likely that the number of elements in an array are unknown. For this reason the `foreach`
|
||||||
|
@ -680,7 +652,7 @@ In this example if the `remove` processor does fail then
|
||||||
the array elements that have been processed thus far will
|
the array elements that have been processed thus far will
|
||||||
be updated.
|
be updated.
|
||||||
|
|
||||||
=== Accessing data in pipelines
|
== Accessing data in pipelines
|
||||||
|
|
||||||
Processors in pipelines have read and write access to documents that pass through the pipeline.
|
Processors in pipelines have read and write access to documents that pass through the pipeline.
|
||||||
The fields in the source of a document and its metadata fields are accessible.
|
The fields in the source of a document and its metadata fields are accessible.
|
||||||
|
@ -781,7 +753,8 @@ to depends on the field in the source with name `geoip.country_iso_code`.
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
==== Handling Failure in Pipelines
|
[[handling-failure-in-pipelines]]
|
||||||
|
=== Handling Failure in Pipelines
|
||||||
|
|
||||||
In its simplest case, pipelines describe a list of processors which
|
In its simplest case, pipelines describe a list of processors which
|
||||||
are executed sequentially and processing halts at the first exception. This
|
are executed sequentially and processing halts at the first exception. This
|
||||||
|
@ -845,7 +818,7 @@ the index for which failed documents get sent.
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
===== Accessing Error Metadata From Processors Handling Exceptions
|
==== Accessing Error Metadata From Processors Handling Exceptions
|
||||||
|
|
||||||
Sometimes you may want to retrieve the actual error message that was thrown
|
Sometimes you may want to retrieve the actual error message that was thrown
|
||||||
by a failed processor. To do so you can access metadata fields called
|
by a failed processor. To do so you can access metadata fields called
|
||||||
|
@ -878,9 +851,9 @@ of manually setting it.
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
=== Ingest APIs
|
== Ingest APIs
|
||||||
|
|
||||||
==== Put pipeline API
|
=== Put pipeline API
|
||||||
|
|
||||||
The put pipeline api adds pipelines and updates existing pipelines in the cluster.
|
The put pipeline api adds pipelines and updates existing pipelines in the cluster.
|
||||||
|
|
||||||
|
@ -904,7 +877,7 @@ PUT _ingest/pipeline/my-pipeline-id
|
||||||
NOTE: The put pipeline api also instructs all ingest nodes to reload their in-memory representation of pipelines, so that
|
NOTE: The put pipeline api also instructs all ingest nodes to reload their in-memory representation of pipelines, so that
|
||||||
pipeline changes take immediately in effect.
|
pipeline changes take immediately in effect.
|
||||||
|
|
||||||
==== Get pipeline API
|
=== Get pipeline API
|
||||||
|
|
||||||
The get pipeline api returns pipelines based on id. This api always returns a local reference of the pipeline.
|
The get pipeline api returns pipelines based on id. This api always returns a local reference of the pipeline.
|
||||||
|
|
||||||
|
@ -940,7 +913,7 @@ For each returned pipeline the source and the version is returned.
|
||||||
The version is useful for knowing what version of the pipeline the node has.
|
The version is useful for knowing what version of the pipeline the node has.
|
||||||
Multiple ids can be provided at the same time. Also wildcards are supported.
|
Multiple ids can be provided at the same time. Also wildcards are supported.
|
||||||
|
|
||||||
==== Delete pipeline API
|
=== Delete pipeline API
|
||||||
|
|
||||||
The delete pipeline api deletes pipelines by id.
|
The delete pipeline api deletes pipelines by id.
|
||||||
|
|
||||||
|
@ -950,7 +923,7 @@ DELETE _ingest/pipeline/my-pipeline-id
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// AUTOSENSE
|
// AUTOSENSE
|
||||||
|
|
||||||
==== Simulate pipeline API
|
=== Simulate pipeline API
|
||||||
|
|
||||||
The simulate pipeline api executes a specific pipeline against
|
The simulate pipeline api executes a specific pipeline against
|
||||||
the set of documents provided in the body of the request.
|
the set of documents provided in the body of the request.
|
Loading…
Reference in New Issue