Merge pull request #16620 from dedemorton/add_ingest_doc
Add ingest docs to the build
This commit is contained in:
commit
16e87bbe14
|
@ -41,6 +41,8 @@ include::modules.asciidoc[]
|
|||
|
||||
include::index-modules.asciidoc[]
|
||||
|
||||
include::ingest.asciidoc[]
|
||||
|
||||
include::testing.asciidoc[]
|
||||
|
||||
include::glossary.asciidoc[]
|
||||
|
|
|
@ -0,0 +1,34 @@
|
|||
[[ingest]]
|
||||
= Ingest Node
|
||||
|
||||
[partintro]
|
||||
--
|
||||
Ingest node can be used to pre-process documents before the actual indexing takes place.
|
||||
This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
|
||||
transformations and then passes the documents back to the index or bulk APIs.
|
||||
|
||||
Ingest node is enabled by default. In order to disable ingest the following
|
||||
setting should be configured in the elasticsearch.yml file:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
node.ingest: false
|
||||
--------------------------------------------------
|
||||
|
||||
It is possible to enable ingest on any node or have dedicated ingest nodes.
|
||||
|
||||
In order to pre-process document before indexing the `pipeline` parameter should be used
|
||||
on an index or bulk request to tell Ingest what pipeline is going to be used.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
|
||||
{
|
||||
...
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
--
|
||||
|
||||
include::ingest/ingest-node.asciidoc[]
|
|
@ -1,33 +1,5 @@
|
|||
[[ingest]]
|
||||
== Ingest Node
|
||||
|
||||
Ingest node can be used to pre-process documents before the actual indexing takes place.
|
||||
This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
|
||||
transformations and then passes the documents back to the index or bulk APIs.
|
||||
|
||||
Ingest node is enabled by default. In order to disable ingest the following
|
||||
setting should be configured in the elasticsearch.yml file:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
node.ingest: false
|
||||
--------------------------------------------------
|
||||
|
||||
It is possible to enable ingest on any node or have dedicated ingest nodes.
|
||||
|
||||
In order to pre-process document before indexing the `pipeline` parameter should be used
|
||||
on an index or bulk request to tell Ingest what pipeline is going to be used.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
|
||||
{
|
||||
...
|
||||
}
|
||||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
=== Pipeline Definition
|
||||
[[pipe-line]]
|
||||
== Pipeline Definition
|
||||
|
||||
A pipeline is a definition of a series of processors that are to be
|
||||
executed in the same sequential order as they are declared.
|
||||
|
@ -45,7 +17,7 @@ what the pipeline attempts to achieve.
|
|||
The `processors` parameter defines a list of processors to be executed in
|
||||
order.
|
||||
|
||||
=== Processors
|
||||
== Processors
|
||||
|
||||
All processors are defined in the following way within a pipeline definition:
|
||||
|
||||
|
@ -67,7 +39,7 @@ but is very useful for bookkeeping and tracing errors to specific processors.
|
|||
|
||||
See <<handling-failure-in-pipelines>> to learn more about the `on_failure` field and error handling in pipelines.
|
||||
|
||||
==== Set processor
|
||||
=== Set processor
|
||||
Sets one field and associates it with the specified value. If the field already exists,
|
||||
its value will be replaced with the provided one.
|
||||
|
||||
|
@ -90,7 +62,7 @@ its value will be replaced with the provided one.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Append processor
|
||||
=== Append processor
|
||||
Appends one or more values to an existing array if the field already exists and it is an array.
|
||||
Converts a scalar to an array and appends one or more values to it if the field exists and it is a scalar.
|
||||
Creates an array containing the provided values if the fields doesn't exist.
|
||||
|
@ -115,7 +87,7 @@ Accepts a single value or an array of values.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Remove processor
|
||||
=== Remove processor
|
||||
Removes an existing field. If the field doesn't exist, an exception will be thrown
|
||||
|
||||
[[remove-options]]
|
||||
|
@ -135,7 +107,7 @@ Removes an existing field. If the field doesn't exist, an exception will be thro
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Rename processor
|
||||
=== Rename processor
|
||||
Renames an existing field. If the field doesn't exist, an exception will be thrown. Also, the new field
|
||||
name must not exist.
|
||||
|
||||
|
@ -159,7 +131,7 @@ name must not exist.
|
|||
--------------------------------------------------
|
||||
|
||||
|
||||
==== Convert processor
|
||||
=== Convert processor
|
||||
Converts an existing field's value to a different type, like turning a string to an integer.
|
||||
If the field value is an array, all members will be converted.
|
||||
|
||||
|
@ -187,7 +159,7 @@ false if its string value is equal to `false` (ignore case) and it will throw ex
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Gsub processor
|
||||
=== Gsub processor
|
||||
Converts a string field by applying a regular expression and a replacement.
|
||||
If the field is not a string, the processor will throw an exception.
|
||||
|
||||
|
@ -212,7 +184,7 @@ If the field is not a string, the processor will throw an exception.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Join processor
|
||||
=== Join processor
|
||||
Joins each element of an array into a single string using a separator character between each element.
|
||||
Throws error when the field is not an array.
|
||||
|
||||
|
@ -235,7 +207,7 @@ Throws error when the field is not an array.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Split processor
|
||||
=== Split processor
|
||||
Split a field to an array using a separator character. Only works on string fields.
|
||||
|
||||
[[split-options]]
|
||||
|
@ -255,7 +227,7 @@ Split a field to an array using a separator character. Only works on string fiel
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Lowercase processor
|
||||
=== Lowercase processor
|
||||
Converts a string to its lowercase equivalent.
|
||||
|
||||
[[lowercase-options]]
|
||||
|
@ -275,7 +247,7 @@ Converts a string to its lowercase equivalent.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Uppercase processor
|
||||
=== Uppercase processor
|
||||
Converts a string to its uppercase equivalent.
|
||||
|
||||
[[uppercase-options]]
|
||||
|
@ -295,7 +267,7 @@ Converts a string to its uppercase equivalent.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Trim processor
|
||||
=== Trim processor
|
||||
Trims whitespace from field. NOTE: this only works on leading and trailing whitespaces.
|
||||
|
||||
[[trim-options]]
|
||||
|
@ -315,7 +287,7 @@ Trims whitespace from field. NOTE: this only works on leading and trailing white
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Grok Processor
|
||||
=== Grok Processor
|
||||
|
||||
The Grok Processor extracts structured fields out of a single text field within a document. You choose which field to
|
||||
extract matched fields from, as well as the Grok Pattern you expect will match. A Grok Pattern is like a regular
|
||||
|
@ -330,7 +302,7 @@ Here, you can add your own custom grok pattern files with custom grok expression
|
|||
If you need help building patterns to match your logs, you will find the <http://grokdebug.herokuapp.com> and
|
||||
<http://grokconstructor.appspot.com/> applications quite useful!
|
||||
|
||||
===== Grok Basics
|
||||
==== Grok Basics
|
||||
|
||||
Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
|
||||
The regular expression library is Oniguruma, and you can see the full supported regexp syntax
|
||||
|
@ -367,7 +339,7 @@ Grok expression.
|
|||
%{NUMBER:duration} %{IP:client}
|
||||
--------------------------------------------------
|
||||
|
||||
===== Custom Patterns and Pattern Files
|
||||
==== Custom Patterns and Pattern Files
|
||||
|
||||
The Grok Processor comes pre-packaged with a base set of pattern files. These patterns may not always have
|
||||
what you are looking for. These pattern files have a very basic format. Each line describes a named pattern with
|
||||
|
@ -393,7 +365,7 @@ SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
|
|||
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
|
||||
--------------------------------------------------
|
||||
|
||||
===== Using Grok Processor in a Pipeline
|
||||
==== Using Grok Processor in a Pipeline
|
||||
|
||||
[[grok-options]]
|
||||
.Grok Options
|
||||
|
@ -417,7 +389,7 @@ a document.
|
|||
|
||||
The pattern for this could be
|
||||
|
||||
[source]
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
|
||||
--------------------------------------------------
|
||||
|
@ -474,7 +446,7 @@ An example of a pipeline specifying custom pattern definitions:
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Date processor
|
||||
=== Date processor
|
||||
|
||||
The date processor is used for parsing dates from fields, and then using that date or timestamp as the timestamp for that document.
|
||||
The date processor adds by default the parsed date as a new field called `@timestamp`, configurable by setting the `target_field`
|
||||
|
@ -512,7 +484,7 @@ An example that adds the parsed date to the `timestamp` field based on the `init
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Fail processor
|
||||
=== Fail processor
|
||||
The Fail Processor is used to raise an exception. This is useful for when
|
||||
a user expects a pipeline to fail and wishes to relay a specific message
|
||||
to the requester.
|
||||
|
@ -534,7 +506,7 @@ to the requester.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Foreach processor
|
||||
=== Foreach processor
|
||||
All processors can operate on elements inside an array, but if all elements of an array need to
|
||||
be processed in the same way defining a processor for each element becomes cumbersome and tricky
|
||||
because it is likely that the number of elements in an array are unknown. For this reason the `foreach`
|
||||
|
@ -680,7 +652,7 @@ In this example if the `remove` processor does fail then
|
|||
the array elements that have been processed thus far will
|
||||
be updated.
|
||||
|
||||
=== Accessing data in pipelines
|
||||
== Accessing data in pipelines
|
||||
|
||||
Processors in pipelines have read and write access to documents that pass through the pipeline.
|
||||
The fields in the source of a document and its metadata fields are accessible.
|
||||
|
@ -781,7 +753,8 @@ to depends on the field in the source with name `geoip.country_iso_code`.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
==== Handling Failure in Pipelines
|
||||
[[handling-failure-in-pipelines]]
|
||||
=== Handling Failure in Pipelines
|
||||
|
||||
In its simplest case, pipelines describe a list of processors which
|
||||
are executed sequentially and processing halts at the first exception. This
|
||||
|
@ -845,7 +818,7 @@ the index for which failed documents get sent.
|
|||
--------------------------------------------------
|
||||
|
||||
|
||||
===== Accessing Error Metadata From Processors Handling Exceptions
|
||||
==== Accessing Error Metadata From Processors Handling Exceptions
|
||||
|
||||
Sometimes you may want to retrieve the actual error message that was thrown
|
||||
by a failed processor. To do so you can access metadata fields called
|
||||
|
@ -878,9 +851,9 @@ of manually setting it.
|
|||
--------------------------------------------------
|
||||
|
||||
|
||||
=== Ingest APIs
|
||||
== Ingest APIs
|
||||
|
||||
==== Put pipeline API
|
||||
=== Put pipeline API
|
||||
|
||||
The put pipeline api adds pipelines and updates existing pipelines in the cluster.
|
||||
|
||||
|
@ -904,7 +877,7 @@ PUT _ingest/pipeline/my-pipeline-id
|
|||
NOTE: The put pipeline api also instructs all ingest nodes to reload their in-memory representation of pipelines, so that
|
||||
pipeline changes take immediately in effect.
|
||||
|
||||
==== Get pipeline API
|
||||
=== Get pipeline API
|
||||
|
||||
The get pipeline api returns pipelines based on id. This api always returns a local reference of the pipeline.
|
||||
|
||||
|
@ -940,7 +913,7 @@ For each returned pipeline the source and the version is returned.
|
|||
The version is useful for knowing what version of the pipeline the node has.
|
||||
Multiple ids can be provided at the same time. Also wildcards are supported.
|
||||
|
||||
==== Delete pipeline API
|
||||
=== Delete pipeline API
|
||||
|
||||
The delete pipeline api deletes pipelines by id.
|
||||
|
||||
|
@ -950,7 +923,7 @@ DELETE _ingest/pipeline/my-pipeline-id
|
|||
--------------------------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
==== Simulate pipeline API
|
||||
=== Simulate pipeline API
|
||||
|
||||
The simulate pipeline api executes a specific pipeline against
|
||||
the set of documents provided in the body of the request.
|
Loading…
Reference in New Issue