Merge pull request #16620 from dedemorton/add_ingest_doc

Add ingest docs to the build
2016-02-12 07:33:06 -08:00 · 2016-02-12 07:33:06 -08:00 · 16e87bbe14
parent 83885423a0 2734737d55
commit 16e87bbe14
3 changed files with 67 additions and 58 deletions
--- a/docs/reference/index.asciidoc
+++ b/docs/reference/index.asciidoc
@ -41,6 +41,8 @@ include::modules.asciidoc[]

 include::index-modules.asciidoc[]

+include::ingest.asciidoc[]
+
 include::testing.asciidoc[]

 include::glossary.asciidoc[]
--- a/docs/reference/ingest.asciidoc
+++ b/docs/reference/ingest.asciidoc
@ -0,0 +1,34 @@
+[[ingest]]
+= Ingest Node
+
+[partintro]
+--
+Ingest node can be used to pre-process documents before the actual indexing takes place.
+This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
+transformations and then passes the documents back to the index or bulk APIs.
+
+Ingest node is enabled by default. In order to disable ingest the following
+setting should be configured in the elasticsearch.yml file:
+
+[source,yaml]
+--------------------------------------------------
+node.ingest: false
+--------------------------------------------------
+
+It is possible to enable ingest on any node or have dedicated ingest nodes.
+
+In order to pre-process document before indexing the `pipeline` parameter should be used
+on an index or bulk request to tell Ingest what pipeline is going to be used.
+
+[source,js]
+--------------------------------------------------
+PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
+{
+  ...
+}
+--------------------------------------------------
+// AUTOSENSE
+
+--
+
+include::ingest/ingest-node.asciidoc[]
--- a/docs/reference/ingest/ingest-node.asciidoc
+++ b/docs/reference/ingest/ingest-node.asciidoc
@ -1,33 +1,5 @@
-[[ingest]]
-== Ingest Node
-
-Ingest node can be used to pre-process documents before the actual indexing takes place.
-This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
-transformations and then passes the documents back to the index or bulk APIs.
-
-Ingest node is enabled by default. In order to disable ingest the following
-setting should be configured in the elasticsearch.yml file:
-
-[source,yaml]
--------------------------------------------------
-node.ingest: false
--------------------------------------------------
-
-It is possible to enable ingest on any node or have dedicated ingest nodes.
-
-In order to pre-process document before indexing the `pipeline` parameter should be used
-on an index or bulk request to tell Ingest what pipeline is going to be used.
-
-[source,js]
--------------------------------------------------
-PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
-{
-  ...
-}
--------------------------------------------------
-// AUTOSENSE
-
-=== Pipeline Definition
+[[pipe-line]]
+== Pipeline Definition

 A pipeline is a definition of a series of processors that are to be 
 executed in the same sequential order as they are declared.
@ -45,7 +17,7 @@ what the pipeline attempts to achieve.
 The `processors` parameter defines a list of processors to be executed in 
 order.

-=== Processors
+== Processors

 All processors are defined in the following way within a pipeline definition:

@ -67,7 +39,7 @@ but is very useful for bookkeeping and tracing errors to specific processors.

 See <<handling-failure-in-pipelines>> to learn more about the `on_failure` field and error handling in pipelines.

-==== Set processor
+=== Set processor
 Sets one field and associates it with the specified value. If the field already exists,
 its value will be replaced with the provided one.

@ -90,7 +62,7 @@ its value will be replaced with the provided one.
 }
 --------------------------------------------------

-==== Append processor
+=== Append processor
 Appends one or more values to an existing array if the field already exists and it is an array.
 Converts a scalar to an array and appends one or more values to it if the field exists and it is a scalar.
 Creates an array containing the provided values if the fields doesn't exist.
@ -115,7 +87,7 @@ Accepts a single value or an array of values.
 }
 --------------------------------------------------

-==== Remove processor
+=== Remove processor
 Removes an existing field. If the field doesn't exist, an exception will be thrown

 [[remove-options]]
@ -135,7 +107,7 @@ Removes an existing field. If the field doesn't exist, an exception will be thro
 }
 --------------------------------------------------

-==== Rename processor
+=== Rename processor
 Renames an existing field. If the field doesn't exist, an exception will be thrown. Also, the new field
 name must not exist.

@ -159,7 +131,7 @@ name must not exist.
 --------------------------------------------------


-==== Convert processor
+=== Convert processor
 Converts an existing field's value to a different type, like turning a string to an integer.
 If the field value is an array, all members will be converted.

@ -187,7 +159,7 @@ false if its string value is equal to `false` (ignore case) and it will throw ex
 }
 --------------------------------------------------

-==== Gsub processor
+=== Gsub processor
 Converts a string field by applying a regular expression and a replacement.
 If the field is not a string, the processor will throw an exception.

@ -212,7 +184,7 @@ If the field is not a string, the processor will throw an exception.
 }
 --------------------------------------------------

-==== Join processor
+=== Join processor
 Joins each element of an array into a single string using a separator character between each element.
 Throws error when the field is not an array.

@ -235,7 +207,7 @@ Throws error when the field is not an array.
 }
 --------------------------------------------------

-==== Split processor
+=== Split processor
 Split a field to an array using a separator character. Only works on string fields.

 [[split-options]]
@ -255,7 +227,7 @@ Split a field to an array using a separator character. Only works on string fiel
 }
 --------------------------------------------------

-==== Lowercase processor
+=== Lowercase processor
 Converts a string to its lowercase equivalent.

 [[lowercase-options]]
@ -275,7 +247,7 @@ Converts a string to its lowercase equivalent.
 }
 --------------------------------------------------

-==== Uppercase processor
+=== Uppercase processor
 Converts a string to its uppercase equivalent.

 [[uppercase-options]]
@ -295,7 +267,7 @@ Converts a string to its uppercase equivalent.
 }
 --------------------------------------------------

-==== Trim processor
+=== Trim processor
 Trims whitespace from field. NOTE: this only works on leading and trailing whitespaces.

 [[trim-options]]
@ -315,7 +287,7 @@ Trims whitespace from field. NOTE: this only works on leading and trailing white
 }
 --------------------------------------------------

-==== Grok Processor
+=== Grok Processor

 The Grok Processor extracts structured fields out of a single text field within a document. You choose which field to
 extract matched fields from, as well as the Grok Pattern you expect will match. A Grok Pattern is like a regular
@ -330,7 +302,7 @@ Here, you can add your own custom grok pattern files with custom grok expression
 If you need help building patterns to match your logs, you will find the <http://grokdebug.herokuapp.com> and
 <http://grokconstructor.appspot.com/> applications quite useful!

-===== Grok Basics
+==== Grok Basics

 Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
 The regular expression library is Oniguruma, and you can see the full supported regexp syntax
@ -367,7 +339,7 @@ Grok expression.
 %{NUMBER:duration} %{IP:client}
 --------------------------------------------------

-===== Custom Patterns and Pattern Files
+==== Custom Patterns and Pattern Files

 The Grok Processor comes pre-packaged with a base set of pattern files. These patterns may not always have
 what you are looking for. These pattern files have a very basic format. Each line describes a named pattern with
@ -393,7 +365,7 @@ SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
 TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
 --------------------------------------------------

-===== Using Grok Processor in a Pipeline
+==== Using Grok Processor in a Pipeline

 [[grok-options]]
 .Grok Options
@ -417,7 +389,7 @@ a document.

 The pattern for this could be

-[source]
+[source,js]
 --------------------------------------------------
 %{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
 --------------------------------------------------
@ -474,7 +446,7 @@ An example of a pipeline specifying custom pattern definitions:
 }
 --------------------------------------------------

-==== Date processor
+=== Date processor

 The date processor is used for parsing dates from fields, and then using that date or timestamp as the timestamp for that document.
 The date processor adds by default the parsed date as a new field called `@timestamp`, configurable by setting the `target_field`
@ -512,7 +484,7 @@ An example that adds the parsed date to the `timestamp` field based on the `init
 }
 --------------------------------------------------

-==== Fail processor
+=== Fail processor
 The Fail Processor is used to raise an exception. This is useful for when
 a user expects a pipeline to fail and wishes to relay a specific message
 to the requester.
@ -534,7 +506,7 @@ to the requester.
 }
 --------------------------------------------------

-==== Foreach processor
+=== Foreach processor
 All processors can operate on elements inside an array, but if all elements of an array need to
 be processed in the same way defining a processor for each element becomes cumbersome and tricky
 because it is likely that the number of elements in an array are unknown. For this reason the `foreach`
@ -680,7 +652,7 @@ In this example if the `remove` processor does fail then
 the array elements that have been processed thus far will
 be updated.

-=== Accessing data in pipelines
+== Accessing data in pipelines

 Processors in pipelines have read and write access to documents that pass through the pipeline.
 The fields in the source of a document and its metadata fields are accessible.
@ -781,7 +753,8 @@ to depends on the field in the source with name `geoip.country_iso_code`.
 }
 --------------------------------------------------

-==== Handling Failure in Pipelines
+[[handling-failure-in-pipelines]]
+=== Handling Failure in Pipelines

 In its simplest case, pipelines describe a list of processors which 
 are executed sequentially and processing halts at the first exception. This 
@ -845,7 +818,7 @@ the index for which failed documents get sent.
 --------------------------------------------------


-===== Accessing Error Metadata From Processors Handling Exceptions
+==== Accessing Error Metadata From Processors Handling Exceptions

 Sometimes you may want to retrieve the actual error message that was thrown 
 by a failed processor. To do so you can access metadata fields called 
@ -878,9 +851,9 @@ of manually setting it.
 --------------------------------------------------


-=== Ingest APIs
+== Ingest APIs

-==== Put pipeline API
+=== Put pipeline API

 The put pipeline api adds pipelines and updates existing pipelines in the cluster.

@ -904,7 +877,7 @@ PUT _ingest/pipeline/my-pipeline-id
 NOTE: The put pipeline api also instructs all ingest nodes to reload their in-memory representation of pipelines, so that
      pipeline changes take immediately in effect.

-==== Get pipeline API
+=== Get pipeline API

 The get pipeline api returns pipelines based on id. This api always returns a local reference of the pipeline.

@ -940,7 +913,7 @@ For each returned pipeline the source and the version is returned.
 The version is useful for knowing what version of the pipeline the node has.
 Multiple ids can be provided at the same time. Also wildcards are supported.

-==== Delete pipeline API
+=== Delete pipeline API

 The delete pipeline api deletes pipelines by id.

@ -950,7 +923,7 @@ DELETE _ingest/pipeline/my-pipeline-id
 --------------------------------------------------
 // AUTOSENSE

-==== Simulate pipeline API
+=== Simulate pipeline API

 The simulate pipeline api executes a specific pipeline against
 the set of documents provided in the body of the request.