Merge pull request #16203 from talevy/ingest_docs_error

[Ingest] add docs for on_failure support in ingest pipelines
This commit is contained in:
Tal Levy 2016-01-25 06:50:02 -08:00
commit 00b92c4929
1 changed files with 97 additions and 0 deletions

View File

@ -578,6 +578,103 @@ to depends on the field in the source with name `geoip.country_iso_code`.
}
--------------------------------------------------
==== Handling Failure in Pipelines
In its simplest case, pipelines describe a list of processors which
are executed sequentially and processing halts at the first exception. This
may not be desirable when failures are expected. For example, not all your logs
may match a certain grok expression and you may wish to index such documents into
a separate index.
To enable this behavior, you can utilize the `on_failure` parameter. `on_failure`
defines a list of processors to be executed immediately following the failed processor.
This parameter can be supplied at the pipeline level, as well as at the processor
level. If a processor has an `on_failure` configuration option provided, whether
it is empty or not, any exceptions that are thrown by it will be caught and the
pipeline will continue executing the proceeding processors defined. Since further processors
are defined within the scope of an `on_failure` statement, failure handling can be nested.
Example: In the following example we define a pipeline that hopes to rename documents with
a field named `foo` to `bar`. If the document does not contain the `foo` field, we
go ahead and attach an error message within the document for later analysis within
Elasticsearch.
[source,js]
--------------------------------------------------
{
"description" : "my first pipeline with handled exceptions",
"processors" : [
{
"rename" : {
"field" : "foo",
"to" : "bar",
"on_failure" : [
{
"set" : {
"field" : "error",
"value" : "field \"foo\" does not exist, cannot rename to \"bar\""
}
}
]
}
}
]
}
--------------------------------------------------
Example: Here we define an `on_failure` block on a whole pipeline to change
the index for which failed documents get sent.
[source,js]
--------------------------------------------------
{
"description" : "my first pipeline with handled exceptions",
"processors" : [ ... ],
"on_failure" : [
{
"set" : {
"field" : "_index",
"value" : "failed-{{ _index }}"
}
}
]
}
--------------------------------------------------
===== Accessing Error Metadata From Processors Handling Exceptions
Sometimes you may want to retrieve the actual error message that was thrown
by a failed processor. To do so you can access metadata fields called
`on_failure_message` and `on_failure_processor`. These fields are only accessible
from within the context of an `on_failure` block. Here is an updated version of
our first example which leverages these fields to provide the error message instead
of manually setting it.
[source,js]
--------------------------------------------------
{
"description" : "my first pipeline with handled exceptions",
"processors" : [
{
"rename" : {
"field" : "foo",
"to" : "bar",
"on_failure" : [
{
"set" : {
"field" : "error",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]
}
}
]
}
--------------------------------------------------
=== Ingest APIs
==== Put pipeline API