Merge pull request #22300 from dadoonet/doc/ingest-attachment

Adds more information about ingest attachment properties extraction
This commit is contained in:
David Pilato 2016-12-23 11:30:19 +01:00 committed by GitHub
commit 2511442a92
1 changed files with 21 additions and 1 deletions

View File

@ -52,7 +52,7 @@ The node must be stopped before removing the plugin.
| `field` | yes | - | The field to get the base64 encoded field from
| `target_field` | no | attachment | The field that will hold the attachment information
| `indexed_chars` | no | 100000 | The number of chars being used for extraction to prevent huge fields. Use `-1` for no limit.
| `properties` | no | all | Properties to select to be stored. Can be `content`, `title`, `name`, `author`, `keywords`, `date`, `content_type`, `content_length`, `language`
| `properties` | no | all properties | Array of properties to select to be stored. Can be `content`, `title`, `name`, `author`, `keywords`, `date`, `content_type`, `content_length`, `language`
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
|======
@ -102,6 +102,26 @@ Returns this:
--------------------------------------------------
// TESTRESPONSE
To specify only some fields to be extracted:
[source,js]
--------------------------------------------------
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "data",
"properties": [ "content", "title" ]
}
}
]
}
--------------------------------------------------
// CONSOLE
NOTE: Extracting contents from binary data is a resource intensive operation and
consumes a lot of resources. It is highly recommended to run pipelines
using this processor in a dedicated ingest node.