[DOCS] Add CBOR example to ingest attachment docs (#60919) (#60964)

This commit is contained in:
James Rodewig 2020-08-11 10:28:22 -04:00 committed by GitHub
parent d544528c7b
commit a1100bb770
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 47 additions and 0 deletions

View File

@ -100,6 +100,53 @@ NOTE: Extracting contents from binary data is a resource intensive operation and
consumes a lot of resources. It is highly recommended to run pipelines
using this processor in a dedicated ingest node.
[[ingest-attachment-cbor]]
==== Use the attachment processor with CBOR
To avoid encoding and decoding JSON to base64, you can instead pass CBOR data to
the attachment processor. For example, the following request creates the
`cbor-attachment` pipeline, which uses the attachment processor.
[source,console]
----
PUT _ingest/pipeline/cbor-attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "data"
}
}
]
}
----
The following Python script passes CBOR data to an HTTP indexing request that
includes the `cbor-attachment` pipeline. The HTTP request headers use a
a `content-type` of `application/cbor`.
NOTE: Not all {es} clients support custom HTTP request headers.
[source,python]
----
import cbor2
import requests
file = 'my-file'
headers = {'content-type': 'application/cbor'}
with open(file, 'rb') as f:
doc = {
'data': f.read()
}
requests.put(
'http://localhost:9200/my-index-000001/_doc/my_id?pipeline=cbor-attachment',
data=cbor2.dumps(doc),
headers=headers
)
----
[[ingest-attachment-extracted-chars]]
==== Limit the number of extracted chars