OpenSearch/docs/reference/ingest/apis/simulate-pipeline.asciidoc

433 lines
8.5 KiB
Plaintext

[[simulate-pipeline-api]]
=== Simulate pipeline API
++++
<titleabbrev>Simulate pipeline</titleabbrev>
++++
Executes an ingest pipeline against
a set of provided documents.
////
[source,console]
----
PUT /_ingest/pipeline/my-pipeline-id
{
"description" : "example pipeline to simulate",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value"
}
}
]
}
----
// TESTSETUP
////
[source,console]
----
POST /_ingest/pipeline/my-pipeline-id/_simulate
{
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
[[simulate-pipeline-api-request]]
==== {api-request-title}
POST /_ingest/pipeline/<pipeline>/_simulate
GET /_ingest/pipeline/<pipeline>/_simulate
POST /_ingest/pipeline/_simulate
GET /_ingest/pipeline/_simulate
[[simulate-pipeline-api-desc]]
==== {api-description-title}
The simulate pipeline API executes a specific pipeline
against a set of documents provided in the body of the request.
You can either specify an existing pipeline
to execute against the provided documents
or supply a pipeline definition in the body of the request.
[[simulate-pipeline-api-path-params]]
==== {api-path-parms-title}
`<pipeline>`::
(Optional, string)
Pipeline ID used to simulate an ingest.
[[simulate-pipeline-api-query-params]]
==== {api-query-parms-title}
`verbose`::
(Optional, boolean)
If `true`,
the response includes output data
for each processor in the executed pipeline.
[[simulate-pipeline-api-request-body]]
==== {api-request-body-title}
`description`::
(Optional, string)
Description of the ingest pipeline.
`processors`::
+
--
(Optional, array of <<ingest-processors,processor objects>>)
Array of processors used to pre-process documents
during ingest.
Processors are executed in the order provided.
See <<ingest-processors>> for processor object definitions
and a list of built-in processors.
--
`docs`::
+
--
(Required, array)
Array of documents
ingested by the pipeline.
Document object parameters include:
`_index`::
(Optional, string)
Name of the index containing the document.
`_id`::
(Optional, string)
Unique identifier for the document.
This ID is only unique within the index.
`_source`::
(Required, object)
JSON body for the document.
--
[[simulate-pipeline-api-example]]
==== {api-examples-title}
[[simulate-pipeline-api-path-parm-ex]]
===== Specify a pipeline as a path parameter
[source,console]
----
POST /_ingest/pipeline/my-pipeline-id/_simulate
{
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
The API returns the following response:
[source,console-result]
----
{
"docs": [
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "bar"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.187Z"
}
}
},
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "rab"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.188Z"
}
}
}
]
}
----
// TESTRESPONSE[s/"2017-05-04T22:30:03.187Z"/$body.docs.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:30:03.188Z"/$body.docs.1.doc._ingest.timestamp/]
[[simulate-pipeline-api-request-body-ex]]
===== Specify a pipeline in the request body
[source,console]
----
POST /_ingest/pipeline/_simulate
{
"pipeline" :
{
"description": "_description",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value"
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
The API returns the following response:
[source,console-result]
----
{
"docs": [
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "bar"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.187Z"
}
}
},
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "rab"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.188Z"
}
}
}
]
}
----
// TESTRESPONSE[s/"2017-05-04T22:30:03.187Z"/$body.docs.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:30:03.188Z"/$body.docs.1.doc._ingest.timestamp/]
[[ingest-verbose-param]]
===== View verbose results
You can use the simulate pipeline API
to see how each processor affects the ingest document
as it passes through the pipeline.
To see the intermediate results
of each processor in the simulate request,
you can add the `verbose` parameter to the request.
[source,console]
----
POST /_ingest/pipeline/_simulate?verbose
{
"pipeline" :
{
"description": "_description",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value2"
}
},
{
"set" : {
"field" : "field3",
"value" : "_value3"
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
The API returns the following response:
[source,console-result]
----
{
"docs": [
{
"processor_results": [
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value2",
"foo": "bar"
},
"_ingest": {
"timestamp": "2017-05-04T22:46:09.674Z"
}
}
},
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field3": "_value3",
"field2": "_value2",
"foo": "bar"
},
"_ingest": {
"timestamp": "2017-05-04T22:46:09.675Z"
}
}
}
]
},
{
"processor_results": [
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value2",
"foo": "rab"
},
"_ingest": {
"timestamp": "2017-05-04T22:46:09.676Z"
}
}
},
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field3": "_value3",
"field2": "_value2",
"foo": "rab"
},
"_ingest": {
"timestamp": "2017-05-04T22:46:09.677Z"
}
}
}
]
}
]
}
----
// TESTRESPONSE[s/"2017-05-04T22:46:09.674Z"/$body.docs.0.processor_results.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:46:09.675Z"/$body.docs.0.processor_results.1.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:46:09.676Z"/$body.docs.1.processor_results.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:46:09.677Z"/$body.docs.1.processor_results.1.doc._ingest.timestamp/]
////
[source,console]
----
DELETE /_ingest/pipeline/*
----
[source,console-result]
----
{
"acknowledged": true
}
----
////