OpenSearch/docs/reference/ingest/apis/simulate-pipeline.asciidoc
Jake Landis d2e5f2f532
[7.x] Enhance the ingest node simulate verbose output (#60433) (#60678)
This commit enhances the verbose output for the
`_ingest/pipeline/_simulate?verbose` api. Specifically
this adds the following:
* the pipeline processor is now included in the output
* the conditional (if) and result is now included in the output iff it was defined
* a status field is always displayed. the possible values of status are
  * `success` - if the processor ran with out errors
  * `error` - if the processor ran but threw an error that was not ingored
  * `error_ignored` - if the processor ran but threw an error that was ingored
  * `skipped` - if the process did not run (currently only possible if the if condition evaluates to false)
  * `dropped` - if the the `drop` processor ran and dropped the document
* a `processor_type` field for the type of processor (e.g. set, rename, etc.)
* throw a better error if trying to simulate with a pipeline that does not exist

closes #56004
2020-08-27 16:53:09 -05:00

446 lines
8.6 KiB
Plaintext

[[simulate-pipeline-api]]
=== Simulate pipeline API
++++
<titleabbrev>Simulate pipeline</titleabbrev>
++++
Executes an ingest pipeline against
a set of provided documents.
////
[source,console]
----
PUT /_ingest/pipeline/my-pipeline-id
{
"description" : "example pipeline to simulate",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value"
}
}
]
}
----
// TESTSETUP
////
[source,console]
----
POST /_ingest/pipeline/my-pipeline-id/_simulate
{
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
[[simulate-pipeline-api-request]]
==== {api-request-title}
`POST /_ingest/pipeline/<pipeline>/_simulate`
`GET /_ingest/pipeline/<pipeline>/_simulate`
`POST /_ingest/pipeline/_simulate`
`GET /_ingest/pipeline/_simulate`
[[simulate-pipeline-api-desc]]
==== {api-description-title}
The simulate pipeline API executes a specific pipeline
against a set of documents provided in the body of the request.
You can either specify an existing pipeline
to execute against the provided documents
or supply a pipeline definition in the body of the request.
[[simulate-pipeline-api-path-params]]
==== {api-path-parms-title}
`<pipeline>`::
(Optional, string)
Pipeline ID used to simulate an ingest.
[[simulate-pipeline-api-query-params]]
==== {api-query-parms-title}
`verbose`::
(Optional, boolean)
If `true`,
the response includes output data
for each processor in the executed pipeline.
[[simulate-pipeline-api-request-body]]
==== {api-request-body-title}
`description`::
(Optional, string)
Description of the ingest pipeline.
`processors`::
+
--
(Optional, array of <<ingest-processors,processor objects>>)
Array of processors used to pre-process documents
during ingest.
Processors are executed in the order provided.
See <<ingest-processors>> for processor object definitions
and a list of built-in processors.
--
`docs`::
+
--
(Required, array)
Array of documents
ingested by the pipeline.
Document object parameters include:
`_index`::
(Optional, string)
Name of the index containing the document.
`_id`::
(Optional, string)
Unique identifier for the document.
This ID is only unique within the index.
`_source`::
(Required, object)
JSON body for the document.
--
[[simulate-pipeline-api-example]]
==== {api-examples-title}
[[simulate-pipeline-api-path-parm-ex]]
===== Specify a pipeline as a path parameter
[source,console]
----
POST /_ingest/pipeline/my-pipeline-id/_simulate
{
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
The API returns the following response:
[source,console-result]
----
{
"docs": [
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "bar"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.187Z"
}
}
},
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "rab"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.188Z"
}
}
}
]
}
----
// TESTRESPONSE[s/"2017-05-04T22:30:03.187Z"/$body.docs.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:30:03.188Z"/$body.docs.1.doc._ingest.timestamp/]
[[simulate-pipeline-api-request-body-ex]]
===== Specify a pipeline in the request body
[source,console]
----
POST /_ingest/pipeline/_simulate
{
"pipeline" :
{
"description": "_description",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value"
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
The API returns the following response:
[source,console-result]
----
{
"docs": [
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "bar"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.187Z"
}
}
},
{
"doc": {
"_id": "id",
"_index": "index",
"_type": "_doc",
"_source": {
"field2": "_value",
"foo": "rab"
},
"_ingest": {
"timestamp": "2017-05-04T22:30:03.188Z"
}
}
}
]
}
----
// TESTRESPONSE[s/"2017-05-04T22:30:03.187Z"/$body.docs.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2017-05-04T22:30:03.188Z"/$body.docs.1.doc._ingest.timestamp/]
[[ingest-verbose-param]]
===== View verbose results
You can use the simulate pipeline API
to see how each processor affects the ingest document
as it passes through the pipeline.
To see the intermediate results
of each processor in the simulate request,
you can add the `verbose` parameter to the request.
[source,console]
----
POST /_ingest/pipeline/_simulate?verbose
{
"pipeline" :
{
"description": "_description",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value2"
}
},
{
"set" : {
"field" : "field3",
"value" : "_value3"
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "rab"
}
}
]
}
----
The API returns the following response:
[source,console-result]
----
{
"docs": [
{
"processor_results": [
{
"processor_type": "set",
"status": "success",
"doc": {
"_index": "index",
"_type": "_doc",
"_id": "id",
"_source": {
"field2": "_value2",
"foo": "bar"
},
"_ingest": {
"pipeline": "_simulate_pipeline",
"timestamp": "2020-07-30T01:21:24.251836Z"
}
}
},
{
"processor_type": "set",
"status": "success",
"doc": {
"_index": "index",
"_type": "_doc",
"_id": "id",
"_source": {
"field3": "_value3",
"field2": "_value2",
"foo": "bar"
},
"_ingest": {
"pipeline": "_simulate_pipeline",
"timestamp": "2020-07-30T01:21:24.251836Z"
}
}
}
]
},
{
"processor_results": [
{
"processor_type": "set",
"status": "success",
"doc": {
"_index": "index",
"_type": "_doc",
"_id": "id",
"_source": {
"field2": "_value2",
"foo": "rab"
},
"_ingest": {
"pipeline": "_simulate_pipeline",
"timestamp": "2020-07-30T01:21:24.251863Z"
}
}
},
{
"processor_type": "set",
"status": "success",
"doc": {
"_index": "index",
"_type": "_doc",
"_id": "id",
"_source": {
"field3": "_value3",
"field2": "_value2",
"foo": "rab"
},
"_ingest": {
"pipeline": "_simulate_pipeline",
"timestamp": "2020-07-30T01:21:24.251863Z"
}
}
}
]
}
]
}
----
// TESTRESPONSE[s/"2020-07-30T01:21:24.251836Z"/$body.docs.0.processor_results.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2020-07-30T01:21:24.251836Z"/$body.docs.0.processor_results.1.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2020-07-30T01:21:24.251863Z"/$body.docs.1.processor_results.0.doc._ingest.timestamp/]
// TESTRESPONSE[s/"2020-07-30T01:21:24.251863Z"/$body.docs.1.processor_results.1.doc._ingest.timestamp/]
////
[source,console]
----
DELETE /_ingest/pipeline/*
----
[source,console-result]
----
{
"acknowledged": true
}
----
////