Add Ingest API to reference

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
This commit is contained in:
Naarcha-AWS 2022-02-04 09:37:42 -06:00
parent f4b537fee1
commit 8392cd19ee
5 changed files with 382 additions and 0 deletions

View File

@ -0,0 +1,75 @@
---
layout: default
title: Create or update ingest pipeline
parent: Ingest APIs
grand_parent: REST API reference
nav_order: 11
---
# Create and update a pipeline
The create ingest pipeline API operation creates or updates an ingest pipeline. Each pipeline requires an ingest definition defining how each processor transforms your documents.
## Example
```
PUT _ingest/pipeline/{id}
{
"description" : "A description for your pipeline",
"processors" : [
{
"set" : {
"field": "field-name",
"value": "value"
}
}
]
}
```
## Path and HTTP methods
```
PUT _ingest/pipeline/{id}
```
## Request body fields
Field | Type | Description
:--- | :--- | :---
`description` (optional) | string | Description of your ingest pipeline
`processors` | processor objects | A processor that transforms documents. Runs in the order specified. Appears in index once ran.
```json
{
"description" : "A description for your pipeline",
"processors" : [
{
"set" : {
"field": "field-name",
"value": "value"
}
}
]
}
```
## URL parameters
Parameter | Type | Description
:--- | :--- | :---
master_timeout | time | Explicit operation timeout for connection to master node
timeout | time | Explicit operation timeout
## Response
```json
{
"acknowledged" : true
}
```

View File

@ -0,0 +1,33 @@
---
layout: default
title: Delete a pipeline
parent: Ingest APIs
grand_parent: REST API reference
nav_order: 14
---
# Delete a pipeline
If you no longer want to use an ingest pipeline, use the delete ingest pipeline API operation.
## Example
```
DELETE _ingest/pipeline/{id}
```
## URL parameters
Parameter | Type | Description
:--- | :--- | :---
master_timeout | time | Explicit operation timeout for connection to master node
timeout | time | Explicit operation timeout
## Response
```json
{
"acknowledged" : true
}
```

View File

@ -0,0 +1,58 @@
---
layout: default
title: Get ingest pipeline
parent: Ingest APIs
grand_parent: REST API reference
nav_order: 10
---
## Get ingest pipeline
After you create a pipeline, use the get ingest pipeline API operation to return all the information about a specific ingest pipeline.
## Example
```
GET _ingest/pipeline/{id}
```
## Path and HTTP methods
Return all ingest pipelines.
```
GET _ingest/pipeline
```
Returns a single ingest pipeline based on the pipeline's ID.
```
GET _ingest/pipeline/{id}
```
## URL parameters
All parameters are optional.
Parameter | Type | Description
:--- | :--- | :---
master_timeout | time | Explicit operation timeout for connection to master node
## Response
```json
{
"pipeline-id" : {
"description" : "A description for your pipeline",
"processors" : [
{
"set" : {
"field" : "field-name",
"value" : "value"
}
}
]
}
}
```

View File

@ -0,0 +1,15 @@
---
layout: default
title: Ingest APIs
parent: REST API reference
has_children: true
nav_order: 3
redirect_from:
- /opensearch/rest-api/ingest-apis/
---
# Ingest APIs
Before you index your data, OpenSearch's ingest APIs help transform your data through the creation of ingest pipelines. Pipelines are made up of processors, a customizable task that run in succession. The transformed data appears in your data stream or index after each of the processors completes.
Ingest pipelines in OpenSearch are managed using ingest API operations. In production environments, your cluster should contain at least one node with the `node.roles: [ingest]`. For more information on setting up node roles within a cluster, see [Cluster Formation]({{site.url}}{{site.baseurl}}/cluster/).

View File

@ -0,0 +1,201 @@
---
layout: default
title: Simulate an ingest pipeline
parent: Ingest APIs
grand_parent: REST API reference
nav_order: 13
---
# Simulate a pipeline
Simulates an ingest pipeline with any example documents set you specify.
## Example
```
POST /_ingest/pipeline/{id}/_simulate
{
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"location": "document-name"
}
},
{
"_index": "index",
"_id": "id",
"_source": {
"location": "document-name"
}
}
]
}
```
## Path and HTTP methods
Simulate the last ingest pipeline created
```
GET _ingest/pipeline/_simulate
POST _ingest/pipeline/_simulate
```
Simulate a single pipeline based on the pipeline's ID.
```
GET _ingest/pipeline/{id}/_simulate
POST _ingest/pipeline/{id}/_simulate
```
## URL parameters
Parameter | Type | Description
:--- | :--- | :---
verbose | boolean | Verbose mode. Display data output for each processor in executed pipeline
## Request body fields
Field | Type | Description
:--- | :--- | :---
`pipeline` | object | The pipeline you want to simulate. When included without the pipeline `{id}` inside the request path, the response simulates the last pipeline created
`docs` | array of objects | The documents you want to use to test the pipeline
The `docs` field can include the following subfields:
Field | Type | Description
:--- | :--- | :---
`id` (Optional) | string | An optional identifier for the document. Cannot be used elsewhere in the index
`index` (Optional) | string |The index where the documents transformed data will be stored
`source` | object | The documents JSON body
## Response
### Specify pipeline in request body
```json
{
"docs" : [
{
"doc" : {
"_index" : "index",
"_type" : "_doc",
"_id" : "id",
"_source" : {
"location" : "new-new",
"field2" : "_value"
},
"_ingest" : {
"timestamp" : "2022-02-03T23:12:11.337706671Z"
}
}
},
{
"doc" : {
"_index" : "index",
"_type" : "_doc",
"_id" : "id",
"_source" : {
"location" : "new-new",
"field2" : "_value"
},
"_ingest" : {
"timestamp" : "2022-02-03T23:12:11.337721296Z"
}
}
}
]
}
```
### Specify pipeline ID inside path
```json
{
"docs" : [
{
"doc" : {
"_index" : "index",
"_type" : "_doc",
"_id" : "id",
"_source" : {
"field-name" : "value",
"location" : "document-name"
},
"_ingest" : {
"timestamp" : "2022-02-03T21:47:05.382744877Z"
}
}
},
{
"doc" : {
"_index" : "index",
"_type" : "_doc",
"_id" : "id",
"_source" : {
"field-name" : "value",
"location" : "document-name"
},
"_ingest" : {
"timestamp" : "2022-02-03T21:47:05.382803544Z"
}
}
}
]
}
```
### Receive verbose response
With the `verbose` parameter set to `true`, the response shows how each processor transform the specified document.
```json
{
"docs" : [
{
"processor_results" : [
{
"processor_type" : "set",
"status" : "success",
"doc" : {
"_index" : "index",
"_type" : "_doc",
"_id" : "id",
"_source" : {
"field-name" : "value",
"location" : "document-name"
},
"_ingest" : {
"pipeline" : "35678",
"timestamp" : "2022-02-03T21:45:09.414049004Z"
}
}
}
]
},
{
"processor_results" : [
{
"processor_type" : "set",
"status" : "success",
"doc" : {
"_index" : "index",
"_type" : "_doc",
"_id" : "id",
"_source" : {
"field-name" : "value",
"location" : "document-name"
},
"_ingest" : {
"pipeline" : "35678",
"timestamp" : "2022-02-03T21:45:09.414093212Z"
}
}
}
]
}
]
}
```