opensearch-docs-cn/_im-plugin/index-transforms/transforms-apis.md

741 lines
16 KiB
Markdown
Raw Normal View History

2021-05-28 19:09:25 -04:00
---
layout: default
title: Transforms APIs
nav_order: 45
2021-06-08 17:48:50 -04:00
parent: Index transforms
2021-05-28 19:09:25 -04:00
has_toc: true
---
# Transforms APIs
Aside from using OpenSearch Dashboards, you can also use the REST API to create, start, stop, and complete other operations relative to transform jobs.
#### Table of contents
- TOC
{:toc}
## Create a transform job
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Creates a transform job.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
PUT _plugins/_transform/<transform_id>
2021-05-28 19:09:25 -04:00
{
"transform": {
2021-05-28 19:34:58 -04:00
"enabled": true,
"schedule": {
"interval": {
"period": 1,
"unit": "Minutes",
"start_time": 1602100553
}
},
"description": "Sample transform job",
"source_index": "sample_index",
"target_index": "sample_target",
"data_selection_query": {
"match_all": {}
},
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
}
}
}
```
**Sample Response**
```json
{
"_id": "sample",
"_version": 7,
"_seq_no": 13,
"_primary_term": 1,
"transform": {
"transform_id": "sample",
"schema_version": 7,
2021-05-28 19:09:25 -04:00
"schedule": {
"interval": {
2021-05-28 19:34:58 -04:00
"start_time": 1621467964243,
2021-05-28 19:09:25 -04:00
"period": 1,
2021-05-28 19:34:58 -04:00
"unit": "Minutes"
2021-05-28 19:09:25 -04:00
}
},
2021-05-28 19:34:58 -04:00
"metadata_id": null,
"updated_at": 1621467964243,
"enabled": true,
"enabled_at": 1621467964243,
2021-05-28 19:09:25 -04:00
"description": "Sample transform job",
"source_index": "sample_index",
"data_selection_query": {
2021-05-28 19:34:58 -04:00
"match_all": {
"boost": 1.0
}
2021-05-28 19:09:25 -04:00
},
2021-05-28 19:34:58 -04:00
"target_index": "sample_target",
"roles": [],
2021-05-28 19:09:25 -04:00
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
}
}
}
```
You can specify the following options in the HTTP request body:
Option | Data Type | Description | Required
:--- | :--- | :--- | :---
enabled | Boolean | If true, the transform job is enabled at creation. | No
schedule | Object | The schedule the transform job runs on. | Yes
2021-05-28 19:09:25 -04:00
start_time | Integer | The Unix epoch time of the transform job's start time. | Yes
description | String | Describes the transform job. | No
metadata_id | String | Any metadata to be associated with the transform job. | No
source_index | String | The source index whose data to transform. | Yes
target_index | String | The target index the newly transformed data is added into. You can create a new index or update an existing one. | Yes
data_selection_query | Object | The query DSL to use to filter a subset of the source index for the transform job. See [query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl) for more information. | Yes
page_size | Integer | The number of buckets IM processes and indexes concurrently. Higher number means better performance but requires more memory. If your machine runs out of memory, IM automatically adjusts this field and retries until the operation succeeds. | Yes
2021-06-09 22:15:41 -04:00
groups | Array | Specifies the grouping(s) to use in the transform job. Supported groups are `terms`, `histogram`, and `date_histogram`. For more information, see [Bucket Aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg). | Yes if not using aggregations
2021-05-28 19:09:25 -04:00
source_field | String | The field(s) to transform | Yes
aggregations | Object | The aggregations to use in the transform job. Supported aggregations are: `sum`, `max`, `min`, `value_count`, `avg`, `scripted_metric`, and `percentiles`. For more information, see [Metric Aggregations]({{site.url}}{{site.baseurl}}/opensearch/metric-agg). | Yes if not using groups
2021-05-28 19:09:25 -04:00
## Update a transform job
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Updates a transform job if `transform_id` already exists.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
PUT _plugins/_transform/<transform_id>
2021-05-28 19:09:25 -04:00
{
"transform": {
2021-05-28 19:34:58 -04:00
"enabled": true,
"schedule": {
"interval": {
"period": 1,
"unit": "Minutes",
"start_time": 1602100553
}
},
"description": "Sample transform job",
"source_index": "sample_index",
"target_index": "sample_target",
"data_selection_query": {
"match_all": {}
},
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
}
}
}
```
**Sample Response**
```json
{
"_id": "sample",
"_version": 2,
"_seq_no": 14,
"_primary_term": 1,
"transform": {
"transform_id": "sample",
"schema_version": 7,
2021-05-28 19:09:25 -04:00
"schedule": {
"interval": {
2021-05-28 19:34:58 -04:00
"start_time": 1602100553,
2021-05-28 19:09:25 -04:00
"period": 1,
2021-05-28 19:34:58 -04:00
"unit": "Minutes"
2021-05-28 19:09:25 -04:00
}
},
2021-05-28 19:34:58 -04:00
"metadata_id": null,
"updated_at": 1621889843874,
"enabled": true,
"enabled_at": 1621889843874,
2021-05-28 19:09:25 -04:00
"description": "Sample transform job",
"source_index": "sample_index",
"data_selection_query": {
2021-05-28 19:34:58 -04:00
"match_all": {
"boost": 1.0
}
2021-05-28 19:09:25 -04:00
},
2021-05-28 19:34:58 -04:00
"target_index": "sample_target",
"roles": [],
2021-05-28 19:09:25 -04:00
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
}
}
}
```
The Update operation supports the following URL parameters:
2021-05-28 19:09:25 -04:00
Parameter | Description | Required
:---| :--- | :---
`if_seq_no` | Only perform the transform operation if the last operation that changed the transform job has the specified sequence number. | Yes
`if_primary_term` | Only perform the transform operation if the last operation that changed the transform job has the specified sequence term. | Yes
You can update the following fields:
Option | Data Type | Description
:--- | :--- | :---
schedule | Object | The schedule the transform job runs on. Contains the fields `interval.start_time`, `interval.period`, and `interval.unit`.
start_time | Integer | The Unix epoch start time of the transform job.
period | Integer | How often to execute the transform job.
unit | String | The unit of time associated with the execution period. Available options are `Minutes`, `Hours`, and `Days`.
description | Integer | Describes the transform job.
page_size | Integer | The number of buckets IM processes and indexes concurrently. Higher number means better performance but requires more memory. If your machine runs out of memory, IM automatically adjusts this field and retries until the operation succeeds.
2021-05-28 19:09:25 -04:00
## Get a transform job's details
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Returns a transform job's details.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
GET _plugins/_transform/<transform_id>
2021-05-28 19:09:25 -04:00
```
**Sample Response**
```json
{
2021-05-28 19:34:58 -04:00
"_id": "sample",
"_version": 7,
"_seq_no": 13,
"_primary_term": 1,
"transform": {
"transform_id": "sample",
"schema_version": 7,
"schedule": {
"interval": {
"start_time": 1621467964243,
"period": 1,
"unit": "Minutes"
}
},
"metadata_id": null,
"updated_at": 1621467964243,
"enabled": true,
"enabled_at": 1621467964243,
"description": "Sample transform job",
"source_index": "sample_index",
"data_selection_query": {
"match_all": {
"boost": 1.0
}
},
"target_index": "sample_target",
"roles": [],
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
}
}
}
```
You can also get details of all transform jobs by omitting `transform_id`.
**Sample Request**
```json
2021-06-08 17:50:25 -04:00
GET _plugins/_transform/
2021-05-28 19:34:58 -04:00
```
**Sample Response**
```json
{
"total_transforms": 1,
"transforms": [
{
"_id": "sample",
"_seq_no": 13,
"_primary_term": 1,
"transform": {
2021-05-28 19:09:25 -04:00
"transform_id": "sample",
"schema_version": 7,
"schedule": {
2021-05-28 19:34:58 -04:00
"interval": {
"start_time": 1621467964243,
"period": 1,
"unit": "Minutes"
}
2021-05-28 19:09:25 -04:00
},
"metadata_id": null,
"updated_at": 1621467964243,
"enabled": true,
"enabled_at": 1621467964243,
"description": "Sample transform job",
"source_index": "sample_index",
"data_selection_query": {
2021-05-28 19:34:58 -04:00
"match_all": {
"boost": 1.0
}
2021-05-28 19:09:25 -04:00
},
"target_index": "sample_target",
"roles": [],
"page_size": 1,
"groups": [
2021-05-28 19:34:58 -04:00
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
}
2021-05-28 19:09:25 -04:00
],
"aggregations": {
2021-05-28 19:34:58 -04:00
"quantity": {
"sum": {
"field": "total_quantity"
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
}
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
}
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
]
2021-05-28 19:09:25 -04:00
}
```
You can specify these options as the `GET` API operations URL parameters to filter results:
Parameter | Description | Required
:--- | :--- | :---
from | The starting index to search from. Default is 0. | No
size | Specifies the amount of results to return. Default is 10. | No
2021-05-28 19:09:25 -04:00
search |The search term to use to filter results. | No
sortField | The field to sort results with. | No
sortDirection | Specifies the direction to sort results in. Can be `ASC` or `DESC`. Default is ASC. | No
2021-05-28 19:09:25 -04:00
For example, this request returns two results starting from the eighth index.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
GET _plugins/_transform?size=2&from=8
2021-05-28 19:09:25 -04:00
```
**Sample Response**
```json
{
2021-05-28 19:34:58 -04:00
"total_transforms": 18,
"transforms": [
{
"_id": "sample8",
"_seq_no": 93,
"_primary_term": 1,
"transform": {
"transform_id": "sample8",
"schema_version": 7,
"schedule": {
"interval": {
"start_time": 1622063596812,
"period": 1,
"unit": "Minutes"
}
},
"metadata_id": "y4hFAB2ZURQ2dzY7BAMxWA",
"updated_at": 1622063657233,
"enabled": false,
"enabled_at": null,
"description": "Sample transform job",
"source_index": "sample_index3",
"data_selection_query": {
"match_all": {
"boost": 1.0
}
},
"target_index": "sample_target3",
"roles": [],
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
}
}
},
{
"_id": "sample9",
"_seq_no": 98,
"_primary_term": 1,
"transform": {
"transform_id": "sample9",
"schema_version": 7,
"schedule": {
"interval": {
"start_time": 1622063598065,
"period": 1,
"unit": "Minutes"
}
},
"metadata_id": "x8tCIiYMTE3veSbIJkit5A",
"updated_at": 1622063658388,
"enabled": false,
"enabled_at": null,
"description": "Sample transform job",
"source_index": "sample_index4",
"data_selection_query": {
"match_all": {
"boost": 1.0
}
2021-05-28 19:09:25 -04:00
},
2021-05-28 19:34:58 -04:00
"target_index": "sample_target4",
"roles": [],
"page_size": 1,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
},
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
}
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
}
}
]
2021-05-28 19:09:25 -04:00
}
```
## Start a transform job
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Transform jobs created using the API are automatically enabled, but if you ever need to enable a job, you can use the `start` API operation.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
POST _plugins/_transform/<transform_id>/_start
2021-05-28 19:09:25 -04:00
```
**Sample Response**
```json
{
2021-05-28 19:34:58 -04:00
"acknowledged": true
2021-05-28 19:09:25 -04:00
}
```
## Stop a transform job
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Stops/disables a transform job.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
POST _plugins/_transform/<transform_id>/_stop
2021-05-28 19:09:25 -04:00
```
**Sample Response**
```json
{
2021-05-28 19:34:58 -04:00
"acknowledged": true
2021-05-28 19:09:25 -04:00
}
```
## Get the status of a transform job
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Returns the status and metadata of a transform job.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
GET _plugins/_transform/<transform_id>/_explain
2021-05-28 19:09:25 -04:00
```
**Sample Response**
```json
{
2021-05-28 19:34:58 -04:00
"sample": {
"metadata_id": "PzmjweME5xbgkenl9UpsYw",
"transform_metadata": {
"transform_id": "sample",
"last_updated_at": 1621883525873,
"status": "finished",
"failure_reason": "null",
"stats": {
"pages_processed": 0,
"documents_processed": 0,
"documents_indexed": 0,
"index_time_in_millis": 0,
"search_time_in_millis": 0
}
2021-05-28 19:09:25 -04:00
}
2021-05-28 19:34:58 -04:00
}
2021-05-28 19:09:25 -04:00
}
```
## Preview a transform job's results
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Returns a preview of what a transformed index would look like.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
POST _plugins/_transform/_preview
2021-05-28 19:09:25 -04:00
{
"transform": {
2021-05-28 19:34:58 -04:00
"enabled": false,
"schedule": {
"interval": {
"period": 1,
"unit": "Minutes",
"start_time": 1602100553
}
},
"description": "test transform",
"source_index": "sample_index",
"target_index": "sample_target",
"data_selection_query": {
"match_all": {}
},
"page_size": 10,
"groups": [
{
"terms": {
"source_field": "customer_gender",
"target_field": "gender"
}
2021-05-28 19:09:25 -04:00
},
2021-05-28 19:34:58 -04:00
{
"terms": {
"source_field": "day_of_week",
"target_field": "day"
}
}
],
"aggregations": {
"quantity": {
"sum": {
"field": "total_quantity"
}
2021-05-28 19:09:25 -04:00
}
}
2021-05-28 19:34:58 -04:00
}
2021-05-28 19:09:25 -04:00
}
```
**Sample Response**
```json
{
"documents" : [
2021-05-28 19:34:58 -04:00
{
"quantity" : 862.0,
"gender" : "FEMALE",
"day" : "Friday"
},
{
"quantity" : 682.0,
"gender" : "FEMALE",
"day" : "Monday"
},
{
"quantity" : 772.0,
"gender" : "FEMALE",
"day" : "Saturday"
},
{
"quantity" : 669.0,
"gender" : "FEMALE",
"day" : "Sunday"
},
{
"quantity" : 887.0,
"gender" : "FEMALE",
"day" : "Thursday"
}
2021-05-28 19:09:25 -04:00
]
}
```
## Delete a transform job
2021-07-26 19:14:22 -04:00
Introduced 1.0
{: .label .label-purple }
2021-05-28 19:09:25 -04:00
Deletes a transform job. This operation does not delete the source or target indices.
**Sample Request**
```json
2021-06-08 17:48:50 -04:00
DELETE _plugins/_transform/<transform_id>
2021-05-28 19:09:25 -04:00
```
**Sample Response**
```json
{
2021-05-28 19:34:58 -04:00
"took": 205,
"errors": false,
"items": [
{
"delete": {
2021-06-08 17:48:50 -04:00
"_index": ".opensearch-ism-config",
2021-05-28 19:34:58 -04:00
"_type": "_doc",
"_id": "sample",
"_version": 4,
"result": "deleted",
"forced_refresh": true,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 6,
"_primary_term": 1,
"status": 200
}
}
]
2021-05-28 19:09:25 -04:00
}
```