OpenSearch/docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

--
:api: put-data-frame-transform
:request: PutDataFrameTransformRequest
:response: AcknowledgedResponse
--
[id="{upid}-{api}"]
=== Put Data Frame Transform API

The Put Data Frame Transform API is used to create a new {dataframe-transform}.

The API accepts a +{request}+ object as a request and returns a +{response}+.

[id="{upid}-{api}-request"]
==== Put Data Frame Request

A +{request}+ requires the following argument:

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-request]
--------------------------------------------------
<1> The configuration of the {dataframe-job} to create

[id="{upid}-{api}-config"]
==== Data Frame Transform Configuration

The `DataFrameTransformConfig` object contains all the details about the {dataframe-transform}
configuration and contains the following arguments:

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-config]
--------------------------------------------------
<1> The {dataframe-transform} ID
<2> The source indices and query from which to gather data
<3> The destination index and optional pipeline
<4> How often to check for updates to the source indices
<5> The PivotConfig
<6> Optional free text description of the transform

[id="{upid}-{api}-query-config"]

==== SourceConfig

The indices and the query from which to collect data.
If query is not set, a `match_all` query is used by default.

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-source-config]
--------------------------------------------------

==== DestConfig

The index where to write the data and the optional pipeline
through which the docs should be indexed

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-dest-config]
--------------------------------------------------

===== QueryConfig

The query with which to select data from the source.

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-query-config]
--------------------------------------------------

==== PivotConfig

Defines the pivot function `group by` fields and the aggregation to reduce the data.

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-pivot-config]
--------------------------------------------------
<1> The `GroupConfig` to use in the pivot
<2> The aggregations to use
<3> The maximum paging size for the transform when pulling data
from the source. The size dynamically adjusts as the transform
is running to recover from and prevent OOM issues.

===== GroupConfig
The grouping terms. Defines the group by and destination fields
which are produced by the pivot function. There are 3 types of
groups

* Terms
* Histogram
* Date Histogram

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-group-config]
--------------------------------------------------
<1> The destination field
<2> Group by values of the `user_id` field

===== AggregationConfig

Defines the aggregations for the group fields.
// TODO link to the supported aggregations

["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-agg-config]
--------------------------------------------------
<1> Aggregate the average star rating

include::../execution.asciidoc[]

[id="{upid}-{api}-response"]
==== Response

The returned +{response}+ acknowledges the successful creation of
the new {dataframe-transform} or an error if the configuration is invalid.
[ML-Dataframe] Add Data Frame client to the Java HLRC (#40040) Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform. 2019-03-14 14:57:12 +00:00			`--`
			`:api: put-data-frame-transform`
			`:request: PutDataFrameTransformRequest`
			`:response: AcknowledgedResponse`
			`--`
			`[id="{upid}-{api}"]`
			`=== Put Data Frame Transform API`

			`The Put Data Frame Transform API is used to create a new {dataframe-transform}.`

			`The API accepts a +{request}+ object as a request and returns a +{response}+.`

			`[id="{upid}-{api}-request"]`
			`==== Put Data Frame Request`

			`A +{request}+ requires the following argument:`

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-request]`
			`--------------------------------------------------`
			`<1> The configuration of the {dataframe-job} to create`

			`[id="{upid}-{api}-config"]`
			`==== Data Frame Transform Configuration`

			The `DataFrameTransformConfig` object contains all the details about the {dataframe-transform}
			`configuration and contains the following arguments:`

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-config]`
			`--------------------------------------------------`
			`<1> The {dataframe-transform} ID`
[ML] make source and dest objects in the transform config (#40337) (#40396) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test 2019-03-25 07:16:41 -05:00			`<2> The source indices and query from which to gather data`
[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124) (#43388) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0 2019-06-19 16:18:27 -05:00			`<3> The destination index and optional pipeline`
[ML-DataFrame] Add a frequency option to transform config, default 1m (#44120) Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config. 2019-07-10 09:35:23 +01:00			`<4> How often to check for updates to the source indices`
			`<5> The PivotConfig`
			`<6> Optional free text description of the transform`
[ML-Dataframe] Add Data Frame client to the Java HLRC (#40040) Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform. 2019-03-14 14:57:12 +00:00
			`[id="{upid}-{api}-query-config"]`
[ML] make source and dest objects in the transform config (#40337) (#40396) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test 2019-03-25 07:16:41 -05:00
			`==== SourceConfig`

			`The indices and the query from which to collect data.`
			If query is not set, a `match_all` query is used by default.

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-source-config]`
			`--------------------------------------------------`

[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124) (#43388) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0 2019-06-19 16:18:27 -05:00			`==== DestConfig`

			`The index where to write the data and the optional pipeline`
			`through which the docs should be indexed`

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-dest-config]`
			`--------------------------------------------------`

[ML] make source and dest objects in the transform config (#40337) (#40396) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test 2019-03-25 07:16:41 -05:00			`===== QueryConfig`
[ML-Dataframe] Add Data Frame client to the Java HLRC (#40040) Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform. 2019-03-14 14:57:12 +00:00
			`The query with which to select data from the source.`

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-query-config]`
			`--------------------------------------------------`

			`==== PivotConfig`

			Defines the pivot function `group by` fields and the aggregation to reduce the data.

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-pivot-config]`
			`--------------------------------------------------`
[ML] adding pivot.max_search_page_size option for setting paging size (#41920) (#42079) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change 2019-05-10 13:22:31 -05:00			<1> The `GroupConfig` to use in the pivot
			`<2> The aggregations to use`
			`<3> The maximum paging size for the transform when pulling data`
			`from the source. The size dynamically adjusts as the transform`
			`is running to recover from and prevent OOM issues.`
[ML-Dataframe] Add Data Frame client to the Java HLRC (#40040) Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform. 2019-03-14 14:57:12 +00:00
			`===== GroupConfig`
			`The grouping terms. Defines the group by and destination fields`
			`which are produced by the pivot function. There are 3 types of`
			`groups`

			`* Terms`
			`* Histogram`
			`* Date Histogram`

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-group-config]`
			`--------------------------------------------------`
			`<1> The destination field`
			<2> Group by values of the `user_id` field

			`===== AggregationConfig`

			`Defines the aggregations for the group fields.`
			`// TODO link to the supported aggregations`

			`["source","java",subs="attributes,callouts,macros"]`
			`--------------------------------------------------`
			`include-tagged::{doc-tests-file}[{api}-agg-config]`
			`--------------------------------------------------`
			`<1> Aggregate the average star rating`

			`include::../execution.asciidoc[]`

			`[id="{upid}-{api}-response"]`
			`==== Response`

			`The returned +{response}+ acknowledges the successful creation of`
			`the new {dataframe-transform} or an error if the configuration is invalid.`