OpenSearch/docs/java-rest/high-level/dataframe/put_data_frame.asciidoc
David Roberts cb62d4acdf [ML-DataFrame] Add a frequency option to transform config, default 1m (#44120)
Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.
2019-07-10 09:59:00 +01:00

120 lines
3.9 KiB
Plaintext

--
:api: put-data-frame-transform
:request: PutDataFrameTransformRequest
:response: AcknowledgedResponse
--
[id="{upid}-{api}"]
=== Put Data Frame Transform API
The Put Data Frame Transform API is used to create a new {dataframe-transform}.
The API accepts a +{request}+ object as a request and returns a +{response}+.
[id="{upid}-{api}-request"]
==== Put Data Frame Request
A +{request}+ requires the following argument:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-request]
--------------------------------------------------
<1> The configuration of the {dataframe-job} to create
[id="{upid}-{api}-config"]
==== Data Frame Transform Configuration
The `DataFrameTransformConfig` object contains all the details about the {dataframe-transform}
configuration and contains the following arguments:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-config]
--------------------------------------------------
<1> The {dataframe-transform} ID
<2> The source indices and query from which to gather data
<3> The destination index and optional pipeline
<4> How often to check for updates to the source indices
<5> The PivotConfig
<6> Optional free text description of the transform
[id="{upid}-{api}-query-config"]
==== SourceConfig
The indices and the query from which to collect data.
If query is not set, a `match_all` query is used by default.
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-source-config]
--------------------------------------------------
==== DestConfig
The index where to write the data and the optional pipeline
through which the docs should be indexed
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-dest-config]
--------------------------------------------------
===== QueryConfig
The query with which to select data from the source.
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-query-config]
--------------------------------------------------
==== PivotConfig
Defines the pivot function `group by` fields and the aggregation to reduce the data.
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-pivot-config]
--------------------------------------------------
<1> The `GroupConfig` to use in the pivot
<2> The aggregations to use
<3> The maximum paging size for the transform when pulling data
from the source. The size dynamically adjusts as the transform
is running to recover from and prevent OOM issues.
===== GroupConfig
The grouping terms. Defines the group by and destination fields
which are produced by the pivot function. There are 3 types of
groups
* Terms
* Histogram
* Date Histogram
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-group-config]
--------------------------------------------------
<1> The destination field
<2> Group by values of the `user_id` field
===== AggregationConfig
Defines the aggregations for the group fields.
// TODO link to the supported aggregations
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-agg-config]
--------------------------------------------------
<1> Aggregate the average star rating
include::../execution.asciidoc[]
[id="{upid}-{api}-response"]
==== Response
The returned +{response}+ acknowledges the successful creation of
the new {dataframe-transform} or an error if the configuration is invalid.