OpenSearch/docs/en/rest-api/ml/datafeedresource.asciidoc

101 lines
3.7 KiB
Plaintext
Raw Normal View History

//lcawley Verified example output 2017-04-11
[[ml-datafeed-resource]]
==== Data Feed Resources
A data feed resource has the following properties:
`aggregations`::
(object) If set, the data feed performs aggregation searches.
For syntax information, see {ref}/search-aggregations.html[Aggregations].
Support for aggregations is limited: TBD.
For example:
`{"@timestamp": {"histogram": {"field": "@timestamp",
"interval": 30000,"offset": 0,"order": {"_key": "asc"},"keyed": false,
"min_doc_count": 0}, "aggregations": {"events_per_min": {"sum": {
"field": "events_per_min"}}}}}`.
`chunking_config`::
(object) The chunking configuration, which specifies how data searches are
chunked. See <<ml-datafeed-chunking-config>>.
For example: {"mode": "manual", "time_span": "3h"}
`datafeed_id`::
(string) A numerical character string that uniquely identifies the data feed.
`frequency`::
(time units) The interval at which scheduled queries are made while the data
feed runs in real time. The default value is either the bucket span for short
bucket spans, or, for longer bucket spans, a sensible fraction of the bucket
span. For example: "150s"
`indexes` (required)::
(array) An array of index names. For example: ["it_ops_metrics"]
`job_id` (required)::
(string) The unique identifier for the job to which the data feed sends data.
`query`::
(object) The Elasticsearch query domain-specific language (DSL). This value
corresponds to the query object in an Elasticsearch search POST body. All the
options that are supported by Elasticsearch can be used, as this object is
passed verbatim to Elasticsearch. By default, this property has the following
value: `{"match_all": {"boost": 1}}`. If this property is not specified, the
default value is `“match_all”: {}`.
`query_delay`::
(time units) The number of seconds behind real-time that data is queried. For
example, if data from 10:04 a.m. might not be searchable in Elasticsearch
until 10:06 a.m., set this property to 120 seconds. The default value is 60
seconds. For example: "60s".
`scroll_size`::
(unsigned integer) The `size` parameter that is used in Elasticsearch searches.
The default value is `1000`.
`types` (required)::
(array) A list of types to search for within the specified indices.
For example: ["network","sql","kpi"].
[[ml-datafeed-chunking-config]]
===== Chunking Configuration Objects
A chunking configuration object has the following properties:
`mode` (required)::
There are three available modes: +
`auto`::: The chunk size will be dynamically calculated.
`manual`::: Chunking will be applied according to the specified `time_span`.
`off`::: No chunking will be applied.
`time_span`::
(time units) The time span that each search will be querying.
This setting is only applicable when the mode is set to `manual`.
For example: "3h".
[float]
[[ml-datafeed-counts]]
==== Data Feed Counts
The get data feed statistics API provides information about the operational
progress of a data feed. For example:
`assigment_explanation`::
TBD. For example: " "
`datafeed_id`::
(string) A numerical character string that uniquely identifies the data feed.
`node`::
(object) TBD
The node that is running the query?
`id`::: TBD. For example, "0-o0tOoRTwKFZifatTWKNw".
`name`::: TBD. For example, "0-o0tOo".
`ephemeral_id`::: TBD. For example, "DOZltLxLS_SzYpW6hQ9hyg".
`transport_address`::: TBD. For example, "127.0.0.1:9300".
`attributes`::: TBD. For example, {"max_running_jobs": "10"}.
`state`::
(string) The status of the data feed, which can be one of the following values: +
`started`::: The data feed is actively receiving data.
`stopped`::: The data feed is stopped and will not receive data until it is re-started.