OpenSearch/docs/en/rest-api/ml/datafeedresource.asciidoc

98 lines
3.6 KiB
Plaintext
Raw Normal View History

//lcawley Verified example output 2017-04-11
[[ml-datafeed-resource]]
==== Data Feed Resources
A data feed resource has the following properties:
`aggregations`::
(object) When set the datafeed performs aggregation searches.
For syntax information, see {ref}search-aggregations.html[Aggregations].
Support for aggregations is limited: TBD.
For example:
`{"@timestamp": {"histogram": {"field": "@timestamp",
"interval": 30000,"offset": 0,"order": {"_key": "asc"},"keyed": false,
"min_doc_count": 0}, "aggregations": {"events_per_min": {"sum": {
"field": "events_per_min"}}}}}`.
`chunking_config`::
(object) The chunking configuration, which specifies how data searches
will be chunked. See <<ml-datafeed-chunking-config,chunking configuration objects>>.
For example: {"mode": "manual", "time_span": "3h"}
`datafeed_id`::
(string) A numerical character string that uniquely identifies the data feed.
`frequency`::
(time units) Interval at which scheduled queries should be made while the datafeed
runs in real-time. The default is either the bucket span for short bucket spans, or,
for longer bucket spans, a sensible fraction of the bucket span.
For example: "150s"
`indexes` (required)::
(array) An array of index names. For example: ["it_ops_metrics"]
`job_id` (required)::
(string) The id of the job to which the datafeed will send data.
`query`::
(object) Elasticsearch query DSL. Corresponds to the query object in an Elasticsearch
search POST body. All options supported by Elasticsearch may be used, as this object
is passed verbatim to Elasticsearch. If not specified the default is “match_all”: {}
By default, this property has the following value: `{"match_all": {"boost": 1}}`.
`query_delay`::
(time units) How many seconds behind real-time should data be queried. For example,
if data from 10:04am may not be searchable in Elasticsearch until 10:06am then set this to 120 seconds.
The default is 60 seconds. For example: "60s"
`scroll_size`::
(unsigned integer) The `size` parameter to be used in elasticsearch searches.
The default value is `1000`.
`types` (required)::
(array) List of types to search for within the specified indexes.
For example: ["network","sql","kpi"]
[[ml-datafeed-chunking-config]]
===== Chunking Configuration Objects
A chunking configuration object has the following properties:
`mode` (required)::
There are 3 available modes: +
`auto`::: the chunk size will be dynamically calculated.
`manual`::: chunking will be applied according to the specified `time_span`.
`off`::: no chunking will be applied.
`time_span`::
(time units) The time span that each search will be querying.
This setting is only applicable when the mode is set to `manual`.
For example: "3h"
[float]
[[ml-datafeed-counts]]
==== Data Feed Counts
The get data feed statistics API provides information about the operational
progress of a data feed. For example:
`assigment_explanation`::
TBD. For example: " "
`datafeed_id`::
(string) A numerical character string that uniquely identifies the data feed.
`node`::
(object) TBD
The node that is running the query?
`id`::: TBD. For example, "0-o0tOoRTwKFZifatTWKNw".
`name`::: TBD. For example, "0-o0tOo".
`ephemeral_id`::: TBD. For example, "DOZltLxLS_SzYpW6hQ9hyg".
`transport_address`::: TBD. For example, "127.0.0.1:9300".
`attributes`::: TBD. For example, {"max_running_jobs": "10"}.
`state`::
(string) The status of the data feed, which can be one of the following values: +
`started`::: The data feed is actively receiving data.
`stopped`::: The data feed is stopped and will not receive data until it is re-started.