111 lines
4.2 KiB
Plaintext
111 lines
4.2 KiB
Plaintext
//lcawley Verified example output 2017-04-11
|
|
[[ml-datafeed-resource]]
|
|
==== Data Feed Resources
|
|
|
|
A data feed resource has the following properties:
|
|
|
|
`aggregations`::
|
|
(object) If set, the data feed performs aggregation searches.
|
|
For syntax information, see {ref}/search-aggregations.html[Aggregations].
|
|
Support for aggregations is limited and should only be used with
|
|
low cardinality data.
|
|
For example:
|
|
`{"@timestamp": {"histogram": {"field": "@timestamp",
|
|
"interval": 30000,"offset": 0,"order": {"_key": "asc"},"keyed": false,
|
|
"min_doc_count": 0}, "aggregations": {"events_per_min": {"sum": {
|
|
"field": "events_per_min"}}}}}`.
|
|
|
|
//TBD link to a Working with aggregations page
|
|
`chunking_config`::
|
|
(object) Specifies how data searches are split into time chunks.
|
|
See <<ml-datafeed-chunking-config>>.
|
|
For example: {"mode": "manual", "time_span": "3h"}
|
|
|
|
`datafeed_id`::
|
|
(string) A numerical character string that uniquely identifies the data feed.
|
|
|
|
`frequency`::
|
|
(time units) The interval at which scheduled queries are made while the data
|
|
feed runs in real time. The default value is either the bucket span for short
|
|
bucket spans, or, for longer bucket spans, a sensible fraction of the bucket
|
|
span. For example: "150s"
|
|
|
|
`indexes` (required)::
|
|
(array) An array of index names. For example: ["it_ops_metrics"]
|
|
|
|
`job_id` (required)::
|
|
(string) The unique identifier for the job to which the data feed sends data.
|
|
|
|
`query`::
|
|
(object) The {es} query domain-specific language (DSL). This value
|
|
corresponds to the query object in an {es} search POST body. All the
|
|
options that are supported by {es} can be used, as this object is
|
|
passed verbatim to {es}. By default, this property has the following
|
|
value: `{"match_all": {"boost": 1}}`.
|
|
|
|
`query_delay`::
|
|
(time units) The number of seconds behind real time that data is queried. For
|
|
example, if data from 10:04 a.m. might not be searchable in {es} until
|
|
10:06 a.m., set this property to 120 seconds. The default value is `60s`.
|
|
|
|
`scroll_size`::
|
|
(unsigned integer) The `size` parameter that is used in {es} searches.
|
|
The default value is `1000`.
|
|
|
|
`types` (required)::
|
|
(array) A list of types to search for within the specified indices.
|
|
For example: ["network","sql","kpi"].
|
|
|
|
[[ml-datafeed-chunking-config]]
|
|
===== Chunking Configuration Objects
|
|
|
|
Data feeds might be required to search over long time periods, for several months
|
|
or years. This search is split into time chunks in order to ensure the load
|
|
on {es} is managed. Chunking configuration controls how the size of these time
|
|
chunks are calculated and is an advanced configuration option.
|
|
|
|
A chunking configuration object has the following properties:
|
|
|
|
`mode` (required)::
|
|
There are three available modes: +
|
|
`auto`::: The chunk size will be dynamically calculated. This is the default
|
|
and recommended value.
|
|
`manual`::: Chunking will be applied according to the specified `time_span`.
|
|
`off`::: No chunking will be applied.
|
|
|
|
`time_span`::
|
|
(time units) The time span that each search will be querying.
|
|
This setting is only applicable when the mode is set to `manual`.
|
|
For example: "3h".
|
|
|
|
[float]
|
|
[[ml-datafeed-counts]]
|
|
==== Data Feed Counts
|
|
|
|
The get data feed statistics API provides information about the operational
|
|
progress of a data feed. For example:
|
|
|
|
`assignment_explanation`::
|
|
(string) For started data feeds only, contains messages relating to the
|
|
selection of a node.
|
|
|
|
`datafeed_id`::
|
|
(string) A numerical character string that uniquely identifies the data feed.
|
|
|
|
`node`::
|
|
(object) The node upon which the data feed is started. The data feed and
|
|
job will be on the same node.
|
|
`id`::: The unique identifier of the node. For example,
|
|
"0-o0tOoRTwKFZifatTWKNw".
|
|
`name`::: The node name. For example, "0-o0tOo".
|
|
`ephemeral_id`::: The node ephemeral ID.
|
|
`transport_address`::: The host and port where transport HTTP connections are
|
|
accepted. For example, "127.0.0.1:9300".
|
|
`attributes`::: For example, {"max_running_jobs": "10"}.
|
|
|
|
`state`::
|
|
(string) The status of the data feed, which can be one of the following values: +
|
|
`started`::: The data feed is actively receiving data.
|
|
`stopped`::: The data feed is stopped and will not receive data until it is
|
|
re-started.
|