//lcawley Verified example output 2017-04-11 [[ml-datafeed-resource]] ==== {dfeed-cap} Resources A {dfeed} resource has the following properties: `aggregations`:: (object) If set, the {dfeed} performs aggregation searches. For syntax information, see {ref}/search-aggregations.html[Aggregations]. Support for aggregations is limited and should only be used with low cardinality data. For example: `{"@timestamp": {"histogram": {"field": "@timestamp", "interval": 30000,"offset": 0,"order": {"_key": "asc"},"keyed": false, "min_doc_count": 0}, "aggregations": {"events_per_min": {"sum": { "field": "events_per_min"}}}}}`. //TBD link to a Working with aggregations page `chunking_config`:: (object) Specifies how data searches are split into time chunks. See <>. For example: {"mode": "manual", "time_span": "3h"} `datafeed_id`:: (string) A numerical character string that uniquely identifies the {dfeed}. `frequency`:: (time units) The interval at which scheduled queries are made while the {dfeed} runs in real time. The default value is either the bucket span for short bucket spans, or, for longer bucket spans, a sensible fraction of the bucket span. For example: "150s" `indices`:: (array) An array of index names. For example: ["it_ops_metrics"] `job_id`:: (string) The unique identifier for the job to which the {dfeed} sends data. `query`:: (object) The {es} query domain-specific language (DSL). This value corresponds to the query object in an {es} search POST body. All the options that are supported by {es} can be used, as this object is passed verbatim to {es}. By default, this property has the following value: `{"match_all": {"boost": 1}}`. `query_delay`:: (time units) The number of seconds behind real time that data is queried. For example, if data from 10:04 a.m. might not be searchable in {es} until 10:06 a.m., set this property to 120 seconds. The default value is `60s`. `scroll_size`:: (unsigned integer) The `size` parameter that is used in {es} searches. The default value is `1000`. `types`:: (array) A list of types to search for within the specified indices. For example: ["network","sql","kpi"]. [[ml-datafeed-chunking-config]] ===== Chunking Configuration Objects {dfeeds-cap} might be required to search over long time periods, for several months or years. This search is split into time chunks in order to ensure the load on {es} is managed. Chunking configuration controls how the size of these time chunks are calculated and is an advanced configuration option. A chunking configuration object has the following properties: `mode`:: There are three available modes: + `auto`::: The chunk size will be dynamically calculated. This is the default and recommended value. `manual`::: Chunking will be applied according to the specified `time_span`. `off`::: No chunking will be applied. `time_span`:: (time units) The time span that each search will be querying. This setting is only applicable when the mode is set to `manual`. For example: "3h". [float] [[ml-datafeed-counts]] ==== {dfeed-cap} Counts The get {dfeed} statistics API provides information about the operational progress of a {dfeed}. For example: `assignment_explanation`:: (string) For started {dfeeds} only, contains messages relating to the selection of a node. `datafeed_id`:: (string) A numerical character string that uniquely identifies the {dfeed}. `node`:: (object) The node upon which the {dfeed} is started. The {dfeed} and job will be on the same node. `id`::: The unique identifier of the node. For example, "0-o0tOoRTwKFZifatTWKNw". `name`::: The node name. For example, "0-o0tOo". `ephemeral_id`::: The node ephemeral ID. `transport_address`::: The host and port where transport HTTP connections are accepted. For example, "127.0.0.1:9300". `attributes`::: For example, {"max_running_jobs": "10"}. `state`:: (string) The status of the {dfeed}, which can be one of the following values: + `started`::: The {dfeed} is actively receiving data. `stopped`::: The {dfeed} is stopped and will not receive data until it is re-started.