//lcawley Verified example output 2017-04-11 [[ml-datafeed-resource]] === {dfeed-cap} Resources A {dfeed} resource has the following properties: `aggregations`:: (object) If set, the {dfeed} performs aggregation searches. Support for aggregations is limited and should only be used with low cardinality data. For more information, see <>. `chunking_config`:: (object) Specifies how data searches are split into time chunks. See <>. For example: `{"mode": "manual", "time_span": "3h"}` `datafeed_id`:: (string) A numerical character string that uniquely identifies the {dfeed}. `frequency`:: (time units) The interval at which scheduled queries are made while the {dfeed} runs in real time. The default value is either the bucket span for short bucket spans, or, for longer bucket spans, a sensible fraction of the bucket span. For example: `150s`. `indices`:: (array) An array of index names. For example: `["it_ops_metrics"]` `job_id`:: (string) The unique identifier for the job to which the {dfeed} sends data. `query`:: (object) The {es} query domain-specific language (DSL). This value corresponds to the query object in an {es} search POST body. All the options that are supported by {es} can be used, as this object is passed verbatim to {es}. By default, this property has the following value: `{"match_all": {"boost": 1}}`. `query_delay`:: (time units) The number of seconds behind real time that data is queried. For example, if data from 10:04 a.m. might not be searchable in {es} until 10:06 a.m., set this property to 120 seconds. The default value is `60s`. `script_fields`:: (object) Specifies scripts that evaluate custom expressions and returns script fields to the {dfeed}. The <> in a job can contain functions that use these script fields. For more information, see {ref}/search-request-script-fields.html[Script Fields]. For example: + -- [source,js] ---------------------------------- { "script_fields": { "total_error_count": { "script": { "lang": "painless", "source": "doc['error_count'].value + doc['aborted_count'].value" } } } } ---------------------------------- -- `scroll_size`:: (unsigned integer) The `size` parameter that is used in {es} searches. The default value is `1000`. `types`:: (array) A list of types to search for within the specified indices. For example: `["network","sql","kpi"]`. [[ml-datafeed-chunking-config]] ==== Chunking Configuration Objects {dfeeds-cap} might be required to search over long time periods, for several months or years. This search is split into time chunks in order to ensure the load on {es} is managed. Chunking configuration controls how the size of these time chunks are calculated and is an advanced configuration option. A chunking configuration object has the following properties: `mode`:: There are three available modes: + `auto`::: The chunk size will be dynamically calculated. This is the default and recommended value. `manual`::: Chunking will be applied according to the specified `time_span`. `off`::: No chunking will be applied. `time_span`:: (time units) The time span that each search will be querying. This setting is only applicable when the mode is set to `manual`. For example: `3h`. [float] [[ml-datafeed-counts]] ==== {dfeed-cap} Counts The get {dfeed} statistics API provides information about the operational progress of a {dfeed}. For example: `assignment_explanation`:: (string) For started {dfeeds} only, contains messages relating to the selection of a node. `datafeed_id`:: (string) A numerical character string that uniquely identifies the {dfeed}. `node`:: (object) The node upon which the {dfeed} is started. The {dfeed} and job will be on the same node. `id`::: The unique identifier of the node. For example, "0-o0tOoRTwKFZifatTWKNw". `name`::: The node name. For example, `0-o0tOo`. `ephemeral_id`::: The node ephemeral ID. `transport_address`::: The host and port where transport HTTP connections are accepted. For example, `127.0.0.1:9300`. `attributes`::: For example, `{"max_running_jobs": "10"}`. `state`:: (string) The status of the {dfeed}, which can be one of the following values: + `started`::: The {dfeed} is actively receiving data. `stopped`::: The {dfeed} is stopped and will not receive data until it is re-started.