OpenSearch/docs/reference/ml/df-analytics/apis/dfanalyticsresources.asciidoc

[role="xpack"]
[testenv="platinum"]
[[ml-dfanalytics-resources]]
=== {dfanalytics-cap} job resources

{dfanalytics-cap} resources relate to APIs such as <<put-dfanalytics>> and
<<get-dfanalytics>>.	

[discrete]	
[[ml-dfanalytics-properties]]	
==== {api-definitions-title}

`analysis`::
  (object) The type of analysis that is performed on the `source`. For example: 
  `outlier_detection`. For more information, see <<dfanalytics-types>>.
  
`analyzed_fields`::
  (object) You can specify both `includes` and/or `excludes` patterns. If 
  `analyzed_fields` is not set, only the relevant fields will be included. For 
  example all the numeric fields for {oldetection}.
  
  `analyzed_fields.includes`:::
    (array) An array of strings that defines the fields that will be included in 
    the analysis.
    
  `analyzed_fields.excludes`:::
    (array) An array of strings that defines the fields that will be excluded 
    from the analysis.
  

[source,js]
--------------------------------------------------
PUT _ml/data_frame/analytics/loganalytics
{
  "source": {
    "index": "logdata"
  },
  "dest": {
    "index": "logdata_out"
  },
  "analysis": {
    "outlier_detection": {
    }
  },
  "analyzed_fields": {
        "includes": [ "request.bytes", "response.counts.error" ],
        "excludes": [ "source.geo" ]
  }
}
--------------------------------------------------
// CONSOLE
// TEST[setup:setup_logdata]

`description`::
  (Optional, string) A description of the job.

`dest`::
  (object) The destination configuration of the analysis.
  
  `index`:::
    (Required, string) Defines the _destination index_ to store the results of 
    the {dfanalytics-job}.
  
  `results_field`:::
    (Optional, string) Defines the name of the field in which to store the 
    results of the analysis. Default to `ml`.

`id`::
  (string) The unique identifier for the {dfanalytics-job}. This identifier can 
  contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and 
  underscores. It must start and end with alphanumeric characters. This property 
  is informational; you cannot change the identifier for existing jobs.
  
`model_memory_limit`::
  (string) The approximate maximum amount of memory resources that are 
  permitted for analytical processing. The default value for {dfanalytics-jobs} 
  is `1gb`. If your `elasticsearch.yml` file contains an 
  `xpack.ml.max_model_memory_limit` setting, an error occurs when you try to 
  create {dfanalytics-jobs} that have `model_memory_limit` values greater than 
  that setting. For more information, see <<ml-settings>>.

`source`::
  (object) The source configuration consisting an `index` and optionally a 
  `query` object.
  
  `index`:::
    (Required, string or array) Index or indices on which to perform the 
    analysis. It can be a single index or index pattern as well as an array of 
    indices or patterns.
    
  `query`:::
    (Optional, object) The {es} query domain-specific language 
    (<<query-dsl,DSL>>). This value corresponds to the query object in an {es} 
    search POST body. All the options that are supported by {es} can be used, 
    as this object is passed verbatim to {es}. By default, this property has 
    the following value: `{"match_all": {}}`.

[[dfanalytics-types]]
==== Analysis objects

{dfanalytics-cap} resources contain `analysis` objects. For example, when you
create a {dfanalytics-job}, you must define the type of analysis it performs. 
Currently, `outlier_detection` is the only available type of analysis, however, 
other types will be added, for example `regression`.
  
[discrete]
[[oldetection-resources]]
==== {oldetection-cap} configuration objects 

An {oldetection} configuration object has the following properties:

`n_neighbors`::
  (integer) Defines the value for how many nearest neighbors each method of 
  {oldetection} will use to calculate its {olscore}. When the value is 
  not set, the system will dynamically detect an appropriate value.

`method`::
  (string) Sets the method that {oldetection} uses. If the method is not set 
  {oldetection} uses an ensemble of different methods and normalises and 
  combines their individual {olscores} to obtain the overall {olscore}. We 
  recommend to use the ensemble method. Available methods are `lof`, `ldof`, 
  `distance_kth_nn`, `distance_knn`.

`feature_influence_threshold`:: 
  (double) The minimum {olscore} that a document needs to have in order to 
  calculate its {fiscore}. 
  Value range: 0-1 (`0.1` by default).
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00			`[role="xpack"]`
			`[testenv="platinum"]`
			`[[ml-dfanalytics-resources]]`
			`=== {dfanalytics-cap} job resources`

			`{dfanalytics-cap} resources relate to APIs such as <<put-dfanalytics>> and`
			`<<get-dfanalytics>>.`

			`[discrete]`
			`[[ml-dfanalytics-properties]]`
			`==== {api-definitions-title}`

			`analysis`::
			(object) The type of analysis that is performed on the `source`. For example:
			`outlier_detection`. For more information, see <<dfanalytics-types>>.

			`analyzed_fields`::
			(object) You can specify both `includes` and/or `excludes` patterns. If
			`analyzed_fields` is not set, only the relevant fields will be included. For
			`example all the numeric fields for {oldetection}.`
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections 2019-07-26 11:39:59 +02:00
[DOCS] [PUT DFA] Documents inline the child params of source and dest (#45649) * [DOCS] [PUT DFA] Documents inline the child params of source and dest. * [DOCS] Fixes indentation issues and amends dfa definitions. 2019-08-29 14:38:14 +02:00			`analyzed_fields.includes`:::
			`(array) An array of strings that defines the fields that will be included in`
			`the analysis.`

			`analyzed_fields.excludes`:::
			`(array) An array of strings that defines the fields that will be excluded`
			`from the analysis.`


[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections 2019-07-26 11:39:59 +02:00			`[source,js]`
			`--------------------------------------------------`
			`PUT _ml/data_frame/analytics/loganalytics`
			`{`
			`"source": {`
			`"index": "logdata"`
			`},`
			`"dest": {`
			`"index": "logdata_out"`
			`},`
			`"analysis": {`
			`"outlier_detection": {`
			`}`
			`},`
			`"analyzed_fields": {`
			`"includes": [ "request.bytes", "response.counts.error" ],`
			`"excludes": [ "source.geo" ]`
			`}`
			`}`
			`--------------------------------------------------`
			`// CONSOLE`
			`// TEST[setup:setup_logdata]`
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00
[ML] Add description to DF analytics (#45774) (#46019) 2019-08-27 15:48:59 +03:00			`description`::
			`(Optional, string) A description of the job.`

[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00			`dest`::
[DOCS] [PUT DFA] Documents inline the child params of source and dest (#45649) * [DOCS] [PUT DFA] Documents inline the child params of source and dest. * [DOCS] Fixes indentation issues and amends dfa definitions. 2019-08-29 14:38:14 +02:00			`(object) The destination configuration of the analysis.`

			`index`:::
			`(Required, string) Defines the _destination index_ to store the results of`
			`the {dfanalytics-job}.`

			`results_field`:::
			`(Optional, string) Defines the name of the field in which to store the`
			results of the analysis. Default to `ml`.
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00
			`id`::
			`(string) The unique identifier for the {dfanalytics-job}. This identifier can`
			`contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and`
			`underscores. It must start and end with alphanumeric characters. This property`
			`is informational; you cannot change the identifier for existing jobs.`

			`model_memory_limit`::
			`(string) The approximate maximum amount of memory resources that are`
			`permitted for analytical processing. The default value for {dfanalytics-jobs}`
			is `1gb`. If your `elasticsearch.yml` file contains an
			`xpack.ml.max_model_memory_limit` setting, an error occurs when you try to
			create {dfanalytics-jobs} that have `model_memory_limit` values greater than
			`that setting. For more information, see <<ml-settings>>.`

			`source`::
[DOCS] [PUT DFA] Documents inline the child params of source and dest (#45649) * [DOCS] [PUT DFA] Documents inline the child params of source and dest. * [DOCS] Fixes indentation issues and amends dfa definitions. 2019-08-29 14:38:14 +02:00			(object) The source configuration consisting an `index` and optionally a
			`query` object.

			`index`:::
			`(Required, string or array) Index or indices on which to perform the`
			`analysis. It can be a single index or index pattern as well as an array of`
			`indices or patterns.`

			`query`:::
			`(Optional, object) The {es} query domain-specific language`
			`(<<query-dsl,DSL>>). This value corresponds to the query object in an {es}`
			`search POST body. All the options that are supported by {es} can be used,`
			`as this object is passed verbatim to {es}. By default, this property has`
			the following value: `{"match_all": {}}`.
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00
			`[[dfanalytics-types]]`
			`==== Analysis objects`

			{dfanalytics-cap} resources contain `analysis` objects. For example, when you
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections 2019-07-26 11:39:59 +02:00			`create a {dfanalytics-job}, you must define the type of analysis it performs.`
			Currently, `outlier_detection` is the only available type of analysis, however,
			other types will be added, for example `regression`.
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00
			`[discrete]`
			`[[oldetection-resources]]`
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections 2019-07-26 11:39:59 +02:00			`==== {oldetection-cap} configuration objects`
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00
			`An {oldetection} configuration object has the following properties:`

			`n_neighbors`::
			`(integer) Defines the value for how many nearest neighbors each method of`
			`{oldetection} will use to calculate its {olscore}. When the value is`
			`not set, the system will dynamically detect an appropriate value.`

			`method`::
			`(string) Sets the method that {oldetection} uses. If the method is not set`
			`{oldetection} uses an ensemble of different methods and normalises and`
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections 2019-07-26 11:39:59 +02:00			`combines their individual {olscores} to obtain the overall {olscore}. We`
			recommend to use the ensemble method. Available methods are `lof`, `ldof`,
			`distance_kth_nn`, `distance_knn`.
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool. 2019-07-11 18:05:05 +02:00
			`feature_influence_threshold`::
			`(double) The minimum {olscore} that a document needs to have in order to`
			`calculate its {fiscore}.`
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections 2019-07-26 11:39:59 +02:00			Value range: 0-1 (`0.1` by default).