[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972)
This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool.
This commit is contained in:
parent
5f22370b6b
commit
2171b6b47f
|
@ -0,0 +1,108 @@
|
||||||
|
[role="xpack"]
|
||||||
|
[testenv="platinum"]
|
||||||
|
[[ml-dfanalytics-resources]]
|
||||||
|
=== {dfanalytics-cap} job resources
|
||||||
|
|
||||||
|
{dfanalytics-cap} resources relate to APIs such as <<put-dfanalytics>> and
|
||||||
|
<<get-dfanalytics>>.
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[ml-dfanalytics-properties]]
|
||||||
|
==== {api-definitions-title}
|
||||||
|
|
||||||
|
`analysis`::
|
||||||
|
(object) The type of analysis that is performed on the `source`. For example:
|
||||||
|
`outlier_detection`. For more information, see <<dfanalytics-types>>.
|
||||||
|
|
||||||
|
`analyzed_fields`::
|
||||||
|
(object) You can specify both `includes` and/or `excludes` patterns. If
|
||||||
|
`analyzed_fields` is not set, only the relevant fields will be included. For
|
||||||
|
example all the numeric fields for {oldetection}.
|
||||||
|
|
||||||
|
`dest`::
|
||||||
|
(object) The destination configuration of the analysis. For more information,
|
||||||
|
see <<dfanalytics-dest-resources>>.
|
||||||
|
|
||||||
|
`id`::
|
||||||
|
(string) The unique identifier for the {dfanalytics-job}. This identifier can
|
||||||
|
contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and
|
||||||
|
underscores. It must start and end with alphanumeric characters. This property
|
||||||
|
is informational; you cannot change the identifier for existing jobs.
|
||||||
|
|
||||||
|
`model_memory_limit`::
|
||||||
|
(string) The approximate maximum amount of memory resources that are
|
||||||
|
permitted for analytical processing. The default value for {dfanalytics-jobs}
|
||||||
|
is `1gb`. If your `elasticsearch.yml` file contains an
|
||||||
|
`xpack.ml.max_model_memory_limit` setting, an error occurs when you try to
|
||||||
|
create {dfanalytics-jobs} that have `model_memory_limit` values greater than
|
||||||
|
that setting. For more information, see <<ml-settings>>.
|
||||||
|
|
||||||
|
`source`::
|
||||||
|
(object) The source configuration, consisting of `index` and optionally a
|
||||||
|
`query`. For more information, see <<dfanalytics-source-resources>>.
|
||||||
|
|
||||||
|
[[dfanalytics-types]]
|
||||||
|
==== Analysis objects
|
||||||
|
|
||||||
|
{dfanalytics-cap} resources contain `analysis` objects. For example, when you
|
||||||
|
create a {dfanalytics-job}, you must define the type of analysis it performs.
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[oldetection-resources]]
|
||||||
|
===== {oldetection-cap} configuration objects
|
||||||
|
|
||||||
|
An {oldetection} configuration object has the following properties:
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[oldetection-properties]]
|
||||||
|
==== {api-definitions-title}
|
||||||
|
|
||||||
|
`n_neighbors`::
|
||||||
|
(integer) Defines the value for how many nearest neighbors each method of
|
||||||
|
{oldetection} will use to calculate its {olscore}. When the value is
|
||||||
|
not set, the system will dynamically detect an appropriate value.
|
||||||
|
|
||||||
|
`method`::
|
||||||
|
(string) Sets the method that {oldetection} uses. If the method is not set
|
||||||
|
{oldetection} uses an ensemble of different methods and normalises and
|
||||||
|
combines their individual {olscores} to obtain the overall {olscore}.
|
||||||
|
Available methods are `lof`, `ldof`, `distance_kth_nn`, `distance_knn`.
|
||||||
|
|
||||||
|
`feature_influence_threshold`::
|
||||||
|
(double) The minimum {olscore} that a document needs to have in order to
|
||||||
|
calculate its {fiscore}.
|
||||||
|
Value range: 0-1 (`0.1` by default).
|
||||||
|
|
||||||
|
[[dfanalytics-dest-resources]]
|
||||||
|
==== Dest configuration objects
|
||||||
|
|
||||||
|
{dfanalytics-cap} resources contain `dest` objects. For example, when you
|
||||||
|
create a {dfanalytics-job}, you must define its destination.
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[dfanalytics-dest-properties]]
|
||||||
|
==== {api-definitions-title}
|
||||||
|
|
||||||
|
`index`::
|
||||||
|
(string) The name of the index in which to store the results of the
|
||||||
|
{dfanalytics-job}.
|
||||||
|
|
||||||
|
`results_field`::
|
||||||
|
(string) The name of the field in which to store the results of the analysis.
|
||||||
|
The default value is `ml`.
|
||||||
|
|
||||||
|
[[dfanalytics-source-resources]]
|
||||||
|
==== Source configuration objects
|
||||||
|
|
||||||
|
The `source` configuration object has the following properties:
|
||||||
|
|
||||||
|
`index`::
|
||||||
|
(array) An array of index names on which to perform the analysis. It can be a
|
||||||
|
single index or index pattern as well as an array of indices or patterns.
|
||||||
|
|
||||||
|
`query`::
|
||||||
|
(object) The {es} query domain-specific language (DSL). This value
|
||||||
|
corresponds to the query object in an {es} search POST body. All the
|
||||||
|
options that are supported by {es} can be used, as this object is
|
||||||
|
passed verbatim to {es}. By default, this property has the following
|
||||||
|
value: `{"match_all": {"boost": 1}}`.
|
|
@ -8,15 +8,9 @@
|
||||||
<titleabbrev>Evaluate {dfanalytics}</titleabbrev>
|
<titleabbrev>Evaluate {dfanalytics}</titleabbrev>
|
||||||
++++
|
++++
|
||||||
|
|
||||||
experimental[]
|
Evaluates the {dfanalytics} for an annotated index.
|
||||||
|
|
||||||
Evaluates the executed analysis on an index that is already annotated with a
|
experimental[]
|
||||||
field that contains the results of the analytics (the `ground truth`) for each
|
|
||||||
{dataframe} row. Evaluation is typically done via calculating a set of metrics
|
|
||||||
that capture various aspects of the quality of the results over the data for
|
|
||||||
which we have the `ground truth`. For different types of analyses different
|
|
||||||
metrics are suitable. This API packages together commonly used metrics for
|
|
||||||
various analyses.
|
|
||||||
|
|
||||||
[[ml-evaluate-dfanalytics-request]]
|
[[ml-evaluate-dfanalytics-request]]
|
||||||
==== {api-request-title}
|
==== {api-request-title}
|
||||||
|
@ -30,6 +24,19 @@ various analyses.
|
||||||
information, see {stack-ov}/security-privileges.html[Security privileges] and
|
information, see {stack-ov}/security-privileges.html[Security privileges] and
|
||||||
{stack-ov}/built-in-roles.html[Built-in roles].
|
{stack-ov}/built-in-roles.html[Built-in roles].
|
||||||
|
|
||||||
|
[[ml-evaluate-dfanalytics-desc]]
|
||||||
|
==== {api-description-title}
|
||||||
|
|
||||||
|
This API evaluates the executed analysis on an index that is already annotated
|
||||||
|
with a field that contains the results of the analytics (the `ground truth`)
|
||||||
|
for each {dataframe} row.
|
||||||
|
|
||||||
|
Evaluation is typically done by calculating a set of metrics that capture various aspects of the quality of the results over the data for which you have the
|
||||||
|
`ground truth`.
|
||||||
|
|
||||||
|
For different types of analyses different metrics are suitable. This API
|
||||||
|
packages together commonly used metrics for various analyses.
|
||||||
|
|
||||||
[[ml-evaluate-dfanalytics-request-body]]
|
[[ml-evaluate-dfanalytics-request-body]]
|
||||||
==== {api-request-body-title}
|
==== {api-request-body-title}
|
||||||
|
|
||||||
|
@ -38,8 +45,22 @@ information, see {stack-ov}/security-privileges.html[Security privileges] and
|
||||||
|
|
||||||
`evaluation` (Required)::
|
`evaluation` (Required)::
|
||||||
(object) Defines the type of evaluation you want to perform. For example:
|
(object) Defines the type of evaluation you want to perform. For example:
|
||||||
`binary_soft_classification`.
|
`binary_soft_classification`. See <<ml-evaluate-dfanalytics-resources>>.
|
||||||
See Evaluate API resources.
|
|
||||||
|
[[ml-evaluate-dfanalytics-results]]
|
||||||
|
==== {api-response-body-title}
|
||||||
|
|
||||||
|
`binary_soft_classification`::
|
||||||
|
(object) If you chose to do binary soft classification, the API returns the
|
||||||
|
following evaluation metrics:
|
||||||
|
|
||||||
|
`auc_roc`::: TBD
|
||||||
|
|
||||||
|
`confusion_matrix`::: TBD
|
||||||
|
|
||||||
|
`precision`::: TBD
|
||||||
|
|
||||||
|
`recall`::: TBD
|
||||||
|
|
||||||
[[ml-evaluate-dfanalytics-example]]
|
[[ml-evaluate-dfanalytics-example]]
|
||||||
==== {api-examples-title}
|
==== {api-examples-title}
|
||||||
|
|
|
@ -0,0 +1,63 @@
|
||||||
|
[role="xpack"]
|
||||||
|
[testenv="platinum"]
|
||||||
|
[[ml-evaluate-dfanalytics-resources]]
|
||||||
|
=== {dfanalytics-cap} evaluation resources
|
||||||
|
|
||||||
|
Evaluation configuration objects relate to the <<evaluate-dfanalytics>>.
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[ml-evaluate-dfanalytics-properties]]
|
||||||
|
==== {api-definitions-title}
|
||||||
|
|
||||||
|
`evaluation`::
|
||||||
|
(object) Defines the type of evaluation you want to perform. The value of this
|
||||||
|
object can be different depending on the type of evaluation you want to
|
||||||
|
perform. For example, it can contain <<binary-sc-resources>>.
|
||||||
|
|
||||||
|
[[binary-sc-resources]]
|
||||||
|
==== Binary soft classification configuration objects
|
||||||
|
|
||||||
|
Binary soft classification evaluates the results of an analysis which outputs
|
||||||
|
the probability that each {dataframe} row belongs to a certain class. For
|
||||||
|
example, in the context of outlier detection, the analysis outputs the
|
||||||
|
probability whether each row is an outlier.
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[binary-sc-resources-properties]]
|
||||||
|
===== {api-definitions-title}
|
||||||
|
|
||||||
|
`actual_field`::
|
||||||
|
(string) The field of the `index` which contains the `ground
|
||||||
|
truth`. The data type of this field can be boolean or integer. If the data
|
||||||
|
type is integer, the value has to be either `0` (false) or `1` (true).
|
||||||
|
|
||||||
|
`predicted_probability_field`::
|
||||||
|
(string) The field of the `index` that defines the probability of whether the
|
||||||
|
item belongs to the class in question or not. It's the field that contains the
|
||||||
|
results of the analysis.
|
||||||
|
|
||||||
|
`metrics`::
|
||||||
|
(object) Specifies the metrics that are used for the evaluation. Available
|
||||||
|
metrics:
|
||||||
|
|
||||||
|
`auc_roc`::
|
||||||
|
(object) The AUC ROC (area under the curve of the receiver operating
|
||||||
|
characteristic) score and optionally the curve.
|
||||||
|
Default value is {"includes_curve": false}.
|
||||||
|
|
||||||
|
`precision`::
|
||||||
|
(object) Set the different thresholds of the {olscore} at where the metric
|
||||||
|
is calculated.
|
||||||
|
Default value is {"at": [0.25, 0.50, 0.75]}.
|
||||||
|
|
||||||
|
`recall`::
|
||||||
|
(object) Set the different thresholds of the {olscore} at where the metric
|
||||||
|
is calculated.
|
||||||
|
Default value is {"at": [0.25, 0.50, 0.75]}.
|
||||||
|
|
||||||
|
`confusion_matrix`::
|
||||||
|
(object) Set the different thresholds of the {olscore} at where the metrics
|
||||||
|
(`tp` - true positive, `fp` - false positive, `tn` - true negative, `fn` -
|
||||||
|
false negative) are calculated.
|
||||||
|
Default value is {"at": [0.25, 0.50, 0.75]}.
|
||||||
|
|
|
@ -45,6 +45,10 @@ You can get information for all {dfanalytics-jobs} by using _all, by specifying
|
||||||
(string) Identifier for the {dfanalytics-job}. If you do not specify one of
|
(string) Identifier for the {dfanalytics-job}. If you do not specify one of
|
||||||
these options, the API returns information for the first hundred
|
these options, the API returns information for the first hundred
|
||||||
{dfanalytics-jobs}.
|
{dfanalytics-jobs}.
|
||||||
|
|
||||||
|
`allow_no_match` (Optional)::
|
||||||
|
(boolean) If `false` and the `data_frame_analytics_id` does not match any
|
||||||
|
{dfanalytics-job} an error will be returned. The default value is `true`.
|
||||||
|
|
||||||
[[ml-get-dfanalytics-query-params]]
|
[[ml-get-dfanalytics-query-params]]
|
||||||
==== {api-query-parms-title}
|
==== {api-query-parms-title}
|
||||||
|
@ -60,6 +64,13 @@ You can get information for all {dfanalytics-jobs} by using _all, by specifying
|
||||||
`size` (Optional)::
|
`size` (Optional)::
|
||||||
(integer) Specifies the maximum number of {dfanalytics-jobs} to obtain. The
|
(integer) Specifies the maximum number of {dfanalytics-jobs} to obtain. The
|
||||||
default value is `100`.
|
default value is `100`.
|
||||||
|
|
||||||
|
[[ml-get-dfanalytics-results]]
|
||||||
|
==== {api-response-body-title}
|
||||||
|
|
||||||
|
`data_frame_analytics`::
|
||||||
|
(array) An array of {dfanalytics-job} resources. For more information, see
|
||||||
|
<<ml-dfanalytics-resources>>.
|
||||||
|
|
||||||
[[ml-get-dfanalytics-example]]
|
[[ml-get-dfanalytics-example]]
|
||||||
==== {api-examples-title}
|
==== {api-examples-title}
|
||||||
|
|
|
@ -56,24 +56,23 @@ and mappings.
|
||||||
|
|
||||||
[[ml-put-dfanalytics-request-body]]
|
[[ml-put-dfanalytics-request-body]]
|
||||||
==== {api-request-body-title}
|
==== {api-request-body-title}
|
||||||
|
|
||||||
`analysis` (Required)::
|
`analysis` (Required)::
|
||||||
(object) Defines the type of {dfanalytics} you want to perform on your source
|
(object) Defines the type of {dfanalytics} you want to perform on your source
|
||||||
index. For example: `outlier_detection`.
|
index. For example: `outlier_detection`. See <<dfanalytics-types>>.
|
||||||
See {oldetection} resources.
|
|
||||||
|
|
||||||
`analyzed_fields` (Optional)::
|
`analyzed_fields` (Optional)::
|
||||||
(object) You can specify both `includes` and/or `excludes` patterns. If
|
(object) You can specify both `includes` and/or `excludes` patterns. If
|
||||||
`analyzed_fields` is not set, only the relevant fileds will be included. For
|
`analyzed_fields` is not set, only the relevant fields will be included. For
|
||||||
example all the numeric fields for {oldetection}.
|
example, all the numeric fields for {oldetection}.
|
||||||
|
|
||||||
`dest` (Required)::
|
`dest` (Required)::
|
||||||
(object) The destination configuration, consisting of `index` and optionally
|
(object) The destination configuration, consisting of `index` and optionally
|
||||||
`results_field` (`ml` by default).
|
`results_field` (`ml` by default). See <<dfanalytics-dest-resources>>.
|
||||||
|
|
||||||
`source` (Required)::
|
`source` (Required)::
|
||||||
(object) The source configuration, consisting of `index` and optionally a
|
(object) The source configuration, consisting of `index` and optionally a
|
||||||
`query`.
|
`query`. See <<dfanalytics-source-resources>>.
|
||||||
|
|
||||||
[[ml-put-dfanalytics-example]]
|
[[ml-put-dfanalytics-example]]
|
||||||
==== {api-examples-title}
|
==== {api-examples-title}
|
||||||
|
|
|
@ -8,7 +8,9 @@ These resource definitions are used in APIs related to {ml-features} and
|
||||||
* <<ml-calendar-resource,Calendars>>
|
* <<ml-calendar-resource,Calendars>>
|
||||||
* <<ml-datafeed-resource,{dfeeds-cap}>>
|
* <<ml-datafeed-resource,{dfeeds-cap}>>
|
||||||
* <<ml-datafeed-counts,{dfeed-cap} counts>>
|
* <<ml-datafeed-counts,{dfeed-cap} counts>>
|
||||||
|
* <<ml-dfanalytics-resources,{dfanalytics-cap}>>
|
||||||
* <<data-frame-transform-resource,{dataframe-transforms-cap}>>
|
* <<data-frame-transform-resource,{dataframe-transforms-cap}>>
|
||||||
|
* <<ml-evaluate-dfanalytics-resources,Evaluate {dfanalytics}>>
|
||||||
* <<ml-filter-resource,Filters>>
|
* <<ml-filter-resource,Filters>>
|
||||||
* <<ml-job-resource,Jobs>>
|
* <<ml-job-resource,Jobs>>
|
||||||
* <<ml-jobstats,Job statistics>>
|
* <<ml-jobstats,Job statistics>>
|
||||||
|
@ -19,7 +21,9 @@ These resource definitions are used in APIs related to {ml-features} and
|
||||||
|
|
||||||
include::{es-repo-dir}/ml/apis/calendarresource.asciidoc[]
|
include::{es-repo-dir}/ml/apis/calendarresource.asciidoc[]
|
||||||
include::{es-repo-dir}/ml/apis/datafeedresource.asciidoc[]
|
include::{es-repo-dir}/ml/apis/datafeedresource.asciidoc[]
|
||||||
|
include::{es-repo-dir}/ml/apis/dfanalyticsresources.asciidoc[]
|
||||||
include::{es-repo-dir}/data-frames/apis/transformresource.asciidoc[]
|
include::{es-repo-dir}/data-frames/apis/transformresource.asciidoc[]
|
||||||
|
include::{es-repo-dir}/ml/apis/evaluateresources.asciidoc[]
|
||||||
include::{es-repo-dir}/ml/apis/filterresource.asciidoc[]
|
include::{es-repo-dir}/ml/apis/filterresource.asciidoc[]
|
||||||
include::{es-repo-dir}/ml/apis/jobresource.asciidoc[]
|
include::{es-repo-dir}/ml/apis/jobresource.asciidoc[]
|
||||||
include::{es-repo-dir}/ml/apis/jobcounts.asciidoc[]
|
include::{es-repo-dir}/ml/apis/jobcounts.asciidoc[]
|
||||||
|
|
Loading…
Reference in New Issue