[DOCS] Collapses nested objects in data frame analytics APIs (#54472) (#54526)

This commit is contained in:
Lisa Cawley 2020-03-31 12:51:04 -07:00 committed by GitHub
parent c9db2de41d
commit 922ec8e961
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 228 additions and 172 deletions

View File

@ -57,14 +57,13 @@ they are not included in the explanation.
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-data-frame-analytics]
[[ml-explain-dfanalytics-request-body]]
==== {api-request-body-title}
A {dataframe-analytics-config} as described in <<put-dfanalytics>>.
Note that `id` and `dest` don't need to be provided in the context of this API.
[role="child_attributes"]
[[ml-explain-dfanalytics-results]]
==== {api-response-body-title}

View File

@ -70,7 +70,7 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=from]
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=size]
[role="child_attributes"]
[[ml-get-dfanalytics-results]]
==== {api-response-body-title}

View File

@ -78,7 +78,7 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=size]
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=tags]
[role="child_attributes"]
[[ml-get-inference-results]]
==== {api-response-body-title}

View File

@ -80,7 +80,7 @@ using 4-fold cross validation.
(Required, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-data-frame-analytics-define]
[role="child_attributes"]
[[ml-put-dfanalytics-request-body]]
==== {api-request-body-title}
@ -88,184 +88,197 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-data-frame-analytics-define]
(Optional, boolean)
include::{docdir}/ml/ml-shared.asciidoc[tag=allow-lazy-start]
//Begin analysis
`analysis`::
(Required, object)
The analysis configuration, which contains the information necessary to perform
one of the following types of analysis: {classification}, {oldetection}, or
{regression}.
`analysis`.`classification`:::
+
.Properties of `analysis`
[%collapsible%open]
====
//Begin classification
`classification`:::
(Required^*^, object)
The configuration information necessary to perform
{ml-docs}/dfa-classification.html[{classification}].
+
--
TIP: Advanced parameters are for fine-tuning {classanalysis}. They are set
automatically by <<ml-hyperparam-optimization,hyperparameter optimization>>
to give minimum validation error. It is highly recommended to use the default
values unless you fully understand the function of these parameters.
--
`analysis`.`classification`.`dependent_variable`::::
+
.Properties of `classification`
[%collapsible%open]
=====
`dependent_variable`::::
(Required, string)
+
--
include::{docdir}/ml/ml-shared.asciidoc[tag=dependent-variable]
+
The data type of the field must be numeric (`integer`, `short`, `long`, `byte`),
categorical (`ip`, `keyword`, `text`), or boolean.
--
`analysis`.`classification`.`eta`::::
`eta`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=eta]
`analysis`.`classification`.`feature_bag_fraction`::::
`feature_bag_fraction`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=feature-bag-fraction]
`analysis`.`classification`.`max_trees`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=max-trees]
`analysis`.`classification`.`gamma`::::
`gamma`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=gamma]
`analysis`.`classification`.`lambda`::::
`lambda`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=lambda]
`analysis`.`classification`.`class_assignment_objective`::::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=class-assignment-objective]
`max_trees`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=max-trees]
`analysis`.`classification`.`num_top_classes`::::
`num_top_classes`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=num-top-classes]
`analysis`.`classification`.`prediction_field_name`::::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=prediction-field-name]
`analysis`.`classification`.`randomize_seed`::::
(Optional, long)
include::{docdir}/ml/ml-shared.asciidoc[tag=randomize-seed]
`analysis`.`classification`.`num_top_feature_importance_values`::::
`num_top_feature_importance_values`::::
(Optional, integer)
Advanced configuration option. Specifies the maximum number of
{ml-docs}/dfa-classification.html#dfa-classification-feature-importance[feature
importance] values per document to return. By default, it is zero and no feature importance
calculation occurs.
`analysis`.`classification`.`training_percent`::::
`prediction_field_name`::::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=prediction-field-name]
`randomize_seed`::::
(Optional, long)
include::{docdir}/ml/ml-shared.asciidoc[tag=randomize-seed]
`training_percent`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=training-percent]
`analysis`.`outlier_detection`:::
//End classification
=====
//Begin outlier_detection
`outlier_detection`:::
(Required^*^, object)
The configuration information necessary to perform
{ml-docs}/dfa-outlier-detection.html[{oldetection}]:
`analysis`.`outlier_detection`.`compute_feature_influence`::::
+
.Properties of `outlier_detection`
[%collapsible%open]
=====
`compute_feature_influence`::::
(Optional, boolean)
include::{docdir}/ml/ml-shared.asciidoc[tag=compute-feature-influence]
`analysis`.`outlier_detection`.`feature_influence_threshold`::::
`feature_influence_threshold`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=feature-influence-threshold]
`analysis`.`outlier_detection`.`method`::::
`method`::::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=method]
`analysis`.`outlier_detection`.`n_neighbors`::::
`n_neighbors`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=n-neighbors]
`analysis`.`outlier_detection`.`outlier_fraction`::::
`outlier_fraction`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=outlier-fraction]
`analysis`.`outlier_detection`.`standardization_enabled`::::
`standardization_enabled`::::
(Optional, boolean)
include::{docdir}/ml/ml-shared.asciidoc[tag=standardization-enabled]
`analysis`.`regression`:::
//End outlier_detection
=====
//Begin regression
`regression`:::
(Required^*^, object)
The configuration information necessary to perform
{ml-docs}/dfa-regression.html[{regression}].
+
--
TIP: Advanced parameters are for fine-tuning {reganalysis}. They are set
automatically by <<ml-hyperparam-optimization,hyperparameter optimization>>
to give minimum validation error. It is highly recommended to use the default
values unless you fully understand the function of these parameters.
--
`analysis`.`regression`.`dependent_variable`::::
+
.Properties of `regression`
[%collapsible%open]
=====
`dependent_variable`::::
(Required, string)
+
--
include::{docdir}/ml/ml-shared.asciidoc[tag=dependent-variable]
+
The data type of the field must be numeric.
--
`analysis`.`regression`.`eta`::::
`eta`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=eta]
`analysis`.`regression`.`feature_bag_fraction`::::
`feature_bag_fraction`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=feature-bag-fraction]
`analysis`.`regression`.`max_trees`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=max-trees]
`analysis`.`regression`.`gamma`::::
`gamma`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=gamma]
`analysis`.`regression`.`lambda`::::
`lambda`::::
(Optional, double)
include::{docdir}/ml/ml-shared.asciidoc[tag=lambda]
`analysis`.`regression`.`prediction_field_name`::::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=prediction-field-name]
`max_trees`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=max-trees]
`analysis`.`regression`.`num_top_feature_importance_values`::::
`num_top_feature_importance_values`::::
(Optional, integer)
Advanced configuration option. Specifies the maximum number of
{ml-docs}/dfa-regression.html#dfa-regression-feature-importance[feature importance]
values per document to return. By default, it is zero and no feature importance calculation
occurs.
values per document to return. By default, it is zero and no feature importance
calculation occurs.
`analysis`.`regression`.`training_percent`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=training-percent]
`prediction_field_name`::::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=prediction-field-name]
`analysis`.`regression`.`randomize_seed`::::
`randomize_seed`::::
(Optional, long)
include::{docdir}/ml/ml-shared.asciidoc[tag=randomize-seed]
`training_percent`::::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=training-percent]
=====
//End regression
====
//End analysis
//Begin analyzed_fields
`analyzed_fields`::
(Optional, object)
include::{docdir}/ml/ml-shared.asciidoc[tag=analyzed-fields]
`analyzed_fields`.`excludes`:::
+
.Properties of `analyzed_fields`
[%collapsible%open]
====
`excludes`:::
(Optional, array)
include::{docdir}/ml/ml-shared.asciidoc[tag=analyzed-fields-excludes]
`analyzed_fields`.`includes`:::
(Optional, array)
`includes`:::
(Optional, array)
include::{docdir}/ml/ml-shared.asciidoc[tag=analyzed-fields-includes]
//End analyzed_fields
====
`description`::
(Optional, string)

View File

@ -443,32 +443,46 @@ end::data-description[]
tag::data-frame-analytics[]
An array of {dfanalytics-job} resources, which are sorted by the `id` value in
ascending order.
+
.Properties of {dfanalytics-job} resources
[%collapsible%open]
====
`analysis`:::
(object) The type of analysis that is performed on the `source`.
//Begin analyzed_fields
`analyzed_fields`:::
(object) Contains `includes` and/or `excludes` patterns that select which fields
are included in the analysis.
`analyzed_fields`.`excludes`:::
+
.Properties of `analyzed_fields`
[%collapsible%open]
=====
`excludes`:::
(Optional, array) An array of strings that defines the fields that are excluded
from the analysis.
`analyzed_fields`.`includes`:::
`includes`:::
(Optional, array) An array of strings that defines the fields that are included
in the analysis.
=====
//End analyzed_fields
//Begin dest
`dest`:::
(string) The destination configuration of the analysis.
`dest`.`index`:::
+
.Properties of `dest`
[%collapsible%open]
=====
`index`:::
(string) The _destination index_ that stores the results of the
{dfanalytics-job}.
`dest`.`results_field`:::
`results_field`:::
(string) The name of the field that stores the results of the analysis. Defaults
to `ml`.
=====
//End dest
`id`:::
(string) The unique identifier of the {dfanalytics-job}.
@ -479,29 +493,40 @@ to `ml`.
`source`:::
(object) The configuration of how the analysis data is sourced. It has an
`index` parameter and optionally a `query` and a `_source`.
`source`.`index`:::
+
.Properties of `source`
[%collapsible%open]
=====
`index`:::
(array) Index or indices on which to perform the analysis. It can be a single
index or index pattern as well as an array of indices or patterns.
`source`.`query`:::
`query`:::
(object) The query that has been specified for the {dfanalytics-job}. The {es}
query domain-specific language (<<query-dsl,DSL>>). This value corresponds to
the query object in an {es} search POST body. By default, this property has the
following value: `{"match_all": {}}`.
`source`.`_source`:::
`_source`:::
(object) Contains the specified `includes` and/or `excludes` patterns that
select which fields are present in the destination. Fields that are excluded
cannot be included in the analysis.
`source`.`_source`.`excludes`:::
+
.Properties of `_source`
[%collapsible%open]
======
`excludes`:::
(array) An array of strings that defines the fields that are excluded from the
destination.
`source`.`_source`.`includes`:::
`includes`:::
(array) An array of strings that defines the fields that are included in the
destination.
======
//End of _source
=====
//End source
====
end::data-frame-analytics[]
tag::data-frame-analytics-stats[]
@ -970,16 +995,20 @@ A description of the job.
end::description-dfa[]
tag::dest[]
The destination configuration, consisting of `index` and
optionally `results_field` (`ml` by default).
`index`:::
(Required, string) Defines the _destination index_ to store the results of
the {dfanalytics-job}.
The destination configuration, consisting of `index` and optionally
`results_field` (`ml` by default).
+
.Properties of `dest`
[%collapsible%open]
====
`index`:::
(Required, string) Defines the _destination index_ to store the results of the
{dfanalytics-job}.
`results_field`:::
(Optional, string) Defines the name of the field in which to store the
results of the analysis. Default to `ml`.
`results_field`:::
(Optional, string) Defines the name of the field in which to store the results
of the analysis. Defaults to `ml`.
====
end::dest[]
tag::detector-description[]
@ -1045,14 +1074,11 @@ end::feature-influence-threshold[]
tag::field-selection[]
An array of objects that explain selection for each field, sorted by
the field names. Each object in the array has the following properties:
`name`:::
(string) The field name.
`mapping_types`:::
(string) The mapping types of the field.
the field names.
+
.Properties of `field_selection` objects
[%collapsible%open]
====
`is_included`:::
(boolean) Whether the field is selected to be included in the analysis.
@ -1063,8 +1089,15 @@ the field names. Each object in the array has the following properties:
(string) The feature type of this field for the analysis. May be `categorical`
or `numerical`.
`mapping_types`:::
(string) The mapping types of the field.
`name`:::
(string) The field name.
`reason`:::
(string) The reason a field is not selected to be included in the analysis.
====
end::field-selection[]
tag::filter[]
@ -1297,18 +1330,21 @@ allowed to contain. The maximum value is 2000.
end::max-trees[]
tag::memory-estimation[]
An object containing the memory estimates. The object has the
following properties:
`expected_memory_without_disk`:::
(string) Estimated memory usage under the assumption that the whole
{dfanalytics} should happen in memory (i.e. without overflowing to disk).
An object containing the memory estimates.
+
.Properties of `memory_estimation`
[%collapsible%open]
====
`expected_memory_with_disk`:::
(string) Estimated memory usage under the assumption that overflowing to disk is
allowed during {dfanalytics}. `expected_memory_with_disk` is usually smaller
than `expected_memory_without_disk` as using disk allows to limit the main
memory needed to perform {dfanalytics}.
`expected_memory_without_disk`:::
(string) Estimated memory usage under the assumption that the whole
{dfanalytics} should happen in memory (i.e. without overflowing to disk).
====
end::memory-estimation[]
tag::method[]
@ -1648,38 +1684,44 @@ A numerical character string that uniquely identifies the model snapshot.
end::snapshot-id[]
tag::source-put-dfa[]
The configuration of how to source the analysis data. It requires an
`index`. Optionally, `query` and `_source` may be specified.
`index`:::
(Required, string or array) Index or indices on which to perform the
analysis. It can be a single index or index pattern as well as an array of
indices or patterns.
The configuration of how to source the analysis data. It requires an `index`.
Optionally, `query` and `_source` may be specified.
+
.Properties of `source`
[%collapsible%open]
====
`index`:::
(Required, string or array) Index or indices on which to perform the analysis.
It can be a single index or index pattern as well as an array of indices or
patterns.
+
--
WARNING: If your source indices contain documents with the same IDs, only the
document that is indexed last appears in the destination index.
--
`query`:::
(Optional, object) The {es} query domain-specific language
(<<query-dsl,DSL>>). This value corresponds to the query object in an {es}
search POST body. All the options that are supported by {es} can be used,
as this object is passed verbatim to {es}. By default, this property has
the following value: `{"match_all": {}}`.
(Optional, object) The {es} query domain-specific language (<<query-dsl,DSL>>).
This value corresponds to the query object in an {es} search POST body. All the
options that are supported by {es} can be used, as this object is passed
verbatim to {es}. By default, this property has the following value:
`{"match_all": {}}`.
`_source`:::
(Optional, object) Specify `includes` and/or `excludes` patterns to select
which fields will be present in the destination. Fields that are excluded
cannot be included in the analysis.
`includes`::::
(array) An array of strings that defines the fields that will be
included in the destination.
(Optional, object) Specify `includes` and/or `excludes` patterns to select which
fields will be present in the destination. Fields that are excluded cannot be
included in the analysis.
+
.Properties of `_source`
[%collapsible%open]
=====
`includes`::::
(array) An array of strings that defines the fields that will be included in the
destination.
`excludes`::::
(array) An array of strings that defines the fields that will be
excluded from the destination.
`excludes`::::
(array) An array of strings that defines the fields that will be excluded from
the destination.
=====
====
end::source-put-dfa[]
tag::sparse-bucket-count[]
@ -1811,32 +1853,27 @@ end::total-partition-field-count[]
tag::trained-model-configs[]
An array of trained model resources, which are sorted by the `model_id` value in
ascending order.
`model_id`:::
(string)
Idetifier for the trained model.
+
.Properties of trained model resources
[%collapsible%open]
====
`created_by`:::
(string)
Information on the creator of the trained model.
`version`:::
(string)
The {es} version number in which the trained model was created.
`create_time`:::
(<<time-units,time units>>)
The time when the trained model was created.
`tags`:::
(string)
A comma delimited string of tags. A {infer} model can have many tags, or none.
`metadata`:::
`default_field_map` :::
(object)
An object containing metadata about the trained model. For example, models
created by {dfanalytics} contain an `analysis_config` and an `input`
object.
A string to string object that contains the default field map to use
when inferring against the model. For example, data frame analytics
may train the model on a specific multi-field `foo.keyword`.
The analytics job would then supply a default field map entry for
`"foo" : "foo.keyword"`.
+
Any field map described in the inference configuration takes precedence.
`estimated_heap_memory_usage_bytes`:::
(integer)
@ -1850,16 +1887,23 @@ The estimated number of operations to use the trained model.
(string)
The license level of the trained model.
`default_field_map` :::
`metadata`:::
(object)
A string to string object that contains the default field map to use
when inferring against the model. For example, data frame analytics
may train the model on a specific multi-field `foo.keyword`.
The analytics job would then supply a default field map entry for
`"foo" : "foo.keyword"`.
An object containing metadata about the trained model. For example, models
created by {dfanalytics} contain `analysis_config` and `input` objects.
Any field map described in the inference configuration takes precedence.
`model_id`:::
(string)
Idetifier for the trained model.
`tags`:::
(string)
A comma delimited string of tags. A {infer} model can have many tags, or none.
`version`:::
(string)
The {es} version number in which the trained model was created.
====
end::trained-model-configs[]
tag::training-percent[]