[DOCS] Removes unshared sections from ml-shared.asciidoc (#55192)
This commit is contained in:
parent
f49354b7d7
commit
2910d01179
|
@ -71,11 +71,50 @@ The API returns a response that contains the following:
|
||||||
|
|
||||||
`field_selection`::
|
`field_selection`::
|
||||||
(array)
|
(array)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=field-selection]
|
An array of objects that explain selection for each field, sorted by
|
||||||
|
the field names.
|
||||||
|
+
|
||||||
|
.Properties of `field_selection` objects
|
||||||
|
[%collapsible%open]
|
||||||
|
====
|
||||||
|
`is_included`:::
|
||||||
|
(boolean) Whether the field is selected to be included in the analysis.
|
||||||
|
|
||||||
|
`is_required`:::
|
||||||
|
(boolean) Whether the field is required.
|
||||||
|
|
||||||
|
`feature_type`:::
|
||||||
|
(string) The feature type of this field for the analysis. May be `categorical`
|
||||||
|
or `numerical`.
|
||||||
|
|
||||||
|
`mapping_types`:::
|
||||||
|
(string) The mapping types of the field.
|
||||||
|
|
||||||
|
`name`:::
|
||||||
|
(string) The field name.
|
||||||
|
|
||||||
|
`reason`:::
|
||||||
|
(string) The reason a field is not selected to be included in the analysis.
|
||||||
|
====
|
||||||
|
|
||||||
`memory_estimation`::
|
`memory_estimation`::
|
||||||
(object)
|
(object)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=memory-estimation]
|
An object containing the memory estimates.
|
||||||
|
+
|
||||||
|
.Properties of `memory_estimation`
|
||||||
|
[%collapsible%open]
|
||||||
|
====
|
||||||
|
`expected_memory_with_disk`:::
|
||||||
|
(string) Estimated memory usage under the assumption that overflowing to disk is
|
||||||
|
allowed during {dfanalytics}. `expected_memory_with_disk` is usually smaller
|
||||||
|
than `expected_memory_without_disk` as using disk allows to limit the main
|
||||||
|
memory needed to perform {dfanalytics}.
|
||||||
|
|
||||||
|
`expected_memory_without_disk`:::
|
||||||
|
(string) Estimated memory usage under the assumption that the whole
|
||||||
|
{dfanalytics} should happen in memory (i.e. without overflowing to disk).
|
||||||
|
====
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
[[ml-explain-dfanalytics-example]]
|
[[ml-explain-dfanalytics-example]]
|
||||||
|
|
|
@ -76,7 +76,92 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=size]
|
||||||
|
|
||||||
`data_frame_analytics`::
|
`data_frame_analytics`::
|
||||||
(array)
|
(array)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=data-frame-analytics]
|
An array of {dfanalytics-job} resources, which are sorted by the `id` value in
|
||||||
|
ascending order.
|
||||||
|
+
|
||||||
|
.Properties of {dfanalytics-job} resources
|
||||||
|
[%collapsible%open]
|
||||||
|
====
|
||||||
|
`analysis`:::
|
||||||
|
(object) The type of analysis that is performed on the `source`.
|
||||||
|
|
||||||
|
//Begin analyzed_fields
|
||||||
|
`analyzed_fields`:::
|
||||||
|
(object) Contains `includes` and/or `excludes` patterns that select which fields
|
||||||
|
are included in the analysis.
|
||||||
|
+
|
||||||
|
.Properties of `analyzed_fields`
|
||||||
|
[%collapsible%open]
|
||||||
|
=====
|
||||||
|
`excludes`:::
|
||||||
|
(Optional, array) An array of strings that defines the fields that are excluded
|
||||||
|
from the analysis.
|
||||||
|
|
||||||
|
`includes`:::
|
||||||
|
(Optional, array) An array of strings that defines the fields that are included
|
||||||
|
in the analysis.
|
||||||
|
=====
|
||||||
|
//End analyzed_fields
|
||||||
|
//Begin dest
|
||||||
|
`dest`:::
|
||||||
|
(string) The destination configuration of the analysis.
|
||||||
|
+
|
||||||
|
.Properties of `dest`
|
||||||
|
[%collapsible%open]
|
||||||
|
=====
|
||||||
|
`index`:::
|
||||||
|
(string) The _destination index_ that stores the results of the
|
||||||
|
{dfanalytics-job}.
|
||||||
|
|
||||||
|
`results_field`:::
|
||||||
|
(string) The name of the field that stores the results of the analysis. Defaults
|
||||||
|
to `ml`.
|
||||||
|
=====
|
||||||
|
//End dest
|
||||||
|
|
||||||
|
`id`:::
|
||||||
|
(string) The unique identifier of the {dfanalytics-job}.
|
||||||
|
|
||||||
|
`model_memory_limit`:::
|
||||||
|
(string) The `model_memory_limit` that has been set to the {dfanalytics-job}.
|
||||||
|
|
||||||
|
`source`:::
|
||||||
|
(object) The configuration of how the analysis data is sourced. It has an
|
||||||
|
`index` parameter and optionally a `query` and a `_source`.
|
||||||
|
+
|
||||||
|
.Properties of `source`
|
||||||
|
[%collapsible%open]
|
||||||
|
=====
|
||||||
|
`index`:::
|
||||||
|
(array) Index or indices on which to perform the analysis. It can be a single
|
||||||
|
index or index pattern as well as an array of indices or patterns.
|
||||||
|
|
||||||
|
`query`:::
|
||||||
|
(object) The query that has been specified for the {dfanalytics-job}. The {es}
|
||||||
|
query domain-specific language (<<query-dsl,DSL>>). This value corresponds to
|
||||||
|
the query object in an {es} search POST body. By default, this property has the
|
||||||
|
following value: `{"match_all": {}}`.
|
||||||
|
|
||||||
|
`_source`:::
|
||||||
|
(object) Contains the specified `includes` and/or `excludes` patterns that
|
||||||
|
select which fields are present in the destination. Fields that are excluded
|
||||||
|
cannot be included in the analysis.
|
||||||
|
+
|
||||||
|
.Properties of `_source`
|
||||||
|
[%collapsible%open]
|
||||||
|
======
|
||||||
|
`excludes`:::
|
||||||
|
(array) An array of strings that defines the fields that are excluded from the
|
||||||
|
destination.
|
||||||
|
|
||||||
|
`includes`:::
|
||||||
|
(array) An array of strings that defines the fields that are included in the
|
||||||
|
destination.
|
||||||
|
======
|
||||||
|
//End of _source
|
||||||
|
=====
|
||||||
|
//End source
|
||||||
|
====
|
||||||
|
|
||||||
|
|
||||||
[[ml-get-dfanalytics-response-codes]]
|
[[ml-get-dfanalytics-response-codes]]
|
||||||
|
|
|
@ -60,7 +60,8 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=allow-no-match]
|
||||||
|
|
||||||
`decompress_definition`::
|
`decompress_definition`::
|
||||||
(Optional, boolean)
|
(Optional, boolean)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=decompress-definition]
|
Specifies whether the included model definition should be returned as a JSON map
|
||||||
|
(`true`) or in a custom compressed format (`false`). Defaults to `true`.
|
||||||
|
|
||||||
`from`::
|
`from`::
|
||||||
(Optional, integer)
|
(Optional, integer)
|
||||||
|
@ -68,7 +69,9 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=from]
|
||||||
|
|
||||||
`include_model_definition`::
|
`include_model_definition`::
|
||||||
(Optional, boolean)
|
(Optional, boolean)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=include-model-definition]
|
Specifies if the model definition should be returned in the response. Defaults
|
||||||
|
to `false`. When `true`, only a single model must match the ID patterns
|
||||||
|
provided, otherwise a bad request is returned.
|
||||||
|
|
||||||
`size`::
|
`size`::
|
||||||
(Optional, integer)
|
(Optional, integer)
|
||||||
|
@ -84,7 +87,59 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=tags]
|
||||||
|
|
||||||
`trained_model_configs`::
|
`trained_model_configs`::
|
||||||
(array)
|
(array)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=trained-model-configs]
|
An array of trained model resources, which are sorted by the `model_id` value in
|
||||||
|
ascending order.
|
||||||
|
+
|
||||||
|
.Properties of trained model resources
|
||||||
|
[%collapsible%open]
|
||||||
|
====
|
||||||
|
`created_by`:::
|
||||||
|
(string)
|
||||||
|
Information on the creator of the trained model.
|
||||||
|
|
||||||
|
`create_time`:::
|
||||||
|
(<<time-units,time units>>)
|
||||||
|
The time when the trained model was created.
|
||||||
|
|
||||||
|
`default_field_map` :::
|
||||||
|
(object)
|
||||||
|
A string to string object that contains the default field map to use
|
||||||
|
when inferring against the model. For example, data frame analytics
|
||||||
|
may train the model on a specific multi-field `foo.keyword`.
|
||||||
|
The analytics job would then supply a default field map entry for
|
||||||
|
`"foo" : "foo.keyword"`.
|
||||||
|
+
|
||||||
|
Any field map described in the inference configuration takes precedence.
|
||||||
|
|
||||||
|
`estimated_heap_memory_usage_bytes`:::
|
||||||
|
(integer)
|
||||||
|
The estimated heap usage in bytes to keep the trained model in memory.
|
||||||
|
|
||||||
|
`estimated_operations`:::
|
||||||
|
(integer)
|
||||||
|
The estimated number of operations to use the trained model.
|
||||||
|
|
||||||
|
`license_level`:::
|
||||||
|
(string)
|
||||||
|
The license level of the trained model.
|
||||||
|
|
||||||
|
`metadata`:::
|
||||||
|
(object)
|
||||||
|
An object containing metadata about the trained model. For example, models
|
||||||
|
created by {dfanalytics} contain `analysis_config` and `input` objects.
|
||||||
|
|
||||||
|
`model_id`:::
|
||||||
|
(string)
|
||||||
|
Idetifier for the trained model.
|
||||||
|
|
||||||
|
`tags`:::
|
||||||
|
(string)
|
||||||
|
A comma delimited string of tags. A {infer} model can have many tags, or none.
|
||||||
|
|
||||||
|
`version`:::
|
||||||
|
(string)
|
||||||
|
The {es} version number in which the trained model was created.
|
||||||
|
====
|
||||||
|
|
||||||
[[ml-get-inference-response-codes]]
|
[[ml-get-inference-response-codes]]
|
||||||
==== {api-response-codes-title}
|
==== {api-response-codes-title}
|
||||||
|
|
|
@ -183,27 +183,41 @@ The configuration information necessary to perform
|
||||||
=====
|
=====
|
||||||
`compute_feature_influence`::::
|
`compute_feature_influence`::::
|
||||||
(Optional, boolean)
|
(Optional, boolean)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=compute-feature-influence]
|
If `true`, the feature influence calculation is enabled. Defaults to `true`.
|
||||||
|
|
||||||
`feature_influence_threshold`::::
|
`feature_influence_threshold`::::
|
||||||
(Optional, double)
|
(Optional, double)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=feature-influence-threshold]
|
The minimum {olscore} that a document needs to have in order to calculate its
|
||||||
|
{fiscore}. Value range: 0-1 (`0.1` by default).
|
||||||
|
|
||||||
`method`::::
|
`method`::::
|
||||||
(Optional, string)
|
(Optional, string)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=method]
|
Sets the method that {oldetection} uses. If the method is not set {oldetection}
|
||||||
|
uses an ensemble of different methods and normalises and combines their
|
||||||
|
individual {olscores} to obtain the overall {olscore}. We recommend to use the
|
||||||
|
ensemble method. Available methods are `lof`, `ldof`, `distance_kth_nn`,
|
||||||
|
`distance_knn`.
|
||||||
|
|
||||||
`n_neighbors`::::
|
`n_neighbors`::::
|
||||||
(Optional, integer)
|
(Optional, integer)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=n-neighbors]
|
Defines the value for how many nearest neighbors each method of
|
||||||
|
{oldetection} will use to calculate its {olscore}. When the value is not set,
|
||||||
|
different values will be used for different ensemble members. This helps
|
||||||
|
improve diversity in the ensemble. Therefore, only override this if you are
|
||||||
|
confident that the value you choose is appropriate for the data set.
|
||||||
|
|
||||||
`outlier_fraction`::::
|
`outlier_fraction`::::
|
||||||
(Optional, double)
|
(Optional, double)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=outlier-fraction]
|
Sets the proportion of the data set that is assumed to be outlying prior to
|
||||||
|
{oldetection}. For example, 0.05 means it is assumed that 5% of values are real
|
||||||
|
outliers and 95% are inliers.
|
||||||
|
|
||||||
`standardization_enabled`::::
|
`standardization_enabled`::::
|
||||||
(Optional, boolean)
|
(Optional, boolean)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=standardization-enabled]
|
If `true`, then the following operation is performed on the columns before
|
||||||
|
computing outlier scores: (x_i - mean(x_i)) / sd(x_i). Defaults to `true`. For
|
||||||
|
more information, see
|
||||||
|
https://en.wikipedia.org/wiki/Feature_scaling#Standardization_(Z-score_Normalization)[this wiki page about standardization].
|
||||||
//End outlier_detection
|
//End outlier_detection
|
||||||
=====
|
=====
|
||||||
//Begin regression
|
//Begin regression
|
||||||
|
@ -334,11 +348,54 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=dest]
|
||||||
|
|
||||||
`model_memory_limit`::
|
`model_memory_limit`::
|
||||||
(Optional, string)
|
(Optional, string)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=model-memory-limit-dfa]
|
The approximate maximum amount of memory resources that are permitted for
|
||||||
|
analytical processing. The default value for {dfanalytics-jobs} is `1gb`. If
|
||||||
|
your `elasticsearch.yml` file contains an `xpack.ml.max_model_memory_limit`
|
||||||
|
setting, an error occurs when you try to create {dfanalytics-jobs} that have
|
||||||
|
`model_memory_limit` values greater than that setting. For more information, see
|
||||||
|
<<ml-settings>>.
|
||||||
|
|
||||||
`source`::
|
`source`::
|
||||||
(object)
|
(object)
|
||||||
include::{docdir}/ml/ml-shared.asciidoc[tag=source-put-dfa]
|
The configuration of how to source the analysis data. It requires an `index`.
|
||||||
|
Optionally, `query` and `_source` may be specified.
|
||||||
|
+
|
||||||
|
.Properties of `source`
|
||||||
|
[%collapsible%open]
|
||||||
|
====
|
||||||
|
`index`:::
|
||||||
|
(Required, string or array) Index or indices on which to perform the analysis.
|
||||||
|
It can be a single index or index pattern as well as an array of indices or
|
||||||
|
patterns.
|
||||||
|
+
|
||||||
|
WARNING: If your source indices contain documents with the same IDs, only the
|
||||||
|
document that is indexed last appears in the destination index.
|
||||||
|
|
||||||
|
`query`:::
|
||||||
|
(Optional, object) The {es} query domain-specific language (<<query-dsl,DSL>>).
|
||||||
|
This value corresponds to the query object in an {es} search POST body. All the
|
||||||
|
options that are supported by {es} can be used, as this object is passed
|
||||||
|
verbatim to {es}. By default, this property has the following value:
|
||||||
|
`{"match_all": {}}`.
|
||||||
|
|
||||||
|
`_source`:::
|
||||||
|
(Optional, object) Specify `includes` and/or `excludes` patterns to select which
|
||||||
|
fields will be present in the destination. Fields that are excluded cannot be
|
||||||
|
included in the analysis.
|
||||||
|
+
|
||||||
|
.Properties of `_source`
|
||||||
|
[%collapsible%open]
|
||||||
|
=====
|
||||||
|
`includes`::::
|
||||||
|
(array) An array of strings that defines the fields that will be included in the
|
||||||
|
destination.
|
||||||
|
|
||||||
|
`excludes`::::
|
||||||
|
(array) An array of strings that defines the fields that will be excluded from
|
||||||
|
the destination.
|
||||||
|
=====
|
||||||
|
====
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
[[ml-put-dfanalytics-example]]
|
[[ml-put-dfanalytics-example]]
|
||||||
|
|
|
@ -278,10 +278,6 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=time-span]
|
||||||
====
|
====
|
||||||
end::chunking-config[]
|
end::chunking-config[]
|
||||||
|
|
||||||
tag::compute-feature-influence[]
|
|
||||||
If `true`, the feature influence calculation is enabled. Defaults to `true`.
|
|
||||||
end::compute-feature-influence[]
|
|
||||||
|
|
||||||
tag::custom-rules[]
|
tag::custom-rules[]
|
||||||
An array of custom rule objects, which enable you to customize the way detectors
|
An array of custom rule objects, which enable you to customize the way detectors
|
||||||
operate. For example, a rule may dictate to the detector conditions under which
|
operate. For example, a rule may dictate to the detector conditions under which
|
||||||
|
@ -375,95 +371,6 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=time-format]
|
||||||
====
|
====
|
||||||
end::data-description[]
|
end::data-description[]
|
||||||
|
|
||||||
tag::data-frame-analytics[]
|
|
||||||
An array of {dfanalytics-job} resources, which are sorted by the `id` value in
|
|
||||||
ascending order.
|
|
||||||
+
|
|
||||||
.Properties of {dfanalytics-job} resources
|
|
||||||
[%collapsible%open]
|
|
||||||
====
|
|
||||||
`analysis`:::
|
|
||||||
(object) The type of analysis that is performed on the `source`.
|
|
||||||
|
|
||||||
//Begin analyzed_fields
|
|
||||||
`analyzed_fields`:::
|
|
||||||
(object) Contains `includes` and/or `excludes` patterns that select which fields
|
|
||||||
are included in the analysis.
|
|
||||||
+
|
|
||||||
.Properties of `analyzed_fields`
|
|
||||||
[%collapsible%open]
|
|
||||||
=====
|
|
||||||
`excludes`:::
|
|
||||||
(Optional, array) An array of strings that defines the fields that are excluded
|
|
||||||
from the analysis.
|
|
||||||
|
|
||||||
`includes`:::
|
|
||||||
(Optional, array) An array of strings that defines the fields that are included
|
|
||||||
in the analysis.
|
|
||||||
=====
|
|
||||||
//End analyzed_fields
|
|
||||||
//Begin dest
|
|
||||||
`dest`:::
|
|
||||||
(string) The destination configuration of the analysis.
|
|
||||||
+
|
|
||||||
.Properties of `dest`
|
|
||||||
[%collapsible%open]
|
|
||||||
=====
|
|
||||||
`index`:::
|
|
||||||
(string) The _destination index_ that stores the results of the
|
|
||||||
{dfanalytics-job}.
|
|
||||||
|
|
||||||
`results_field`:::
|
|
||||||
(string) The name of the field that stores the results of the analysis. Defaults
|
|
||||||
to `ml`.
|
|
||||||
=====
|
|
||||||
//End dest
|
|
||||||
|
|
||||||
`id`:::
|
|
||||||
(string) The unique identifier of the {dfanalytics-job}.
|
|
||||||
|
|
||||||
`model_memory_limit`:::
|
|
||||||
(string) The `model_memory_limit` that has been set to the {dfanalytics-job}.
|
|
||||||
|
|
||||||
`source`:::
|
|
||||||
(object) The configuration of how the analysis data is sourced. It has an
|
|
||||||
`index` parameter and optionally a `query` and a `_source`.
|
|
||||||
+
|
|
||||||
.Properties of `source`
|
|
||||||
[%collapsible%open]
|
|
||||||
=====
|
|
||||||
`index`:::
|
|
||||||
(array) Index or indices on which to perform the analysis. It can be a single
|
|
||||||
index or index pattern as well as an array of indices or patterns.
|
|
||||||
|
|
||||||
`query`:::
|
|
||||||
(object) The query that has been specified for the {dfanalytics-job}. The {es}
|
|
||||||
query domain-specific language (<<query-dsl,DSL>>). This value corresponds to
|
|
||||||
the query object in an {es} search POST body. By default, this property has the
|
|
||||||
following value: `{"match_all": {}}`.
|
|
||||||
|
|
||||||
`_source`:::
|
|
||||||
(object) Contains the specified `includes` and/or `excludes` patterns that
|
|
||||||
select which fields are present in the destination. Fields that are excluded
|
|
||||||
cannot be included in the analysis.
|
|
||||||
+
|
|
||||||
.Properties of `_source`
|
|
||||||
[%collapsible%open]
|
|
||||||
======
|
|
||||||
`excludes`:::
|
|
||||||
(array) An array of strings that defines the fields that are excluded from the
|
|
||||||
destination.
|
|
||||||
|
|
||||||
`includes`:::
|
|
||||||
(array) An array of strings that defines the fields that are included in the
|
|
||||||
destination.
|
|
||||||
======
|
|
||||||
//End of _source
|
|
||||||
=====
|
|
||||||
//End source
|
|
||||||
====
|
|
||||||
end::data-frame-analytics[]
|
|
||||||
|
|
||||||
tag::data-frame-analytics-stats[]
|
tag::data-frame-analytics-stats[]
|
||||||
An array of statistics objects for {dfanalytics-jobs}, which are
|
An array of statistics objects for {dfanalytics-jobs}, which are
|
||||||
sorted by the `id` value in ascending order.
|
sorted by the `id` value in ascending order.
|
||||||
|
@ -906,11 +813,6 @@ category. (Dead categories are a side effect of the way categorization has no
|
||||||
prior training.)
|
prior training.)
|
||||||
end::dead-category-count[]
|
end::dead-category-count[]
|
||||||
|
|
||||||
tag::decompress-definition[]
|
|
||||||
Specifies whether the included model definition should be returned as a JSON map
|
|
||||||
(`true`) or in a custom compressed format (`false`). Defaults to `true`.
|
|
||||||
end::decompress-definition[]
|
|
||||||
|
|
||||||
tag::delayed-data-check-config[]
|
tag::delayed-data-check-config[]
|
||||||
Specifies whether the {dfeed} checks for missing data and the size of the
|
Specifies whether the {dfeed} checks for missing data and the size of the
|
||||||
window. For example: `{"enabled": true, "check_window": "1h"}`.
|
window. For example: `{"enabled": true, "check_window": "1h"}`.
|
||||||
|
@ -1029,39 +931,6 @@ Advanced configuration option. Defines the fraction of features that will be
|
||||||
used when selecting a random bag for each candidate split.
|
used when selecting a random bag for each candidate split.
|
||||||
end::feature-bag-fraction[]
|
end::feature-bag-fraction[]
|
||||||
|
|
||||||
tag::feature-influence-threshold[]
|
|
||||||
The minimum {olscore} that a document needs to have in order to calculate its
|
|
||||||
{fiscore}. Value range: 0-1 (`0.1` by default).
|
|
||||||
end::feature-influence-threshold[]
|
|
||||||
|
|
||||||
tag::field-selection[]
|
|
||||||
An array of objects that explain selection for each field, sorted by
|
|
||||||
the field names.
|
|
||||||
+
|
|
||||||
.Properties of `field_selection` objects
|
|
||||||
[%collapsible%open]
|
|
||||||
====
|
|
||||||
`is_included`:::
|
|
||||||
(boolean) Whether the field is selected to be included in the analysis.
|
|
||||||
|
|
||||||
`is_required`:::
|
|
||||||
(boolean) Whether the field is required.
|
|
||||||
|
|
||||||
`feature_type`:::
|
|
||||||
(string) The feature type of this field for the analysis. May be `categorical`
|
|
||||||
or `numerical`.
|
|
||||||
|
|
||||||
`mapping_types`:::
|
|
||||||
(string) The mapping types of the field.
|
|
||||||
|
|
||||||
`name`:::
|
|
||||||
(string) The field name.
|
|
||||||
|
|
||||||
`reason`:::
|
|
||||||
(string) The reason a field is not selected to be included in the analysis.
|
|
||||||
====
|
|
||||||
end::field-selection[]
|
|
||||||
|
|
||||||
tag::filter[]
|
tag::filter[]
|
||||||
One or more <<analysis-tokenfilters,token filters>>. In addition to the built-in
|
One or more <<analysis-tokenfilters,token filters>>. In addition to the built-in
|
||||||
token filters, other plugins can provide more token filters. This property is
|
token filters, other plugins can provide more token filters. This property is
|
||||||
|
@ -1114,12 +983,6 @@ tag::groups[]
|
||||||
A list of job groups. A job can belong to no groups or many.
|
A list of job groups. A job can belong to no groups or many.
|
||||||
end::groups[]
|
end::groups[]
|
||||||
|
|
||||||
tag::include-model-definition[]
|
|
||||||
Specifies if the model definition should be returned in the response. Defaults
|
|
||||||
to `false`. When `true`, only a single model must match the ID patterns
|
|
||||||
provided, otherwise a bad request is returned.
|
|
||||||
end::include-model-definition[]
|
|
||||||
|
|
||||||
tag::indices[]
|
tag::indices[]
|
||||||
An array of index names. Wildcards are supported. For example:
|
An array of index names. Wildcards are supported. For example:
|
||||||
`["it_ops_metrics", "server*"]`.
|
`["it_ops_metrics", "server*"]`.
|
||||||
|
@ -1319,32 +1182,6 @@ Advanced configuration option. Defines the maximum number of trees the forest is
|
||||||
allowed to contain. The maximum value is 2000.
|
allowed to contain. The maximum value is 2000.
|
||||||
end::max-trees[]
|
end::max-trees[]
|
||||||
|
|
||||||
tag::memory-estimation[]
|
|
||||||
An object containing the memory estimates.
|
|
||||||
+
|
|
||||||
.Properties of `memory_estimation`
|
|
||||||
[%collapsible%open]
|
|
||||||
====
|
|
||||||
`expected_memory_with_disk`:::
|
|
||||||
(string) Estimated memory usage under the assumption that overflowing to disk is
|
|
||||||
allowed during {dfanalytics}. `expected_memory_with_disk` is usually smaller
|
|
||||||
than `expected_memory_without_disk` as using disk allows to limit the main
|
|
||||||
memory needed to perform {dfanalytics}.
|
|
||||||
|
|
||||||
`expected_memory_without_disk`:::
|
|
||||||
(string) Estimated memory usage under the assumption that the whole
|
|
||||||
{dfanalytics} should happen in memory (i.e. without overflowing to disk).
|
|
||||||
====
|
|
||||||
end::memory-estimation[]
|
|
||||||
|
|
||||||
tag::method[]
|
|
||||||
Sets the method that {oldetection} uses. If the method is not set {oldetection}
|
|
||||||
uses an ensemble of different methods and normalises and combines their
|
|
||||||
individual {olscores} to obtain the overall {olscore}. We recommend to use the
|
|
||||||
ensemble method. Available methods are `lof`, `ldof`, `distance_kth_nn`,
|
|
||||||
`distance_knn`.
|
|
||||||
end::method[]
|
|
||||||
|
|
||||||
tag::missing-field-count[]
|
tag::missing-field-count[]
|
||||||
The number of input documents that are missing a field that the {anomaly-job} is
|
The number of input documents that are missing a field that the {anomaly-job} is
|
||||||
configured to analyze. Input documents with missing fields are still processed
|
configured to analyze. Input documents with missing fields are still processed
|
||||||
|
@ -1411,15 +1248,6 @@ tag::model-memory-limit-anomaly-jobs[]
|
||||||
The upper limit for model memory usage, checked on increasing values.
|
The upper limit for model memory usage, checked on increasing values.
|
||||||
end::model-memory-limit-anomaly-jobs[]
|
end::model-memory-limit-anomaly-jobs[]
|
||||||
|
|
||||||
tag::model-memory-limit-dfa[]
|
|
||||||
The approximate maximum amount of memory resources that are permitted for
|
|
||||||
analytical processing. The default value for {dfanalytics-jobs} is `1gb`. If
|
|
||||||
your `elasticsearch.yml` file contains an `xpack.ml.max_model_memory_limit`
|
|
||||||
setting, an error occurs when you try to create {dfanalytics-jobs} that have
|
|
||||||
`model_memory_limit` values greater than that setting. For more information, see
|
|
||||||
<<ml-settings>>.
|
|
||||||
end::model-memory-limit-dfa[]
|
|
||||||
|
|
||||||
tag::model-memory-status[]
|
tag::model-memory-status[]
|
||||||
The status of the mathematical models, which can have one of the following
|
The status of the mathematical models, which can have one of the following
|
||||||
values:
|
values:
|
||||||
|
@ -1496,14 +1324,6 @@ NOTE: To use the `multivariate_by_fields` property, you must also specify
|
||||||
--
|
--
|
||||||
end::multivariate-by-fields[]
|
end::multivariate-by-fields[]
|
||||||
|
|
||||||
tag::n-neighbors[]
|
|
||||||
Defines the value for how many nearest neighbors each method of
|
|
||||||
{oldetection} will use to calculate its {olscore}. When the value is not set,
|
|
||||||
different values will be used for different ensemble members. This helps
|
|
||||||
improve diversity in the ensemble. Therefore, only override this if you are
|
|
||||||
confident that the value you choose is appropriate for the data set.
|
|
||||||
end::n-neighbors[]
|
|
||||||
|
|
||||||
tag::node-address[]
|
tag::node-address[]
|
||||||
The network address of the node.
|
The network address of the node.
|
||||||
end::node-address[]
|
end::node-address[]
|
||||||
|
@ -1538,12 +1358,6 @@ order documents are discarded, since jobs require time series data to be in
|
||||||
ascending chronological order.
|
ascending chronological order.
|
||||||
end::out-of-order-timestamp-count[]
|
end::out-of-order-timestamp-count[]
|
||||||
|
|
||||||
tag::outlier-fraction[]
|
|
||||||
Sets the proportion of the data set that is assumed to be outlying prior to
|
|
||||||
{oldetection}. For example, 0.05 means it is assumed that 5% of values are real
|
|
||||||
outliers and 95% are inliers.
|
|
||||||
end::outlier-fraction[]
|
|
||||||
|
|
||||||
tag::over-field-name[]
|
tag::over-field-name[]
|
||||||
The field used to split the data. In particular, this property is used for
|
The field used to split the data. In particular, this property is used for
|
||||||
analyzing the splits with respect to the history of all splits. It is used for
|
analyzing the splits with respect to the history of all splits. It is used for
|
||||||
|
@ -1666,60 +1480,12 @@ tag::snapshot-id[]
|
||||||
A numerical character string that uniquely identifies the model snapshot.
|
A numerical character string that uniquely identifies the model snapshot.
|
||||||
end::snapshot-id[]
|
end::snapshot-id[]
|
||||||
|
|
||||||
tag::source-put-dfa[]
|
|
||||||
The configuration of how to source the analysis data. It requires an `index`.
|
|
||||||
Optionally, `query` and `_source` may be specified.
|
|
||||||
+
|
|
||||||
.Properties of `source`
|
|
||||||
[%collapsible%open]
|
|
||||||
====
|
|
||||||
`index`:::
|
|
||||||
(Required, string or array) Index or indices on which to perform the analysis.
|
|
||||||
It can be a single index or index pattern as well as an array of indices or
|
|
||||||
patterns.
|
|
||||||
+
|
|
||||||
WARNING: If your source indices contain documents with the same IDs, only the
|
|
||||||
document that is indexed last appears in the destination index.
|
|
||||||
|
|
||||||
`query`:::
|
|
||||||
(Optional, object) The {es} query domain-specific language (<<query-dsl,DSL>>).
|
|
||||||
This value corresponds to the query object in an {es} search POST body. All the
|
|
||||||
options that are supported by {es} can be used, as this object is passed
|
|
||||||
verbatim to {es}. By default, this property has the following value:
|
|
||||||
`{"match_all": {}}`.
|
|
||||||
|
|
||||||
`_source`:::
|
|
||||||
(Optional, object) Specify `includes` and/or `excludes` patterns to select which
|
|
||||||
fields will be present in the destination. Fields that are excluded cannot be
|
|
||||||
included in the analysis.
|
|
||||||
+
|
|
||||||
.Properties of `_source`
|
|
||||||
[%collapsible%open]
|
|
||||||
=====
|
|
||||||
`includes`::::
|
|
||||||
(array) An array of strings that defines the fields that will be included in the
|
|
||||||
destination.
|
|
||||||
|
|
||||||
`excludes`::::
|
|
||||||
(array) An array of strings that defines the fields that will be excluded from
|
|
||||||
the destination.
|
|
||||||
=====
|
|
||||||
====
|
|
||||||
end::source-put-dfa[]
|
|
||||||
|
|
||||||
tag::sparse-bucket-count[]
|
tag::sparse-bucket-count[]
|
||||||
The number of buckets that contained few data points compared to the expected
|
The number of buckets that contained few data points compared to the expected
|
||||||
number of data points. If your data contains many sparse buckets, consider using
|
number of data points. If your data contains many sparse buckets, consider using
|
||||||
a longer `bucket_span`.
|
a longer `bucket_span`.
|
||||||
end::sparse-bucket-count[]
|
end::sparse-bucket-count[]
|
||||||
|
|
||||||
tag::standardization-enabled[]
|
|
||||||
If `true`, then the following operation is performed on the columns before
|
|
||||||
computing outlier scores: (x_i - mean(x_i)) / sd(x_i). Defaults to `true`. For
|
|
||||||
more information, see
|
|
||||||
https://en.wikipedia.org/wiki/Feature_scaling#Standardization_(Z-score_Normalization)[this wiki page about standardization].
|
|
||||||
end::standardization-enabled[]
|
|
||||||
|
|
||||||
tag::state-anomaly-job[]
|
tag::state-anomaly-job[]
|
||||||
The status of the {anomaly-job}, which can be one of the following values:
|
The status of the {anomaly-job}, which can be one of the following values:
|
||||||
+
|
+
|
||||||
|
@ -1833,62 +1599,6 @@ The number of `partition` field values that were analyzed by the models. This
|
||||||
value is cumulative for all detectors in the job.
|
value is cumulative for all detectors in the job.
|
||||||
end::total-partition-field-count[]
|
end::total-partition-field-count[]
|
||||||
|
|
||||||
tag::trained-model-configs[]
|
|
||||||
An array of trained model resources, which are sorted by the `model_id` value in
|
|
||||||
ascending order.
|
|
||||||
+
|
|
||||||
.Properties of trained model resources
|
|
||||||
[%collapsible%open]
|
|
||||||
====
|
|
||||||
`created_by`:::
|
|
||||||
(string)
|
|
||||||
Information on the creator of the trained model.
|
|
||||||
|
|
||||||
`create_time`:::
|
|
||||||
(<<time-units,time units>>)
|
|
||||||
The time when the trained model was created.
|
|
||||||
|
|
||||||
`default_field_map` :::
|
|
||||||
(object)
|
|
||||||
A string to string object that contains the default field map to use
|
|
||||||
when inferring against the model. For example, data frame analytics
|
|
||||||
may train the model on a specific multi-field `foo.keyword`.
|
|
||||||
The analytics job would then supply a default field map entry for
|
|
||||||
`"foo" : "foo.keyword"`.
|
|
||||||
+
|
|
||||||
Any field map described in the inference configuration takes precedence.
|
|
||||||
|
|
||||||
`estimated_heap_memory_usage_bytes`:::
|
|
||||||
(integer)
|
|
||||||
The estimated heap usage in bytes to keep the trained model in memory.
|
|
||||||
|
|
||||||
`estimated_operations`:::
|
|
||||||
(integer)
|
|
||||||
The estimated number of operations to use the trained model.
|
|
||||||
|
|
||||||
`license_level`:::
|
|
||||||
(string)
|
|
||||||
The license level of the trained model.
|
|
||||||
|
|
||||||
`metadata`:::
|
|
||||||
(object)
|
|
||||||
An object containing metadata about the trained model. For example, models
|
|
||||||
created by {dfanalytics} contain `analysis_config` and `input` objects.
|
|
||||||
|
|
||||||
`model_id`:::
|
|
||||||
(string)
|
|
||||||
Idetifier for the trained model.
|
|
||||||
|
|
||||||
`tags`:::
|
|
||||||
(string)
|
|
||||||
A comma delimited string of tags. A {infer} model can have many tags, or none.
|
|
||||||
|
|
||||||
`version`:::
|
|
||||||
(string)
|
|
||||||
The {es} version number in which the trained model was created.
|
|
||||||
====
|
|
||||||
end::trained-model-configs[]
|
|
||||||
|
|
||||||
tag::training-percent[]
|
tag::training-percent[]
|
||||||
Defines what percentage of the eligible documents that will
|
Defines what percentage of the eligible documents that will
|
||||||
be used for training. Documents that are ignored by the analysis (for example
|
be used for training. Documents that are ignored by the analysis (for example
|
||||||
|
|
Loading…
Reference in New Issue