[7.x][DOCS] Update example and nesting in get data frame analytics job stats API (#55612)

This commit is contained in:
Lisa Cawley 2020-04-22 10:58:26 -07:00 committed by GitHub
parent 6e1b958069
commit 314ca78e31
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 617 additions and 463 deletions

View File

@ -82,17 +82,26 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=node-datafeeds]
--
[%collapsible%open]
====
`id`:::
include::{docdir}/ml/ml-shared.asciidoc[tag=node-id]
`name`::: The node name. For example, `0-o0tOo`.
`attributes`:::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-attributes]
`ephemeral_id`:::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
`transport_address`::: The host and port where transport HTTP connections are
accepted. For example, `127.0.0.1:9300`.
`attributes`::: For example, `{"ml.machine_memory": "17179869184"}`.
`id`:::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-id]
`name`:::
(string)
The node name. For example, `0-o0tOo`.
`transport_address`:::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-transport-address]
====
--

View File

@ -281,8 +281,8 @@ available only for open jobs.
[%collapsible%open]
====
`attributes`:::
(object) Lists node attributes. For example,
`{"ml.machine_memory": "17179869184"}`.
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-attributes]
`ephemeral_id`:::
(string)
@ -293,10 +293,12 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
include::{docdir}/ml/ml-shared.asciidoc[tag=node-id]
`name`:::
(string) The node name.
(string)
The node name.
`transport_address`:::
(string) The host and port where transport HTTP connections are accepted.
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-transport-address]
====
//End node

View File

@ -61,12 +61,442 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=size]
[[ml-get-dfanalytics-stats-response-body]]
==== {api-response-body-title}
The API returns the following information:
`data_frame_analytics`::
(array)
include::{docdir}/ml/ml-shared.asciidoc[tag=data-frame-analytics-stats]
An array of objects that contain usage information for {dfanalytics-jobs}, which
are sorted by the `id` value in ascending order.
+
.Properties of {dfanalytics-job} usage resources
[%collapsible%open]
====
//Begin analysis_stats
`analysis_stats`:::
(object)
An object containing information about the analysis job.
+
.Properties of `analysis_stats`
[%collapsible%open]
=====
//Begin classification_stats
`classification_stats`::::
(object)
An object containing information about the {classanalysis} job.
+
.Properties of `classification_stats`
[%collapsible%open]
======
//Begin class_hyperparameters
`hyperparameters`::::
(object)
An object containing the parameters of the {classanalysis} job.
+
.Properties of `hyperparameters`
[%collapsible%open]
=======
`alpha`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-alpha]
`class_assignment_objective`::::
(string)
Defines whether class assignment maximizes the accuracy or the minimum recall
metric. Possible values are `maximize_accuracy` and `maximize_minimum_recall`.
`downsample_factor`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-downsample-factor]
`eta`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-eta]
`eta_growth_rate_per_tree`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-eta-growth]
`feature_bag_fraction`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-feature-bag-fraction]
`gamma`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-gamma]
`lambda`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-lambda]
`max_attempts_to_add_tree`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-attempts]
`max_optimization_rounds_per_hyperparameter`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-optimization-rounds]
`max_trees`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-trees]
`num_folds`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-num-folds]
`num_splits_per_feature`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-num-splits]
`soft_tree_depth_limit`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-soft-limit]
`soft_tree_depth_tolerance`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-soft-tolerance]
=======
//End class_hyperparameters
`iteration`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-iteration]
`timestamp`::::
(date)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timestamp]
//Begin class_timing_stats
`timing_stats`::::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats]
+
.Properties of `timing_stats`
[%collapsible%open]
=======
`elapsed_time`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-elapsed]
`iteration_time`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-iteration]
=======
//End class_timing_stats
//Begin class_validation_loss
`validation_loss`::::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss]
+
.Properties of `validation_loss`
[%collapsible%open]
=======
`fold_values`::::
(array of strings)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss-fold]
`loss_type`::::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss-type]
=======
//End class_validation_loss
======
//End classification_stats
//Begin outlier_detection_stats
`outlier_detection_stats`::::
(object)
An object containing information about the {oldetection} job.
+
.Properties of `outlier_detection_stats`
[%collapsible%open]
======
//Begin parameters
`parameters`::::
(object)
The list of job parameters specified by the user or determined by algorithmic
heuristics.
+
.Properties of `parameters`
[%collapsible%open]
=======
`compute_feature_influence`::::
(boolean)
If true, feature influence calculation is enabled.
`feature_influence_threshold`::::
(double)
The minimum {olscore} that a document needs to have to calculate its feature
influence score.
`method`::::
(string)
The method that {oldetection} uses. Possible values are `lof`, `ldof`,
`distance_kth_nn`, `distance_knn`, and `ensemble`.
`n_neighbors`::::
(integer)
The value for how many nearest neighbors each method of {oldetection} uses to
calculate its outlier score.
`outlier_fraction`::::
(double)
The proportion of the data set that is assumed to be outlying prior to
{oldetection}.
`standardization_enabled`::::
(boolean)
If true, then the following operation is performed on the columns before
computing {olscores}: (x_i - mean(x_i)) / sd(x_i).
=======
//End parameters
`timestamp`::::
(date)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timestamp]
//Begin od_timing_stats
`timing_stats`::::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats]
+
.Property of `timing_stats`
[%collapsible%open]
=======
`elapsed_time`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-elapsed]
=======
//End od_timing_stats
======
//End outlier_detection_stats
//Begin regression_stats
`regression_stats`::::
(object)
An object containing information about the {reganalysis}.
+
.Properties of `regression_stats`
[%collapsible%open]
======
//Begin reg_hyperparameters
`hyperparameters`::::
(object)
An object containing the parameters of the {reganalysis}.
+
.Properties of `hyperparameters`
[%collapsible%open]
=======
`alpha`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-alpha]
`downsample_factor`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-downsample-factor]
`eta`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-eta]
`eta_growth_rate_per_tree`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-eta-growth]
`feature_bag_fraction`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-feature-bag-fraction]
`gamma`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-gamma]
`lambda`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-lambda]
`max_attempts_to_add_tree`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-attempts]
`max_optimization_rounds_per_hyperparameter`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-optimization-rounds]
`max_trees`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-trees]
`num_folds`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-num-folds]
`num_splits_per_feature`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-num-splits]
`soft_tree_depth_limit`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-soft-limit]
`soft_tree_depth_tolerance`::::
(double)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-soft-tolerance]
=======
//End reg_hyperparameters
`iteration`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-iteration]
`timestamp`::::
(date)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timestamp]
//Begin reg_timing_stats
`timing_stats`::::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats]
+
.Propertis of `timing_stats`
[%collapsible%open]
=======
`elapsed_time`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-elapsed]
`iteration_time`::::
(integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-iteration]
=======
//End reg_timing_stats
//Begin reg_validation_loss
`validation_loss`::::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss]
+
.Properties of `validation_loss`
[%collapsible%open]
=======
`fold_values`::::
(array of strings)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss-fold]
`loss_type`::::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss-type]
=======
//End reg_validation_loss
======
//End regression_stats
=====
//End analysis_stats
`assignment_explanation`:::
(string)
For running jobs only, contains messages relating to the selection of a node to
run the job.
//Begin data_counts
`data_counts`:::
(object)
An object that provides counts for the quantity of documents skipped, used in
training, or available for testing.
+
.Properties of `data_counts`
[%collapsible%open]
=====
`skipped_docs_count`:::
(integer)
The number of documents that are skipped during the analysis because they
contained values that are not supported by the analysis. For example,
{oldetection} does not support missing fields so it skips documents with missing
fields. Likewise, all types of analysis skip documents that contain arrays with
more than one element.
`test_docs_count`:::
(integer)
The number of documents that are not used for training the model and can be used
for testing.
`training_docs_count`:::
(integer)
The number of documents that are used for training the model.
=====
//End data_counts
`id`:::
(string)
The unique identifier of the {dfanalytics-job}.
`memory_usage`:::
(Optional, object)
An object describing memory usage of the analytics. It is present only after the
job is started and memory usage is reported.
+
.Properties of `memory_usage`
[%collapsible%open]
=====
`peak_usage_bytes`:::
(long)
The number of bytes used at the highest peak of memory usage.
`timestamp`:::
(date)
The timestamp when memory usage was calculated.
=====
`node`:::
(object)
Contains properties for the node that runs the job. This information is
available only for running jobs.
+
.Properties of `node`
[%collapsible%open]
=====
`attributes`:::
(object)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-attributes]
`ephemeral_id`:::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
`id`:::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-id]
`name`:::
(string)
The node name.
`transport_address`:::
(string)
include::{docdir}/ml/ml-shared.asciidoc[tag=node-transport-address]
=====
`progress`:::
(array) The progress report of the {dfanalytics-job} by phase.
+
.Properties of phase objects
[%collapsible%open]
=====
`phase`:::
(string) Defines the phase of the {dfanalytics-job}. Possible phases:
`reindexing`, `loading_data`, `analyzing`, and `writing_results`.
`progress_percent`:::
(integer) The progress that the {dfanalytics-job} has made expressed in
percentage.
=====
`state`:::
(string) The status of the {dfanalytics-job}, which can be one of the following
values: `analyzing`, `failed`, `reindexing`, `started`, `starting`, `stopped`,
`stopping`.
====
//End of data_frame_analytics
[[ml-get-dfanalytics-stats-response-codes]]
==== {api-response-codes-title}
@ -79,11 +509,14 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=data-frame-analytics-stats]
[[ml-get-dfanalytics-stats-example]]
==== {api-examples-title}
The following API retrieves usage information for the
{ml-docs}/ecommerce-outliers.html[{oldetection} {dfanalytics-job} example]:
[source,console]
--------------------------------------------------
GET _ml/data_frame/analytics/loganalytics/_stats
GET _ml/data_frame/analytics/ecommerce/_stats
--------------------------------------------------
// TEST[skip:TBD]
// TEST[skip:Kibana sample data]
The API returns the following results:
@ -91,30 +524,55 @@ The API returns the following results:
[source,console-result]
----
{
"count": 1,
"data_frame_analytics": [
"count" : 1,
"data_frame_analytics" : [
{
"id" : "ecommerce",
"state" : "stopped",
"progress" : [
{
"id": "loganalytics",
"state": "stopped",
"progress": [
{
"phase": "reindexing",
"progress_percent": 0
},
{
"phase": "loading_data",
"progress_percent": 0
},
{
"phase": "analyzing",
"progress_percent": 0
},
{
"phase": "writing_results",
"progress_percent": 0
}
]
"phase" : "reindexing",
"progress_percent" : 100
},
{
"phase" : "loading_data",
"progress_percent" : 100
},
{
"phase" : "analyzing",
"progress_percent" : 100
},
{
"phase" : "writing_results",
"progress_percent" : 100
}
]
],
"data_counts" : {
"training_docs_count" : 3321,
"test_docs_count" : 0,
"skipped_docs_count" : 0
},
"memory_usage" : {
"timestamp" : 1586905058000,
"peak_usage_bytes" : 279484
},
"analysis_stats" : {
"outlier_detection_stats" : {
"timestamp" : 1586905058000,
"parameters" : {
"n_neighbors" : 0,
"method" : "ensemble",
"compute_feature_influence" : true,
"feature_influence_threshold" : 0.1,
"outlier_fraction" : 0.05,
"standardization_enabled" : true
},
"timing_stats" : {
"elapsed_time" : 245
}
}
}
}
]
}
----

View File

@ -371,429 +371,6 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=time-format]
====
end::data-description[]
tag::data-frame-analytics-stats[]
An array of statistics objects for {dfanalytics-jobs}, which are
sorted by the `id` value in ascending order.
//Begin analysis_stats
`analysis_stats`::
(object)
An object containing statistical data about the analysis.
+
.Properties of `analysis_stats`
[%collapsible%open]
====
//Begin classification_stats
`classification_stats`:::
(object)
An object containing statistical data about the {classanalysis}.
+
.Properties of `classification_stats`
[%collapsible%open]
=====
//Begin class_hyperparameters
`hyperparameters`::::
(object)
An object containing the parameters of the {classanalysis}.
+
.Properties of `hyperparameters`
[%collapsible%open]
======
tag::dfas-alpha[]
`alpha`::::
(double)
Regularization factor to penalize deeper trees when training decision trees.
end::dfas-alpha[]
`class_assignment_objective`::::
(string)
Defines whether class assignment maximizes the accuracy or the minimum recall
metric. Possible values are `maximize_accuracy` and `maximize_minimum_recall`.
tag::dfas-downsample-factor[]
`downsample_factor`::::
(double)
The value of the downsample factor.
end::dfas-downsample-factor[]
tag::dfas-eta[]
`eta`::::
(double)
The value of the eta hyperparameter.
end::dfas-eta[]
tag::dfas-eta-growth[]
`eta_growth_rate_per_tree`::::
(double)
Specifies the rate at which the `eta` increases for each new tree that is added to the
forest. For example, a rate of `1.05` increases `eta` by 5%.
end::dfas-eta-growth[]
tag::dfas-feature-bag-fraction[]
`feature_bag_fraction`::::
(double)
The fraction of features that is used when selecting a random bag for each
candidate split.
end::dfas-feature-bag-fraction[]
tag::dfas-gamma[]
`gamma`::::
(double)
Regularization factor to penalize trees with large numbers of nodes.
end::dfas-gamma[]
tag::dfas-lambda[]
`lambda`::::
(double)
Regularization factor to penalize large leaf weights.
end::dfas-lambda[]
tag::dfas-max-attempts[]
`max_attempts_to_add_tree`::::
(integer)
If the algorithm fails to determine a non-trivial tree (more than a single
leaf), this parameter determines how many of such consecutive failures are
tolerated. Once the number of attempts exceeds the threshold, the forest
training stops.
end::dfas-max-attempts[]
tag::dfas-max-optimization-rounds[]
`max_optimization_rounds_per_hyperparameter`::::
(integer)
A multiplier responsible for determining the maximum number of
hyperparameter optimization steps in the Bayesian optimization procedure.
The maximum number of steps is determined based on the number of undefined hyperparameters
times the maximum optimization rounds per hyperparameter.
end::dfas-max-optimization-rounds[]
tag::dfas-max-trees[]
`max_trees`::::
(integer)
The maximum number of trees in the forest.
end::dfas-max-trees[]
tag::dfas-num-folds[]
`num_folds`::::
(integer)
The maximum number of folds for the cross-validation procedure.
end::dfas-num-folds[]
tag::dfas-num-splits[]
`num_splits_per_feature`::::
(integer)
Determines the maximum number of splits for every feature that can occur in a
decision tree when the tree is trained.
end::dfas-num-splits[]
tag::dfas-soft-limit[]
`soft_tree_depth_limit`::::
(double)
Tree depth limit is used for calculating the tree depth penalty. This is a soft
limit, it can be exceeded.
end::dfas-soft-limit[]
tag::dfas-soft-tolerance[]
`soft_tree_depth_tolerance`::::
(double)
Tree depth tolerance is used for calculating the tree depth penalty. This is a
soft limit, it can be exceeded.
end::dfas-soft-tolerance[]
======
//End class_hyperparameters
tag::dfas-iteration[]
`iteration`::::
(integer)
The number of iterations on the analysis.
end::dfas-iteration[]
tag::dfas-timestamp[]
`timestamp`::::
(date)
The timestamp when the statistics were reported in milliseconds since the epoch.
end::dfas-timestamp[]
//Begin class_timing_stats
tag::dfas-timing-stats[]
`timing_stats`::::
(object)
An object containing time statistics about the {dfanalytics-job}.
end::dfas-timing-stats[]
+
.Properties of `timing_stats`
[%collapsible%open]
======
tag::dfas-timing-stats-elapsed[]
`elapsed_time`::::
(integer)
Runtime of the analysis in milliseconds.
end::dfas-timing-stats-elapsed[]
tag::dfas-timing-stats-iteration[]
`iteration_time`::::
(integer)
Runtime of the latest iteration of the analysis in milliseconds.
end::dfas-timing-stats-iteration[]
======
//End class_timing_stats
//Begin class_validation_loss
tag::dfas-validation-loss[]
`validation_loss`::::
(object)
An object containing information about validation loss.
end::dfas-validation-loss[]
+
.Properties of `validation_loss`
[%collapsible%open]
======
tag::dfas-validation-loss-type[]
`loss_type`::::
(string)
The type of the loss metric. For example, `binomial_logistic`.
end::dfas-validation-loss-type[]
tag::dfas-validation-loss-fold[]
`fold_values`::::
(array of strings)
Validation loss values for every added decision tree during the forest growing
procedure.
end::dfas-validation-loss-fold[]
======
//End class_validation_loss
=====
//End classification_stats
//Begin outlier_detection_stats
`outlier_detection_stats`:::
(object)
An object containing statistical data about the {oldetection} job.
+
.Properties of `outlier_detection_stats`
[%collapsible%open]
=====
//Begin parameters
`parameters`::::
(object)
The list of job parameters specified by the user or determined by algorithmic
heuristics.
+
.Properties of `parameters`
[%collapsible%open]
======
`compute_feature_influence`::::
(boolean)
If true, feature influence calculation is enabled.
`feature_influence_threshold`::::
(double)
The minimum {olscore} that a document needs to have to calculate its feature
influence score.
`method`::::
(string)
The method that {oldetection} uses. Possible values are `lof`, `ldof`,
`distance_kth_nn`, `distance_knn`, and `ensemble`.
`n_neighbors`::::
(integer)
The value for how many nearest neighbors each method of {oldetection} uses to
calculate its outlier score.
`outlier_fraction`::::
(double)
The proportion of the data set that is assumed to be outlying prior to
{oldetection}.
`standardization_enabled`::::
(boolean)
If true, then the following operation is performed on the columns before
computing {olscores}: (x_i - mean(x_i)) / sd(x_i).
======
//End parameters
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timestamp]
//Begin od_timing_stats
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats]
+
.Property of `timing_stats`
[%collapsible%open]
======
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-elapsed]
======
//End od_timing_stats
=====
//End outlier_detection_stats
//Begin regression_stats
`regression_stats`:::
(object)
An object containing statistical data about the {reganalysis}.
+
.Properties of `regression_stats`
[%collapsible%open]
=====
//Begin reg_hyperparameters
`hyperparameters`::::
(object)
An object containing the parameters of the {reganalysis}.
+
.Properties of `hyperparameters`
[%collapsible%open]
======
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-alpha]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-downsample-factor]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-eta]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-eta-growth]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-feature-bag-fraction]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-gamma]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-lambda]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-attempts]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-optimization-rounds]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-max-trees]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-num-folds]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-num-splits]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-soft-limit]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-soft-tolerance]
======
//End reg_hyperparameters
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-iteration]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timestamp]
//Begin reg_timing_stats
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats]
+
.Propertis of `timing_stats`
[%collapsible%open]
======
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-elapsed]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-timing-stats-iteration]
======
//End reg_timing_stats
//Begin reg_validation_loss
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss]
+
.Properties of `validation_loss`
[%collapsible%open]
======
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss-type]
include::{docdir}/ml/ml-shared.asciidoc[tag=dfas-validation-loss-fold]
======
//End reg_validation_loss
=====
//End regression_stats
====
//End analysis_stats
`assignment_explanation`:::
(string)
For running jobs only, contains messages relating to the selection of a node to
run the job.
//Begin data_counts
`data_counts`:::
(object)
An object containing statistical data about the documents in the analysis.
+
.Properties of `data_counts`
[%collapsible%open]
====
`skipped_docs_count`:::
(integer)
The number of documents that are skipped during the analysis because they
contained values that are not supported by the analysis. For example,
{oldetection} does not support missing fields so it skips documents with missing
fields. Likewise, all types of analysis skip documents that contain arrays with
more than one element.
`test_docs_count`:::
(integer)
The number of documents that are not used for training the model and can be used
for testing.
`training_docs_count`:::
(integer)
The number of documents that are used for training the model.
====
//End data_counts
`id`:::
(string)
The unique identifier of the {dfanalytics-job}.
`memory_usage`:::
(Optional, object)
An object describing memory usage of the analytics. It is present only after the
job is started and memory usage is reported.
`memory_usage`.`peak_usage_bytes`:::
(long)
The number of bytes used at the highest peak of memory usage.
`memory_usage`.`timestamp`:::
(date)
The timestamp when memory usage was calculated.
`node`:::
(object)
Contains properties for the node that runs the job. This information is
available only for running jobs.
`node`.`attributes`:::
(object)
Lists node attributes such as `ml.machine_memory`, `ml.max_open_jobs`, and
`xpack.installed`.
`node`.`ephemeral_id`:::
(string)
The ephemeral id of the node.
`node`.`id`:::
(string)
The unique identifier of the node.
`node`.`name`:::
(string)
The node name.
`node`.`transport_address`:::
(string)
The host and port where transport HTTP connections are accepted.
`progress`:::
(array) The progress report of the {dfanalytics-job} by phase.
`progress`.`phase`:::
(string) Defines the phase of the {dfanalytics-job}. Possible phases:
`reindexing`, `loading_data`, `analyzing`, and `writing_results`.
`progress`.`progress_percent`:::
(integer) The progress that the {dfanalytics-job} has made expressed in
percentage.
`state`:::
(string) Current state of the {dfanalytics-job}.
end::data-frame-analytics-stats[]
tag::datafeed-id[]
A numerical character string that uniquely identifies the
{dfeed}. This identifier can contain lowercase alphanumeric characters (a-z
@ -894,6 +471,106 @@ A unique identifier for the detector. This identifier is based on the order of
the detectors in the `analysis_config`, starting at zero.
end::detector-index[]
tag::dfas-alpha[]
Regularization factor to penalize deeper trees when training decision trees.
end::dfas-alpha[]
tag::dfas-downsample-factor[]
The value of the downsample factor.
end::dfas-downsample-factor[]
tag::dfas-eta[]
The value of the eta hyperparameter.
end::dfas-eta[]
tag::dfas-eta-growth[]
Specifies the rate at which the `eta` increases for each new tree that is added to the
forest. For example, a rate of `1.05` increases `eta` by 5%.
end::dfas-eta-growth[]
tag::dfas-feature-bag-fraction[]
The fraction of features that is used when selecting a random bag for each
candidate split.
end::dfas-feature-bag-fraction[]
tag::dfas-gamma[]
Regularization factor to penalize trees with large numbers of nodes.
end::dfas-gamma[]
tag::dfas-iteration[]
The number of iterations on the analysis.
end::dfas-iteration[]
tag::dfas-lambda[]
Regularization factor to penalize large leaf weights.
end::dfas-lambda[]
tag::dfas-max-attempts[]
If the algorithm fails to determine a non-trivial tree (more than a single
leaf), this parameter determines how many of such consecutive failures are
tolerated. Once the number of attempts exceeds the threshold, the forest
training stops.
end::dfas-max-attempts[]
tag::dfas-max-optimization-rounds[]
A multiplier responsible for determining the maximum number of
hyperparameter optimization steps in the Bayesian optimization procedure.
The maximum number of steps is determined based on the number of undefined hyperparameters
times the maximum optimization rounds per hyperparameter.
end::dfas-max-optimization-rounds[]
tag::dfas-max-trees[]
The maximum number of trees in the forest.
end::dfas-max-trees[]
tag::dfas-num-folds[]
The maximum number of folds for the cross-validation procedure.
end::dfas-num-folds[]
tag::dfas-num-splits[]
Determines the maximum number of splits for every feature that can occur in a
decision tree when the tree is trained.
end::dfas-num-splits[]
tag::dfas-soft-limit[]
Tree depth limit is used for calculating the tree depth penalty. This is a soft
limit, it can be exceeded.
end::dfas-soft-limit[]
tag::dfas-soft-tolerance[]
Tree depth tolerance is used for calculating the tree depth penalty. This is a
soft limit, it can be exceeded.
end::dfas-soft-tolerance[]
tag::dfas-timestamp[]
The timestamp when the statistics were reported in milliseconds since the epoch.
end::dfas-timestamp[]
tag::dfas-timing-stats[]
An object containing time statistics about the {dfanalytics-job}.
end::dfas-timing-stats[]
tag::dfas-timing-stats-elapsed[]
Runtime of the analysis in milliseconds.
end::dfas-timing-stats-elapsed[]
tag::dfas-timing-stats-iteration[]
Runtime of the latest iteration of the analysis in milliseconds.
end::dfas-timing-stats-iteration[]
tag::dfas-validation-loss[]
An object containing information about validation loss.
end::dfas-validation-loss[]
tag::dfas-validation-loss-fold[]
Validation loss values for every added decision tree during the forest growing
procedure.
end::dfas-validation-loss-fold[]
tag::dfas-validation-loss-type[]
The type of the loss metric. For example, `binomial_logistic`.
end::dfas-validation-loss-type[]
tag::earliest-record-timestamp[]
The timestamp of the earliest chronologically input document.
end::earliest-record-timestamp[]
@ -1334,6 +1011,10 @@ tag::node-address[]
The network address of the node.
end::node-address[]
tag::node-attributes[]
Lists node attributes such as `ml.machine_memory` or `ml.max_open_jobs` settings.
end::node-attributes[]
tag::node-datafeeds[]
For started {dfeeds} only, this information pertains to the node upon which the
{dfeed} is started.
@ -1352,6 +1033,10 @@ Contains properties for the node that runs the job. This information is
available only for open jobs.
end::node-jobs[]
tag::node-transport-address[]
The host and port where transport HTTP connections are accepted.
end::node-transport-address[]
tag::open-time[]
For open jobs only, the elapsed time for which the job has been open.
end::open-time[]