OpenSearch/docs/reference/ml/anomaly-detection/apis/get-snapshot.asciidoc

252 lines
7.2 KiB
Plaintext

[role="xpack"]
[testenv="platinum"]
[[ml-get-snapshot]]
=== Get model snapshots API
++++
<titleabbrev>Get model snapshots</titleabbrev>
++++
Retrieves information about model snapshots.
[[ml-get-snapshot-request]]
==== {api-request-title}
`GET _ml/anomaly_detectors/<job_id>/model_snapshots` +
`GET _ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>`
[[ml-get-snapshot-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have `monitor_ml`,
`monitor`, `manage_ml`, or `manage` cluster privileges to use this API. See
<<security-privileges>>.
[[ml-get-snapshot-path-parms]]
==== {api-path-parms-title}
`<job_id>`::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
`<snapshot_id>`::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=snapshot-id]
+
--
If you do not specify this optional parameter, the API returns information about
all model snapshots.
--
[[ml-get-snapshot-request-body]]
==== {api-request-body-title}
`desc`::
(Optional, boolean) If true, the results are sorted in descending order.
`end`::
(Optional, date) Returns snapshots with timestamps earlier than this time.
`from`::
(Optional, integer) Skips the specified number of snapshots.
`size`::
(Optional, integer) Specifies the maximum number of snapshots to obtain.
`sort`::
(Optional, string) Specifies the sort field for the requested snapshots. By
default, the snapshots are sorted by their timestamp.
`start`::
(Optional, string) Returns snapshots with timestamps after this time.
[role="child_attributes"]
[[ml-get-snapshot-results]]
==== {api-response-body-title}
The API returns an array of model snapshot objects, which have the following
properties:
`description`::
(string) An optional description of the job.
`job_id`::
(string) A numerical character string that uniquely identifies the job that
the snapshot was created for.
`latest_record_time_stamp`::
(date) The timestamp of the latest processed record.
`latest_result_time_stamp`::
(date) The timestamp of the latest bucket result.
`min_version`::
(string) The minimum version required to be able to restore the model snapshot.
//Begin model_size_stats
`model_size_stats`::
(object) Summary information describing the model.
+
.Properties of `model_size_stats`
[%collapsible%open]
====
`bucket_allocation_failures_count`:::
(long) The number of buckets for which entities were not processed due to memory
limit constraints.
`categorized_doc_count`:::
(long) The number of documents that have had a field categorized.
`categorization_status`:::
(string) The status of categorization for this job.
Contains one of the following values.
+
--
* `ok`: Categorization is performing acceptably well (or not being
used at all).
* `warn`: Categorization is detecting a distribution of categories
that suggests the input data is inappropriate for categorization.
Problems could be that there is only one category, more than 90% of
categories are rare, the number of categories is greater than 50% of
the number of categorized documents, there are no frequently
matched categories, or more than 50% of categories are dead.
--
`dead_category_count`:::
(long) The number of categories created by categorization that will
never be assigned again because another category's definition
makes it a superset of the dead category. (Dead categories are a
side effect of the way categorization has no prior training.)
`failed_category_count`:::
(long)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=failed-category-count]
`frequent_category_count`:::
(long) The number of categories that match more than 1% of categorized
documents.
`job_id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
`log_time`:::
(date) The timestamp that the `model_size_stats` were recorded, according to
server-time.
`memory_status`:::
(string) The status of the memory in relation to its `model_memory_limit`.
Contains one of the following values.
+
--
* `hard_limit`: The internal models require more space that the configured
memory limit. Some incoming data could not be processed.
* `ok`: The internal models stayed below the configured value.
* `soft_limit`: The internal models require more than 60% of the configured
memory limit and more aggressive pruning will be performed in order to try to
reclaim space.
--
`model_bytes`:::
(long) An approximation of the memory resources required for this analysis.
`model_bytes_exceeded`:::
(long) The number of bytes over the high limit for memory usage at the last allocation failure.
`model_bytes_memory_limit`:::
(long) The upper limit for memory usage, checked on increasing values.
`rare_category_count`:::
(long) The number of categories that match just one categorized document.
`result_type`:::
(string) Internal. This value is always `model_size_stats`.
`timestamp`:::
(date) The timestamp that the `model_size_stats` were recorded, according to the
bucket timestamp of the data.
`total_by_field_count`:::
(long) The number of _by_ field values analyzed. Note that these are counted
separately for each detector and partition.
`total_category_count`:::
(long) The number of categories created by categorization.
`total_over_field_count`:::
(long) The number of _over_ field values analyzed. Note that these are counted
separately for each detector and partition.
`total_partition_field_count`:::
(long) The number of _partition_ field values analyzed.
====
//End model_size_stats
`retain`::
(boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=retain]
`snapshot_id`::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=snapshot-id]
`snapshot_doc_count`::
(long) For internal use only.
`timestamp`::
(date) The creation timestamp for the snapshot.
[[ml-get-snapshot-example]]
==== {api-examples-title}
[source,console]
--------------------------------------------------
GET _ml/anomaly_detectors/high_sum_total_sales/model_snapshots
{
"start": "1575402236000"
}
--------------------------------------------------
// TEST[skip:Kibana sample data]
In this example, the API provides a single result:
[source,js]
----
{
"count" : 1,
"model_snapshots" : [
{
"job_id" : "high_sum_total_sales",
"min_version" : "6.4.0",
"timestamp" : 1575402237000,
"description" : "State persisted due to job close at 2019-12-03T19:43:57+0000",
"snapshot_id" : "1575402237",
"snapshot_doc_count" : 1,
"model_size_stats" : {
"job_id" : "high_sum_total_sales",
"result_type" : "model_size_stats",
"model_bytes" : 1638816,
"model_bytes_exceeded" : 0,
"model_bytes_memory_limit" : 10485760,
"total_by_field_count" : 3,
"total_over_field_count" : 3320,
"total_partition_field_count" : 2,
"bucket_allocation_failures_count" : 0,
"memory_status" : "ok",
"categorized_doc_count" : 0,
"total_category_count" : 0,
"frequent_category_count" : 0,
"rare_category_count" : 0,
"dead_category_count" : 0,
"categorization_status" : "ok",
"log_time" : 1575402237000,
"timestamp" : 1576965600000
},
"latest_record_time_stamp" : 1576971072000,
"latest_result_time_stamp" : 1576965600000,
"retain" : false
}
]
}
----