127 lines
4.4 KiB
Plaintext
127 lines
4.4 KiB
Plaintext
[[ml-revert-snapshot]]
|
|
==== Revert Model Snapshots
|
|
|
|
The revert model snapshot API allows you to revert to a specific snapshot.
|
|
|
|
===== Request
|
|
|
|
`POST _xpack/ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>/_revert`
|
|
|
|
|
|
===== Description
|
|
|
|
The {ml} feature in {xpack} reacts quickly to anomalous input, learning new behaviors in data.
|
|
Highly anomalous input increases the variance in the models whilst the system learns
|
|
whether this is a new step-change in behavior or a one-off event. In the case
|
|
where this anomalous input is known to be a one-off, then it might be appropriate
|
|
to reset the model state to a time before this event. For example, you might
|
|
consider reverting to a saved snapshot after Black Friday
|
|
or a critical system failure.
|
|
|
|
////
|
|
To revert to a saved snapshot, you must follow this sequence:
|
|
. Close the job
|
|
. Revert to a snapshot
|
|
. Open the job
|
|
. Send new data to the job
|
|
|
|
When reverting to a snapshot, there is a choice to make about whether or not
|
|
you want to keep the results that were created between the time of the snapshot
|
|
and the current time. In the case of Black Friday for instance, you might want
|
|
to keep the results and carry on processing data from the current time,
|
|
though without the models learning the one-off behavior and compensating for it.
|
|
However, say in the event of a critical system failure and you decide to reset
|
|
and models to a previous known good state and process data from that time,
|
|
it makes sense to delete the intervening results for the known bad period and
|
|
resend data from that earlier time.
|
|
|
|
Any gaps in data since the snapshot time will be treated as nulls and not modeled.
|
|
If there is a partial bucket at the end of the snapshot and/or at the beginning
|
|
of the new input data, then this will be ignored and treated as a gap.
|
|
|
|
For jobs with many entities, the model state may be very large.
|
|
If a model state is several GB, this could take 10-20 mins to revert depending
|
|
upon machine spec and resources. If this is the case, please ensure this time
|
|
is planned for.
|
|
Model size (in bytes) is available as part of the Job Resource Model Size Stats.
|
|
////
|
|
IMPORTANT: Before you revert to a saved snapshot, you must close the job.
|
|
Sending data to a closed job changes its status to `open`, so you must also
|
|
ensure that you do not expect data imminently.
|
|
|
|
===== Path Parameters
|
|
|
|
`job_id` (required)::
|
|
(+string+) Identifier for the job
|
|
|
|
`snapshot_id` (required)::
|
|
(+string+) Identifier for the model snapshot
|
|
|
|
===== Request Body
|
|
|
|
`delete_intervening_results`::
|
|
(+boolean+) If true, deletes the results in the time period between the
|
|
latest results and the time of the reverted snapshot. It also resets the
|
|
model to accept records for this time period.
|
|
|
|
NOTE: If you choose not to delete intervening results when reverting a snapshot,
|
|
the job will not accept input data that is older than the current time.
|
|
If you want to resend data, then delete the intervening results.
|
|
|
|
////
|
|
===== Responses
|
|
|
|
TBD
|
|
200
|
|
(EmptyResponse) The cluster has been successfully deleted
|
|
404
|
|
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
|
412
|
|
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
|
////
|
|
|
|
===== Examples
|
|
|
|
The following example reverts to the `1491856080` snapshot for the
|
|
`it_ops_new_kpi` job:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
POST
|
|
_xpack/ml/anomaly_detectors/it_ops_new_kpi/model_snapshots/1491856080/_revert
|
|
{
|
|
"delete_intervening_results": true
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[skip:todo]
|
|
|
|
When the operation is complete, you receive the following results:
|
|
----
|
|
{
|
|
"acknowledged": true,
|
|
"model": {
|
|
"job_id": "it_ops_new_kpi",
|
|
"timestamp": 1491856080000,
|
|
"description": "State persisted due to job close at 2017-04-10T13:28:00-0700",
|
|
"snapshot_id": "1491856080",
|
|
"snapshot_doc_count": 1,
|
|
"model_size_stats": {
|
|
"job_id": "it_ops_new_kpi",
|
|
"result_type": "model_size_stats",
|
|
"model_bytes": 29518,
|
|
"total_by_field_count": 3,
|
|
"total_over_field_count": 0,
|
|
"total_partition_field_count": 2,
|
|
"bucket_allocation_failures_count": 0,
|
|
"memory_status": "ok",
|
|
"log_time": 1491856080000,
|
|
"timestamp": 1455318000000
|
|
},
|
|
"latest_record_time_stamp": 1455318669000,
|
|
"latest_result_time_stamp": 1455318000000,
|
|
"retain": false
|
|
}
|
|
}
|
|
----
|