mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-09 14:35:04 +00:00
cdc3a260af
* Add retention to Snapshot Lifecycle Management (#46407)
This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria.
An example policy would look like:
```
PUT /_slm/policy/snapshot-every-day
{
"schedule": "0 30 2 * * ?",
"name": "<production-snap-{now/d}>",
"repository": "my-s3-repository",
"config": {
"indices": ["foo-*", "important"]
},
// Newly configured retention options
"retention": {
// Snapshots should be deleted after 14 days
"expire_after": "14d",
// Keep a maximum of thirty snapshots
"max_count": 30,
// Keep a minimum of the four most recent snapshots
"min_count": 4
}
}
```
SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour.
Included in this work is a new SLM stats API endpoint available through
``` json
GET /_slm/stats
```
That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API.
* Add base framework for snapshot retention (#43605)
* Add base framework for snapshot retention
This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask`
to start as the basis for SLM's retention implementation.
Relates to #38461
* Remove extraneous 'public'
* Use a local var instead of reading class var repeatedly
* Add SnapshotRetentionConfiguration for retention configuration (#43777)
* Add SnapshotRetentionConfiguration for retention configuration
This commit adds the `SnapshotRetentionConfiguration` class and its HLRC
counterpart to encapsulate the configuration for SLM retention.
Currently only a single parameter is supported as an example (we still
need to discuss the different options we want to support and their
names) to keep the size of the PR down. It also does not yet include version serialization checks
since the original SLM branch has not yet been merged.
Relates to #43663
* Fix REST tests
* Fix more documentation
* Use Objects.equals to avoid NPE
* Put `randomSnapshotLifecyclePolicy` in only one place
* Occasionally return retention with no configuration
* Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764)
* Implement SnapshotRetentionTask's snapshot filtering and deletion
This commit implements the snapshot filtering and deletion for
`SnapshotRetentionTask`. Currently only the expire-after age is used for
determining whether a snapshot is eligible for deletion.
Relates to #43663
* Fix deletes running on the wrong thread
* Handle missing or null policy in snap metadata differently
* Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>>
* Use the `OriginSettingClient` to work with security, enhance logging
* Prevent NPE in test by mocking Client
* Allow empty/missing SLM retention configuration (#45018)
Semi-related to #44465, this allows the `"retention"` configuration map
to be missing.
Relates to #43663
* Add min_count and max_count as SLM retention predicates (#44926)
This adds the configuration options for `min_count` and `max_count` as
well as the logic for determining whether a snapshot meets this criteria
to SLM's retention feature.
These options are optional and one, two, or all three can be specified
in an SLM policy.
Relates to #43663
* Time-bound deletion of snapshots in retention delete function (#45065)
* Time-bound deletion of snapshots in retention delete function
With a cluster that has a large number of snapshots, it's possible that
snapshot deletion can take a very long time (especially since deletes
currently have to happen in a serial fashion). To prevent snapshot
deletion from taking forever in a cluster and blocking other operations,
this commit adds a setting to allow configuring a maximum time to spend
deletion snapshots during retention. This dynamic setting defaults to 1
hour and is best-effort, meaning that it doesn't hard stop a deletion
at an hour mark, but ensures that once the time has passed, all
subsequent deletions are deferred until the next retention cycle.
Relates to #43663
* Wow snapshots suuuure can take a long time.
* Use a LongSupplier instead of actually sleeping
* Remove TestLogging annotation
* Remove rate limiting
* Add SLM metrics gathering and endpoint (#45362)
* Add SLM metrics gathering and endpoint
This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster
takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The
stats stored include the number of snapshots taken, failed, deleted, the number of retention runs,
as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount
of time spent deleting snapshots from SLM retention.
This commit also adds an endpoint for retrieving all stats (further commits will expose this in the
SLM get-policy API) that looks like:
```
GET /_slm/stats
{
"retention_runs" : 13,
"retention_failed" : 0,
"retention_timed_out" : 0,
"retention_deletion_time" : "1.4s",
"retention_deletion_time_millis" : 1404,
"policy_metrics" : {
"daily-snapshots2" : {
"snapshots_taken" : 7,
"snapshots_failed" : 0,
"snapshots_deleted" : 6,
"snapshot_deletion_failures" : 0
},
"daily-snapshots" : {
"snapshots_taken" : 12,
"snapshots_failed" : 0,
"snapshots_deleted" : 12,
"snapshot_deletion_failures" : 6
}
},
"total_snapshots_taken" : 19,
"total_snapshots_failed" : 0,
"total_snapshots_deleted" : 18,
"total_snapshot_deletion_failures" : 6
}
```
This does not yet include HLRC for this, as this commit is quite large on its own. That will be
added in a subsequent commit.
Relates to #43663
* Version qualify serialization
* Initialize counters outside constructor
* Use computeIfAbsent instead of being too verbose
* Move part of XContent generation into subclass
* Fix REST action for master merge
* Unused import
* Record history of SLM retention actions (#45513)
This commit records the deletion of snapshots by the retention component
of SLM into the SLM history index for the purposes of reviewing operations
taken by SLM and alerting.
* Retry SLM retention after currently running snapshot completes (#45802)
* Retry SLM retention after currently running snapshot completes
This commit adds a ClusterStateObserver to wait until the currently
running snapshot is complete before proceeding with snapshot deletion.
SLM retention waits for the maximum allowed deletion time for the
snapshot to complete, however, the waiting time is not factored into
the limit on actual deletions.
Relates to #43663
* Increase timeout waiting for snapshot completion
* Apply patch
From 2374316f0d
.patch
* Rename test variables
* [TEST] Be less strict for stats checking
* Skip SLM retention if ILM is STOPPING or STOPPED (#45869)
This adds a check to ensure we take no action during SLM retention if
ILM is currently stopped or in the process of stopping.
Relates to #43663
* Check all actions preventing snapshot delete during retention (#45992)
* Check all actions preventing snapshot delete during retention run
Previously we only checked to see if a snapshot was currently running,
but it turns out that more things can block snapshot deletion. This
changes the check to be a check for:
- a snapshot currently running
- a deletion already in progress
- a repo cleanup in progress
- a restore currently running
This was found by CI where a third party delete in a test caused SLM
retention deletion to throw an exception.
Relates to #43663
* Add unit test for okayToDeleteSnapshots
* Fix bug where SLM retention task would be scheduled on every node
* Enhance test logging
* Ignore if snapshot is already deleted
* Missing import
* Fix SnapshotRetentionServiceTests
* Expose SLM policy stats in get SLM policy API (#45989)
This also adds support for the SLM stats endpoint to the high level rest client.
Retrieving a policy now looks like:
```json
{
"daily-snapshots" : {
"version": 1,
"modified_date": "2019-04-23T01:30:00.000Z",
"modified_date_millis": 1556048137314,
"policy" : {
"schedule": "0 30 1 * * ?",
"name": "<daily-snap-{now/d}>",
"repository": "my_repository",
"config": {
"indices": ["data-*", "important"],
"ignore_unavailable": false,
"include_global_state": false
},
"retention": {}
},
"stats": {
"snapshots_taken": 0,
"snapshots_failed": 0,
"snapshots_deleted": 0,
"snapshot_deletion_failures": 0
},
"next_execution": "2019-04-24T01:30:00.000Z",
"next_execution_millis": 1556048160000
}
}
```
Relates to #43663
* Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356)
* Rewrite SnapshotLifecycleIT as as ESIntegTestCase
This commit splits `SnapshotLifecycleIT` into two different tests.
`SnapshotLifecycleRestIT` which includes the tests that do not require
slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an
integration test using `MockRepository` to simulate a snapshot being in
progress.
Relates to #43663
Resolves #46205
* Add error logging when exceptions are thrown
* Update serialization versions
* Fix type inference
* Use non-Cancellable HLRC return value
* Fix Client mocking in test
* Fix SLMSnapshotBlockingIntegTests for 7.x branch
* Update SnapshotRetentionTask for non-multi-repo snapshot retrieval
* Add serialization guards for SnapshotLifecyclePolicy
408 lines
18 KiB
Plaintext
408 lines
18 KiB
Plaintext
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[snapshot-lifecycle-management-api]]
|
|
== Snapshot lifecycle management API
|
|
|
|
The Snapshot Lifecycle Management APIs are used to manage policies for the time
|
|
and frequency of automatic snapshots. Snapshot Lifecycle Management is related
|
|
to <<index-lifecycle-management,Index Lifecycle Management>>, however, instead
|
|
of managing a lifecycle of actions that are performed on a single index, SLM
|
|
allows configuring policies spanning multiple indices.
|
|
|
|
SLM policy management is split into three different CRUD APIs, a way to put or update
|
|
policies, a way to retrieve policies, and a way to delete unwanted policies, as
|
|
well as a separate API for immediately invoking a snapshot based on a policy.
|
|
|
|
Since SLM falls under the same category as ILM, it is stopped and started by
|
|
using the <<start-stop-ilm,start and stop>> ILM APIs.
|
|
|
|
[[slm-api-put]]
|
|
=== Put Snapshot Lifecycle Policy API
|
|
|
|
Creates or updates a snapshot policy. If the policy already exists, the version
|
|
is incremented. Only the latest version of a policy is stored.
|
|
|
|
When a policy is created it is immediately scheduled based on the schedule of
|
|
the policy, when a policy is updated its schedule changes are immediately
|
|
applied.
|
|
|
|
==== Path Parameters
|
|
|
|
`policy_id` (required)::
|
|
(string) Identifier (id) for the policy.
|
|
|
|
==== Request Parameters
|
|
|
|
include::{docdir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
|
|
|
|
==== Authorization
|
|
|
|
You must have the `manage_slm` cluster privilege to use this API. You must also
|
|
have the `manage` index privilege on all indices being managed by `policy`. All
|
|
operations executed by {slm} for a policy are executed as the user that put the
|
|
latest version of a policy. For more information, see
|
|
{stack-ov}/security-privileges.html[Security Privileges].
|
|
|
|
==== Example
|
|
|
|
The following creates a snapshot lifecycle policy with an id of
|
|
`daily-snapshots`:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /_slm/policy/daily-snapshots
|
|
{
|
|
"schedule": "0 30 1 * * ?", <1>
|
|
"name": "<daily-snap-{now/d}>", <2>
|
|
"repository": "my_repository", <3>
|
|
"config": { <4>
|
|
"indices": ["data-*", "important"], <5>
|
|
"ignore_unavailable": false,
|
|
"include_global_state": false
|
|
},
|
|
"retention": {}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:setup-repository]
|
|
|
|
<1> When the snapshot should be taken, in this case, 1:30am daily
|
|
<2> The name each snapshot should be given
|
|
<3> Which repository to take the snapshot in
|
|
<4> Any extra snapshot configuration
|
|
<5> Which indices the snapshot should contain
|
|
|
|
The top-level keys that the policy supports are described below:
|
|
|
|
|==================
|
|
| Key | Description
|
|
|
|
| `schedule` | A periodic or absolute time schedule. Supports all values
|
|
supported by the cron scheduler:
|
|
{xpack-ref}/trigger-schedule.html#schedule-cron[Cron scheduler configuration]
|
|
|
|
| `name` | A name automatically given to each snapshot performed by this policy.
|
|
Supports the same <<date-math-index-names,date math>> supported in index
|
|
names. A UUID is automatically appended to the end of the name to prevent
|
|
conflicting snapshot names.
|
|
|
|
| `repository` | The snapshot repository that will contain snapshots created by
|
|
this policy. The repository must exist prior to the policy's creation and can
|
|
be created with the <<modules-snapshots,snapshot repository API>>.
|
|
|
|
| `config` | Configuration for each snapshot that will be created by this
|
|
policy. Any configuration is included with <<modules-snapshots,create snapshot
|
|
requests>> issued by this policy.
|
|
|==================
|
|
|
|
To update an existing policy, simply use the put snapshot lifecycle policy API
|
|
with the same policy id as an existing policy.
|
|
|
|
[[slm-api-get]]
|
|
=== Get Snapshot Lifecycle Policy API
|
|
|
|
Once a policy is in place, you can retrieve one or more of the policies using
|
|
the get snapshot lifecycle policy API. This also includes information about the
|
|
latest successful and failed invocation that the automatic snapshots have taken.
|
|
|
|
==== Path Parameters
|
|
|
|
`policy_ids` (optional)::
|
|
(string) Comma-separated ids of policies to retrieve.
|
|
|
|
==== Examples
|
|
|
|
To retrieve a policy, perform a `GET` with the policy's id
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_slm/policy/daily-snapshots?human
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
The output looks similar to the following:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"daily-snapshots" : {
|
|
"version": 1, <1>
|
|
"modified_date": "2019-04-23T01:30:00.000Z", <2>
|
|
"modified_date_millis": 1556048137314,
|
|
"policy" : {
|
|
"schedule": "0 30 1 * * ?",
|
|
"name": "<daily-snap-{now/d}>",
|
|
"repository": "my_repository",
|
|
"config": {
|
|
"indices": ["data-*", "important"],
|
|
"ignore_unavailable": false,
|
|
"include_global_state": false
|
|
},
|
|
"retention": {}
|
|
},
|
|
"stats": {
|
|
"snapshots_taken": 0,
|
|
"snapshots_failed": 0,
|
|
"snapshots_deleted": 0,
|
|
"snapshot_deletion_failures": 0
|
|
},
|
|
"next_execution": "2019-04-24T01:30:00.000Z", <3>
|
|
"next_execution_millis": 1556048160000
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/"modified_date": "2019-04-23T01:30:00.000Z"/"modified_date": $body.daily-snapshots.modified_date/ s/"modified_date_millis": 1556048137314/"modified_date_millis": $body.daily-snapshots.modified_date_millis/ s/"next_execution": "2019-04-24T01:30:00.000Z"/"next_execution": $body.daily-snapshots.next_execution/ s/"next_execution_millis": 1556048160000/"next_execution_millis": $body.daily-snapshots.next_execution_millis/]
|
|
<1> The version of the snapshot policy, only the latest verison is stored and incremented when the policy is updated
|
|
<2> The last time this policy was modified
|
|
<3> The next time this policy will be executed
|
|
|
|
Or, to retrieve all policies:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_slm/policy
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
[[slm-api-execute]]
|
|
=== Execute Snapshot Lifecycle Policy API
|
|
|
|
Sometimes it can be useful to immediately execute a snapshot based on policy,
|
|
perhaps before an upgrade or before performing other maintenance on indices. The
|
|
execute snapshot policy API allows you to perform a snapshot immediately without
|
|
waiting for a policy's scheduled invocation.
|
|
|
|
==== Path Parameters
|
|
|
|
`policy_id` (required)::
|
|
(string) Id of the policy to execute
|
|
|
|
==== Example
|
|
|
|
To take an immediate snapshot using a policy, use the following
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /_slm/policy/daily-snapshots/_execute
|
|
--------------------------------------------------
|
|
// TEST[skip:we can't easily handle snapshots from docs tests]
|
|
|
|
This API will immediately return with the generated snapshot name
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"snapshot_name": "daily-snap-2019.04.24-gwrqoo2xtea3q57vvg0uea"
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[skip:we can't handle snapshots from docs tests]
|
|
|
|
The snapshot will be taken in the background, you can use the
|
|
<<modules-snapshots,snapshot APIs>> to monitor the status of the snapshot.
|
|
|
|
Once a snapshot has been kicked off, you can see the latest successful or failed
|
|
snapshot using the get snapshot lifecycle policy API:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_slm/policy/daily-snapshots?human
|
|
--------------------------------------------------
|
|
// TEST[skip:we already tested get policy above, the last_failure may not be present though]
|
|
|
|
Which, in this case shows an error because the index did not exist:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"daily-snapshots" : {
|
|
"version": 1,
|
|
"modified_date": "2019-04-23T01:30:00.000Z",
|
|
"modified_date_millis": 1556048137314,
|
|
"policy" : {
|
|
"schedule": "0 30 1 * * ?",
|
|
"name": "<daily-snap-{now/d}>",
|
|
"repository": "my_repository",
|
|
"config": {
|
|
"indices": ["data-*", "important"],
|
|
"ignore_unavailable": false,
|
|
"include_global_state": false
|
|
},
|
|
"retention": {}
|
|
},
|
|
"stats": {
|
|
"snapshots_taken": 0,
|
|
"snapshots_failed": 1,
|
|
"snapshots_deleted": 0,
|
|
"snapshot_deletion_failures": 0
|
|
}
|
|
"last_failure": { <1>
|
|
"snapshot_name": "daily-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
|
|
"time_string": "2019-04-02T01:30:00.000Z",
|
|
"time": 1556042030000,
|
|
"details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
|
|
} ,
|
|
"next_execution": "2019-04-24T01:30:00.000Z",
|
|
"next_execution_millis": 1556048160000
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[skip:the presence of last_failure is asynchronous and will be present for users, but is untestable]
|
|
|
|
<1> The last unsuccessfully initiated snapshot by this policy, along with the details of its failure
|
|
|
|
In this case, it failed due to the "important" index not existing and
|
|
`ignore_unavailable` setting being set to `false`.
|
|
|
|
Updating the policy to change the `ignore_unavailable` setting is done using the
|
|
same put snapshot lifecycle policy API:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /_slm/policy/daily-snapshots
|
|
{
|
|
"schedule": "0 30 1 * * ?",
|
|
"name": "<daily-snap-{now/d}>",
|
|
"repository": "my_repository",
|
|
"config": {
|
|
"indices": ["data-*", "important"],
|
|
"ignore_unavailable": true,
|
|
"include_global_state": false
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
Another snapshot can immediately be executed to ensure the new policy works:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /_slm/policy/daily-snapshots/_execute
|
|
--------------------------------------------------
|
|
// TEST[skip:we can't handle snapshots in docs tests]
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"snapshot_name": "daily-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a"
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[skip:we can't handle snapshots in docs tests]
|
|
|
|
Now retriving the policy shows that the policy has successfully been executed:
|
|
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_slm/policy/daily-snapshots?human
|
|
--------------------------------------------------
|
|
// TEST[skip:we already tested this above and the output may not be available yet]
|
|
|
|
Which now includes the successful snapshot information:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"daily-snapshots" : {
|
|
"version": 2, <1>
|
|
"modified_date": "2019-04-23T01:30:00.000Z",
|
|
"modified_date_millis": 1556048137314,
|
|
"policy" : {
|
|
"schedule": "0 30 1 * * ?",
|
|
"name": "<daily-snap-{now/d}>",
|
|
"repository": "my_repository",
|
|
"config": {
|
|
"indices": ["data-*", "important"],
|
|
"ignore_unavailable": true,
|
|
"include_global_state": false
|
|
},
|
|
"retention": {}
|
|
},
|
|
"stats": {
|
|
"snapshots_taken": 1,
|
|
"snapshots_failed": 1,
|
|
"snapshots_deleted": 0,
|
|
"snapshot_deletion_failures": 0
|
|
},
|
|
"last_success": { <2>
|
|
"snapshot_name": "daily-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a",
|
|
"time_string": "2019-04-24T16:43:49.316Z",
|
|
"time": 1556124229316
|
|
} ,
|
|
"last_failure": {
|
|
"snapshot_name": "daily-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
|
|
"time_string": "2019-04-02T01:30:00.000Z",
|
|
"time": 1556042030000,
|
|
"details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
|
|
} ,
|
|
"next_execution": "2019-04-24T01:30:00.000Z",
|
|
"next_execution_millis": 1556048160000
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
|
|
|
|
<1> The policy's version has been incremented because it was updated
|
|
<2> The last successfully initiated snapshot information
|
|
|
|
It is a good idea to test policies using the execute API to ensure they work.
|
|
|
|
[[slm-get-stats]]
|
|
=== Get Snapshot Lifecycle Stats API
|
|
|
|
SLM stores statistics on a global and per-policy level about actions taken. These stats can be
|
|
retrieved by using the following API:
|
|
|
|
==== Example
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_slm/stats
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
|
|
Which returns a response similar to:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"retention_runs": 13,
|
|
"retention_failed": 0,
|
|
"retention_timed_out": 0,
|
|
"retention_deletion_time": "1.4s",
|
|
"retention_deletion_time_millis": 1404,
|
|
"policy_metrics": {
|
|
"daily-snapshots": {
|
|
"snapshots_taken": 1,
|
|
"snapshots_failed": 1,
|
|
"snapshots_deleted": 0,
|
|
"snapshot_deletion_failures": 0
|
|
}
|
|
},
|
|
"total_snapshots_taken": 1,
|
|
"total_snapshots_failed": 1,
|
|
"total_snapshots_deleted": 0,
|
|
"total_snapshot_deletion_failures": 0
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/runs": 13/runs": $body.retention_runs/ s/_failed": 0/_failed": $body.retention_failed/ s/_timed_out": 0/_timed_out": $body.retention_timed_out/ s/"1.4s"/$body.retention_deletion_time/ s/1404/$body.retention_deletion_time_millis/]
|
|
|
|
[[slm-api-delete]]
|
|
=== Delete Snapshot Lifecycle Policy API
|
|
|
|
A policy can be deleted by issuing a delete request with the policy id. Note
|
|
that this prevents any future snapshots from being taken, but does not cancel
|
|
any currently ongoing snapshots or remove any previously taken snapshots.
|
|
|
|
==== Path Parameters
|
|
|
|
`policy_id` (optional)::
|
|
(string) Id of the policy to remove.
|
|
|
|
==== Example
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
DELETE /_slm/policy/daily-snapshots
|
|
--------------------------------------------------
|
|
// TEST[continued]
|