[7.x] Add Snapshot Lifecycle Management (#44382)
* Add Snapshot Lifecycle Management (#43934)
* Add SnapshotLifecycleService and related CRUD APIs
This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.
This also includes the get, put, and delete APIs for these policies
Relates to #38461
* Make scheduledJobIds return an immutable set
* Use Object.equals for SnapshotLifecyclePolicy
* Remove unneeded TODO
* Implement ToXContentFragment on SnapshotLifecyclePolicyItem
* Copy contents of the scheduledJobIds
* Handle snapshot lifecycle policy updates and deletions (#40062)
(Note this is a PR against the `snapshot-lifecycle-management` feature branch)
This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.
Relates to #38461
* Take a snapshot for the policy when the SLM policy is triggered (#40383)
(This is a PR for the `snapshot-lifecycle-management` branch)
This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.
This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.
Relates to #38461
* Record most recent snapshot policy success/failure (#40619)
Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.
This is the first step toward writing history in a more permanent
fashion.
* Validate snapshot lifecycle policies (#40654)
(This is a PR against the `snapshot-lifecycle-management` branch)
With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.
Part of #38461
* Hook SLM into ILM's start and stop APIs (#40871)
(This pull request is for the `snapshot-lifecycle-management` branch)
This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.
Relates to #38461
* Add tests for SnapshotLifecyclePolicyItem (#40912)
Adds serialization tests for SnapshotLifecyclePolicyItem.
* Fix improper import in build.gradle after master merge
* Add human readable version of modified date for snapshot lifecycle policy (#41035)
* Add human readable version of modified date for snapshot lifecycle policy
This small change changes it from:
```
...
"modified_date": 1554843903242,
...
```
To
```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```
Including the `"modified_date"` field when the `?human` field is used.
Relates to #38461
* Fix test
* Add API to execute SLM policy on demand (#41038)
This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.
```json
PUT /_ilm/snapshot/<policy>/_execute
```
And it returns the response with the generated snapshot name:
```json
{
"snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```
Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.
Relates to #38461
* Add next_execution to SLM policy metadata (#41221)
* Add next_execution to SLM policy metadata
This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:
```json
GET /_ilm/snapshot?human
{
"production" : {
"version" : 1,
"modified_date" : "2019-04-15T21:16:21.865Z",
"modified_date_millis" : 1555362981865,
"policy" : {
"name" : "<production-snap-{now/d}>",
"schedule" : "*/30 * * * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"foo-*",
"important"
],
"ignore_unavailable" : true,
"include_global_state" : false
}
},
"next_execution" : "2019-04-15T21:16:30.000Z",
"next_execution_millis" : 1555362990000
},
"other" : {
"version" : 1,
"modified_date" : "2019-04-15T21:12:19.959Z",
"modified_date_millis" : 1555362739959,
"policy" : {
"name" : "<other-snap-{now/d}>",
"schedule" : "0 30 2 * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"other"
],
"ignore_unavailable" : false,
"include_global_state" : true
}
},
"next_execution" : "2019-04-16T02:30:00.000Z",
"next_execution_millis" : 1555381800000
}
}
```
Relates to #38461
* Fix and enhance tests
* Figured out how to Cron
* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)
This commit changes the endpoint for snapshot lifecycle management from:
```
GET /_ilm/snapshot/<policy>
```
to:
```
GET /_slm/policy/<policy>
```
It mimics the ILM path only using `slm` instead of `ilm`.
Relates to #38461
* Add initial documentation for SLM (#41510)
* Add initial documentation for SLM
This adds the initial documentation for snapshot lifecycle management.
It also includes the REST spec API json files since they're sort of
documentation.
Relates to #38461
* Add `manage_slm` and `read_slm` roles (#41607)
* Add `manage_slm` and `read_slm` roles
This adds two more built in roles -
`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.
`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.
Relates to #38461
* Add execute to the test
* Fix ilm -> slm typo in test
* Record SLM history into an index (#41707)
It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.
This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.
Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s. Watcher would cause the same problem, but it
is already disabled (for the same reason).
* High Level Rest Client support for SLM (#41767)
* High Level Rest Client support for SLM
This commit add HLRC support for SLM.
Relates to #38461
* Fill out documentation tests with tags
* Add more callouts and asciidoc for HLRC
* Update javadoc links to real locations
* Add security test testing SLM cluster privileges (#42678)
* Add security test testing SLM cluster privileges
This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.
Relates to #38461
* Don't redefine vars
* Add Getting Started Guide for SLM (#42878)
This commit adds a basic Getting Started Guide for SLM.
* Include SLM policy name in Snapshot metadata (#43132)
Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.
* Fix compilation after master merge
* [TEST] Move exception wrapping for devious exception throwing
Fixes an issue where an exception was created from one line and thrown in another.
* Fix SLM for the change to AcknowledgedResponse
* Add Snapshot Lifecycle Management Package Docs (#43535)
* Fix compilation for transport actions now that task is required
* Add a note mentioning the privileges needed for SLM (#43708)
* Add a note mentioning the privileges needed for SLM
This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.
Relates to #38461
* Mention that you can create snapshots for indices you can't read
* Fix REST tests for new number of cluster privileges
* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)
* Fix SnapshotHistoryStoreTests after merge
* Remove overridden newResponse functions that have been removed
* Fix compilation for backport
* Fix get snapshot output parsing in test
* [DOCS] Add redirects for removed autogen anchors (#44380)
* Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
|
|
|
[role="xpack"]
|
|
|
|
[testenv="basic"]
|
|
|
|
[[snapshot-lifecycle-management-api]]
|
2019-07-19 14:35:36 -04:00
|
|
|
== Snapshot lifecycle management API
|
[7.x] Add Snapshot Lifecycle Management (#44382)
* Add Snapshot Lifecycle Management (#43934)
* Add SnapshotLifecycleService and related CRUD APIs
This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.
This also includes the get, put, and delete APIs for these policies
Relates to #38461
* Make scheduledJobIds return an immutable set
* Use Object.equals for SnapshotLifecyclePolicy
* Remove unneeded TODO
* Implement ToXContentFragment on SnapshotLifecyclePolicyItem
* Copy contents of the scheduledJobIds
* Handle snapshot lifecycle policy updates and deletions (#40062)
(Note this is a PR against the `snapshot-lifecycle-management` feature branch)
This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.
Relates to #38461
* Take a snapshot for the policy when the SLM policy is triggered (#40383)
(This is a PR for the `snapshot-lifecycle-management` branch)
This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.
This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.
Relates to #38461
* Record most recent snapshot policy success/failure (#40619)
Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.
This is the first step toward writing history in a more permanent
fashion.
* Validate snapshot lifecycle policies (#40654)
(This is a PR against the `snapshot-lifecycle-management` branch)
With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.
Part of #38461
* Hook SLM into ILM's start and stop APIs (#40871)
(This pull request is for the `snapshot-lifecycle-management` branch)
This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.
Relates to #38461
* Add tests for SnapshotLifecyclePolicyItem (#40912)
Adds serialization tests for SnapshotLifecyclePolicyItem.
* Fix improper import in build.gradle after master merge
* Add human readable version of modified date for snapshot lifecycle policy (#41035)
* Add human readable version of modified date for snapshot lifecycle policy
This small change changes it from:
```
...
"modified_date": 1554843903242,
...
```
To
```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```
Including the `"modified_date"` field when the `?human` field is used.
Relates to #38461
* Fix test
* Add API to execute SLM policy on demand (#41038)
This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.
```json
PUT /_ilm/snapshot/<policy>/_execute
```
And it returns the response with the generated snapshot name:
```json
{
"snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```
Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.
Relates to #38461
* Add next_execution to SLM policy metadata (#41221)
* Add next_execution to SLM policy metadata
This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:
```json
GET /_ilm/snapshot?human
{
"production" : {
"version" : 1,
"modified_date" : "2019-04-15T21:16:21.865Z",
"modified_date_millis" : 1555362981865,
"policy" : {
"name" : "<production-snap-{now/d}>",
"schedule" : "*/30 * * * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"foo-*",
"important"
],
"ignore_unavailable" : true,
"include_global_state" : false
}
},
"next_execution" : "2019-04-15T21:16:30.000Z",
"next_execution_millis" : 1555362990000
},
"other" : {
"version" : 1,
"modified_date" : "2019-04-15T21:12:19.959Z",
"modified_date_millis" : 1555362739959,
"policy" : {
"name" : "<other-snap-{now/d}>",
"schedule" : "0 30 2 * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"other"
],
"ignore_unavailable" : false,
"include_global_state" : true
}
},
"next_execution" : "2019-04-16T02:30:00.000Z",
"next_execution_millis" : 1555381800000
}
}
```
Relates to #38461
* Fix and enhance tests
* Figured out how to Cron
* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)
This commit changes the endpoint for snapshot lifecycle management from:
```
GET /_ilm/snapshot/<policy>
```
to:
```
GET /_slm/policy/<policy>
```
It mimics the ILM path only using `slm` instead of `ilm`.
Relates to #38461
* Add initial documentation for SLM (#41510)
* Add initial documentation for SLM
This adds the initial documentation for snapshot lifecycle management.
It also includes the REST spec API json files since they're sort of
documentation.
Relates to #38461
* Add `manage_slm` and `read_slm` roles (#41607)
* Add `manage_slm` and `read_slm` roles
This adds two more built in roles -
`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.
`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.
Relates to #38461
* Add execute to the test
* Fix ilm -> slm typo in test
* Record SLM history into an index (#41707)
It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.
This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.
Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s. Watcher would cause the same problem, but it
is already disabled (for the same reason).
* High Level Rest Client support for SLM (#41767)
* High Level Rest Client support for SLM
This commit add HLRC support for SLM.
Relates to #38461
* Fill out documentation tests with tags
* Add more callouts and asciidoc for HLRC
* Update javadoc links to real locations
* Add security test testing SLM cluster privileges (#42678)
* Add security test testing SLM cluster privileges
This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.
Relates to #38461
* Don't redefine vars
* Add Getting Started Guide for SLM (#42878)
This commit adds a basic Getting Started Guide for SLM.
* Include SLM policy name in Snapshot metadata (#43132)
Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.
* Fix compilation after master merge
* [TEST] Move exception wrapping for devious exception throwing
Fixes an issue where an exception was created from one line and thrown in another.
* Fix SLM for the change to AcknowledgedResponse
* Add Snapshot Lifecycle Management Package Docs (#43535)
* Fix compilation for transport actions now that task is required
* Add a note mentioning the privileges needed for SLM (#43708)
* Add a note mentioning the privileges needed for SLM
This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.
Relates to #38461
* Mention that you can create snapshots for indices you can't read
* Fix REST tests for new number of cluster privileges
* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)
* Fix SnapshotHistoryStoreTests after merge
* Remove overridden newResponse functions that have been removed
* Fix compilation for backport
* Fix get snapshot output parsing in test
* [DOCS] Add redirects for removed autogen anchors (#44380)
* Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
|
|
|
|
|
|
|
The Snapshot Lifecycle Management APIs are used to manage policies for the time
|
|
|
|
and frequency of automatic snapshots. Snapshot Lifecycle Management is related
|
|
|
|
to <<index-lifecycle-management,Index Lifecycle Management>>, however, instead
|
|
|
|
of managing a lifecycle of actions that are performed on a single index, SLM
|
|
|
|
allows configuring policies spanning multiple indices.
|
|
|
|
|
|
|
|
SLM policy management is split into three different CRUD APIs, a way to put or update
|
|
|
|
policies, a way to retrieve policies, and a way to delete unwanted policies, as
|
|
|
|
well as a separate API for immediately invoking a snapshot based on a policy.
|
|
|
|
|
|
|
|
Since SLM falls under the same category as ILM, it is stopped and started by
|
|
|
|
using the <<start-stop-ilm,start and stop>> ILM APIs.
|
|
|
|
|
|
|
|
[[slm-api-put]]
|
|
|
|
=== Put Snapshot Lifecycle Policy API
|
|
|
|
|
|
|
|
Creates or updates a snapshot policy. If the policy already exists, the version
|
|
|
|
is incremented. Only the latest version of a policy is stored.
|
|
|
|
|
|
|
|
When a policy is created it is immediately scheduled based on the schedule of
|
|
|
|
the policy, when a policy is updated its schedule changes are immediately
|
|
|
|
applied.
|
|
|
|
|
|
|
|
==== Path Parameters
|
|
|
|
|
|
|
|
`policy_id` (required)::
|
|
|
|
(string) Identifier (id) for the policy.
|
|
|
|
|
|
|
|
==== Request Parameters
|
|
|
|
|
2019-08-02 08:42:33 -04:00
|
|
|
include::{docdir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
|
[7.x] Add Snapshot Lifecycle Management (#44382)
* Add Snapshot Lifecycle Management (#43934)
* Add SnapshotLifecycleService and related CRUD APIs
This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.
This also includes the get, put, and delete APIs for these policies
Relates to #38461
* Make scheduledJobIds return an immutable set
* Use Object.equals for SnapshotLifecyclePolicy
* Remove unneeded TODO
* Implement ToXContentFragment on SnapshotLifecyclePolicyItem
* Copy contents of the scheduledJobIds
* Handle snapshot lifecycle policy updates and deletions (#40062)
(Note this is a PR against the `snapshot-lifecycle-management` feature branch)
This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.
Relates to #38461
* Take a snapshot for the policy when the SLM policy is triggered (#40383)
(This is a PR for the `snapshot-lifecycle-management` branch)
This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.
This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.
Relates to #38461
* Record most recent snapshot policy success/failure (#40619)
Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.
This is the first step toward writing history in a more permanent
fashion.
* Validate snapshot lifecycle policies (#40654)
(This is a PR against the `snapshot-lifecycle-management` branch)
With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.
Part of #38461
* Hook SLM into ILM's start and stop APIs (#40871)
(This pull request is for the `snapshot-lifecycle-management` branch)
This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.
Relates to #38461
* Add tests for SnapshotLifecyclePolicyItem (#40912)
Adds serialization tests for SnapshotLifecyclePolicyItem.
* Fix improper import in build.gradle after master merge
* Add human readable version of modified date for snapshot lifecycle policy (#41035)
* Add human readable version of modified date for snapshot lifecycle policy
This small change changes it from:
```
...
"modified_date": 1554843903242,
...
```
To
```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```
Including the `"modified_date"` field when the `?human` field is used.
Relates to #38461
* Fix test
* Add API to execute SLM policy on demand (#41038)
This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.
```json
PUT /_ilm/snapshot/<policy>/_execute
```
And it returns the response with the generated snapshot name:
```json
{
"snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```
Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.
Relates to #38461
* Add next_execution to SLM policy metadata (#41221)
* Add next_execution to SLM policy metadata
This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:
```json
GET /_ilm/snapshot?human
{
"production" : {
"version" : 1,
"modified_date" : "2019-04-15T21:16:21.865Z",
"modified_date_millis" : 1555362981865,
"policy" : {
"name" : "<production-snap-{now/d}>",
"schedule" : "*/30 * * * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"foo-*",
"important"
],
"ignore_unavailable" : true,
"include_global_state" : false
}
},
"next_execution" : "2019-04-15T21:16:30.000Z",
"next_execution_millis" : 1555362990000
},
"other" : {
"version" : 1,
"modified_date" : "2019-04-15T21:12:19.959Z",
"modified_date_millis" : 1555362739959,
"policy" : {
"name" : "<other-snap-{now/d}>",
"schedule" : "0 30 2 * * ?",
"repository" : "repo",
"config" : {
"indices" : [
"other"
],
"ignore_unavailable" : false,
"include_global_state" : true
}
},
"next_execution" : "2019-04-16T02:30:00.000Z",
"next_execution_millis" : 1555381800000
}
}
```
Relates to #38461
* Fix and enhance tests
* Figured out how to Cron
* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)
This commit changes the endpoint for snapshot lifecycle management from:
```
GET /_ilm/snapshot/<policy>
```
to:
```
GET /_slm/policy/<policy>
```
It mimics the ILM path only using `slm` instead of `ilm`.
Relates to #38461
* Add initial documentation for SLM (#41510)
* Add initial documentation for SLM
This adds the initial documentation for snapshot lifecycle management.
It also includes the REST spec API json files since they're sort of
documentation.
Relates to #38461
* Add `manage_slm` and `read_slm` roles (#41607)
* Add `manage_slm` and `read_slm` roles
This adds two more built in roles -
`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.
`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.
Relates to #38461
* Add execute to the test
* Fix ilm -> slm typo in test
* Record SLM history into an index (#41707)
It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.
This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.
Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s. Watcher would cause the same problem, but it
is already disabled (for the same reason).
* High Level Rest Client support for SLM (#41767)
* High Level Rest Client support for SLM
This commit add HLRC support for SLM.
Relates to #38461
* Fill out documentation tests with tags
* Add more callouts and asciidoc for HLRC
* Update javadoc links to real locations
* Add security test testing SLM cluster privileges (#42678)
* Add security test testing SLM cluster privileges
This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.
Relates to #38461
* Don't redefine vars
* Add Getting Started Guide for SLM (#42878)
This commit adds a basic Getting Started Guide for SLM.
* Include SLM policy name in Snapshot metadata (#43132)
Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.
* Fix compilation after master merge
* [TEST] Move exception wrapping for devious exception throwing
Fixes an issue where an exception was created from one line and thrown in another.
* Fix SLM for the change to AcknowledgedResponse
* Add Snapshot Lifecycle Management Package Docs (#43535)
* Fix compilation for transport actions now that task is required
* Add a note mentioning the privileges needed for SLM (#43708)
* Add a note mentioning the privileges needed for SLM
This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.
Relates to #38461
* Mention that you can create snapshots for indices you can't read
* Fix REST tests for new number of cluster privileges
* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)
* Fix SnapshotHistoryStoreTests after merge
* Remove overridden newResponse functions that have been removed
* Fix compilation for backport
* Fix get snapshot output parsing in test
* [DOCS] Add redirects for removed autogen anchors (#44380)
* Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
|
|
|
|
|
|
|
==== Authorization
|
|
|
|
|
|
|
|
You must have the `manage_slm` cluster privilege to use this API. You must also
|
|
|
|
have the `manage` index privilege on all indices being managed by `policy`. All
|
|
|
|
operations executed by {slm} for a policy are executed as the user that put the
|
|
|
|
latest version of a policy. For more information, see
|
|
|
|
{stack-ov}/security-privileges.html[Security Privileges].
|
|
|
|
|
|
|
|
==== Example
|
|
|
|
|
|
|
|
The following creates a snapshot lifecycle policy with an id of
|
|
|
|
`daily-snapshots`:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
PUT /_slm/policy/daily-snapshots
|
|
|
|
{
|
|
|
|
"schedule": "0 30 1 * * ?", <1>
|
|
|
|
"name": "<daily-snap-{now/d}>", <2>
|
|
|
|
"repository": "my_repository", <3>
|
|
|
|
"config": { <4>
|
|
|
|
"indices": ["data-*", "important"], <5>
|
|
|
|
"ignore_unavailable": false,
|
|
|
|
"include_global_state": false
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[setup:setup-repository]
|
|
|
|
<1> When the snapshot should be taken, in this case, 1:30am daily
|
|
|
|
<2> The name each snapshot should be given
|
|
|
|
<3> Which repository to take the snapshot in
|
|
|
|
<4> Any extra snapshot configuration
|
|
|
|
<5> Which indices the snapshot should contain
|
|
|
|
|
|
|
|
The top-level keys that the policy supports are described below:
|
|
|
|
|
|
|
|
|==================
|
|
|
|
| Key | Description
|
|
|
|
|
|
|
|
| `schedule` | A periodic or absolute time schedule. Supports all values
|
|
|
|
supported by the cron scheduler:
|
|
|
|
{xpack-ref}/trigger-schedule.html#schedule-cron[Cron scheduler configuration]
|
|
|
|
|
|
|
|
| `name` | A name automatically given to each snapshot performed by this policy.
|
|
|
|
Supports the same <<date-math-index-names,date math>> supported in index
|
|
|
|
names. A UUID is automatically appended to the end of the name to prevent
|
|
|
|
conflicting snapshot names.
|
|
|
|
|
|
|
|
| `repository` | The snapshot repository that will contain snapshots created by
|
|
|
|
this policy. The repository must exist prior to the policy's creation and can
|
|
|
|
be created with the <<modules-snapshots,snapshot repository API>>.
|
|
|
|
|
|
|
|
| `config` | Configuration for each snapshot that will be created by this
|
|
|
|
policy. Any configuration is included with <<modules-snapshots,create snapshot
|
|
|
|
requests>> issued by this policy.
|
|
|
|
|==================
|
|
|
|
|
|
|
|
To update an existing policy, simply use the put snapshot lifecycle policy API
|
|
|
|
with the same policy id as an existing policy.
|
|
|
|
|
|
|
|
[[slm-api-get]]
|
|
|
|
=== Get Snapshot Lifecycle Policy API
|
|
|
|
|
|
|
|
Once a policy is in place, you can retrieve one or more of the policies using
|
|
|
|
the get snapshot lifecycle policy API. This also includes information about the
|
|
|
|
latest successful and failed invocation that the automatic snapshots have taken.
|
|
|
|
|
|
|
|
==== Path Parameters
|
|
|
|
|
|
|
|
`policy_ids` (optional)::
|
|
|
|
(string) Comma-separated ids of policies to retrieve.
|
|
|
|
|
|
|
|
==== Examples
|
|
|
|
|
|
|
|
To retrieve a policy, perform a `GET` with the policy's id
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_slm/policy/daily-snapshots?human
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
The output looks similar to the following:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"daily-snapshots" : {
|
|
|
|
"version": 1, <1>
|
|
|
|
"modified_date": "2019-04-23T01:30:00.000Z", <2>
|
|
|
|
"modified_date_millis": 1556048137314,
|
|
|
|
"policy" : {
|
|
|
|
"schedule": "0 30 1 * * ?",
|
|
|
|
"name": "<daily-snap-{now/d}>",
|
|
|
|
"repository": "my_repository",
|
|
|
|
"config": {
|
|
|
|
"indices": ["data-*", "important"],
|
|
|
|
"ignore_unavailable": false,
|
|
|
|
"include_global_state": false
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"next_execution": "2019-04-24T01:30:00.000Z", <3>
|
|
|
|
"next_execution_millis": 1556048160000
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[s/"modified_date": "2019-04-23T01:30:00.000Z"/"modified_date": $body.daily-snapshots.modified_date/ s/"modified_date_millis": 1556048137314/"modified_date_millis": $body.daily-snapshots.modified_date_millis/ s/"next_execution": "2019-04-24T01:30:00.000Z"/"next_execution": $body.daily-snapshots.next_execution/ s/"next_execution_millis": 1556048160000/"next_execution_millis": $body.daily-snapshots.next_execution_millis/]
|
|
|
|
<1> The version of the snapshot policy, only the latest verison is stored and incremented when the policy is updated
|
|
|
|
<2> The last time this policy was modified
|
|
|
|
<3> The next time this policy will be executed
|
|
|
|
|
|
|
|
Or, to retrieve all policies:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_slm/policy
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
[[slm-api-execute]]
|
|
|
|
=== Execute Snapshot Lifecycle Policy API
|
|
|
|
|
|
|
|
Sometimes it can be useful to immediately execute a snapshot based on policy,
|
|
|
|
perhaps before an upgrade or before performing other maintenance on indices. The
|
|
|
|
execute snapshot policy API allows you to perform a snapshot immediately without
|
|
|
|
waiting for a policy's scheduled invocation.
|
|
|
|
|
|
|
|
==== Path Parameters
|
|
|
|
|
|
|
|
`policy_id` (required)::
|
|
|
|
(string) Id of the policy to execute
|
|
|
|
|
|
|
|
==== Example
|
|
|
|
|
|
|
|
To take an immediate snapshot using a policy, use the following
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
PUT /_slm/policy/daily-snapshots/_execute
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[skip:we can't easily handle snapshots from docs tests]
|
|
|
|
|
|
|
|
This API will immediately return with the generated snapshot name
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"snapshot_name": "daily-snap-2019.04.24-gwrqoo2xtea3q57vvg0uea"
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[skip:we can't handle snapshots from docs tests]
|
|
|
|
|
|
|
|
The snapshot will be taken in the background, you can use the
|
|
|
|
<<modules-snapshots,snapshot APIs>> to monitor the status of the snapshot.
|
|
|
|
|
|
|
|
Once a snapshot has been kicked off, you can see the latest successful or failed
|
|
|
|
snapshot using the get snapshot lifecycle policy API:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_slm/policy/daily-snapshots?human
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[skip:we already tested get policy above, the last_failure may not be present though]
|
|
|
|
|
|
|
|
Which, in this case shows an error because the index did not exist:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"daily-snapshots" : {
|
|
|
|
"version": 1,
|
|
|
|
"modified_date": "2019-04-23T01:30:00.000Z",
|
|
|
|
"modified_date_millis": 1556048137314,
|
|
|
|
"policy" : {
|
|
|
|
"schedule": "0 30 1 * * ?",
|
|
|
|
"name": "<daily-snap-{now/d}>",
|
|
|
|
"repository": "my_repository",
|
|
|
|
"config": {
|
|
|
|
"indices": ["data-*", "important"],
|
|
|
|
"ignore_unavailable": false,
|
|
|
|
"include_global_state": false
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"last_failure": { <1>
|
|
|
|
"snapshot_name": "daily-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
|
|
|
|
"time_string": "2019-04-02T01:30:00.000Z",
|
|
|
|
"time": 1556042030000,
|
|
|
|
"details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
|
|
|
|
} ,
|
|
|
|
"next_execution": "2019-04-24T01:30:00.000Z",
|
|
|
|
"next_execution_millis": 1556048160000
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[skip:the presence of last_failure is asynchronous and will be present for users, but is untestable]
|
|
|
|
<1> The last unsuccessfully initiated snapshot by this policy, along with the details of its failure
|
|
|
|
|
|
|
|
In this case, it failed due to the "important" index not existing and
|
|
|
|
`ignore_unavailable` setting being set to `false`.
|
|
|
|
|
|
|
|
Updating the policy to change the `ignore_unavailable` setting is done using the
|
|
|
|
same put snapshot lifecycle policy API:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
PUT /_slm/policy/daily-snapshots
|
|
|
|
{
|
|
|
|
"schedule": "0 30 1 * * ?",
|
|
|
|
"name": "<daily-snap-{now/d}>",
|
|
|
|
"repository": "my_repository",
|
|
|
|
"config": {
|
|
|
|
"indices": ["data-*", "important"],
|
|
|
|
"ignore_unavailable": true,
|
|
|
|
"include_global_state": false
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
Another snapshot can immediately be executed to ensure the new policy works:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
PUT /_slm/policy/daily-snapshots/_execute
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[skip:we can't handle snapshots in docs tests]
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"snapshot_name": "daily-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a"
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[skip:we can't handle snapshots in docs tests]
|
|
|
|
|
|
|
|
Now retriving the policy shows that the policy has successfully been executed:
|
|
|
|
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
GET /_slm/policy/daily-snapshots?human
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[skip:we already tested this above and the output may not be available yet]
|
|
|
|
|
|
|
|
Which now includes the successful snapshot information:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"daily-snapshots" : {
|
|
|
|
"version": 2, <1>
|
|
|
|
"modified_date": "2019-04-23T01:30:00.000Z",
|
|
|
|
"modified_date_millis": 1556048137314,
|
|
|
|
"policy" : {
|
|
|
|
"schedule": "0 30 1 * * ?",
|
|
|
|
"name": "<daily-snap-{now/d}>",
|
|
|
|
"repository": "my_repository",
|
|
|
|
"config": {
|
|
|
|
"indices": ["data-*", "important"],
|
|
|
|
"ignore_unavailable": true,
|
|
|
|
"include_global_state": false
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"last_success": { <2>
|
|
|
|
"snapshot_name": "daily-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a",
|
|
|
|
"time_string": "2019-04-24T16:43:49.316Z",
|
|
|
|
"time": 1556124229316
|
|
|
|
} ,
|
|
|
|
"last_failure": {
|
|
|
|
"snapshot_name": "daily-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
|
|
|
|
"time_string": "2019-04-02T01:30:00.000Z",
|
|
|
|
"time": 1556042030000,
|
|
|
|
"details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
|
|
|
|
} ,
|
|
|
|
"next_execution": "2019-04-24T01:30:00.000Z",
|
|
|
|
"next_execution_millis": 1556048160000
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
|
|
|
|
<1> The policy's version has been incremented because it was updated
|
|
|
|
<2> The last successfully initiated snapshot information
|
|
|
|
|
|
|
|
It is a good idea to test policies using the execute API to ensure they work.
|
|
|
|
|
|
|
|
[[slm-api-delete]]
|
|
|
|
=== Delete Snapshot Lifecycle Policy API
|
|
|
|
|
|
|
|
A policy can be deleted by issuing a delete request with the policy id. Note
|
|
|
|
that this prevents any future snapshots from being taken, but does not cancel
|
|
|
|
any currently ongoing snapshots or remove any previously taken snapshots.
|
|
|
|
|
|
|
|
==== Path Parameters
|
|
|
|
|
|
|
|
`policy_id` (optional)::
|
|
|
|
(string) Id of the policy to remove.
|
|
|
|
|
|
|
|
==== Example
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
DELETE /_slm/policy/daily-snapshots
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
|
|
|
// TEST[continued]
|